CC
==

Norcroft C compiler.
See Acorn C/C++ Manual for further documentation.

Contents:
  List of all pragmas
  List of all feature flags
  List of all warning suppression flags
  List of all error suppression flags
  Extra notes

  Changes in cc 5.07 - 5.10
  Changes in cc 5.11
  Changes in cc 5.12
  Changes in cc 5.13 - 5.19
  Changes in cc 5.20 - 5.36
  Changes in cc 5.37 - 5.39
  Changes in cc 5.40 - 5.50
  Changes in cc 5.51 - 5.53
  Changes in cc 5.54
  Changes in cc 5.55 - 5.56
  Changes in cc 5.57
  Changes in cc 5.58 - 5.60
  Changes in cc 5.61 - 5.64
  Changes in cc 5.65
  Changes in cc 5.66 - 5.68
  Changes in cc 5.69
  Changes in cc 5.70 - 5.71
  Changes in cc 5.72
  Changes in cc 5.73 - 5.74
  Changes in cc 5.75 - 5.76
  Changes in cc 5.77
  Changes in cc 5.78
  Changes in cc 5.79 - 5.80
  Changes in cc 5.81 - 5.82
  Changes in cc 5.83
  Changes in cc 5.84 - 5.85
  Changes in cc 5.86
  Changes in cc 5.87 - 5.88
  Changes in cc 5.88 - 5.89
  Changes in cc 5.90

Full set of active pragmas
==========================
The following maps the long pragma names recognised by the preprocessor to
the short versions.  The digit is the value to use with the short
versions to effect the same thing.  (ie. #pragma -v4 is the same
as #pragma check_swix_formats).  Do not use -v3 as it is for
internal use only (by the compiler's own source code).

Short form   Long form

-a1          warn_implicit_fn_decls
-c1          check_memory_accesses
-d1          warn_deprecated
-e1          continue_after_hash_error
-g1          disable_fpargs_in_regs
-g2          force_fpargs_in_regs
-i1          include_only_once
-j1          optimise_crossjump
-k1          localcg_dataflow_checks
-l1          optimise_schedule
-m1          optimise_multiple_loads
-o1          anon_unions
-p1          profile
-p2          profile_statements
-s0          check_stack
-t1          force_top_level
-u1          utf8_source
-v1          check_printf_formats
-v2          check_scanf_formats
-v3          __compiler_msg_format_check (not for public use)
-v4          check_swix_formats
-y0          side_effects
-z1          optimise_cse


List of all feature flags
=========================
Option       Meaning

-fa          Check for certain data flow anomalies.
-fb          Verbose
-fc          Limited pcc option
-fd          Unused
-fe          Check 6-char case-insensitive external symbols unique
-ff          Don't embed function names (cf. -fn)
-fg          Unused
-fh          Require pre-declaration of external objects
-fi          Retain #include "..." statements in listing output
-fj          Retain #include <...> statements in listing output
-fk          Use K&R include search rules
-fl          Don't use link register
-fm          Emit warning for unused preprocessor symbols
-fn          Embed function names (cf. -ff)
-fo          Warn of old-style function declarations
-fp          Report on explicit casts of integers to pointer
-fq          Unused
-fr          Let longjmp() corrupt register variables
-fs          Annotate
-ft          Enable variable-sized enums (cf. -fy)
-fu          Unexpanded listing
-fv          Report on all unused declarations including standard headers
-fw          Allow string literals to be writable
-fx          Unused
-fy          Store all enums as ints (cf. -ft).
-fz          Unused


List of all warning suppression flags
=====================================
(+ = disabled by default, use -W+<opt> to enable)

Option       Meaning

-Wa          "Use of = in a conditional context"
-Wb          Unknown pragma
-Wc          Unused
-Wd          Deprecated declaration foo() - give arg types
             and Old-style function
-We          cast between function pointer and non-function object
-Wf          Inventing "extern int foo()"
-Wg          Unguarded or wrongly guarded header file (+)
-Wh          'format' arg to printf/scanf/_swixs etc. is variable
-Wi to -Wk   Unused
-Wl          Lower precision in wider context
-Wm          Unused
-Wn          Implicit narrowing cast
-Wo          Unused
-Wp          non-ANSI #include <...>
-Wq          Unused
-Wr          (_swix) Format requires x parameters, but y given
-Ws to -Wt   Unused
-Wu          Use of C++ keyword (+)
-Wv          Implicit return in non-void context
-Ww          Unused
-Wx          xxx declared but not used
-Wy          Unused
-Wz          Undefined macro in #if - treated as 0


List of all error suppression flags
===================================
(* = documented in Acorn C/C++ manual)
(! = documented in manual, but not actually implemented in the compiler)

Option       Meaning

-ea          Unused
-eb          Unused
-ec          Implicit cast (*)
-ed          Unused
-ee          Unused
-ef          Unclean casts (eg. short to pointer) (!)
-eg          Unused
-eh          Unused
-ei          Suppress syntax checking for #if (!)
-ej          Unused
-ek          Unused
-el          Unused
-em          Same as -epz -fq
-en          Unused
-eo          Unused
-ep          Junk at end of preprocessor line (*)
-eq to -ey   Unused
-ez          Zero-length array (*)


Extra notes
===========

#warning directive
------------------
Despite it not being a formal part of any formal ISO C standard, it has wide
support across other compilers in practice, and is supported as of version
5.71. The usage is similar to #error, for example:

#warning This is a warning

will result in cc emitting the line:

"myfile.c", line x: Warning: #warning encountered "This is a warning"

and generating an object file anyway. Because this is a non-standard extension,
it is not permitted if you specify -fussy or -strict.

restrict
--------
Previously the restrict qualifier was recognised, but didn't affect code
generation. As of version 5.59, restrict will lead to improved code in some
circumstances. See the "Restricted pointers" section in Chapter 6 of the C/C++
manual for further details.

SMULxy/SMLAxy
-------------
These instructions, new in architecture 5TE, provide the ability to work on
16-bit signed numbers packed into 32-bit words, and are of particular use in
certain signal processing applications (such as video decompressors).

The compiler can now generate these instructions for multiplications of
narrow, signed values. (Previously they could only be accessed from inline
assembler).

For example, 

      int mul1(short *a)
      {
          return a[1]*a[2] + a[3]*a[0];
      }
      
will compile as:

   mul1
        LDRH     a2,[a1,#2]
        LDRH     a3,[a1,#4]
        SMULBB   a2,a2,a3
        LDRH     a3,[a1,#6]
        LDRH     a1,[a1,#0]
        SMLABB   a1,a3,a1,a2
        MOV      pc,lr

As this example illustrates, the compiler actually loads shorts into separate
registers, limiting its ability to take full advantage of the unpacking
ability. To give the compiler a hint, you can either manually unpack the
values from ints (using masks and shifts), or use bitfields. For example:

    struct hl
    {
        signed int l:16,h:16;
    };
    
    int mul2(struct hl *a)
    {
        return a[0].h*a[1].l + a[1].h*a[0].l;
    }

will compile as:

   mul2
        LDR      a2,[a1,#0]
        LDR      a1,[a1,#4]
        SMULTB   a3,a2,a1
        SMLATB   a1,a1,a2,a3
        MOV      pc,lr

Such packing may result in reduced performance with other operations.

-apcs /fpregargs
----------------
From version 5.58 onwards the calling convention for /fpregargs has been
changed to align better with ARM's later tools. Note that /fpregargs is not
normally used under RISC OS. The changes are:

1) FP arguments are never passed in FP registers for variadic functions.
   Instead they're passed in integer registers or on the stack. In -pcc mode,
   FP registers are not used unless the first argument to a function is
   floating. (See ARM's ATPCS documentation for an explanation of this
   heuristic.)
   
2) Homogenous floating-point structure arguments are now passed in
   floating-point registers. For example, a structure consisting of 3 floats
   (only) will be passed in 3 adjacent floating-point registers. This
   includes complex numbers, which will be passed in a pair of registers. If
   the appropriate number of registers is not available, then the entire
   structure is passed on the stack - a structure cannot be split between FP
   registers and the stack.
   
3) Homogenous floating-point structure returns, using __value_in_regs,
   are now be returned in F0-F3, regardless of the APCS setting. Complex
   numbers are returned in F0 and F1.

-arch
-----
The compiler now tunes its output to match processor timing characteristics,
depending on the setting of -cpu or the new option -arch. In particular, the
number of cycles for LDR and MUL are considered.

The new command-line argument -arch, when used on its own, is equivalent to
-cpu, except it only accepts architecture names (eg "-arch 5TE"). When both
-cpu and -arch are used simultaneously, the compiler will optimise for the
processor/architecture given by -cpu, but generate code that will run on the
architecture given by -arch.

This allows the user to request optimising for a new processor while not
ruling out the code being run on an older one.

For example:
                              
      default                 code optimised for XScale, runs on ARMv3
      -cpu ARM7TDMI           code optimised for ARM7TDMI, runs on ARMv4T
      -cpu 5                  code optimised for a typical ARMv5 processor,
                              runs on ARMv5
      -arch 5                 same as -cpu 5
      -arch 4T -cpu XScale    code optimised for XScale but runs on ARMv4T
      -cpu ARM2 -arch 5TE     code optimised for ARM2, but requires ARMv5TE
                              (nonsensical, but allowed)

-arch and -cpu can be specified in either order. Only the last -cpu and last
-arch given on the command line are significant.

Scheduling
----------
As of version 5.63 the scheduler re-orders instructions to tune performance
for the chosen CPU. Regardless of the precise CPU code is tuned for, any
scheduling is generally preferable to none for all CPUs after the ARM7. It
will also schedule for the FPA, if an ARM7 is selected.

The CPU to optimise for is selected via the -cpu command-line option; the
default is XScale. If backwards compatibility is required, the -arch option
should be used in conjunction with -cpu.

Scheduling can be disabled on a per-function basis with

  #pragma no_optimise_schedule
  
or on a per-file basis with the command-line option -zpl0.

Scheduling is also disabled if debugging is enabled with -g.

Enums
-----
Since version 5.63 enums can be different sizes. Depending on their values
they can be contained in a signed char, unsigned char, signed short,
unsigned short, signed int or unsigned int.

For backwards compatibility, this is not the default. The default remains
to have all enums be stored as ints. The command-line option -fy selects
this explicitly.

To enable variable-sized enums, use the option -ft.

Intrinsic functions
-------------------
These are functions which are built into the compiler and will be output inline
to perform assorted low level operations. They are ARM architecture specific,
so should be avoided in source code intended to be used across other platforms.

The syntax matches that of ARM's compiler:

  void __breakpoint(int arg);
  Creates a hardware BKPT instruction at the point the intrinsic is used
  with <arg> encoded into it. It is an error to use anything other than 
  a constant integer for <arg>. Note that BKPT will cause an undefined 
  instruction prior to ARMv5, and a prefetch abort if there is no debugger
  attached.

  unsigned int __current_pc();
  Returns the value of the program counter at the point the intrinsic is used.

  unsigned int __current_sp();
  Returns the value of the stack pointer at the point the intrinsic is used.

  void __nop(void);
  Emits a single no operation instruction, and flushes the compiler's
  scheduler. The instruction will not be optimised away (unless it is 
  unreachable). Note that the exact instruction used for a no operation may
  vary based on the selected architecture.

  unsigned int __return_address();
  Returns the value of the link register that would be used to return from
  the current function. Note that inlining and tail call optimisation may
  mean that the return address is actually that of the outer function.

  void __schedule_barrier();
  This intrinsic flushes the compiler's scheduler without any instructions
  being output. Compare this with the __nop() intrinsic.

  int __semihost(int reason, const void *arg);
  Calls the semihosting SWI 0x123456 with <reason> and supplementary <arg>
  data. For RISC OS this intrinsic should not be used as all SWI numbers
  must be allocated, and none is reserved for semihosting purposes.

Arm C Language Extension (ACLE) compatible macros
-------------------------------------------------
The instruction set architecture section, given in the ACLE 2021Q2 ssec-ATisa
specification, is supported. These reflect the selections made at the command
line via -arch and -cpu and may find applications such as where selecting a
radically different algorithm may be worthwhile between different cores.

The predefines are:

  __ARM_ARCH           Set to the architecture number, so ARMv7 would be 7.
                       From ARMv8.1 onwards the value is 100 * the whole
                       part plus the fractional part, so ARMv8.1 would be 801.
  __ARM_ARCH_ISA_THUMB States the level of Thumb support. Set to either 1 or
                       2, or unset if the selected CPU does not support Thumb
                       at all.
  __ARM_ARCH_ISA_A64   Defined if the selected CPU has AArch64 support.
  __ARM_ARCH_ISA_ARM   Defined if the selected CPU has AArch32 support.
  __ARM_32BIT_STATE    Defined if the code produced targets AArch32.


Changes in versions 5.07 to 5.10 of the compiler
================================================

* New warnings and warning suppression flags:

-Ws   suppression of "module has init to static data"
-Wr   suppression _swix format warnings
-Wc   suppression of "use of reserved C++ keyword" warnings
-Wb   suppression of unknown pragma warnings
-Wg   suppression of non-const format parameter warnings
        (..printf, ..scanf, _swix)

* The compiler now knows (with #pragma -v4 in swis.h) about _swi
  and _swix and will check that the correct number of parameters
  have been supplied and that they are of suitable types.
* The compiler now identifies *which* function parameter it is 
  grumbling about when it has a complaint about one of them.


Changes in version 5.11 of the compiler
=======================================

* -Ws is assumed and ignored, but not faulted.
* -ccversion <version * 100> option is supported.  eg. --ccversion 512
  will abort compilation if you aren't using version 5.12 or later.
* A new macro __CC_NORCROFT_VERSION with a numeric value equal to the
  version number * 100 is predefined.


Changes in version 5.12 of the compiler
=======================================

* Signed shift right following a bitwise AND with a constant used to do
  a logical shift, not arithmetic.  This is now fixed.  (C standard says
  that the LSR/ASR choice is implementation defined - and our manual
  defines it as ASR)


Changes in versions 5.13 to 5.19 of the compiler
================================================

* SFMFD instructions generated correctly, and other floating point instr fixes
  too; LDR with writeback to sp as the base register done properly to avoid
  interrupt holes; use of single register LDM and STM instructions minimised;
* More warning suppression stuff added.
* "Unused earlier static declaration of '<symbol>'" is only generated in
  fussy mode.
* Unused symbols starting with the 6 character prefix __link are no longer
  warned about (because the linker set stuff nearly always should not be
  referenced - that's what the linker does)
* The following symbols may be predefined depending on the APCS variant in
  use: __APCS_32, __APCS_FPREGARGS, __APCS_NOFP, __APCS_REENT, __APCS_NOSWST.


Changes in versions 5.20 to 5.36 of the compiler
================================================

* Fix for obscure bug involving signed char array deference by signed char
  index
* RISC OS module static data relocations are now understood as a general data
  access concept similar to using a static base register, and can now be
  optimised properly - helps register allocation and cacheing of the static
  data relocation offset.
* Self-referential structures are now trapped better (the following used to
  cause an infinite loop:  struct foo { struct foo x[2]; }
* Optimisation fix for LDR/STR with writeback when the stack pointer is the
  base register - this could have resulted in valuable data being on the stack
  below the stack pointer.
* A bug in the register usage detection has been fixed, stopping accidental
  corruption of registers.
* Now defaults to APCS-32 (ie /32bit/fpe3).
* Fix to occasional incorrect code transferring FP registers into integer
  registers.
* Fix to common string literal detection code which occasionally crashed.
* Fix to internal inconsistency failure - failure to remove FP no-ops.
* Some other compiler crashes fixed.
* Frame pointer no longer assumed to be word-aligned if using a /nofp APCS.
* *(int *)"TASK" no longer crashes the compiler in MemCheck mode.
* Invoking data as a function no longer crashes the optimiser.
* Lots of new peepholes for FPA code.


Changes in versions 5.37 to 5.39 of the compiler
================================================

* Common string literal detection code fixed again.


Changes in versions 5.40 to 5.50 of the compiler
================================================

* Numerous C99 features added (see separate documentation).
* Peephole for MVN+AND -> BIC extended to handle immediate shifts.
* #pragma force_fpargs_in_regs ("g2" for short)
* All FP comparisons tightened up to ensure correct behaviour with unordered
  operands; generated code will work correctly independently of state of AC
  bit in FPSR
* Compile-time NaN and infinity handling tightened up - this was previously
  untested as there was no way of creating NaNs or infinities at compile-time
* Improved behaviour (and speed) for casts from float to unsigned.
* <FP op>D, MVFS -> <FP op>S peephole strongly tightened to not violate our
  claim of FLT_EVAL_METHOD 0
* Extra peepholes for ABSD MVFS; MVFS MNFS; MVFS ABSS.
* Raw IEEE 754 floating-point constants (0d_xxxxxxxxxxxxxxxx and
  0f_xxxxxxxx), used to implement NAN and INFINITY macros
* Added "__caller_narrow" keyword - not for external use, just for function
  inlining (__r_cos et al). At least until it works fully.
* Added inline functions for floor[f], ceil[f], trunc[f], rint[f] and lrint[f].
* Added software floating point support (if you have a library).
* Empty initialiser lists now diagnosed (illegal in ANSI C)
* Code generation improvements (in particular the ability to defer saving
  registers at the start of a function, and register assignment of structure
  fields).
* "-cpu" and "-memaccess" options, as per ARM ADS (-memaccess currently
  ineffective)
* Bitfield debugging information now output for DDT


Changes in versions 5.51 to 5.53 of the compiler
================================================

* Incorrect code generated by the new "structure splitting" optimisation
  in certain cases fixed.
* Attaching "static" to a block-scope function declaration is now an error
  rather than a serious error.
* Callee-narrowing of function arguments was not occurring in old-style
  functions.
* Casts from long long types to short or char did not work correctly. This
  affected the build of the compiler, meaning that all compile-time
  narrowings to char or short were ineffective.
* Assignments of an array to a pointer in pcc mode were not marking the
  array as having its address taken, leading to some incorrect optimisations
  (notably structure splitting).
* Errors in conditions were causing internal compiler errors
  ("cg_expr(1 = <previous error>)").
* Badly malformed function definitions could cause a compiler crash.
* Some inlined long long operations were being incorrectly conditionally
  executed, potentially producing incorrect results.
* In some very rare cases, backwards references to already-used string
  literals were wrong - probably only when the compiler inlined a strcpy
  from the literal.
* #if __STDC_VERSION__ > 199901 corrected to >= 199901 in <stdlib.h>
* strtoll, strtoull, atoll, imaxabs, imaxdiv, strtoimax and strtoumax
  added to C library (SharedCLibrary 5.45 required).


Changes in version 5.54 of the compiler
=======================================

* __abs operator added, used to inline abs() and labs().
* __inline is a synonym for inline, but available in pre-C99 modes.
* ___select keyword added, to implement <tgmath.h>.
* bool bitfields caused problems in condition contexts.
* Various header file updates to match newer C libraries.
* Various fixes to flowgraph analysis - some shrinkwrap optimisations
  produced broken code, another never actually happened.


Changes in versions 5.55 to 5.56 of the compiler
================================================

* Added support for __packed structures and pointers.
* Inline assembler (as per ARM ADS).
* Added Thumb interworking (-APCS /inter).
* int f() { } is now correctly treated as an old-style function - use
  int f(void) { } instead.
* Improved error reporting for function arguments.
* C99 pragmas added.
* C99 concatenation of wide and normal string literals supported.
* Universal character name and UTF-8 support.
* Support for LDRH etc added. 
* -Otime and -Ospace now have more effect.
* CSE optimises long long expressions.
* Optimisation of static integer and floating const variables.
* long long shifts by constant much improved.
* All long long multiplies will now be inlined (if -cpu 3M or later
  and -Otime). Can also generate SMLAL and UMLAL.
* ARMv5 BLX instruction used for faster function pointer calls.
* Other more complex function pointer forms (such as MOV lr,pc; LDR pc,xxx)
  reinstated; these had been disabled at some point between 5.02 and 5.54
  as a bug workaround.
* Very slow CSE analysis of complex functions (eg various cyphers) fixed.
* Header file guard #defines interpreted and handled internally (a speed
  optimisation which saves having to reload header files).
* Improved compile-time evaluation of floating arithmetic.
* Able to inline more <math.h> functions. Now knows the CNF instruction.
* FPA float->double conversions no longer eliminated (for correct
  behaviour with signalling NaNs and underflow traps).
* Fixes to inline functions with difficult return types.
* Trailing padding removed from structs with flexible array members.
* Many general code generation improvements and bug fixes.
* Constant data placed in read-only area.
* Mapping symbols ($d/$a) now placed in object file to aid DecAOF and DDT.
* long long variables now output correctly in debugging information,
  (requires latest version of DDT).
* Changes to warning control options (see below).
* -zz and -zt options to control data allocation.
* APCS-A and APCS-M support removed.


Changes in version 5.57 of the compiler
=======================================

* Fixed some cases where ADDS x,y,#0 was being turned into MOVS x,y
  when the C flag was needed.
* Minor fix to __packed.
* Long long literals were stored wrongly in big-endian mode.
* Shrinkwrap optimisation fixed for cases where CSEed comparisons were
  expecting flags preserved across the entry sequence.
* Tail-call recursion args-in-vregs optimisation fixed.


Changes in versions 5.58 to 5.60 of the compiler
================================================

* Added support for switch on a long long expression.
* Added complex and imaginary number support (<complex.h>, including
  C99 Annex G).
* _Bool (and _Complex/_Imaginary) now available in pcc and C90 modes.
* MLA instruction used more often, and handled more intelligently.
* CSE aliasing optimisations take account of "restrict" qualifier.
* Optimised handling of narrow (<32 bit) data and computations.
* Will use ARMv5E's SMULxy and SMLAxy instructions where appropriate.
* Improved checking of printf/scanf format strings.
* Inlines signbit().
* Improved compilation of (int) longlong, (int) longlong >> 32, and
  longlong >> 1 or << 1. Also, long long multiplication and division by
  powers of two transformed into shifts.
* Can transform integer division by constant into a 32x32->64
  multiplication (if available on CPU, and -Otime selected).
* Pointer subtraction optimised to use only multiply and/or shift.
* Added inter-statement compile-time evaluation of long long arithmetic.
* Improved CSE handling, especially for FP constants and comparisons.
* IEEE 754 conformance improved; generally edging the compiler closer to
  C99 Annex F. 
* asm keyword recognised in C++ mode.
* -arch command-line parameter added.
* Some improvements to treatment of volatile objects.
* Changes to handling of floating arguments for -apcs /fpregargs.
* Numerous code generation improvements and bug fixes.
* Banner and help now sent to stderr instead of stdout.


Changes in versions 5.61 to 5.64 of the compiler
================================================

* Added instruction scheduling.
* Various other performance enhancements in low-level code generation.
* Various CSE enhancements.
* Copy-propagation optimisation added to register allocator.
* Now defaults to tuning performance for XScale, while remaining ARMv2
  compatible (ie -cpu XScale -arch 2).
* BLX opcode added to inline assembler.
* Enums can now use smaller containers rather than always being int-sized.
* Some hard-coded function size limits increased.
* Various bug fixes.


Changes in version 5.65 of the compiler
=======================================

* Fixed code generation errors related to packed long longs, and to passing
  packed values as arguments to functions.


Changes in versions 5.66 to 5.68 of the compiler
================================================

* Change to unaligned load behaviour. The default setting is now not to use
  unaligned loads as a shortcut to rotate the aligned 32-bit value, because the
  behaviour of unaligned loads has changed (optional in ARMv6, mandatory in
  ARMv7). The same thing can be achieved on earlier versions using
  -memaccess [+/-]L22[+/-]S22-L41, -zh2[+/-][+/-]- or -za1.


Changes in version 5.69 of the compiler
=======================================

* Added the following code generation and optimisation options: -arch 6k,
  -arch 6t2, -arch 7 and -cpu cortex-a8. When -arch 7 or -cpu cortex-a8 is
  specified, then a new set of scheduling rules are selected which target an
  approximation of the Cortex-A8.
* Multiple bug fixes for internal inconsistency errors and compile-time long
  long arithmetic.
* Code generation improvement: unaligned halfword loads/stores fabricated from
  LDRB/STRB instructions can access larger offsets from the base register than
  LDRH/STRH instructions.


Changes in versions 5.70 to 5.71 of the compiler
================================================

* The #warning pre-processor directive is now supported.
* You can now use -wz to suppress "Undefined macro in #if - treated as 0"
* Code tuning for ARM11 now available. To generate code scheduled for the
  ARM11, use '-cpu ARM1176JZF-S'. As before, if compatibility with earlier
  architectures is desired, you must specify a -arch switch in addition.
* The built-in standard C headers now incorporate the changes from the
  Shared C Library, up to version 5.64. This includes a correction to FOPEN_MAX
  as well as support for 64-bit file pointers. wchar.h and wctype.h are also
  included for the first time (but note that the Shared C Library does not
  implement the functions defined therein at the time of writing).
* Improved error message "Error: linkage disagreement for 'xx' - inconsistent
  '__global_reg' usage". This means that you have attempted to declare a
  variable with both __global_reg and either static or extern linkage.
* New error message "References to non-const static data are not supported when
  using -zM with APCS /noswst". Code generated with the -zM switch uses the
  stack limit register to locate its static data, but you have specified a
  calling standard that doesn't use a stack limit register.
* Fixed a bug where data on the stack could be liable to corruption by
  interrupts (for privileged mode code) or signals (for user mode code) in a
  few circumstances, where the stack pointer aliased another register or was
  used for zero-initialising variables.
* Fixed a bug where a 64-bit integer addition of a small number to itself was
  sometimes performed incorrectly.
* Fixed a bug where designated initialisers didn't work correctly with union
  members of structs.
* Static structs can now be initialised using compound literals.


Changes in version 5.72 of the compiler
=======================================

* An accidental NULL pointer access when called with the -C++ switch on the
  command line has been fixed, which caused an abort when used on a version 
  of RISC OS with zero page relocation.


Changes in versions 5.73 to 5.74 of the compiler
================================================

* Module relocation offsets are now annotated correctly in the disassembly
  produced by calling the compiler with switches -zM -S at the same time.
* Following an inlined memset() of structures where a loop is deemed 
  necessary, and the structure size can't be expressed as an ARM immediate,
  any subsequent use of a pointer to the structure is now correctly based.
* Fixed a bug where a mask operation comprising a shift left followed by a
  shift right, used as an array index (which needs another shift left to
  compute the byte offset) was incorrectly optimised.
* The scheduler will now track LDM instructions which are indirectly reading
  the stack via another base register. This ensures the instructions aren't
  scheduled to a point before the write of the data which they want to load.
* Integer expressions of the form
    a = b - (c * d)
  will now produce the MLS instruction when architecture 6T2, or greater, is 
  selected. 
  The inline assembler similarly now accepts MLS. During transformation if the
  target architecture doesn't support it, a MUL/SUB pair will be substituted.
* Fixed a CSE bug where an inlined function whose arguments comprised a string
  that was short enough to fit in the literal pool, and where that same string
  had pointer arithmetic performed on it, the resulting ADR instruction output
  would have the top 16 bits of the instruction set to 0xFFFF, which is an 
  illegal opcode.
* Fixed a CSE bug where the 2nd result from a built in function (such as 
  div/rem) was needed and where the function could be evaluated at compile
  time. The compiler would instead keep track of the 1st result so a later
  use of the result would incorrectly substitute some earlier value. Note that
  run time evaluated functions were not affected.
* The -cpu switch now recognises additional architectures 2a/3g/5tewmmx/
  5tewmmx2/6z for consistency with ObjAsm. They are otherwise treated as
  synonyms for 2/3/5tej/5tej/6k respectively.


Changes in versions 5.75 to 5.76 of the compiler
================================================

* Compare operations involving the PC introduced via the inline assembler are
  no longer faulted. This allows the common idiom
    TEQ pc, pc
  to be used to check if the processor is in 32 bit mode. Other ALU operations
  involving the PC are still faulted because the result would be undefined.
* An improvement has been made to the register allocation when a substitution
  of MLS for MUL/SUB is used.
* The BKPT instruction is now accepted by the inline assembler, allowing
  program flow to be stopped with a suitable JTAG debugger attached using
    __asm { BKPT #0x5678 }
  The constant has no significance to the compiler so could be used to 
  distinguish between several such breakpoints, or the preprocessor 
  symbol __LINE__ used for example.
  Note that BKPT is only available from ARMv5 and later. If no JTAG debugger
  is attached the processor will cause a prefetch abort.


Changes in version 5.77 of the compiler
=======================================

* Fix to the opcode output when PLD is used from the inline assembler, 
  previously the condition bits were wrong as PLD is always unconditional.
  The parser will now also give an error if the source code tries to use
  condition codes other than an implied or explicit AL.
* Fixed a CSE bug where using both results from a 2 result function (such
  as div/rem) would reuse the first result in any subsequent calculation
  if the exact same subsequent calculation was also performed on the second
  result. So 11*a/3 and 11*a%3 was affected, whereas 21*a/3 and 11*a%3 was not.
* Fixed a CSE bug for inlined memsets that were a subset of a larger area
  such as a structure whose values were used both before and after the inlined
  memset. The stale values from before the memset could be reused after it,
  rather than loading the later value of zero. 
* The command line keyword -fpu is now reserved for compatibility with ObjAsm.
* The preprocessor no longer faults function-like expressions nested inside
  an outer #if (or #elif) which evaluated to 0. Previously their syntax was
  checked even if not needed, contrary to the description in ISO9899, so
    #if defined(__colourof)
    #if __colourof(banana)==YELLOW
      grill(banana);
    #endif
    #endif
  now silently ignores the inner 'if' when __colourof is undefined.
* The -cpu switch now recognises architectures 7-a/7-r/7-a.security/8-a
  and Cortex-A5/Cortex-A7/Cortex-A9/Cortex-A15 for consistency with ObjAsm. 
  For code generation purposes they are otherwise treated as synonyms 
  for 7/Cortex-A8 respectively.
* Using SWP[B] in the inline assembler for a -cpu or -arch that doesn't support
  that instruction will now emit a warning.


Changes in version 5.78 of the compiler
=======================================

* When the option -apcs/interwork is in force, symbol __APCS_INTERWORK is now
  predefined by the preprocessor as for the other available -apcs options.
* The maximum size limit for any statically declared array or structure has
  been increased to 16MB. Previously this was inconsistently checked as 8MB
  and 16MB internally in different places, with sizes above 8MB leading to a
  fault in some circumstances.


Changes in versions 5.79 to 5.80 of the compiler
================================================

* When -fpu None is specified, if a floating point operation is used in the
  source code an error is now given as a reminder that only integer code is
  allowed.
* The table of known CPU's has been further extended (to closely match ObjAsm)
  up to and including the Cortex-A57.
* Masking operations of the form
    a = b & constant
  where the constant is made up of a run of contiguous 0's which can't be
  expressed as a conventional ARM immediate constant will now produce
  the BFC instruction when architecture 6T2, or greater, is selected.
* Literals that can't be formed with a shifted 8 bit constant might now select
  a MOVW and/or MOVT instruction when architecture 6T2, or greater, is
  selected. In the event of a tie for space (compared with an LDR from the
  literal pool) the selection of -Otime and -Ospace has an effect.
  Relocations will similarly use MOVW/MOVT when deemed the best method of
  generating an address, though both will always be present since the area
  offset isn't known until link time (requires link 5.34 or later to resolve). 
* The response to -help has been updated with previously missing keywords from
  the Acorn C/C++ manual.
* Fixed a bug which could cause a memory read and subsequent sign extension
  to be emitted in the wrong order in some rare situations, such that the
  memory read came second.
* A number of intrinsic functions (see above for details) are built in to the
  compiler, allowing selected low level primitive operations to be performed
  using "C like" functions. These are expanded directly inline.
* When a NOP is required and architecture 6K, or greater, is selected a hint
  instruction is output instead of MOV r0,r0. 
* Peepholes will now look for common sequences expressed in C that have
  equivalents in architecture 6 onwards. Specifically:
  - REV to reverse endianness (see ARM document DDI0100E section 9.1.4 for
    an example sequence).
  - PKHBT or PKHTB when masking and combining two halfwords into one.
  - SXTB or SXTH sign extension when performing left shift up 8 or 16 bits
    then an equal sized right signed shift down.
  - UXTAB or SXTAB when an unsigned/signed char type is added to.
  - UXTAH or SXTAH when an unsigned/signed short type is added to.
  - UBFX or SBFX when selected bits are extracted from a word. 
* A further improvement has been made to the register allocation when a 
  substitution of MLS for MUL/SUB is used.
* The inline assembler has been updated to add the extra instructions which
  were introduced in architecture 6 and later, up to and including ARMv8.4
  which is current at the time of writing.
  The same restrictions apply with regards to stack and program counter
  manipulation as previously. Therefore it is not possible to use the ERET,
  RFE, or SRS instructions as these manipulate SP or PC.
* A longstanding bug in the inline assembler parser has been fixed which
  confused those instructions whose mnemonic is also a substring of a longer
  mnemonic, when condition codes are taken into account.
  For example, SMLALBB and SMLALCS (SMLAL with 'CS' condition) can now be
  assembled.
* Type qualifiers, such as const, are now accepted by the classification
  macros in both <math.h> and <stdarg.h>. Previously this caused a safety
  check assertion to fail.


Changes in versions 5.81 to 5.82 of the compiler
================================================

* Bug fix to BFI optimisation that could get combined with a peephole
  to result in incorrect code.
* The line number reported after #warning or #error preprocessing directive
  is no longer out-by-1.


Changes in version 5.83 of the compiler
=======================================

* The built in header <wchar.h> has been updated, previously a missing typedef
  resulted in errors when attempting to use it.
* Bug fix to LDRSB peephole whose sense was inverted at version 5.79. The
  result on ARMv4+ was to emit an LDRSB and an arithmetic shift right of 24,
  when only the former should have been.
* Fix static initialisation of a pointer using an expression that reduced
  to an integer constant that was written as an array index. For example
    static void *ptr = (void *)&((char *)0)[57];
  would cause an internal assertion to fail.
* A spurious error is no longer given by the preprocessor when consuming
  unwanted #if (or #elif) directives followed by a quoted string.
* Right shift by a 64 bit variable of a signed 32 bit number no longer
  causes a fault due to an impossible cast. Note that ANSI state the result
  is undefined should the shift be more than 32 bits, this is warned as before.


Changes in versions 5.84 to 5.85 of the compiler
================================================

* Numerous C18 features added (see separate documentation).
* The built in headers have been updated with the C18 changes and
  adds <stdalign.h> <stdnoreturn.h>. The header <uchar.h> is also available
  but note that the Shared C Library does not implement the functions
  defined therein at the time of writing.
* When a packed struct is held only on the stack (as an automatic variable)
  it was possible that an unaligned LDR would be used to read a halfword
  even if the -memaccess switch requested not to. This has been fixed and
  will use LDRH, LDRSH, or a pair of LDRB's as appropriate.
* During optimisation the information about a packed struct could be lost
  resulting in an unaligned LDR of a word sized member even if the -memaccess
  switch requested not to.
* The error messages relating to oversized arrays declared inside structure
  and checks where long longs or shorts are expected have been improved.
* Conditional uses of the __breakpoint() intrinsic no longer emit a
  conditional BKPT instruction, as it can only use the 'AL' condition.
* EOR of a constant which can fit into an 8 bit immediate if inverted, such
  as 0xFFFF00FF, when debugging is enabled with -g, now produces correct code.


Changes in version 5.86 of the compiler
=======================================

* Fix for the -desktop command line switch wanting to link with non-existant
  library C:hostlib.
* Fix for sometimes using the wrong offsets in anon structs containing pointer
  type variables.
* Stop the scheduler reordering instructions which used LDM from a stack
  offset of zero, this could lead to incorrectly ordered code.
* The table of known CPU's has been further extended (to closely match ObjAsm)
  up to and including the Cortex-X1. Also, those that opted not to include the
  CRC instruction will now cause a warning if one is used in the inline
  assembler.
* An assignment to a structure from a structure returning function could
  ignore the subsequent result if there was an earlier reference to the same
  structure. This problem has been fixed & the result is re-read after the
  function call.
* Restore the special case from cc 5.83 of a halfword load from offset 2 when
  the base address is known to be word aligned but unaligned loads are
  configured off. Loading from offset 0 and shifting results in smaller code
  (2 instructions rather than 3) while still honoring the unaligned load
  setting, freeing up a register too.


Changes in versions 5.87 to 5.88 of the compiler
================================================

* Functions which are inlined and have volatile formal arguments will now
  correctly propagate that volatile qualifier after inlining, previously
  the volatility of the actual arguments were used.
* Fixed an internal error when a pointer variable which is passed in the
  arguments to an inline function is then used as the address by LDR (or STR)
  of an __asm statement in the inlined function.
* Don't try to inline fma() when the floating point unit is FPA.
* Fix when attempting to save space by sharing literals in the literal pool
  when two adjacent doubles happen to share one identical word. Previously,
  the result was to produce 1 valid double constant and one invalid one.
* Functions called via BL or BLX via inline assembler now take into account
  potential memory access sideeffects during CSE optimisation, unless the
  called function is explicitly marked pure.
* The CLREX inline assembler opcode and similar barrier operations are now
  flagged as having effects that the compiler may not be able to observe, and
  so will not be optimised away unless unreachable.
* Set the SoftFP flag on function symbols when the softfp APCS option is
  enabled. Note however that this option is not officially supported at this
  time.


Changes in versions 5.88 to 5.89 of the compiler
================================================

* The values of CHAR_MIN and CHAR_MAX in <limits.h> are now adjusted when
  the -zc switch changes the default char representation.
* Fixed a fatal error recovering from an earlier syntax error involving a
  ptrdiff_t.
* The code generator no longer emits LDM instructions with (ineffective)
  writeback when the base register is in the register list.
* Fix for selecting the wrong cast function for the softfp APCS option in
  certain circumstances.
* Fix for calculating the stack offset used by the __return_address()
  intrinsic for the nofp APCS option.


Changes in version 5.90 of the compiler
=======================================

* Calling the _swix library function may now be inlined directly, subject
  to a few rules. This can skip the library call entirely by emitting a SWI
  instruction instead. The compiler knows the calling conventions of the
  most common SWIs and will substitute those which follow the usual APCS
  function calling rules and only take/corrupt values in R0-R3. Errors are
  returned from R0 if V was set after the SWI, or NULL if no error.
* An erroneous memory access to below application space when a pointer type
  variable was used with _Generic has been fixed.
* A fatal error caused by using compound literals to initialise bitfields
  within a struct has been fixed.
* The <stdatomic.h> and <threads.h> header files reserved in the C18 standard
  aren't built into the compiler itself, but their use no longer cause a
  warning about non ANSI headers if they are supplemented on disc.
* Marking a function as noreturn will now apply more aggresive culling of
  unreachable code compared with earlier versions. Excessive warnings about
  potential non void returns from a caller who calls a non-returning function
  are no longer given.
* All of the ssec-ATisa predefines (see above for details) set out in the
  Arm C Language Extensions 2021Q2 specification are now implemented. These
  reflect the -arch and -cpu selections made on the command line.
* The sign of long long variables reported in debug tables when enabled with
  the -g option are now correct. Previously all 64 bit integers were marked
  as unsigned.
* Certain pathological constructs of nested goto statements used to form
  overlapping loops no longer cause a data abort when compiled.
* The feature z (-fz) switch has been withdrawn as it has had no effect since
  version 5.63 of the compiler.
* Tentative declarations that are then followed up by a definition such as
    const int myvar;
    const int myvar = 1;
  no longer result in an internal compiler error; the value is placed into the
  read only data area.
