

                    OS/2 WARP (POWERPC EDITION)
                    USING THE METAWARE COMPILER



PPC 604 COMPATIBILITY:
Compile your applications using the MetaWare hcoppc flag -Hon=cpu_603
Your application will then be compatible with both the PowerPC 601 and the
PowerPC 604.


METAWARE POWERPC COMPILER, ASSEMBLER, AND LINKER NOTES:
The following information applies to:

    hcoppc  R2.6h   (Compiler version)
    asppc   1.80    (Assembler version)
    ldppc   3.54    (Linker version)


NOTES:
       1. Some new compiler options have been added.  Refer to
          USING YOUR TOOLKIT (POWERPC EDITION) for new
          compiler options.


The ABI specifies support for long_long and long_double datatypes.  The
support for these datatypes is not present in the current system provided
libraries, however the compiler supports these datatypes in single-precision
mode via the use of the options "-Hon=Long_long_8" and "-Hon=Long_double_16".


LIMITATIONS AND RESTRICTIONS:
OS/2 runtime libraries and compiler support now includes full C++
support (the version shipped in the previous Beta level did not).


UNDOCUMENTED COMPILER OPTIONS:
-Hoff=off_fp_reference - this option specifies generation of debug data
    not using the stack frame pointer reference for offset values.
    This option is required for compatibility with the ICATPM debugger.
    This has been set as the default in the compiler .cnf file
    (hcae6k.cnf and hcoppc.cnf).

-Hon=zero_word_before_functions  -  this option generates a word of zero's
prior to the code generated for each function; this is used for minimal tag
word support, as specified in the ABI.

Push_align_members pragma usage note: you must provide an argument
value to the #pragma push_align_members(n) pragma; otherwise, the error
"Insufficient Number of Arguments in Pragma Specification" will occur.
Using #pragma push_align_members(64) will always cause the compiler to
revert to packing structures to the natural alignment size.


__FUNCTION__ pre-defined macro:

MetaWare supports a pre-defined macro named " __function()" that yields
a string identifying the current function. This can essentially replace
the use of "__FUNCTION__" (IBM C/Set++ compiler macro) from code which
used this macro with the IBM C/Set++ compiler. MetaWare's "__function()"
isn't identical to "__FUNCTION__" in all respects, but can be used as a
replacement in most cases.

To enable mapping of __FUNCTION__ to __function() automatically, the
following lines can be added to existing source code:

    #if __HIGHC__
    #define __FUNCTION__ __function()
    endif


UNDOCUMENTED LINKER OPTIONS:

    -Bpic               Issue warning message for link of PIC and non-PIC
                        (absolute) code

    -Blstrip            Strip symbol table data from executable during the link

    -Bstart_addr=nnnn   Where nnnn is the start address in decimal
                        (for hex, 0xnnnn)

    -Bpage_size=nnnn    Where nnnn is the desired page size in decimal
                        (for hex, 0xnnnn)

    -Bhard_align        Sets segment start to true page start boundary

    -Bwired             For wired shared library (all PT_LOAD segments
                        will have PF_M flag set)

    -Bkernel            For kernel extension shared library (all PT_LOAD
                        segments will have PF_K flag set)

    -Bnocopyrel         Disables generation of R_PPC_COPY relocations for
                        data relocations; R_PPC_GLOB_DAT relocations are
                        generated instead.


LINKER NOTES:
TYPE values are set to SHT_EXPORTS.  ALIGN value is set to 8 (machine
alignment)

The linker generates OS/2 DLL EXPORT entries exactly as specified in the
.DEF file EXPORTS entries, no automatic upper-casing of these entries is
performed as is the default for the LNK386 linker; therefore, programs
which reference these entries via their .DEF file IMPORTS entries must match
the spelling and case sensitivity of these entries exactly; for compatibility
purposes additional "aliases" for the same export entry can be added with
additional uppercase spelling for them in the .DEF file EXPORTS section.


COMPILER DIRECT REGISTER ASSIGNMENT SUPPORT:
Direct register assignment is based on internal compiler register
numbering reference, not user-referenced register numbers.

Example:

The current documentation states that direct register assignment be
coded as:

       int i == REG(20);       // assign register # 20 to i.

Since the current compiler internal numbering of the registers are
as follows:

       0..31 == GR0..GR31, 32..63 == FPR0..FPR31

and since the REG() wasn't a keyword and should not be treated
literally, the direct register assignment should be coded as:

       int i == 20;    // no REG() around it.


VARIABLE ARGUMENT LIST IMPLEMENTATION NOTES:
The PowerPC ABI specifies the implementation of variable argument
list processing. Any code that is affected by the implementation
specification in the ABI or references the va_list structure must
conform to the ABI. This is an ABI conformance issue and a portability
issue so it's important to identify and change any non-conforming code.
We recommend reviewing the ABI document section on the variable
argument list definition and the compiler's include file definition
(in file "_stdarg.h").

Due to the specification of implementation for variable argument
lists in the PowerPC ABI, some source code changes are required for
certain code sequences which directly reference the variable
argument list structure.  This should only affect certain non-portable
source code sequences.  Code will have to be inspected for the use of
this sequence and then updated in the following manner:

The current __va_list source code definition is a one element array
of some structure. This makes it illegal to do direct va_list variable
assignment.  For example:

    int foo(va_list *a) {
        va_list save;
        save = *a;  // save the original va_list (ILLEGAL now)

        int peek = va_arg(a,int);   // the va_arg macro modifies
        ....        // the original va_list 'a'. So if we just
                    // want to peek at the argument without actually
                    // extracting it, we have to save and restore
                    // as above.

        *a = save;  // restore. (ILLEGAL now)
        }

To port this code, you have to "#ifdef" the code (that is,
save[0] = (*a)[0]), or better yet use memcpy as follows:

    int foo(va_list *a) {
        va_list save;
        memcpy(&save,&(*a),sizeof(va_list));

        int peek = va_arg(a,int);   // the va_arg macro modifies
        ....        // the original va_list 'a'. So if we just
                    // want to peek at the argument without actually
                    // extracting it, we have to save and restore
                    // as above.

        memcpy(&(*a),&save,sizeof(va_list));
        }

Using memcpy as shown above will work all the time.
(There is no performance degradation using either the inline memcpy
or the memcpy library function.)

The above memcpy technique will suffice for va_list implementations
as an array or as a pointer to an argument in a stack-based argument
list, but wouldn't be portable to an implementation of va_list as a
pointer to a malloc'ed memory segment. The ABI specifies a newer va_list
structure; portable va_list access via the __va_copy macro in the MetaWare
compiler's STDARG.H file is recommended for va_list portability. This is
to be used only with the -abi_va_arg compiler option for current ABI
conforming variable argument list support.

This is the section of the MetaWare _STDARG.H file relevant
to the PowerPC ABI implementation of va_arg support:


    /***********************************************************************
     * PowerPC ABI varargs                                                 *
     ***********************************************************************/
    #ifndef _VA_LIST_DEFINED
    #define _VA_LIST_DEFINED
        typedef struct {
            /* The one that is going to be documented in the ABI:
             * gpr -- index into the array of 8 GPRs stored in the register
             *        save area; gpr=0 corresponds to r3, gpr=1 to r4, etc.
             * fpr -- index into the array of 8 FPRs stored in the
             *        register save area; fpr=0 corresponds to f1, fpr=1
             *        to f2, etc.
             * input_arg_area -- location in input argument area which
             *        may have the next var arg that was passed in memory.
             * reg_save_area -- where r3:r10 and f1:f8 (if saved) are stored.
             */
            char gpr;
            char fpr;
            char *input_arg_area;
            char *reg_save_area;
            } __va_list[1];
    #endif

    extern __builtin_va_info(void *);
    extern void* __va_arg(void *, int);

    #define __va_start(ap,fmt) __builtin_va_info(&ap)
    #define __va_arg(ap,t)      (*((t*)__va_arg(ap,_INFO(t,__VA_INFO_CLASS))))



PACKED DATA STRUCTURES AND ALIGNMENT PADDING CONSIDERATIONS:
The MetaWare PowerPC compiler supports sizeof() for packed structures
consistent with the ANSI and K&R definitions.  Quoting from ANSI:
"When applied to an operand that has structure or union type, the result
is the total number of bytes in such a object, including internal and
trailing padding."

For the MetaWare PowerPC compiler, sizeof() includes the padding needed
at the end of a packed structure to force proper alignment. In addition,
if there is an array of packed structures, or compound packed structures,
this may require code changes for code which didn't account for architectures
where structure alignment padding could be included. For instance, code which
uses sizeof() to index through structures will always be off by the amount
of the padding needed at the end of the packed structure.

The recommended method for a more portable implementation is to
use offset() of the last element plus the last element's size in the
calculation to get the actual size of members of packed structures.

An example showing the results of the alignment effect on packed
structures is shown below:


// example of packing first_struct and second_struct into a buffer...
#pragma push_align_members(1);  // Forces one byte structure member alignment
struct first_struct {
   int *a;
   short b;
   };

struct second_struct {
   int a;
   int b;
   char c;
   };

#pragma pop_align_members();

char buf[1024];
struct first_struct an_instance_of_first_struct = { 0, 1};
struct second_struct an_instance_of_second_struct = { 5, 7, 'a'};


void main() {
   char *p = buf;
   int i, n;
   memcpy(p, an_instance_of_first_struct, sizeof(struct first_struct));
   p += sizeof(struct first_struct);
   memcpy(p, an_instance_of_second_struct, sizeof(struct second_struct));
   n = sizeof(struct first_struct) + sizeof(struct second_struct);

   // print out just to verify that we got it working..
   // otherwise, you can write it out as fwrite(buf,1,n,outfile);

   // should see the following output, which is Little endian
   // 00 00 00 00 01 00 05 00 00 00 07 00 00 00 61
   for (i= 0; i < n; i++) {
       printf("%02x ", (unsigned int) buf[i]);
       }
   printf("\n");
   }


KNOWN PROBLEMS AND WORKAROUNDS

The compiler optimization option "-Os"  (optimize for size, not speed)
should not be used, some testcases have produced errors with required
compare instructions not being generated for "if" statements.

DLL's which have C++ code or use thread-local storage (-Hthread compile
option) must have the MetaWare C/C++ runtime initialization run at global
process initialization. Therefore, if your executable doesn't load any DLL's
linked with the MetaWare runtime libraries via  default DLL load (the DLL's
name is specified in the .def file IMPORTS statement, or its import library
is linked in) then issuing a dynamic load of the DLL ("DosLoadModule") will
result in errors (most likely a trap in the runtime library). Either change
the executable or DLL which dynamically loads the MetaWare built DLL to 
load it at the executable's initialization (specify it in the .def file IMPORTS
statement) or create a stub DLL built with the MetaWare C/C++ runtime 
libraries and include this DLL in the .def file IMPORTS statement.

Multi-threaded programs which start threads via the DosCreateThread API
should instead use the C runtime library "_beginthread" function instead.


A link error "Expected end of line" will occur when linking with
.DEF files that contain legacy OS/2 attribute names. The attribute
names known to cause this problem thus far are "NEWFILES" and
"LONGNAMES". These attribute names are obsolete and can be safely
omitted from the .def files.



Using a module definition file that has a CODE statement with the
following: CODE MOVABLE DISCARDABLE PRELOAD
will generate multiple executable  segments in the resulting executable.
Instead, the CODE statement should specify only the PRELOAD and MOVABLE
attributes.

To set a data segment to have the SHARED attribute the following
.def file entry must be used in the "SEGMENTS" section of the .DEF file:

   seg_name CLASS 'DATA' SHARED

(where "seg_name" is the named data segment's name)

Alternatively, the .DEF file can specify SHARED for all data segments by using:

   DATA SHARED

Please note, however, that using this method of setting data segments SHARED
sets this not only for user-defined data segments but also for any that are
linked in from archive libraries, for instance compiler runtime library
data segments, and this may result in unexpected results if these other data
segments are not intended to be set SHARED.

SHARED is the default for user-defined data segments for OS/2 Warp (Intel)
DLL's; non-shared is the default for EXE's.

A loader error may result from code which makes heavy use of user-named
code segments (via the "alloc_text" pragma). User-named code segments are
typically used to group related functions into the same physical area of the
resulting executable or dll. If the loader error occurs, either reduce the number of "alloc_text" named segments (or the number of functions placed
in these segments) or remove the use of alloc_text named segments entirely.


Notes on Metaware Linker:
  - If no name is given in the LIB statement, the entire path name will be
   the default name along with the object name.  User must use -h option
   to get just the object name for default.


