
    The Alpha Unix-NT assembler (asaxp) compiles source files written in
    Alpha AXP Assembly Language.

    This implementation is for the Windows/NT AXP platform.

    It is assumed that the user is familiar with both the Digital Alpha AXP
    Architecture and the Digital Unix assembly language (as described
    in the Digital Unix Assembly Language Programmer's Guide, Order Number
    AA-PS31D-TE or the asaxp.hlp online help file). 

    Please mail bug reports to "gaf@zk3.dec.com" or "alpha::gaf".

ASAXP Version 3.02.0 ev6 beta Alpha Release Notes.
Copyright (c) Digital Equipment Corporation, 1995-1996


1  FILES
   -----

    readme.txt      Release notes (this file).

    asaxp.exe       The assembler executable.

    asaxp.hlp       The assembler help file.

    The cl command must be installed in order for the  asaxp preprocessor
    to run correctly. Preprocessing can be optionally bypassed by using the
    /nopp command line option. 


2  Command Syntax
   --------------

    asaxp  [options] <filename>

2a.  Command Line Options.

    /?
    /help       Display list of supported options.

    /arch<symbol> Where the symbols are defined as follows:
          EV4   -   Generate instructions for the EV4 architecture.
          EV5   -   Generate instructions for the EV4 architecture.
          EV56  -   Generate instructions for the EV56 architecture.
          EV6   -   Generate instructions for the EV56 architecture. (EV6 in future).
          HOST  -   Generate instructions for the machine on which the 
                    assembler is executing.
          GENERIC - Generate instructions that most current alpha
                    chips use. (Currently EV4).

          Note: Whitespace can optionally precede the case insensitive <symbol>.

    /D<symbol>  Tells the C preprocessor the definition
       or       of <symbol>.
    /D<symbol>=<string>

    /E          Writes the preprocessor output to standard output. 
                (Same as C compiler's /E option)

    /I<dir>     Tells the preprocessor to search <dir> for
                include files. Whitespace can optionally precede
                the <dir>.

    /U<symbol>  Tells the C preprocessor to undefine <symbol>.


    /Gy         Procedure Linking. When used, each procedure is
                placed in its own section.

    /Fo <output filename>    Specifies output filename other than the
                default. (The default is the basename of the source
                file with the .obj extension.)

    /O0 - Turn off code scheduling optimization.
    /O1 - Turn on code scheduling optimization. (Default)

    /noloc  - Ignore the .loc and .file directives.
    /nologo - Suppress the logo.
    /nopp   - Do not invoke the C preprocessor.
    /nowrn  - Do not warn if the .end directive does not refer to 
              the current procedure name.

    /wnt3.1 - Generate object files compatible with the wnt 3.1 linker.

    /QAltls       - Set flag for large tls section.
    /QApdst       - Set EXCEPTION_MODE_SILENT        (default)   
    /QApdsg       - Set EXCEPTION_MODE_SIGNAL
    /QApdsa       - Set EXCEPTION_MODE_SIGNAL_ALL
    /QApdie       - Set EXCEPTION_MODE_IEEE
    /QApdca       - Set EXCEPTION_MODE_CALLER
    /QAarchEV4    - Generate instructions for the EV4 architecture. (default)
    /QAarchEV5    - Generate instructions for the EV4 architecture.
    /QAarchEV56   - Generate instructions for the EV56 architecture.
    /QAarchEV6    - Generate instructions for the EV56 tuneitecture. (EV6 in future)
    /QAtuneEV4    - Schedule instructions for the EV4 architecture.
    /QAtuneEV5    - Schedule instructions for the EV5 architecture.(default)
    /QAtuneEV56   - Schedule instructions for the EV5 architecture.
    /QAtuneEV6    - Schedule instructions for the EV4 architecture. (EV6 in future)
    /QA21064      - Generate and schedule instructions for the EV4
                    architecture. 
    /QA21066      - same as /QA21064.
    /QA21064A     - same as /QA21064.
    /QA21066A     - same as /QA21064.
    /QA21164      - Generate instructions for the EV4 architecture and
                    schedule for the EV5 architecture. (default)
    /QA21164A     - Generate instructions for the EV56 architecture and
                    schedule for the EV5 architecture.
    /QA21264      - Generate instructions for the EV56 architecture and
                    schedule for the EV5 architecture. (EV6 in future)


    /tune<symbol> Where the symbols are defined as follows:
          EV4   -   Schedule instructions for the EV4 architecture.
          EV5   -   Schedule instructions for the EV5 architecture.(default)
          EV56  -   Schedule instructions for the EV5 architecture.
          EV6   -   Schedule instructions for the EV5 architecture. (EV6 in future)
          HOST  -   Schedule instructions appropriate for the machine on
                    which the assembler is executing.
          GENERIC - Schedule instructions that most current alpha
                    chips use. (Currently EV5).

          Note: Whitespace can optionally precede the case insensitive <symbol>.

    /V            - Display assembler version.
    /v            - Verbose. Displays command line to preprocessor and 
                    assembly options.
    /Zd           - Limited debugging. Inserts line numbers into symbol
                    table.
    /Zi           - CodeView(TM) symbolic information is emitted.

   <filename> is the name of the assembly source file.

   Note:  The asaxp assembler assumes that a source filename extension of .i means
      the source has already been preprocessed.

2b. Advanced options.

    /symbols_aligned_0mod4        - Default. Assume symbols are longword
                                    aligned. 
    /symbols_not_aligned          - Do not make assumptions about symbols.
    /stack_aligned_0mod8          - Default. Assume stack is aligned on
                                    quadword boundary.
    /stack_not_aligned            - Do not make any assumptions about stack
                                    alignment. 
    /thread_pointer_aligned_0mod8 - Default. Applies to TLS section. Assume
                                    that the thread pointer is quadword
                                    aligned. 
    /thread_pointer_not_aligned   - Do not make any assumptions about
                                    the thread pointer.

    /resumption_safe  - This option tightens up the scheduling for floating
                        point instructions. This is primarily intended in
                        cases where the programmer wants to catch floating
                        point exceptions.

    /diag_pipe        - Print out the scheduling pipeline. (Only when
                       /O1 is set)

    /1, /2, /3, /4    - Allow debugging output after assembler phase.
                        These options are used with other debug options.
                        (Note that /4 provides output before and after
                        scheduling). /1 - Parse, /2 - Layout, 
                        /3 - Encoding, /4 - Encoding and Scheduling.

    /dump_ir          - Display internal instruction table. Must have one or
                        more of the phases set.

    /dump_sym         - Display the internal symbol table.  Must have one or
                        more of the phases set. Also displays the coff
                        symbol table when /3 or /4 is set.
                       
    /dump_inst        - Display the internal instruction encoding table.

    /dbg_out[&]<file> - Redirect debug output to <file name>. If & precedes
                        name, then append to file.
                       
    /err_out[&]<file> - Redirect error output to <file name>. If & precedes
                        name, then append to file.
                       
2c. Options provided for compatibility with the ACC and Digital Unix assembler.
    /eflag number - Set exception mode to specified number. 

    /g, /g2, /g3   - Same as /Zi.
    /g1            - Same as /Zd.
    /g0            - No debug. (Default)
                    
    /G <num>       - Ignored.

    /o <file name> - Same as /Fo.

    /std, /std0, /std1 - Ignored.

    /t <file name> - Ignored.

2d. Options accepted by the assembler, but silently ignored:
    /a /b /c /coff /d /l /M /n /p /w /X /z


3  Differences from the Digital Unix assembler, as0/as1 (a.k.a. acc)
   -------------------------------------------------------------------

3.1  Directives

3.1.1 Unsupported Directives

    The following directives are silently ignored:

    .set volatile/novolatile,

    .bgnb, .gjsrlive, .lab, .livereg, .option, .vreg,
    .weakext, .ugen, .alias, .noalias

    The following directives generate warnings when encountered:

    .const, .gjsrsaved, .gprel32, .gretlive, .save_ra


3.1.2 .aent Directive

    As0/as1 would insert a nop instruction, for example, bis $31, $31, $31,
    when an .aent directive is specified immediately after an .ent
    directive.

    Asaxp does not insert insert an additional instruction.

    Updated for version 3.02.0. The .aent directive is fully supported, and
    will cause procedure descriptor to be inserted into the .pdata section.

3.1.3 .edata Directive

      .edata <flag> <exception_data> <lang_handler>

      If flag is 0, the current section changes to .xdata.

      If flag is 1, the exception data and language handler are inserted
      into the procedure descriptor for the current procedure. If the
      .edata directive precedes the .ent directive, the information applies
      to the next .ent directive. 


3.1.4 .eflag Directive

      .eflag number
      This sets the exception mode bits in the procedure descriptor for the
      current procedure. This overrides the default set by the /QApdxx
      command line options. Asaxp treats the number as an expression.
      
3.1.5 .end Directive

    As0/as1 would ignore the optional symbol following the .end directive.

    Asaxp checks that an .ent directive is active for any symbol argument
    to the .end directive. In addition, asaxp verifies that the procedure
    is terminated within the proper section.

    Asaxp generates a .end directive when encountering a .ent directive
    when a procedure is active, or when encountering an end of file when a
    procedure is active. In both cases a warning is issued.


3.2 Unsupported Instructions
    ------------------------

    The following instruction is silently ignored by asaxp, but will
    generate a warning in a future asaxp release:

    ldgp


3.3 Code Generated
    --------------

    The code generated by asaxp is meant to be functionally equivalent
    to as0/as1 -- however it is not identical.


3.3.1 Optimizations

    Asaxp makes no attempt to optimize code.  The code must be
    optimized "by hand." 


    As0/as1 attempt to keep track of register contents in the emitted
    code.  For instance,

        ldiq    $3, 0x7fff0001
        ldiq    $4, 0x7fff0002

    is emitted as 3 instructions by as0/as1 because it realizes the second
    constant can be built from the first.  Asaxp does not do this and
    will generate four instructions.

    As0/as1 has a similar ability to do this with relocatable expressions
    that is not done by asaxp.

    As0/as1 uses tricky ways to load constants that are larger than
    2**32. Asaxp also tries to reduce constants. Those constants that
    are not reduced are placed in a constant pool. 

3.3.2 Code Scheduling (reordering)

      Asaxp uses a post-generation code scheduler to reorder instructions
      based on the characteristics of the individual instructions in the
      architecture. This reordering generally results in faster
      performance. It does, however, affect the line number information
      used by the debugger. The reordering algorithms used are consistent
      with as0/as1.

3.3.3 Corrected Instruction Encodings.

    as0/as1 does not encode the following instructions correctly:

    excb, mb, rpcc, trapb, wmb

    Asaxp has the correct encodings.


3.3.4 jsr Instructions

    As0/as1 transforms jsrd" instructions with symbolic operands to bsr
    instructions.

    Asaxp generates the long form of the jsr (ldah-lda-jsr sequence using
    register $at) for such cases.  If a bsr instruction is desired, then
    the code should be modified to use bsr with a symbolic operand. 

3.4 Preprocessor

    Asaxp uses the cl command for preprocessing the source files.  One
    limitation from this is that it is not legal to use the # character to
    begin an end-of-line comment in #include files. 

    Asaxp does work to make it legal to use the '#' character to
    begin an end-of-line comment in the main source file (the file
    specified on the command line).

    The C++ style comment (//) still works in header files.

3.5 Libc.lib

    References to the divide and remainder instructions actually get
    transferred to calls to special run-time library routines.

    These routines can be found in the system library libc.lib.

3.6 Other

    As0/as1 emits local BRADDR relocation records for unconditional
    branches to local labels.  Since branches are pc-relative and the
    distance to the target is known if the symbol is local, asaxp fills
    in the correct branch displacement itself. Relocations are issued for
    non-local branches as well as branches between sections.


4.0 Features added for version 2.0 to support WNT 3.5 (Daytona)
    -----------------------------------------------------------

4.1. The maximum value for the .align directive is 6, not 4.
    Section alignments are adjusted based upon the strictest alignment
    specified by a .align directive within that section. Please note that
    the .align directive specifies the low-order bits of the PC to be
    cleared, not the boundary. To align on a quadword (8 byte) boundary, use
    .align 3, not .align 8.

4.2. Command line options have been updated:
     /Zi - Same as /g or /g2. Emit CodeView(tm) debugging information.
     /Zd - Same as /g1.       Emit COFF line numbers.
     /O0 - Turn off code scheduling optimization.
     /O1 - Turn on code scheduling optimization. (Default)
     /nologo - Suppress the logo.
     /wnt3.1 - Generate object files compatible with the WNT 3.1 linker.

     The following command line options set the default exception handling
     run time procedure descriptor flags. Please refer to Windows NT for
     Alpha AXP Calling Standard. 
         /QApdst        - Set EXCEPTION_MODE_SILENT. (default)
         /QApdsg        - Set EXCEPTION_MODE_SIGNAL.
         /QApdsa        - Set EXCEPTION_MODE_SIGNAL_ALL.
         /QApdie        - Set EXCEPTION_MODE_IEEE.
         /QApdca        - Set EXCEPTION_MODE_CALLER.
         /eflag number  - Set the default exception handling. 

        /symbols_aligned_0mod4 - Symbols are longword granular. (default)
        /symbols_not_aligned   - No attempt is made to align symbols.
        /stack_aligned_0mod8   - Stack is aligned on quadword boundary.
        /stack_not_aligned     - No attempt is made to align the stack.

4.3. The store byte (stb) and store word (stw) pseudo instructions use 
     one additional register, $t11. The assembler now uses the additional
     registers $at, $t9, $t10, $t11. The stb and stw instructions are
     implemented as hardware instructions by the 21164-333 and newer
     processors. (See section 6.4, below)

4.4. The nomove and noreorder assembler options behave identically as
     documented under the .set nomove directive.

4.5. Errors and warnings are now issued to stdout so they can be
     redirected optionally.

4.6. Internally generated symbols are now emitted to the COFF symbol 
     table. The generated names have the form $$nnn. Since $$nnn names are
     also generated by the acc compiler, the asaxp-generated names are
     guaranteed to be unique. 

4.7  Identifier names can now contain the @ and ? characters.

4.8  The load and store sequences are now longword granular.

4.9  Added a new directive, .tls$, for thread local storage.
     The .extern directive now contains a modifier:
     .extern (thread) <symbol>

5.0 Features added for version 2.02.
    --------------------------------

5.1 Procedure linking (/Gy). The /Gy command line option causes every
    procedure to be placed into a section by itself. This is compatible
    with the VC++ function linking.   

5.2 The syntax for the section directives, .text, .data, .rdata, .tls$, 
    .sdata has been extended. The new syntax is:
    .text  [identifier]
    .data  [identifier]
    .rdata [identifier]
    .tls$  [identifier]
    .sdata [identifier]

    Identifier is optional. If specified, that identifier is used to name
    the section. If identifier is not specified, the section name is
    generated by asaxp as standard section names. There is no limit to the
    length of a section name. The same rules that apply to other
    identifiers also apply to section names. Section names are maintained
    in a separate name space from other symbols used in the assembler. 
    There is a special case of the .text directive. The assembler allows a
    non-text directive to occur within the scope of a procedure. The use of
    the .text directive following the non-text instructions causes the
    assembler to reset its context to the procedure. Asaxp does produce
    a warning in this case. This feature has been retained in asaxp to
    support code generated by the acc compiler.
    For example:

    .text dohello # Assign the following code to the dohello section
    .ent dohello
dohello:
   	lda	$sp, -16($sp)
	stq	$26, 0($sp)
	.prologue
	.rdata hello_world	# Assign the following code to the hello_world
                        # section. Note that asaxp does produce a warning.
hello:
	.asciiz	"Hello world\n"
    .text               # Restore context to the dohello section
	lda	$16, hello
	bsr	$26, printf
	ldq	$26, 0($sp)
	lda	$sp, 16($sp)
	ret	$31, ($26), 1
    .end

5.3 Interaction between procedure linking and using named sections. 
    With /Gy in effect, each procedure is placed in a separate section.
    That section is named .text. There may be multiple .text sections.
    If the programmer elects to place a procedure into a named section
    other than .text, that is permissable. There is one caveat: The named
    section specified by the programmer must be empty. If it is not empty,
    the procedure linking code will cause the following procedure to be
    placed into a new .text section.

5.4 Additional error checking on sections. The assembler does not allow a
    procedure to be declared in other than a .text section. Instructions
    occurring in non-text sections cause the assembler to flag an error. 

5.5 Symbols defined by a symbolic equate can now be exported:

    .global foo
    foo = <some constant>

5.6 The subtraction of two identifiers is now permitted as part of a constant
    expression. An optional expression can follow the symbolic difference.
    The expression is limited to addition, subtraction, multiplication and
    division binary operators. The normal expression evaluation rules have
    been altered a bit such that the symbolic difference is assumed to be
    contained within parentheses. The expression following the binary
    operator can be any legal constant expression. 

    foo:
        .quad bar - foo * 2 # The result here is 16. 
    bar:
    
6.0 Features added for version 3.01. 
    --------------------------------

    Support for the 21164 architecture family including support for the new
    word and byte instructions, architecture mask, and implementation version. 

    Added command line options and directives to allow selection of new
    instructions as they are introduced. The default architecture is
    currently EV4 with EV5 scheduling.

     The term <arch argument> is used to specifiy the architecture or
     tuning features (upper or lower case):
        EV4      - Available in all chips.
        EV5      - For /arch - same as EV4. For /tune, selects scheduling
                   for the 21164 chip architecture. 
        EV56     - For /arch, emit byte instructions. For /tune use 21164
                   scheduling. 
        EV6      - Same as ev56. Will reflect the 21264 chip when available.
        host     - Generate instructions or tune as appropriate for the
                   system on which the code is assembled. 
        generic  - Generate instructions or schedule based upon the most
                   prevalent Alpha architecture. At present, this is EV5,
                   but will change in the future.

6.1 New command line switches for architecture selection.
    Added command line options for new Alpha chip architectures. These
    switches are compatible with VC++4.0 and Digital Unix. The QA21 family
    set both the architecture and tuning parameters. The arch, tune, QAarch
    and QAtune options work independently of each other. 
        
        /QA21064      - Generate and schedule instructions for the EV4
                        architecture. 
        /QA21066      - same as /QA21064.
        /QA21064A     - same as /QA21064.
        /QA21066A     - same as /QA21064.
        /QA21164      - Generate instructions for the EV4 architecture and
                        schedule for the EV5 architecture. (default)
        /QA21164A     - Generate instructions for the EV56 architecture and
                        schedule for the EV5 architecture.
        /QA21264      - Generate instructions for the EV56 architecture and
                        schedule for the EV5 architecture. (EV6 in future)
        /QAarch<arch argument> - Generate instructions based on <arch argument>
        /arch <arch argument>  - Generate instructions based on <arch argument>
        /QAtune<arch argument> - Schedule based upon <arch argument>
        /tune <arch argument>  - Schedule based upon <arch argument>

         Some additional options have been added to be silently ignored
         for compatibility with Microsoft's masm assembler.

6.2 Added two new directives. These directives override the command line
    options. 

    .arch <arch argument> - Generate instructions based on <arch argument>
    .tune <arch argument> - Schedule based upon <arch argument>

6.5 Macro substitution in the .repeat/.endr block. The %r token may be used
    within an identifier. This expands to the repeat iteration number,
    starting at 0. For example, 

    .repeat 3
    .globl  aglob%r
    .endr

    This expands to:
    .globl aglob0
    .globl aglob1
    .globl aglob2

6.4 Added ldbu, ldwu, stb, and stw instructions for EV56.
    These instructions are implemented as pseudo instructions for EV4 and
    prior releases of the assembler and are documented correctly in the
    asaxp.hlp file.

6.5 Added the sextb and sextw instructions for EV56.
    These instructions sign extend byte and word respectively. These are
    new instructions as of ASAXP release 3.01.0. These instructions have
    the same syntax as the sextl instruction. They are implemented as
    pseudo instructions for EV4 and EV5. Note that the immediate value
    versions of these instructions are implemented as pseudo instructions,
    with the sign extension being performed at assembly time. 

6.6 Added encodings for the amask and implver instructions.
    These instructions, test the architecture and implementation variants
    of the machine. These instructions are supported on all architectures.

6.6.1     Architecture Mask
    
    Syntax
    	amask $s_reg, $d_reg
    	amask $d_reg/$s_reg
    	amask val_immed, $d_reg
    
    The source register or immediate value represent a mask of architectural 
    extensions requested. Bits corresponding to architectural extensions that
    are present are cleared; reserved bits and bits corresponding to absent
    extensions are copied unchanged. The result is placed into the
    destination register. If the result is zero, all requested features are
    present. Software may specify a source value of all 1's to determine
    the complete set of architectural extensions implemented by a
    processor.  
    
    	Bit		Feature
    	---		----------------------
    	0		Byte/Word instructions are present
    	1..63   Reserved for use by Digital. 
    
    
6.6.2   Implementation Version
    
    Syntax:
    	implver $d_reg
    	
    A small integer is placed into the destination register. The integer
    specifies the major implementation version of the processor on which it
    is executed. This information can be used to make code-scheduling or
    tuning decisions, or it can be used to branch to different pieces of code
    optimized for different implementations. 
    
    	Value		Implementations
    	------		-----------------
    	0		For EV4, EV45, LCA and LCA 45 chips
    			(21064, 21064A, 21066, 21068, and 21066A)
    	1		For EV-5 and EV56 chips
    			(21164, 21164A)
    	2		For EV-6 and derivative chips
    			(21264, etc).
    
6.7 Added EV5 code scheduling. Code scheduling may be altered by setting
    the command line option (as noted in section 6.1) or by using the .tune
    directive (as note in section 6.2). Code scheduling is performed on a basic
    block. In asaxp, an extended basic block is a block of code with a
    single entry point and possibly multiple exit points. In addition,
    the .prologue directive starts a new basic block. The .tune directive
    may start a basic block if the tuning context is changed. Conditional
    branch instructions are included within an extended basic block. This
    enhances the scheduling in cases where the branch is not taken. 

6.8 Example using a single source code block for run-time architecture
    determination. This example uses instructions documented in sections
    5.6, 6.4 and 6.6.1.

    .text
    .tune ev5               # Select 21164 tuning 
                            # Will run ok for 21064
    .arch ev4               # Default EV4 architecture
 #  Prototype:char *cpystr(char *, const char *); 
    .globl  cpystr
    .ent    cpystr
cpystr:
    mov     $16, $0         # Save return value
    amask   1, $1           # Test if hardware ldbu/stb is present
    beq     $1, while1      # If bit 0 == 0, go to second sequence.
    #
    # Generate 2 sequences of instructions. 
    # The first instance generates code for the 21064 and the 
    # early 21164 chips.
    # The second sequence will cause the ldbu/stb hardware
    # instructions to be generated. These are supported by the 
    # 21164-366 and later versions of the 21164 chips.
    # Note that .arch EV56 directive at the end of the sequence
    # will cause the assembler to switch into EV56 mode. 
    .repeat 2
while%r:                    # This is while0 for first instance, 
                            # while1 for second
    ldbu $1, ($17)          # Get source byte 
    stb  $1, ($16)          # Store to destination
    beq  $1, done           # If at EOS, return.
    addq $16, 1, $16        # Next destination
    addq $17, 1, $17        # Next source
    br while%r
    .arch ev56              # Change architecture at end
                            # of repetition
    .endr
done:           
    ret                     # dedemau
    .end    cpystr
     
6.9  Relocation operands
     Relocation operands are generally useful in three situations:

     o  As support for independent expression of the high and low addresses.

     o  In application programs in which the programmer needs
        precise control over scheduling

     o  In source code written for compiler development

     Some macro instructions (for example, lda) require special
     coordination between the machine-code instructions and the
     relocation sequences given to the linker.  By using the macro
     instructions, the assembler programmer relies on the assembler to
     generate the appropriate relocation sequences. 

     In some instances, the use of macro instructions may be undesirable.
     For example, a compiler that supports the generation of assembly
     language files may not want to defer instruction scheduling to the
     assembler.  Such a compiler will want to schedule some or all of the
     machine-code instructions.  To do this, the compiler must have a
     mechanism for emitting an object file's relocation sequences without
     using macro instructions. The mechanism for establishing these
     sequences is the relocation operand.

     A relocation operand is placed after the normal operand on an
     assembly language statement:

   opcode operand relocation_operand

   The syntax of the relocation_operand is as follows:

      !relocation_type! sequence_number or
      ! sequence_number

      relocation_type
         Any one of the following relocation types can be specified:

         Relocation type    Windows NT Description
         ---------------    ----------------------
         braddr             Branch address
         hint               Jump hint
         poffset            Procedure offset
         refhi              High reference. Associated
                            with an ldah instruction, this is
                            normally accompanied by a Pair relocation.
         sechi              Similar to refhi. This is for use by TLS.
         reflo              This is associated with a load, store or
                            lda instruction. The associated
                            instruction may be the target of a pair
                            relocation. 
         seclo              Similar to reflo. This is for use by TLS.
         reflong            Longword relative reference. This is used
                            when the address of a symbol is stored by
                            a .long directive.
         refquad            Quadword relative reference. This is used
                            when the address of a symbol is stored by
                            a .quad directive.
         snum               Section number
         soffset            Section offset

 
    The relocation types must be enclosed within a pair of
    exclamation points (!) and are not case sensitive. Only the refhi,
    sechi, reflo and seclo are useful in asaxp.

    sequence_number
       The sequence number is a positive integer constant with a value range of
       1 to 2147483647. The sequence number may be specified in decimal,
       hex, or octal notations.

    The following examples contain relocation operands in the source code:

   # Equivalent C statement:
   # extern __int64 sym1, sym2;
   # sym1 += sym2;

   # Assembly statements containing macro instructions:
   ldq   $1, sym1
   ldq   $2, sym2
   addq  $1, $2, $3
   stq   $3, sym1

   # Assembly statements containing machine-code instructions
   # requiring relocation operands. In asaxp, and ldah instruction using a
   # symbol without a relocation operand will produce a syntax error. By
   # using the relocation operands, an ldah instruction is eliminated.
   
   ldah  $4, sym1!refhi!1       #     Load high value for sym1.
   ldah  $5, sym2!refhi!2       #     Load high value for sym2.
   ldq   $1, sym1($4)!reflo!1   #     Load sym1. Paired to ldah.
   ldq   $2, sym2($5)!reflo!2   #     Load sym2. Paired to ldah.
   addq  $1, $2, $3
   stq   $3, sym1($4)!reflo!1   #     Store value in sym1 unpaired.

6.10 Support for large tls section. The /QAltls command line option turn on
     large thread local storage for the entire module. Use the relocation
     operand to control references at an instruction level. 

7.0  Features added for release 3.02.0.
     Support for EV6 ECO 84, 87, 88, 90, 96. These new instructions require
     either the .arch ev6 directive or /arch ev6 command line option. 
     These instructions are generated as primitive instructions for all
     architectures. However, when these are unimplemented, the hardware
     will generate an OPDEC trap. Added support for .split directive
     (section 7.6).

7.1  ECO 84 - SQRT instructions.
     SQRTx Fb.rx, Fc.wx
     Major modes:
     VAX Modes:
          SQRTF - F_floating
          SQRTG - G_floating
          Includes qualifiers s, c, u (eg. sqrtgsuc).
     
     IEEE Modes:
          SQRTS - S_floating
          SQRTT - T_floating
          Includes qualifiers d, m, c, s, u, i (eg. sqrtssuic).

7.2  CTPOP/CTLZ/CTTZ     
     CTPOP Rb.rq, Rc.wq
     CTLZ  Rb.rq, Rc.wq
     CTTZ Rb.rq, Rc.wq

7.3  Integer/Floating Register moves.
     FTOIx Fa.rq, Rc.wq
     FTOIS - s_floating to longword.
     FTOIT - t_floating to quadword.

     ITOFx Ra.rq, Fc.wq
     ITOFS - Longword to s_floating
     ITOFF = Longword to f_floating
     ITOFT - Quadword to t_floating

7.4  Instructions for Graphics and Video Algorithms
     PERR Ra.rq, Rb.rq, Rc.wq - Pixel Error

     MINxxx       Ra.rq, Rb.rq, Rc.wq
     MAXxxx       Ra.rq, Rb.rq, Rc.wq

     MAXSB8 - Vector Signed byte Maximum
     MAXSW4 - Vector Signed word Maximum
     MAXUB8 - Vector Unsigned byte Maximum
     MAXUW4 - Vector Unsigned word Maximum
     MINSB8 - Vector Signed byte Minimum
     MINSW4 - Vector Signed word Minimum
     MINUB8 - Vector Unsigned byte Minimum
     MINUW4 - Vector Unsigned word Minimum

     UNPKBx Rb.rq, Rc.wq
     UNPKBL Unpack Bytes to Longwords
     UNPKBW Unpack Bytes to Words

     PKxB Rb.rq, Rc.wq
     PKLB Pack Bytes to Longwords
     PKWB Pack Bytes to Words

7.5  Data Cache Control Instructions
     WH64 (Rb.ab) - Write Hint - 64 bytes
     ECB  (Rb.ab) - Evict Cache Block

7.6  Added support for NTOM and other profiling tools. This includes the
     addition of the .split directive. A related change is that data
     directives (eg. .long), which are placed within a procedure cause the
     assembler to generate an additional procedure descriptor such that
     tools like NTOM may move code segments efficiently.

     .split   proc_name
     The proc_name refers to the procedure name of the associated mainline
     procedure. Proc_name must have been the subject of a previous .ent
     directive. The .split directive differs from the .aent directive in
     that it is part of the referenced procedure, but may not be adjacent
     to that procedure. 

8.0  Features added for 3.03.2

8.1  Support for GEM generated .s files.

8.2  Added .tlscomm directive. 
     .tlscomm       name, expression[, section identifier]

     The .tlscomm directive causes name(unless defined elsewhere) to become
     a global common symbol at the head of a block of at least expression
     bytes of storage within the .TLS$ section. The linker overlays
     like-named common blocks, using the expression value of the largest
     block as the byte size of the overlay. If section identifier is
     selected, it will be appended to .tls$. 

8.3  Added the .comdat directive
     .comdat symbol [ comdat type [ section identifier ]]

     The .comdat directive declares the referenced symbol to be a
     comdat symbol. If specified, the comdat type is one of (case
     insensitive): 
     
           "IMAGE_COMDAT_SELECT_NODUPLICATES",
           "IMAGE_COMDAT_SELECT_ANY",
           "IMAGE_COMDAT_SELECT_SAME_SIZE",
           "IMAGE_COMDAT_SELECT_EXACT_MATCH",
           "IMAGE_COMDAT_SELECT_ASSOCIATIVE",
           "IMAGE_COMDAT_SELECT_LARGEST",
           "IMAGE_COMDAT_SELECT_NEWEST"
	Which correspond to the definitions of the same names in winnt.h.
    The "IMAGE_COMDAT_SELECT_" may be omitted as well as the "_SIZE" 
    or "_MATCH".
    If comdat type is not specified, the comdat type defaults to 
	IMAGE_COMDAT_SELECT_EXACT_MATCH. If the comdat type is 
    IMAGE_COMDAT_SELECT_ASSOCIATIVE, then the section
    identifier is required, and must be the name of a section defined
    within the compilation unit. 

8.4 Added the .drectve directive.
    
    .drectve string [, string]...

    The .drectve directive places linker directives in the the object file
    into the .drectve section. This directive may appear anywhere in the
    source file, and does not change the current section context. The
    string must be enclosed in quotation marks ("). Each string is padded
    by a single space as required by the linker. 

    example:
    .drectve "-defaultlib:libc", "-defaultlib:oldnames"

8.5 Added the .section directive

    .section section-name[,Coff section name ] [section-attribute[, section-attribute]...

    Where section attribute is one of the IMAGE_SCN_  manifest constants
    that are defined in winnt.h. The full attribute name
    (eg. IMAGE_SCN_xxx), or the suffix may be used as shorthand.

    For alignment purposes, the assembler will assume that CNT_CODE or
    MEM_EXECUTE characteristics are text type sections, and will generate
    nops to fill gaps. All other sections are treated as data, and the
    assembler will fill in with binary zeros. The assembler will use
    whatever values are selected for the object file section header, and
    makes no attempt to warn in the event of a conflict.  Section
    attributes are ORed together bitwise.

    It is important that thread-local-storage data sections contain the
    section name beginning with .tls. This affects relocations. However,
    relocations operands can supply the correct relocations.

    The COFF section name is a quoted string containing the name which will
    be placed into the symbol table for this section. The assembler does
    not look at this name. If not supplied, the section name will be the
    asaxp section name.

    If the section directive is used without characteristics, a lookup for
    that name is performed, and the first section in the section table
    matching that name is chosen. If attributes are supplied, the lookup
    will match by both name and attributes. If the assembler is unable to
    find an appropriate match, a new section is created. Also note that
    there are standard section names. The assembler creates empty sections
    with names of, .text, .rdata, .xdata, .pdata, .data, .sdata, .tls$,
    .debug$, .debug$, .drectv, .bss

    examples:
        .section .text           # lookup a section named ".text", 
                                 #  set context to that section

        .section gallagher       # Lookup a section named galagher. 
                                 # if found, the context is changed to the
                                 # gallagher section. If not found, an
                                 # error is generated concerning unknown
                                 # attributes

        .section .text "CNT_CODE", "MEM_EXECUTE", "MEM_WRITE"
                                 # The assembler will seach for a section
                                 # containing these attributes with a name
                                 # .text. If not found, create a .text
                                 # section. The assembler will not make any
                                 # judgements eventhough the attributes
                                 # conflict.
   Attribute name:                                 
    
        IMAGE_SCN_CNT_CODE
        IMAGE_SCN_CNT_INITIALIZED_DATA
        IMAGE_SCN_CNT_UNINITIALIZED_DATA
        IMAGE_SCN_LNK_OTHER
        IMAGE_SCN_LNK_INFO
        IMAGE_SCN_LNK_REMOVE
        IMAGE_SCN_LNK_COMDAT
        IMAGE_SCN_ALIGN_1BYTES
        IMAGE_SCN_ALIGN_2BYTES
        IMAGE_SCN_ALIGN_4BYTES
        IMAGE_SCN_ALIGN_8BYTES
        IMAGE_SCN_ALIGN_16BYTES
        IMAGE_SCN_ALIGN_32BYTES
        IMAGE_SCN_ALIGN_64BYTES
        IMAGE_SCN_ALIGN_1KBYTES
        IMAGE_SCN_MEM_DISCARDABLE
        IMAGE_SCN_MEM_NOT_CACHED
        IMAGE_SCN_MEM_NOT_PAGED
        IMAGE_SCN_MEM_SHARED
        IMAGE_SCN_MEM_EXECUTE
        IMAGE_SCN_MEM_READ
        IMAGE_SCN_MEM_WRITE
        
------------
Trademarks
   CodeView is a trademark of the Microsoft Corporation.
        

