3 ASSEMBLY LANGUAGE

This chapter contains the following sections:

Input Specification
Assembler Significant Characters
Sections
Section Memory Types
Section Attributes
Absolute Sections
Relocatable Section Names
Grouped Sections
Registers
Special Function Registers

3.1 Input Specification

An assembly program consists of zero or more statements, one statement per line. A statement may optionally be followed by a comment, which is introduced by a semicolon character (;) and terminated by the end of the input line. Any source statement can be extended to one or more lines by including the line continuation character (\) as the last character on the line to be continued. The length of a source statement (first line and any continuation lines) is only limited by the amount of available memory. Upper and lower case letters are considered equivalent for assembler mnemonics and directives, but are considered distinct for labels, symbols, directive arguments, and literal strings.

A statement can be defined as:

[label:] [instruction | directive | macro_call] [;comment]

where,

label is an identifier or number. A label does not have to start on the first position of a line, but a label must always be followed by a colon.

instruction is any valid XA assembly language instruction consisting of a mnemonic and one, two, three or no operands. Operands are described in the chapter Operands and Expressions. The instructions are described separately in the chapter Instruction Set.

directive any one of the assembler directives; described separately in the chapter Assembler Directives.

macro_call a call to a previously defined macro. See the chapter Macro Operations.

A statement may be empty.

3.2 Assembler Significant Characters

There are several one character sequences that are significant to the assembler. Some have multiple meanings depending on the context in which they are used. Special characters associated with expression evaluation are described in Chapter 4 , Operands and Expressions. Other assembler-significant characters are:

Individual descriptions of each of the assembler special characters follow. They include usage guidelines, functional descriptions, and examples.

;

Comment Delimiter Character

Any number or characters preceded by a semicolon (;), but not part of a literal string, is considered a comment. Comments are not significant to the assembler, but they can be used to document the source program. Comments will be reproduced in the assembler output listing. Comments are preserved in macro definitions.

Comments can occupy an entire line, or can be placed after the last assembler-significant field in a source statement. The comment is literally reproduced in the listing file.

Examples:

; This comment begins in column 1 of the source file
Loop: CALL COMPUTE    ; This is a trailing comment
                      ; These two comments are preceded
                      ; by a tab in the source file

\

Line Continuation Character or
Macro Dummy Argument Concatenation Operator

Line Continuation

The backslash character (\), if used as the last character on a line, indicates to the assembler that the source statement is continued on the following line. The continuation line will be concatenated to the previous line of the source statement, and the result will be processed by the assembler as if it were a single line source statement. The maximum source statement length (the first line and any continuation lines) is 512 characters.

Example:

; THIS COMMENT \
EXTENDS OVER \
THREE LINES

Macro Argument Concatenation

The backslash (\) is also used to cause the concatenation of a macro dummy argument with other adjacent alphanumeric characters. For the macro processor to recognize dummy arguments, they must normally be separated from other alphanumeric characters by a non-symbol character. However, sometimes it is desirable to concatenate the argument characters with other characters. If an argument is to be concatenated in front of or behind some other symbol characters, then it must be followed by or preceded by the backslash, respectively.

See also section 5.5.1 .

Example:

Suppose the source input file contained the following macro definition:

SWAP_MEM   MACRO  REG1,REG2      ;swap memory contents
     XCH   \REG1, \REG2
     ENDM

The concatenation operator (\) indicates to the macro processor that the substitution characters for the dummy arguments are to be concatenated in both cases with the character I. If this macro were called with the following statement,

the resulting expansion would be:

?

Return Value of Symbol Character

The ?symbol sequence, when used in macro definitions, will be replaced by an ASCII string representing the value of symbol. This operator may be used in association with the backslash (\) operator. The value of symbol must be an integer.

See also section 5.5.2 .

Example:

Consider the following macro definition:

SWAP_MEM  MACRO  REG1,REG2       ;swap memory contents
     XCH  R\?REG1, R\?REG2
     ENDM

If the source file contained the following SET statements and macro call,

AREG SET       1
BREG SET       2
     SWAP_MEM  AREG,BREG

the resulting expansion as it would appear on the source listing would be:

     XCH  R1, R2

%

Return Hex Value of Symbol Character

The %symbol sequence, when used in macro definitions, will be replaced by an ASCII string representing the hexadecimal value of symbol. This operator may be used in associations with the backslash (\) operator. The value of symbol must be an integer.

See also section 5.5.3 .

Example:

Consider the following macro definition:

GEN_LAB    MACRO  LAB,VAL,STMT
LAB\%VAL:  STMT
     ENDM

If this macro were called as follows,

NUM  SET      10
     GEN_LAB  HEX,NUM,'NOP'

The resulting expansion as it would appear in the listing file would be:

HEXA: NOP

^

Macro Local Label Character

The circumflex (^), when used as a unary operator in a macro expansion, will cause name mangling of any associated local label. Normally, the macro preprocessor will leave any local label inside a macro expansion to a normal label in the current module. By using the Local Label character (^), the label is made a unique label. This is done by removing the leading underscore and appending a unique string "__M_Lxxxxxx" where "xxxxxx" is a unique sequence number. The ^-operator has no effect outside of a macro expansion. The ^-operator is useful for passing label names as macro arguments to be used as local label names in the macro. Note that the circumflex is also used as the binary exclusive or operator.

See also section 5.5.5 .

Example:

Consider the following macro definition:

LOAD MACRO ADDR
     ADDR:
          MOV  R0, ADDR
     ^ADDR:
          MOV  R0, ^ADDR
     ENDM

If this macro were called as follows,

     LOAD _LOCAL

the resulting expansion as it would appear in the listing file would be:

_LOCAL:
     MOV  R0, _LOCAL
_LOCAL__M_L000001:
     MOV  R0, _LOCAL__M_L000001

"

Macro String Delimiter or
Quoted String DEFINE Expansion Character

Macro String

The double quote ("), when used in macro definitions, is transformed by the macro processor into the string delimiter, the single quote ('). The macro processor examines the characters between the double quotes for any macro arguments. This mechanism allows the use of macro arguments as literal strings.

See also section 5.5.4 .

Example:

Using the following macro definition,

CSTR  MACRO     STRING
      ASCII     "STRING"
      ENDM

and a macro call,

      CSTR      ABCD

the resulting macro expansion would be:

      ASCII     'ABCD'

Quoted String DEFINE Expansion

A sequence of characters which matches a symbol created with a DEFINE directive will not be expanded if the character sequence is contained within a quoted string. Assembler strings generally are enclosed in single quotes ('). If the string is enclosed in double quotes (") then DEFINE symbols will be expanded within the string. In all other respects usage of double quotes is equivalent to that of single quotes.

Example:

Consider the source fragment below:

     DEFINE LONG  'short'
STR_MAC     MACRO STRING
     MSG    'This is a LONG STRING'
     MSG    "This is a LONG STRING"
     ENDM

If this macro were invoked as follows,

     STR_MAC  sentence

then the resulting expansion would be:

     MSG    'This is a LONG STRING'
     MSG    'This is a short sentence'

@

Function Delimiter

All assembler built-in functions start with the @ symbol. See section 4.5 for a full discussion of these functions.

Example:

SVAL EQU @ABS(VAL)     ; Obtain absolute value

$

Location Counter Substitution

When used as an operand in an expression, the asterisk represents the current integer value of the run-time location counter.

Example:

      CSEG  AT 100H
XBASE EQU   $+20H           ; XBASE = 120H

[]

Location Addressing Mode Operator

Square brackets are used to indicate to the assembler to use a location addressing mode.

Example:

MOV R0, [_Value]

#

Immediate Addressing Mode

The pound sign (#) is used to indicate to the assembler to use the immediate addressing mode.

Example:

CNST EQU  5H
     MOV  R0, #CNST           ;Load R0 with the value 5H

3.3 Sections

Sections group logical pieces of code or data. Each section has a memory type and optionally some properties. There are two types of sections: relocatable sections and absolute sections. The next paragraphs explain the use of sections in more detail.

3.3.1 Section Memory Types

The section memory type specifies the address space where the section will reside. For relocatable sections the memory type is specfied with a type specifier following the SEGMENT directive. For absolute sections the memory type is implied by the directive that initiates the absolute section. Valid section memory types are:

Type specifier
relocatable
section
Directive
absolute
section


Description
BIT BSEG bit data space
CODE CSEG code space
DATA DSEG direct addressable data space
IDATA ISEG indirect addressable data space
XDATA XSEG external data space
BITADDR DBSEG bitaddressable DATA (same as DATA BITADDRESSABLE)
HCODE HCSEG huge code space
EDATA ESEG SmartXA EEPROM Data Memory for SmartXA only
HDATA HSEG huge indirect addressable data space
XSHORT XSSEG first page of external (movx) data memory (same as XDATA SHORT)

Table 3-1: Section memory types

The first group of memory types listed above are the TASKING 8051 compatible memory spaces, the second group lists the extended XA memory spaces.

3.3.2 Section Attributes

The optional section attributes define the properties of the section. Depending on the memory type of the section an attribute is or is not allowed. Possible attributes are:

Section attribute Description Allowed on
BITADDRESSABLE Specifies a section to be relocated within the bit space on a byte boundary. The section size is limited to 32 bytes.
DATA BITADDRESSABLE is equivalent to BITADDR.
DATA
SHORT With this attribute, the locator allocates the section in the first 64K of external data memory. XDATA
PAGE Ignored, TASKING C51 compatibility CODE
HCODE
XDATA
HDATA
INPAGE Ignored, TASKING C51 compatibility CODE
XDATA
INBLOCK Ignored, TASKING C51 compatibility CODE
INSEGMENT Specifies a section which must be contained in a 64K-byte page (a segment). HCODE
HDATA
UNIT A default attribute: the section will not be aligned. all
NOCLEAR Specifies that the section is not to be cleared at program startup. This is a default attribute. all
CLEAR Specifies that the section is to be cleared at program startup. DATA
IDATA
XDATA
XSHORT
HDATA
BIT
BITADDR
INIT Specifies that the section contains initialized data. The initial data is copied from ROM to RAM at program startup. DATA
IDATA
XDATA
XSHORT
HDATA
BIT
BITADDR
OVERLAY Specifies that the section may be overlaid by another overlayable section. An overlayable section implicitly gets the NOCLEAR attribute. No overlaying will be done if the OVERLAY attribute is omitted. DATA
EDATA
IDATA
XDATA
XSHORT
HDATA
BIT
BITADDR
ROMDATA Specifies that the section contains initialized data. CODE
HCODE
DATA
IDATA
HDATA
JOIN Group sections. all

Table 3-2: Section attributes

For absolute sections only the attributes NOCLEAR, CLEAR, INIT and ROMDATA are allowed.

The section attributes can be divided in the following two groups:

From each group you can specify one attribute at the most. The attributes UNIT and NOCLEAR are the default attributes. An OVERLAY attribute cannot be combined with a PAGE, INPAGE, INBLOCK or INSEGMENT attribute. A section with an OVERLAY attribute implicitly also has a NOCLEAR attribute. A section with a ROMDATA attribute implicitly also has an INIT attribute.

3.3.3 Absolute Sections

An absolute section directive switches to an absolute section. Of the section attributes mentioned in the previous paragraph only NOCLEAR, CLEAR, INIT, INTSEGMENT and ROMDATA are allowed on an absolute section. An absolute section can be declared with or without a name.

For all absolute sections also the AT attribute is allowed. The expression following 'AT' defines the start address of the absolute section. If no attributes are specified then the absolute section will continue the last absolute section with the same memory type. If the AT attribute is not specified and the other attributes do not match the attributes of the last absolute section with the same memory type, then a new section is created starting at the first free address following that section. When the absolute section without AT attribute is the first absolute section with that memory type then the section will start at the first valid address for that memory type, that is zero for all sections.

3.3.4 Relocatable Section Names

The assembler generates object files in relocatable IEEE-695 object format. The assembler groups units of code and data in the object file using sections. All relocatable information is related to the start address of a relocatable section. The locator assigns absolute addresses to sections. A section is the smallest unit of code or data that can be moved to a specific address in memory after assembling a source file.

A relocatable section must be declared before it can be used. The SEGMENT pseudo declares a section with its attributes. A section name can be any identifier. The '@' character is not allowed in regular section names. The assembler and linker use this character to create overlayable or joined sections. This is explained below.

You can group sections together with the JOIN attribute. For example, when more sections have to be located within the same data page, you can use this attribute.

A section becomes overlayable by specifying the OVERLAY attribute. Because it is useless to initialize overlaid sections at program startup time (code using overlaid data cannot assume that the data is in the defined state upon first use), the NOCLEAR attribute is defined implicitly when OVERLAY is specified. Overlayable section names are composed as follows:

The linker overlays sections with the same pool name. To decide whether sections can be overlaid, the linker builds a call graph. Data in sections belonging to functions that call each other cannot be overlaid. The compiler generates pseudo instructions ( CALLS ) with information for the linker to build this call graph. The CALLS pseudo has the following syntax:

If the function main() has overlayable data allocations in the zero page and calls nfunc(), the following sections and call information will be generated:

If a section declaration contains the OVERLAY attribute and the section name does not contain exactly one '@' character, the assembler will report an error.

3.3.5 Grouped Sections

When you have to group sections together, you can use the JOIN section attribute. For example, when two data sections have to be located within the same range, you can write this as follows:

and for the second section:

Note that sections are grouped by the extension used in the section name. So, the definition is:

Combining the JOIN and OVERLAY attributes gives the following result:

3.4 Registers

The XA architecture permits bit, byte, word and -in a few cases- double word access to the registers.

A register name convention is introduced to enable the assembler to use generic instructions. The assembler deduces a hardware instruction from the generic mnemonic by interpreting the size of the operands. So if register addressing is used the register name should indicate the register's size.

A word register is composed of two byte registers. The low order byte of a word register is identified with an 'L' postfix, e.g., R0L. The high order byte of a word register is identified with an 'H' postfix, e.g., R0H. Registers R0..R7 are byte addressable.

Word registers are named R0..R15. The assembler, just like the baseline XA core, only supports registers R0..R7, registers R8..R15 are not implemented.

Double word registers are composted of two adjacent registers. Valid combinations are R1:R0, R3:R2, R5:R4 or R7:R6. A double word register is referenced by adding the postfix 'D' to the low order register, i.e., R0D.

Examples:

There are four different instances of registers R0 through R3. Of these four banks only one bank can be active at any given time, referenced as R0 through R3. The contents of the other banks are inaccessible. PSW bits RS1 and RS0 select the active register bank:

RS1 RS0 register bank
0 0 bank 0
0 1 bank 1
1 0 bank 2
1 1 bank 3

Table 3-3: Register bank selection

3.4.1 Special Function Registers

The XA SFRs are not mapped in the address space as with data memory. SFRs have control functions associated with them. For example, an SFR could control and/or provide status information of an on-chip peripheral. All SFRs reside in a 64 byte region starting at location 400H. The SFR space is both byteaddressable and bitaddressable.


Copyright © 2000 TASKING, Inc.