K IEEE-695 OBJECT FORMAT

This appendix contains the following sections:

TIOF and IEEE-695
Command Language Concept
Notational Conventions
Expressions
Expressions
Functions without Operands
Monadic Functions
Dyadic Functions and Operators
MUFOM Variables
@INS and @EXT Operator
Conditional Expressions
MUFOM Commands
Module Level Commands
MB Command
ME Command
DT Command
AD Command
Comment and Checksum Command
Sections
SB Command
ST Command
SA Command
Symbolic Name Declaration and Type Definition
NI Command
NX Command
NN Command
AT Command
TY Command
Value Assignment
AS Command
Loading Commands
LD Command
IR Command
LR Command
RE Command
Linkage Commands
RI Command
WX Command
LI Command
LX Command
MUFOM Functions

1 TIOF and IEEE-695

The IEEE-695 standard describes MUFOM: Microprocessor Universal Format for Object Modules. It defines a target independent storage standard for object files. However, this standard does not describe how symbolic debug information should be encoded according to that standard. Symbolic debug information can be a part of an object file. A debugger which reads an object file uses the symbolic debug information to obtain knowledge about the relation between the executable code and the origination high-level language source files. Since the IEEE-695 standard does not describe the representation of debug information, working implementations of this standard show vendor specific and microprocessor specific solutions for this area.

TIOF, which stands for Target Independent Object Format, is specified as a MUFOM based standard including the representation of symbolic debug information for high-level languages, without introducing the microprocessor dependent solutions. The current version of the TASKING debugger is not yet prepared to read TIOF, so you will have to select IEEE-695 as output format of the locator when you want to debug a program.

Since TIOF and IEEE-695 both use the MUFOM concept as their basis both formats are very similar to each other.

2 Command Language Concept

Most object formats are record oriented: there are one or more section headers at a fixed position in the file which describe how many sections are present. A section header contains information like start address, file offset, etc. The contents of the section is in some data part, which can only be processed after the header has been read. So the tool that reads such an object uses implicit assumptions how to process such a file. Seeking through the file to get those records which are relevant is usual.

MUFOM ( IEEE-695 ) uses a different approach. It is designed as a command language which steers the linker, locator and object reader in the debugger.

An assembler or compiler may create an object module where most of the data contained in it is relocatable. The next phase in the translation process is linking several object modules into one new object module. A relocatable object uses relocation expressions at places where the absolute values are not yet known. An expression evaluator in the locator transforms the relocation expressions into absolute values.

Finally the object is ready for loading into memory. Since an object file is transformed by several processes, MUFOM implements an object file as a sequence of commands which steers this transformation process.

These commands are created, executed or copied by one of five processes which act on a MUFOM object file:

1. Creation process
Creation of the object file by an assembler or compiler. The assembler or compiler tells other MUFOM processes what to do, by emitting commands generated from assembly source text or a high-level language.

2. Linkage process
Linking of several object modules into one module resolving external references by renaming X variables into I variables, and by generating new commands (assigning of R variables).

3. Relocation process
Relocation, giving all sections an absolute address by assigning their L variable.

4. Expression evaluation process
Evaluation of loader expressions, generated in one of the three previously mentioned MUFOM processes.

5. Loader process
Loading the absolute memory image.

The last four processes are in fact command interpreters: the assembler writes an object file which is basically a large sequence of instructions for the linker. For example, instead of writing the contents of a section as a sequence of bytes at a specific position in the file, IEEE-695 defines a load command, LR, which instructs the linker to load a number of bytes. The LR command specifies the number of MAUs (minimum addressable unit) that will be relocated, followed by the actual data. This data can be a number of absolute bytes, or an expression which must be evaluated by the linker.

Transforming relocation expressions into new expressions or absolute data and combining sections is the actual linkage process.

It is possible that one or more of the above MUFOM processes are combined in one tool. For instance, the locator is built from process 3 and process 4 above.

3 Notational Conventions

The following conventions are used in this appendix:

| select one of the items listed between '|'

" " literal characters are between " "

[ ]+ optional item repeats one time or more

[ ]? optional item repeats zero times or one time

[ ]* optional item repeats zero times or more

::= can be read as "is defined as"

4 Expressions

An expression in an IEEE-695 file or a TIOF file is a combination of variables, operators and absolute data.

The variable name always starts with a non-hexadecimal letter (G...Z), immediately followed by an optional hexadecimal number. The first non-hexadecimal letter gives the class of the variable. Reading an object file you encounter the following variables:

G - Start address of a program. If not assigned this address defaults to the address of low-level symbol _start.

I - An I variable represents a global symbol in an object module.

L - Start address of a section. This variable is only used for absolute sections. The 'L' is followed by a section index, which is an hexadecimal number. L variables are created by an assignment command, but the section index must have been been defined by an ST command.

N - Name of internal symbol. This variable is used to assign values of local symbols, or, to build complex types for use by a high-level language debugger, or for inter-modular type checking during linkage. The N variable is created with a NN command.

P - Program pointer per section. This variable always contains the current address of the target memory location. The P variable is followed by a section index, which is a hexadecimal number. The section index must have been defined with an ST command (section type command). The variable is created after its first assignment.

R - The R type variable is a relocation reference for a particular section. All references to addresses in this section must be made relative to the R variable. Linking is accomplished by assigning a new value to R. The R variable consists of the letter 'R', followed by an section index, which is a hexadecimal number. The section index must have been defined with an ST command. The default value of an (unassigned) R variable is 0.

S - The S type variable is the section size (in MAUs) for a section. There is one S variable per section. The 'S' is followed by an section index. An S variable is created by its first assignment.

W - Work variable. This type of variable can be used to assign values to, which can be used in following MUFOM commands. They serve the purpose of maintaining values in a workspace without any additional meaning. A work variable consists of the letter 'W' followed by a hexadecimal number. W variables are created by their first assignment.

X - An X type variable refers to an external reference. X-variables cannot have a value assigned to it. An X variable consists of the letter 'X' followed by a hexadecimal number.

The MUFOM language uses the following data types to form expressions:

digit ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"

hex_letter ::= "A" | "B" | "C" | "D" | "E" | "F"

hex_digit ::= digit | hex_letter

hex_number ::= [ hex_digit ]+

nonhex_letter ::= "G" | "H" | "I" | "J" | "K" | "L" | "M" | "N" |
"O" | "P" | "Q" | "R" | "S" | "T" | "U" | "V" |
"W" | "X" | "Y" | "Z"

letter ::= hex_letter | nonhex_letter

alpha_num ::= letter | digit

identifier ::= letter [ alpha_num ]*

character ::= 'value valid within chosen character set'

char_string_length ::= hex_digit hex_digit

char_string ::= char_string_length [ character ]*

The numeric value specified in 'char_string_length' should be followed by an equal number of characters.

Expressions may be formed out of immediate numbers and MUFOM variables. The MUFOM processes 2 to 4, which form the linker and the locator, contain expression evaluators which parse and calculate the values for the expressions. If a MUFOM process cannot calculate the absolute value of an expression, because the values of the variable are not yet known, it copies the expression (with modifications) into the output file.

Expression are coded in reverse Polish notation. (The operator follows the operands.)

expression ::= boolean_function |
one_operand_function |
two_operand_function |
three_operand_function |
four_operand_function |
conditional_expr | hex_number | MUFOM_variable

4.1 Functions without Operands

@F : false function

@T : true function

The false and true function produce a boolean result false or true which may be used in logical expressions. Both functions do not have operands.

4.2 Monadic Functions

Monadic functions have one operand which precedes the function.

@ABS: returns the absolute value of an integer operand

@NEG: returns the negative value of an integer operand

@NOT: returns the negation of a boolean operand or the one's complement value if the operand is an integer

@ISDEF: returns the logical true value if all variable in an expression are defined, return false otherwise.

4.3 Dyadic Functions and Operators

Dyadic functions and operators have two operands which precede the operator or function.

@AND: returns boolean true/false result of logical 'and' operation on operands, when both operands are logical values. When both operands are not logical values the bitwise and is performed.

@MAX: compares both operands arithmetically and returns the largest value.

@MIN: compares both operands arithmetically and returns the smallest value.

@MOD: returns the modulo result of the division of operand1 by operand2. The result is undefined if either operand is negative, or if operand2 is zero.

@OR: returns boolean true/false result of logical 'or' operation on operands, when both operands are logical values. When both operands are no logical values the bitwise and is performed.

+, -, *, /: These are the arithmetic operators for addition, subtraction, multiplication and division. The result is an integer. For division the result is undefined if operand2 equals zero. The result of a division rounds toward zero.

<, >, =, #: These are operators for the following logical relations: 'less than', 'greater than', 'equals', 'is unequal'. The result is true or false.

4.4 MUFOM Variables

The meaning of the MUFOM variable is explained in section 4. The following syntax rules apply for the MUFOM variables:

4.5 @INS and @EXT Operator

The @INS operator inserts a bit string.

operand2 is inserted in operand1 starting at position operand3, and ending at position operand4.

The @EXT operator extracts a bit string.

A bit string is extracted from operand1 starting at position operand2 and ending at position operand3.

4.6 Conditional Expressions

conditional_expr ::= err_expr | if_else_expr

err_expr ::= value "," condition "," err_num "," "@ERR"

value ::= expression

condition ::= expression

err_num ::= expression

if_else_expr ::= condition "," "@IF" "," expression ","
"@ELSE" "," expression "," "@END"

5 MUFOM Commands

5.1 Module Level Commands

At module level there are four commands: one command to start and one to end a module, one command to set the date and time of creation of the module, and one command to specify address formats.

5.1.1 MB Command

The MB command is the first command in a module. It specifies the target machine configuration and an optional command with the module name.

Example: MB XA.

5.1.2 ME Command

The module end command is the last command in an object file. It defines the end of the object module.

5.1.3 DT Command

The DT command sets the date and time of creation of an object module.

Example: DT19930120120432.

The format of display of the date and time is "YYYYMMDDHHMMSS":

4 digits for the year, 2 digits for the month, 2 digits for the day, 2 digits for the hour, 2 digits for the minutes and 2 digits for the seconds.

5.1.4 AD Command

The AD command specifies the address format of the target execution environment.

AD_command ::= "AD" bits_per_MAU [ "," MAU_per_address
[ "," order ]? ]?

MAU_per_address ::= hex_number

bits_per_MAU ::= hex_number

order ::= "L" | "M"

MAU stands for minimum addressable unit. This is target processor dependant.

L means least significant byte at lowest address ( little endian )
M means most significant byte at lowest address ( big endian )

Example:

Specifies a 3-byte addressable 8-bit processor running in little endian mode.

5.2 Comment and Checksum Command

The comment command offers the possibility to store information in an object module about the object module and the translators that created it. The comment may be used to record the file name of the source file of the object module or the version number of the translator that created it. Because the standard supports several layers each of which has its own revision number an object module may contain several comment commands which specify which revision of the standard has been used to create the module. The contents of a comment is not prescribed by the standard and thus it is implementation defined how a MUFOM process handles a comment command.

The comment levels 0 - 6 are reserved to pass information about the revision number of the layers in this standard.

The checksum command starts and checks the checksum calculation of an object module.

5.3 Sections

A section is the smallest unit of code or data that can be controlled separately. Each section has a unique number which is introduced at the first section begin (SB) command. The contents of a section may follow its introduction. A section ends at the next SB command with a number different from the current number. A section resumes at an SB command with a number that has been introduced before.

5.3.1 SB Command

The maximum number of sections in an object module is implementation defined.

5.3.2 ST Command

The ST command specifies the type of a section.

ST_command ::= "ST" section_number [ "," section_type ]*
[ "," section_name ]? "."

section_type ::= letter

section_name ::= char_string

A section can be named or unnamed. If section_name is omitted a section is unnamed. A section can be relocatable or absolute. If the section start address is an absolute number the section is called absolute. If the section start address is not yet known, the section is called relocatable. In relocatable sections all addresses are specified relative to the relocation base of that section. The relocation phase of the linker or locator may map the relocation base of a section onto a fixed address.

During linkage edition the section name and the section attributes identify a section and thus the actions to be taken. If a section is defined in several modules, the linkage editor must determine how to act on sections with the same name. This can be either one of the following strategies:

A section type gives additional information to the linkage editor about the section, which may be used to layout a section in memory. Section type information is encoded with letters, which may be combined in one ST command. Some combinations of letters are invalid or may be meaningless.

letter meaning class explanation
A absolute access section has absolute address assigned to corresponding L-variable
R read only access no write access to this section
W writable access section may be read and written
X executable access section contains executable code
Z zero page access if target has zero page or short addressable page Z-section map into it
Ynum addressing mode access section must be located in addressing mode num
B blank access section must be initialized to '0' (cleared)
F not filled access section is not filled or cleared (scratch)
I initialize access section must be initialized in rom
E equal overlap if sections in two modules have different length an error must be raised
M max overlap Use largest value as section size
U unique overlap The section name must be unique
C cumulative overlap Concatenate sections if they appear in several modules. The section alignment for partial section must be preserved
O overlay overlap sections with the name name@func must be combined to one section name, according to the rules for func obtained from the call graph
S separate overlap multiple sections can have the same name and they may relocated at unrelated addresses
N now when section is located before normal sections (without N or P)
P postpone when section is located after normal sections (without N or P)

Table K-1: Section types

5.3.3 SA Command

SA_command ::= "SA" section_number "," [MAU_boundary ]?
[ "," page_size ]? ".'

MAU_boundary ::= expression

page_size ::= expression

The MAU boundary value forces the relocator to align a section on the number of MAUs specified. If page_size is present the relocator checks that the section does not exceed a page boundary limit when it is relocated.

5.4 Symbolic Name Declaration and Type Definition

5.4.1 NI Command

The NI command defines an internal symbol. An internal symbol is visible outside the module. Thus it may resolve an undefined external in another module.

The NI_command must precede any reference to the I_variable in a module. There may not be more than one I_variable with the same name or number.

5.4.2 NX Command

The NX command defines an external symbol which is undefined in the current module. The NX command must precede all occurrences of the corresponding X variable.

The unresolved reference corresponding to an NX-command can be resolved by an internal symbol definition ( NI_command ) in another module.

5.4.3 NN Command

The NN command defines a local name which may be used for defining a name of a local symbol in a module or a name in a type definition.

A name defined with an NN command is not visible outside the scope of the module. The NN command must precede all occurrences of the corresponding N variable.

5.4.4 AT Command

The attribute command may be used to define debugging related information of a symbol, such as the symbol type number. Level 2 of the standard does not prescribe the contents of the optional fields of the AT command. The language dependent layer (level 3) describes how these fields can be used to pass high-level symbol information with the AT command.

AT_command ::= "AT" variable "," type_table_entry [ "," lex_level
[ "," hex_number ]* ]? "."

variable ::= I_variable | N_variable | X_variable

type_table_entry ::= hex_number

lex_level ::= hex_number

The type_table entry is a type number introduced with a type command (TY). References to type numbers in the AT command may precede the definition of the type in the TY command.

The meaning of the lex_level field is defined at layer 3 or higher. The same applies to the optional hex_number fields.

5.4.5 TY Command

The TY-command defines a new type table entry. The type number introduced by the type command can be seen as a reference index to this type. The TY-command defines the relation between the newly introduced type and other types that are defined in other places in the object module. It also establishes a relation between a new type index and symbols (N_variable).

TY_command ::= "TY" type_table_entry [ "," parameter ]+ "."

type_table_entry ::= hex_number

parameter ::= hex_number | N_variable | "T" type_table_entry

Level 2 does not define the semantics of the parameters. These are defined at level 3, the language layer. A linkage editor which does not have knowledge of the semantics of the parameter in a type command can still perform type comparison: Two types are considered to compare equal when the following conditions hold:

Variable N0 is supposed to compare equal to any other name.

Type table entry T0 is supposed to compare equal to any other type.

5.5 Value Assignment

5.5.1 AS Command

The assignment command assigns a value to a variable.

5.6 Loading Commands

The contents of a section is either absolute data (code) or relocatable data (code). Absolute data can be loaded with the LD command. The address where loading takes place depends on the value of the P-variable belonging to the section. Data which is contiguous in a LD command is supposed to be loaded contiguously in memory.

If data is not absolute it contains expressions which must be evaluated by the expression evaluator. The LR command allows a relocation expression to be part of the loading command.

5.6.1 LD Command

The constants loaded with the LD command are loaded with the most significant part first.

5.6.2 IR Command

A relocation base is an expression which can be associated with a relocation letter. This relocation letter can be used in subsequent load relocate commands.

IR_command ::= "IR" relocation_letter "," relocation_base
[ "," number_of_bits ]? "."

relocation_letter ::= nonhex_letter

relocation_base ::= expression

number_of_bits ::= expression

Example:

The number_of_bits must be less than or equal to the number of bits per address, which is the product of the number of MAUs per address and the number of bits per MAU, both of which are specified in the AD command. If the number_of_bits is not specified it equals the number of bits per address.

5.6.3 LR Command

LR_command ::= "LR" [ load_item ]+ "."

load_item ::= relocation_letter offset "," | load_constant |

"(" expression [ "," number_of_MAUs ]? ")"

load_constant ::= [ hex_digit ]+

number_of_MAUs ::= expression

Examples:

The first example shows immediate constants which may be loaded as a part of an LR command.

The second example shows the use of the relocation base defined in the previous paragraph, followed by a constant.

The third example shows how the value of the expression R2 + 100 is used to load 4 MAUs.

The three commands in this example may be combined into one LR command:

5.6.4 RE Command

The replicate command defines the number of times a LR command must be replicated:

The LR command must immediately follow the RE command.

Example:

The commands above load 16 MAUs: 4 times the 4 MAU value of R2 + 200

5.7 Linkage Commands

5.7.1 RI Command

The retain internal symbol command indicates that the symbolic information of an NI command must be retained in the output file.

5.7.2 WX Command

The weak external command flags a previously defined external (NX_command) as weak. This means that if the external remains unresolved, the value of the expression in the WX command is assigned to the X variable.

5.7.3 LI Command

The LI command specifies a default library search list. The library names specified in the LI_command are searched for unresolved references.

5.7.4 LX Command

The LX command specifies a library to search for a named unresolved variable.

The paragraphs above showed the commands and operators as ASCII strings. In an object file they are binary encoded. The following tables show the binary representation.

6 MUFOM Functions

The following table lists the first byte of MUFOM elements. Each value between 0 and 255 classifies the MUFOM language element that follows, or it is a language element itself. E.g. numbers outside the range 0-127 are preceded by a length field: 0x82 specifies that a 2 byte integer follows. 0xE4 is the function code for the LR command.

Overview of first byte of MUFOM language elements

Value Description
0x00 - 0x7F Start of regular string, or one byte numbers ranging from 0 - 127
0x80 Code for omitted optional number field
0x81 - 0x88 Numbers outside the range 0 - 127
0x89 - 0x8F Unused
0x90 - 0xA0 User defined function codes
0xA0 - 0xBF MUFOM function codes
0xC0 Unused
0xC1 - 0xDA MUFOM letters
0xDB - 0xDF Unused
0xE0 - 0xF9 MUFOM commands
0xFA - 0xFF Unused

Table K-2: Overview of first byte of MUFOM language elements

Binary encoding of MUFOM letters and function codes

Function code Identifiers
Function code Letter code
@F 0xA0
@T 0xA1 A 0xC1
@ABS 0xA2 B 0xC2
@NEG 0xA3 C 0xC3
@NOT 0xA4 D 0xC4
+ 0xA5 E 0xC5
- 0xA6 F 0xC6
/ 0xA7 G 0xC7
* 0xA8 H 0xC8
@MAX 0xA9 I 0xC9
@MIN 0xAA J 0xCA
@MOD 0xAB K 0xCB
< 0xAC L 0xCC
> 0xAD M 0xCD
= 0xAE N 0xCE
!= <> 0xAF O 0xCF
@AND 0xB0 P 0xD0
@OR 0xB1 Q 0xD1
@XOR 0xB2 R 0xD2
@EXT 0xB3 S 0xD3
@INS 0xB4 T OxD4
@ERR 0xB5 U 0xD5
@IF 0xB6 V 0xD6
@ELSE 0xB7 W 0xD7
@END 0xB8 X 0xD8
@ISDEF 0xB9 Y 0xD9
Z 0xDA

Table K-3: Binary encoding of MUFOM letters and function codes

MUFOM Command codes

Command Code Description
MB 0xE0 Module begin
ME 0xE1 Module end
AS 0xE2 Assign
IR 0xE3 Inititialize relocation base
LR 0xE4 Load with relocation
SB 0xE5 Section begin
ST 0xE6 Section type
SA 0xE7 Section alignment
NI 0xE8 Internal name
NX 0xE9 External name
CO 0xEA Comment
DT 0xEB Date and time
AD 0xEC Address description
LD 0xED Load
CS (with sum) 0xEE Checksum followed by sum value
CS 0xEF Checksum (reset sum to 0 )
NN 0xF0 Name
AT 0xF1 Attribute
TY 0xF2 Type
RI 0xF3 Retain internal symbol
WX 0xF4 Weak external
LI 0xF5 Library search list
LX 0xF6 Library external
RE 0xF7 Replicate
SC 0xF8 Scope definition
LN 0xF9 Line number
0xFA Undefined
0xFB Undefined
0xFC Undefined
0xFD Undefined
0xFE Undefined
0xFF Undefined

Table K-4: MUFOM Command codes


Copyright © 2000 TASKING, Inc.