Assemblers (3)
Machine-Independent Assembler Features
• There are some common assembler features that are not closely related to machine architecture
– The presence or absence of such features is much more closely related to issues such as programmer convenience and software environment than it is to machine architecture
• 5 features are introduced in our textbook (Chapter 2.3)
Literals
- 상수 값 자체를 바로 쓸 수 있도록!
• It is often convenient for the programmer to be able to write the value of a constant operand as a part of the instruction that uses it.
– This avoids having to define the constant elsewhere in the program and make up a label for it.
– Such an operand is called a “literal” because the value is stated literally in the instruction.
• A literal is identified with the prefix “=“, which is followed by a specification of the literal value, using the same notation as in the BYTE statement.
Literals vs. Immediate Operand
- 프로그래머 입장에서는 유사, 그러나 실제 동작은 완전히 다름!
• It is important to understand the difference between a literal and an immediate operand
• With immediate addressing, the operand value is assembled as part of the machine instruction.
– The immediate value is within the machine instruction itself.
– [e.g.] Line 55 in Fig. 2.10: LDA # 3 010003 (오브젝트 코드에 해당 상수 값 그대로 들어감)
• With a literal, the assembler generates the specified value as a constant at some other memory location.
– The address of this generated constant is used as the target address for the machine instruction.
– The literal value is obtained from data memory.
– [e.g.] Line 45 in Fig. 2.10: ENDFIL LDA =C`EOF` 032010 (오브젝트 코드에 들어가지 않음, 실제로는 assembler에 의해 메모리에 저장됨.)
Literal Pool
- 상수를 저장하는 pool, 이 곳에 접근하여 상수에게 접근
- Literal Value들이 저장되는 메모리 영역
• Assembler collects all the literal operands used in a program into one or more literal pools
• Default location is at the end of the program (default는 프로그램 끝에 만들어짐-> END DIRECTIVE 후에!)
– A literal pool listing is shown in Fig.
2.10 immediately following the END statement for better code reading.
In this case, the pool consists of the single literal =X’05’.
• In some cases, however, a programmer can declare a place (i.e., at some other location in the object program) – By using the assembler directive LTORG (Line 93 in Fig. 2.10)
: instruction에 가까운 곳에 literal pool 생성 , Format 3로도 접근 가능하도록!
– When the assembler encounters a LTORG statement, it creates a literal pool that contains all of the literal operands used since the previous LTORG (or the beginning of the program).
: LTORG 를 만나기 전까지의 모든 literal value들은 해당 literal pool에 저장됨.
– To keep the literal operand close to the instruction that uses it.
Duplicate Literals
• Most assemblers recognize duplicate literals (= the same literal used in more than one place in the program), and store only one copy of the specified data value
– For example, the literal =X’05’ is used on lines 215 and 230. However, only one data area with this value is generated. Both instructions refer to the same address in the literal pool for their operand.
How does the Assembler handle Literal Operands?
• The basic data structure needed: LITTAB (Literal Table)
– For each literal used, the table contains [Literal name, Operand value and length, Address assigned to the operand when it is placed in a literal pool]
!SYBOLTABLE과 마찬가지로, 2-pass 알고리즘!
• During Pass1: Each literal operand is recognized.
– Assembler searches LITTAB for the specified literal name.
– If found, no action is needed (중복 허용 X); Otherwise, the literal is added to LITTAB (leaving the address unassigned).
– If the code is LTORG or END, assign addresses for literals in LITTAB.
-> 각 literal operand를 만날 때마다 LITTAB에 넣어주는 역할
-> END나 LTORG를 만날 때 address value를 넣어줌 <-> SYMBOLTABLE은 바로 address value를 넣어줌
• During Pass2: Each literal operand is translated to its address.
– The operand address for use in generating object code is obtained by searching LITTAB for each literal operand encountered
– The data values of literals are inserted into the object program
-> LITTAB를 이용해서 literal operand를 assigned address value로 번역
Symbol Definitions
: EQU라는 디렉티브로 sumbol 정의 , 상수 값에 이름을 부여
• Most assemblers provide an assembler directive that allows the programmer to define symbols and specify their values. • EQU is the assembler directives whose main function is the definition of symbols.
• One common use of EQU (for “equate”)
– Form: symbol EQU value (symbol: symbol label, value: assinged value)
– To establish symbolic names that can be used for improving readability in place of numeric values.
Symbol Definitions with EQU
• When the assembler encounters the EQU statement, it enters MAXLEN into SYMTAB (with value 4096).
=> like 다른 symbol들, EQU DIRECTIVE로 만든 Symbol들은 SYMTAB을 통해 pass1에서 처리된다.
정의된 symbol과 그 값을 SYMTAB에 넣음.
• During assembly of the LDT instruction, the assembler searches SYMTAB for the symbol MAXLEN, using its value as the operand in the instruction.
=> symbol table을 이용해 해당 symbol을 사용하는 명령어를 handle?
• The resulting object code is exactly the same as the original version of the instruction (i.e., the one using the value instead of symbol)
• However, the source statement is easier to understand. It is also much easier to find and change the value of MAXLEN if this becomes necessary
=> EQU를 사용하면 좋은점
Symbol Definitions with EQU
• Another common use of EQU is to define mnemonic names for registers.
-> 레지스터의 이름도 정의 가능
– In a machine with many general-purpose registers, not like in SIC, having mnemonic names for registers can help!
c.f.) The standard mnemonics for registers are already defined in SIC
– The programmer can establish and use names that reflect the logical function of the registers in the program
Restriction in Symbol Definitions
• The descriptions on the EQU statement contain a restriction that is common to all symbol-defining assembler directives. • All symbols used on the right-hand side of the statement (= all terms used to specify the value of the new symbol) must have been defined previously in the program.
Expression (1/3)
: 어떤 값들로 계산을 할 수 있는 코드 구문 ex) MAXLEN
• Most assemblers allow the use of expression wherever a single operand (labels, literals, etc.) is permitted. Each such expression must, of course, be evaluated by the assembler to produce a single operand address or value.
• Assemblers generally allow arithmetic expressions formed according to the normal rules using the operators +,-,*,and /
• Individual terms in the expression may be constants, user-defined symbols, or special terms.
– The most common such special term is the current value of the location counter (often designated by *).
This term represents the value of the next unassigned memory location
– [e.g.] Line 106 in Fig. 2.10: BUFEND EQU *
* : current value of location counter
-> location counter: 해당 구문을 처리할 때 가지고 있는 값
-> This statement gives BUFEND a value that is the address of the next byte after the buffer area
-> BUFFEREND에 current value of location counter 저장
Expression (2/3)
• Expressions are classified depending upon the types of values they produce
– Absolute expressions: independent of program location (고정된 값)
– Relative expressions: relative to the beginning of program (실제 프로그램 위치에 따라 바뀜)
• A symbol defined by EQU can also be absolute or relative
Expression (3/3)
• To determine the type of an expression, we must keep track of the types of all symbols defined in the program. For this purpose we need a flag in the symbol table to indicate type of value (absolute or relative) in addition to the value itself.
– Thus, SYMTAB needs a type field to discern absolute symbols from relative symbols
(어떤 심볼이 relative인지, absolute인지 구분 for program relocation)
• Operands of format 4 instructions may have relative values;
Such relative values should be modified for relocation by the loader later.
– We need to know which is relative
Relative Format 4 Instruc은 loader가 처리해줘야해서, assembler가 modification record를 만들어야함.
– e.g., +JSUB RDREC -> relative value, we need a modification record
(실제 메모리 시작 주소에 따라 달라질 relative 주소-> loader가 obj code 바꿔줘야함)
+LDT #MAXLEN -> absolute value
(실제 메모리 시작 주소에 따라 달라지지 X)
• Operands of format 4 instructions may have relative values; Such relative values should be modified for relocation by the loader later. – We need to know which is relative – e.g., +JSUB RDREC relative value, we need a modification record +LDT #MAXLEN absolute value