Inline Assembler
The compiler includes a powerful inline assembler. With it, assembly language instructions can be used directly in C and C++ source programs without requiring a separate assembler program. Assembly language enables optimizing critical functions, interfacing to the BIOS, operating system and special hardware, and access capabilities of the processor that are not available from C++.It supports both 16-bit and 32-bit code generation in all memory models.
What's in This Chapter
- Basic features of the inline assembler.
- The ASM statement and the ASM block.
- Using ASM registers.
- Calling C and C++ from assembly language.
- Registers and opcodes.
Advantages of writing inline assembly language functions
Use assembly language functions to:- Create a subroutine that executes as quickly as possible
- Provide an interface to functions compiled with another compiler
- Acess capabilities of the CPU and the native instruction set that are not available from C++
- Interface to specialized hardware
- Write code where specific instruction selection and ordering is critical
The asm Statement
The asm statement invokes the assembler. Use this statement wherever a C or C++ statement is legal. You can use asm in any of three ways.The first example shows asm followed simply by an assembly instruction:
asm mov AH,2 asm mov DL,7 asm int 21H
The second example shows asm followed by a set of assembly instructions enclosed by braces. An empty set of braces may follow the directive.
asm { mov AH,2 mov DL,7 int 21H }
Because the asm statement is a statement separator, assembly instructions can appear on the same line:
asm mov AH,2 asm mov DL,7 asm int 21H
The three previous examples generate identical code. But enclosing assembly language in braces, as in the second example, has the advantage of setting the assembly language off from the surrounding C++ code and avoids repeating the asm statement.
No assembler instruction can continue onto a second line. Use the form of the last example primarily for writing macros, which must be one line long after expansion.
Note
The Digital Mars C++ asm statement emulates the Borland asm statement. The _asm and __asm statements emulate the Microsoft _asm and __asm statements.The ASM Block
A series of assembler instructions enclosed by braces following the asm keyword are called an "ASM block." Unlike C++ blocks, ASM blocks do not affect the scope of variables.Restrictions on using C and C++ in an ASM block
An ASM block can use the following C and C++ language elements:- Symbols, including labels, variables, and function names.
- Constants, including symbolic constants and enum members.
- Macros and preprocessor directives.
- Comments delimited by /**/ symbols or defined by // symbols. In Microsoft-compatible mode, semicolons (;) can also be used to delimit comments.
- Type names (where a MASM type would be legal).
- Type names, including declarations of structs and pointers.
- C-style type casts.
Note
Microsoft and Borland inline assemblers do not support type casts.Inline assembly instructions within C or C++ statements can refer to C or C++ variables by name.
MASM-Style hexadecimal constants
Support for MASM-style hexadecimal constants provides easy conversion of MASM-style source code. The constants take the form:digit {hex_digit} ('H'| 'h')
You cannot use hexadecimal constants if the -A (for ANSI compatibility) option is used.
C and C++ operators in an ASM block
An ASM block cannot use operators specific to C and C++, such as the left-shift (<<) operator.You can use operators common to C, C++, and MASM within an ASM block but interpret them as assembly language operators. C and C++ interpret brackets ([]) as enclosing array subscripts and scale them to the size of an array element. But within an asm statement, C and C++ interpret brackets as the MASM index operator, which adds an unscaled byte offset to any adjacent operand.
In Microsoft-compatible mode, the semicolon delimits comments, as in MASM.
Assembly language in an asm statement
In common with other assemblers, the inline assembler accepts any instruction that is legal in MASM. The following are some of the assembly language features of the asm statement:- Expressions. Inline assembly code can use any MASM expression. A MASM expression is any combination of operands and operators that evaluates to a single value or address.
- Data directives and operators. An asm statement can define data objects in the code segment with the MASM directives DB, DW, and DQ. The MASM directives DT, DF, STRUC, and RECORD, and the operators DUP, THIS, WIDTH, and MASK are not accepted.
- Macros. An asm statement can use C preprocessor macros even though the inline assembler is not a macro assembler and does not support MASM macro directives.
Other restrictions on C and C++ symbols
An asm statement can reference any C or C++ variable name, function, or label in scope, provided those names are not symbolic constants. However, you cannot call a C++ member function from within an asm statement.Prototype the functions referenced in an asm statement before using them in programs. This lets the compiler distinguish them from names and labels.
Each assembly language instruction can contain a single C or C++ symbol.
C or C++ symbols within an ASM block must not have the same spelling as an asm reserved word.
The inline assembler allows structure or union tags in asm statements, but only as qualifiers of references to members of the structure or union.
Accessing C or C++ data in an asm statement
In general, instructions in an asm statement can reference any symbol in scope where the statement appears. The following statement loads the AX register with the value of var, a C variable in scope:asm mov AX,var
An asm statement can reference a uniquely named member of a class, structure, or union without specifying the variable name or type before the period operator. But if the member name is not unique, you must specify a variable or type name before the period operator. If two structure types have a member name in common, as in this example:
struct first_type { char *mold; int common_name; }; struct second__type { char *mildew; long common_name; int unique_name; };
then qualify the reference to common_name with the tag name:
asm mov [bx]first_type.common_name,10
You need not qualify a reference to a unique member name. In the following example, unique_name is an anonymous structure member because it is a member of first_type.
asm mov [bx].unique_name,10
This statement generates the same instruction whether or not a qualifying name or type is present. For more information, see the section "Making anonymous references to structure members" later in this chapter.
Functions in inline assembly language
Because ASM blocks do not require separate source file assembly steps, writing a function using ASM blocks is easier than using a separate assembler. In addition, the compiler generates function prolog and epilog code.The expon2 function is an example of a function written in inline assembly language:
int expon2(int num, int power) { asm { mov AX,num // get first argument mov CX,power // get second argument shl AX,CL // AX = AX * (2 to the power of CL) } }
An inline function refers to its arguments by name and may appear in the same source file as the callers of the function.
Refer to Using Assembly Language Functions for a description of the register stacks used by inline assembly instructions.
Making anonymous references to structure members
You can make anonymous references to members of a given structure, as in the following:struct x { int i; int j; int k; } foo;
You can refer to these members i, j, and k anonymously, for example, with the assembly instruction:
asm { mov BX,4 mov AX,foo[BX]; Refers to member j of foo }
Using register variables
Digital Mars C++ supports register variables. Register variables are useful with inline assembly. If asm statements place results in registers, you can use register variables to access those values.
For more information see "Using Register Variables" in Using Assembly Language Functions.
Using the __LOCAL_SIZE symbol
When using the inline assembler, the special symbol __LOCAL_SIZE expands to the number of bytes used by all local symbols. __LOCAL_SIZE is useful in combination with __declspec(naked), as __LOCAL_SIZE is the amount of space to reserve on the stack.For example:
__declspec(naked) int test() { int x, y, z; _asm { push BP mov BP,SP sub SP,__LOCAL_SIZE mov BX,__LOCAL_SIZE[BP] mov BX,__LOCAL_SIZE+2[BP] mov AX,__LOCAL_SIZE mov AX,__LOCAL_SIZE+2 } _AX = x + y + z; _asm { mov SP,BP pop BP ret } }
Using ASM Registers
The asm statement alters the registers outside the programmer's explicit assembly language instructions. Registers contain whatever values the normal control flow leaves in them at the point of the asm statement.For 16-bit memory models
You do not need to preserve the following registers when writing inline assembly language: AX, BX, CX, DX, SI, DI, ES, and flags (other than DF).C and C++ do not expect these registers to be maintained between statements, but they do preserve the following registers: CS, DS, SS, SP and BP.
Note
The compiler does not use registers to hold register variables for functions containing inline assembly code.For 32-bit memory models
Functions can change the values in the EAX, ECX, EDX, ESI, EDI registers.Functions must preserve the values in the EBX, ESI, EDI, EBP, ESP, SS, CS, DS registers (plus ES and GS for the NT memory model).
Always set the direction flag to forward.
To maximize speed on 32-bit buses, make sure data aligns along 32-Function return values
- For 16-bit models. If the return value for a function is short (a char, int, or near pointer) store it in the AX register, as in the previous example, expon2. If the return value is long, store the high word in the DX register and the low word in AX. To return a longer value, store the value in memory and return a pointer to the value.
- For 32-bit models. Return near pointers, ints, unsigned ints, chars, shorts, longs and unsigned longs in EAX. 32-bit models return far pointers in EDX, EAX, where EDX contains the segment and EAX contains the offset.
- When C linkage is in effect. Floats are returned in EAX and doubles in EDX, EAX, where EDX contains the most significant 32 bits and EAX the least significant.
- When C++ linkage is in effect. The compiler creates a temporary copy on the stack and returns a pointer to it.
Function return values
- For 16-bit memory models. If the return value for a function is short (a char, int, or near pointer) store it in the AX register, as in the previous example, expon2. If the return value is long, store the high word in the DX register and the low word in AX. To return a longer value, store the value in memory and return a pointer to the value.
- For 32-bit models. Return near pointers, ints, unsigned ints, chars, shorts, longs, and unsigned longs in EAX. 32-bit models return far pointers in EDX, EAX, where EDX contains the segment and EAX contains the offset.
- When C linkage is in effect. Floats are returned in EAX and doubles in EDX, EAX, where EDX contains the most significant 32 bits and EAX the least significant.
- When C++ linkage is in effect. The compiler creates a temporary copy on the stack and returns a pointer to it.
Interfacing to a member function
The easiest way to interface an assembly language routine to a class member function is to provide a C wrapper function that can be called from the assembly language routine.An alternative is to write the member function in C++ and compile it with normal out-of-line member functions.
Calling C Functions from an ASM Block
C functions, including C library functions, can be called from within the asm block, as in the following example:#include <stdio.h> char format [] = "% s %s %s \n"; char alas[] = "Alas,"; char poor[] = "poor"; char Yorick[] = "Yorick!"; void main(void) { asm { mov AX, offset Yorick push AX mov AX, offset poor push AX mov AX, offset Alas push AX mov AX, offset format push AX call printf } }
Simply push the needed arguments from right to left before calling the function, since function arguments are passed on the stack. To print the message, the example pushes pointers to the three strings, formats them, and then calls printf.
Calling C++ functions
An ASM block can call only global C++ functions that are not overloaded because the types of the arguments are unknown. The compiler issues an error if an ASM block calls an overloaded global C++ function or a C member function.You can also call a function declared with extern "C" linkage from an asm statement within a C++ program, because all the standard header files declare the library functions to have extern "C" linkage.
Defining ASM blocks as C macros
A C++ macro is a convenient way to insert assembly language into source code. But, because a macro expands into a single, logical line, take care when writing them.If the macro expands into multiple instructions, enclose the instructions in an ASM block. The asm statement must precede each instruction. Also separate comments from code with /**/ characters rather than //. Unless you take these precautions, the compiler can be confused by C or C++ statements to the left or right of the assembly code or interpret instructions as comments when the macro becomes a single line. Without the closing brace, the compiler cannot tell where the assembly language ends.
Warning: Do not use double-slash (//) characters within a macro. The compiler terminates the macro when it sees a double-slash.
An ASM block written as a macro can accept arguments but, unlike a C macro, it cannot return values. But some MASM macros can be written as macros for C. The following MASM macro sets a video page to the value specified in the argument page:
findpage MACRO page mov AH, 5 MOV AL,page int 10h ENDM
The following C macro does the same thing:
#define findpage(page) asm \ { \ asm mov AH,5 \ asm mov AL,page \ asm int 10h \ }
Registers
The following registers are supported. Register names are in upper or lower case.- AL, AH, AX, EAX
- BL, BH, BX, EBX
- CL, CH, CX, ECX
- DL, DH, DX, EDX
- BP, EBP
- SP, ESP
- DI, EDI
- SI, ESI
- ES, CS, SS, DS, GS, FS
- CR0, CR2, CR3, CR4
- DR0, DR1, DR2, DR3, DR6, DR7
- TR3, TR4, TR5, TR6, TR7
- ST
- ST(0), ST(1), ST(2), ST(3), ST(4), ST(5), ST(6), ST(7)
- MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7
- XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7
- BL, BH, BX, EBX
Opcodes
The following instructions are supported. Opcode names are in upper or lower case.aaa | aad | aam | aas | adc |
add | addpd | addps | addsd | addss |
and | andnpd | andnps | andpd | andps |
arpl | bound | bsf | bsr | bswap |
bt | btc | btr | bts | call |
cbw | cdq | clc | cld | clflush |
cli | clts | cmc | cmova | cmovae |
cmovb | cmovbe | cmovc | cmove | cmovg |
cmovge | cmovl | cmovle | cmovna | cmovnae |
cmovnb | cmovnbe | cmovnc | cmovne | cmovng |
cmovnge | cmovnl | cmovnle | cmovno | cmovnp |
cmovns | cmovnz | cmovo | cmovp | cmovpe |
cmovpo | cmovs | cmovz | cmp | cmppd |
cmpps | cmps | cmpsb | cmpsd | cmpss |
cmpsw | cmpxch8b | cmpxchg | comisd | comiss |
cpuid | cvtdq2pd | cvtdq2ps | cvtpd2dq | cvtpd2pi |
cvtpd2ps | cvtpi2pd | cvtpi2ps | cvtps2dq | cvtps2pd |
cvtps2pi | cvtsd2si | cvtsd2ss | cvtsi2sd | cvtsi2ss |
cvtss2sd | cvtss2si | cvttpd2dq | cvttpd2pi | cvttps2dq |
cvttps2pi | cvttsd2si | cvttss2si | cwd | cwde |
da | daa | das | db | dd |
de | dec | df | di | div |
divpd | divps | divsd | divss | dl |
dq | ds | dt | dw | emms |
enter | f2xm1 | fabs | fadd | faddp |
fbld | fbstp | fchs | fclex | fcmovb |
fcmovbe | fcmove | fcmovnb | fcmovnbe | fcmovne |
fcmovnu | fcmovu | fcom | fcomi | fcomip |
fcomp | fcompp | fcos | fdecstp | fdisi |
fdiv | fdivp | fdivr | fdivrp | feni |
ffree | fiadd | ficom | ficomp | fidiv |
fidivr | fild | fimul | fincstp | finit |
fist | fistp | fisub | fisubr | fld |
fld1 | fldcw | fldenv | fldl2e | fldl2t |
fldlg2 | fldln2 | fldpi | fldz | fmul |
fmulp | fnclex | fndisi | fneni | fninit |
fnop | fnsave | fnstcw | fnstenv | fnstsw |
fpatan | fprem | fprem1 | fptan | frndint |
frstor | fsave | fscale | fsetpm | fsin |
fsincos | fsqrt | fst | fstcw | fstenv |
fstp | fstsw | fsub | fsubp | fsubr |
fsubrp | ftst | fucom | fucomi | fucomip |
fucomp | fucompp | fwait | fxam | fxch |
fxrstor | fxsave | fxtract | fyl2x | fyl2xp1 |
hlt | idiv | imul | in | inc |
ins | insb | insd | insw | int |
into | invd | invlpg | iret | iretd |
ja | jae | jb | jbe | jc |
jcxz | je | jecxz | jg | jge |
jl | jle | jmp | jna | jnae |
jnb | jnbe | jnc | jne | jng |
jnge | jnl | jnle | jno | jnp |
jns | jnz | jo | jp | jpe |
jpo | js | jz | lahf | lar |
ldmxcsr | lds | lea | leave | les |
lfence | lfs | lgdt | lgs | lidt |
lldt | lmsw | lock | lods | lodsb |
lodsd | lodsw | loop | loope | loopne |
loopnz | loopz | lsl | lss | ltr |
maskmovdqu | maskmovq | maxpd | maxps | maxsd |
maxss | mfence | minpd | minps | minsd |
minss | mov | movapd | movaps | movd |
movdq2q | movdqa | movdqu | movhlps | movhpd |
movhps | movlhps | movlpd | movlps | movmskpd |
movmskps | movntdq | movnti | movntpd | movntps |
movntq | movq | movq2dq | movs | movsb |
movsd | movss | movsw | movsx | movupd |
movups | movzx | mul | mulpd | mulps |
mulsd | mulss | neg | nop | not |
or | orpd | orps | out | outs |
outsb | outsd | outsw | packssdw | packsswb |
packuswb | paddb | paddd | paddq | paddsb |
paddsw | paddusb | paddusw | paddw | pand |
pandn | pavgb | pavgw | pcmpeqb | pcmpeqd |
pcmpeqw | pcmpgtb | pcmpgtd | pcmpgtw | pextrw |
pinsrw | pmaddwd | pmaxsw | pmaxub | pminsw |
pminub | pmovmskb | pmulhuw | pmulhw | pmullw |
pmuludq | pop | popa | popad | popf |
popfd | por | prefetchnta | prefetcht0 | prefetcht1 |
prefetcht2 | psadbw | pshufd | pshufhw | pshuflw |
pshufw | pslld | pslldq | psllq | psllw |
psrad | psraw | psrld | psrldq | psrlq |
psrlw | psubb | psubd | psubq | psubsb |
psubsw | psubusb | psubusw | psubw | punpckhbw |
punpckhdq | punpckhqdq | punpckhwd | punpcklbw | punpckldq |
punpcklqdq | punpcklwd | push | pusha | pushad |
pushf | pushfd | pxor | rcl | rcpps |
rcpss | rcr | rdmsr | rdpmc | rdtsc |
rep | repe | repne | repnz | repz |
ret | retf | rol | ror | rsm |
rsqrtps | rsqrtss | sahf | sal | sar |
sbb | scas | scasb | scasd | scasw |
seta | setae | setb | setbe | setc |
sete | setg | setge | setl | setle |
setna | setnae | setnb | setnbe | setnc |
setne | setng | setnge | setnl | setnle |
setno | setnp | setns | setnz | seto |
setp | setpe | setpo | sets | setz |
sfence | sgdt | shl | shld | shr |
shrd | shufpd | shufps | sidt | sldt |
smsw | sqrtpd | sqrtps | sqrtsd | sqrtss |
stc | std | sti | stmxcsr | stos |
stosb | stosd | stosw | str | sub |
subpd | subps | subsd | subss | sysenter |
sysexit | test | ucomisd | ucomiss | ud2 |
unpckhpd | unpckhps | unpcklpd | unpcklps | verr |
verw | wait | wbinvd | wrmsr | xadd |
xchg | xlat | xlatb | xor | xorpd |
xorps |
Pentium 4 (Prescott) Opcodes Supported
addsubpd | addsubps | fisttp | haddpd | haddps |
hsubpd | hsubps | lddqu | monitor | movddup |
movshdup | movsldup | mwait |
AMD Opcodes
pavgusb | pf2id | pfacc | pfadd | pfcmpeq |
pfcmpge | pfcmpgt | pfmax | pfmin | pfmul |
pfnacc | pfpnacc | pfrcp | pfrcpit1 | pfrcpit2 |
pfrsqit1 | pfrsqrt | pfsub | pfsubr | pi2fd |
pmulhrw | pswapd |