www.digitalmars.com [Home] [Search] [D]

Last update May 14, 2003


D x86 Inline Assembler

Some Assembly Required D, being a systems programming language, provides an inline assembler. The inline assembler is standardized for D implementations across the same CPU family, for example, the Intel Pentium inline assembler for a Win32 D compiler will be syntax compatible with the inline assembler for Linux running on an Intel Pentium.

Differing D implementations, however, are free to innovate upon the memory model, function call/return conventions, argument passing conventions, etc.

This document describes the x86 implementation of the inline assembler.

	AsmInstruction:
		Identifier : AsmInstruction
		align IntegerExpression
		even
		naked
		db Operands
		ds Operands
		di Operands
		dl Operands
		df Operands
		dd Operands
		de Operands
		Opcode
		Opcode Operands

	Operands
		Operand
		Operand , Operands
	

Labels

Assembler instructions can be labeled just like other statements. They can be the target of goto statements. For example:
	void *pc;
	asm
	{
	    call L1		;
	 L1:			;
	    pop	EBX		;
	    mov	pc[EBP],EBX	;	// pc now points to code at L1
	}
	

align IntegerExpression

Causes the assembler to emit NOP instructions to align the next assembler instruction on an IntegerExpression boundary. IntegerExpression must evaluate to an integer that is a power of 2.

Aligning the start of a loop body can sometimes have a dramatic effect on the execution speed.

even

Causes the assembler to emit NOP instructions to align the next assembler instruction on an even boundary.

naked

Causes the compiler to not generate the function prolog and epilog sequences. This means such is the responsibility of inline assembly programmer, and is normally used when the entire function is to be written in assembler.

db, ds, di, dl, df, dd, de

These pseudo ops are for inserting raw data directly into the code. db is for bytes, ds is for 16 bit words, di is for 32 bit words, dl is for 64 bit words, df is for 32 bit floats, dd is for 64 bit doubles, and de is for 80 bit extended reals. Each can have multiple operands. If an operand is a string literal, it is as if there were length operands, where length is the number of characters in the string. One character is used per operand. For example:
	asm
	{
	    db 5,6,0x83;   // insert bytes 0x05, 0x06, and 0x83 into code
	    ds 0x1234;     // insert bytes 0x34, 0x12
	    di 0x1234;     // insert bytes 0x34, 0x12, 0x00, 0x00
	    dl 0x1234;     // insert bytes 0x34, 0x12, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
	    df 1.234;      // insert float 1.234
	    dd 1.234;      // insert double 1.234
	    de 1.234;      // insert extended 1.234
	    db "abc";      // insert bytes 0x61, 0x62, and 0x63
	    ds "abc";      // insert bytes 0x61, 0x00, 0x62, 0x00, 0x63, 0x00
	}
	

Opcodes

A list of supported opcodes is at the end.

The following registers are supported. Register names are always in upper case.

AL, AH, AX, EAX
BL, BH, BX, EBX
CL, CH, CX, ECX
DL, DH, DX, EDX
BP, EBP
SP, ESP
DI, EDI
SI, ESI
ES, CS, SS, DS, GS, FS
CR0, CR2, CR3, CR4
DR0, DR1, DR2, DR3, DR6, DR7
TR3, TR4, TR5, TR6, TR7
ST
ST(0), ST(1), ST(2), ST(3), ST(4), ST(5), ST(6), ST(7)
MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7

Special Cases

lock, rep, repe, repne, repnz, repz
These prefix instructions do not appear in the same statement as the instructions they prefix; they appear in their own statement. For example:
	asm
	{
	    rep   ;
	    movsb ;
	}
	
pause
This opcode is not supported by the assembler, instead use
	{
	    rep  ;
	    nop  ;
	}
	
which produces the same result.
floating point ops
Use the two operand form of the instruction format;
	fdiv ST(1);	// wrong
	fmul ST;        // wrong
	fdiv ST,ST(1);	// right
	fmul ST,ST(0);	// right
	

Operands

	Operand:
	    AsmExp

	AsmExp:
	    AsmLogOrExp
	    AsmLogOrExp ? AsmExp : AsmExp

	AsmLogOrExp:
	    AsmLogAndExp
	    AsmLogAndExp || AsmLogAndExp

	AsmLogAndExp:
	    AsmOrExp
	    AsmOrExp && AsmOrExp

	AsmOrExp:
	    AsmXorExp
	    AsmXorExp | AsmXorExp

	AsmXorExp:
	    AsmAndExp
	    AsmAndExp ^ AsmAndExp

	AsmAndExp:
	    AsmEqualExp
	    AsmEqualExp & AsmEqualExp

	AsmEqualExp:
	    AsmRelExp
	    AsmRelExp == AsmRelExp
	    AsmRelExp != AsmRelExp

	AsmRelExp:
	    AsmShiftExp
	    AsmShiftExp < AsmShiftExp
	    AsmShiftExp <= AsmShiftExp
	    AsmShiftExp > AsmShiftExp
	    AsmShiftExp >= AsmShiftExp

	AsmShiftExp:
	    AsmAddExp
	    AsmAddExp << AsmAddExp
	    AsmAddExp >> AsmAddExp
	    AsmAddExp >>> AsmAddExp

	AsmAddExp:
	    AsmMulExp
	    AsmMulExp + AsmMulExp
	    AsmMulExp - AsmMulExp

	AsmMulExp:
	    AsmBrExp
	    AsmBrExp * AsmBrExp
	    AsmBrExp / AsmBrExp
	    AsmBrExp % AsmBrExp

	AsmBrExp:
	    AsmUnaExp
	    AsmBrExp [ AsmExp ]

	AsmUnaExp:
	    AsmTypePrefix AsmExp
	    offset AsmExp
	    seg AsmExp
	    + AsmUnaExp
	    - AsmUnaExp
	    ! AsmUnaExp
	    ~ AsmUnaExp
	    AsmPrimaryExp

	AsmPrimaryExp
	    IntegerConstant
	    FloatConstant
	    __LOCAL_SIZE
	    $
	    Register
	    DotIdentifier

	DotIdentifier
	    Identifier
	    Identifier . DotIdentifier
	
The operand syntax more or less follows the Intel CPU documentation conventions. In particular, the convention is that for two operand instructions the source is the right operand and the destination is the left operand. The syntax differs from that of Intel's in order to be compatible with the D language tokenizer and to simplify parsing.

Operand Types

	AsmTypePrefix:
		near ptr
		far ptr
		byte ptr
		short ptr
		int ptr
		word ptr
		dword ptr
		float ptr
		double ptr
		extended ptr
	
In cases where the operand size is ambiguous, as in:
	add	[EAX],3		;
	
it can be disambiguated by using an AsmTypePrefix:
	add	byte ptr [EAX],3	;
	add	int ptr [EAX],7		;
	

Struct/Union/Class Member Offsets

To access members of an aggregate, given a pointer to the aggregate is in a register, use the qualified name of the member:
	struct Foo { int a,b,c; }
	int bar(Foo *f)
	{
	    asm
	    {	mov	EBX,f		;
		mov	EAX,Foo.b[EBX]	;
	    }
	}
	

Special Symbols

$
Represents the program counter of the start of the next instruction. So,
	jmp	$  ;
branches to the instruction following the jmp instruction.

__LOCAL_SIZE
This gets replaced by the number of local bytes in the local stack frame. It is most handy when the naked is invoked and a custom stack frame is programmed.

Opcodes Supported

aaa aad aam aas adc
add addpd addps addsd addss
and andnpd andnps andpd andps
arpl bound bsf bsr bswap
bt btc btr bts call
cbw cdq clc cld clflush
cli clts cmc cmova cmovae
cmovb cmovbe cmovc cmove cmovg
cmovge cmovl cmovle cmovna cmovnae
cmovnb cmovnbe cmovnc cmovne cmovng
cmovnge cmovnl cmovnle cmovno cmovnp
cmovns cmovnz cmovo cmovp cmovpe
cmovpo cmovs cmovz cmp cmppd
cmpps cmps cmpsb cmpsd cmpss
cmpsw cmpxch8b cmpxchg comisd comiss
cpuid cvtdq2pd cvtdq2ps cvtpd2dq cvtpd2pi
cvtpd2ps cvtpi2pd cvtpi2ps cvtps2dq cvtps2pd
cvtps2pi cvtsd2si cvtsd2ss cvtsi2sd cvtsi2ss
cvtss2sd cvtss2si cvttpd2dq cvttpd2pi cvttps2dq
cvttps2pi cvttsd2si cvttss2si cwd cwde
da daa das db dd
de dec df di div
divpd divps divsd divss dl
dq ds dt dw emms
enter f2xm1 fabs fadd faddp
fbld fbstp fchs fclex fcmovb
fcmovbe fcmove fcmovnb fcmovnbe fcmovne
fcmovnu fcmovu fcom fcomi fcomip
fcomp fcompp fcos fdecstp fdisi
fdiv fdivp fdivr fdivrp feni
ffree fiadd ficom ficomp fidiv
fidivr fild fimul fincstp finit
fist fistp fisub fisubr fld
fld1 fldcw fldenv fldl2e fldl2t
fldlg2 fldln2 fldpi fldz fmul
fmulp fnclex fndisi fneni fninit
fnop fnsave fnstcw fnstenv fnstsw
fpatan fprem fprem1 fptan frndint
frstor fsave fscale fsetpm fsin
fsincos fsqrt fst fstcw fstenv
fstp fstsw fsub fsubp fsubr
fsubrp ftst fucom fucomi fucomip
fucomp fucompp fwait fxam fxch
fxrstor fxsave fxtract fyl2x fyl2xp1
hlt idiv imul in inc
ins insb insd insw int
into invd invlpg iret iretd
ja jae jb jbe jc
jcxz je jecxz jg jge
jl jle jmp jna jnae
jnb jnbe jnc jne jng
jnge jnl jnle jno jnp
jns jnz jo jp jpe
jpo js jz lahf lar
ldmxcsr lds lea leave les
lfence lfs lgdt lgs lidt
lldt lmsw lock lods lodsb
lodsd lodsw loop loope loopne
loopnz loopz lsl lss ltr
maskmovdqu maskmovq maxpd maxps maxsd
maxss mfence minpd minps minsd
minss mov movapd movaps movd
movdq2q movdqa movdqu movhlps movhpd
movhps movlhps movlpd movlps movmskpd
movmskps movntdq movnti movntpd movntps
movntq movq movq2dq movs movsb
movsd movss movsw movsx movupd
movups movzx mul mulpd mulps
mulsd mulss neg nop not
or orpd orps out outs
outsb outsd outsw packssdw packsswb
packuswb paddb paddd paddq paddsb
paddsw paddusb paddusw paddw pand
pandn pavgb pavgw pcmpeqb pcmpeqd
pcmpeqw pcmpgtb pcmpgtd pcmpgtw pextrw
pinsrw pmaddwd pmaxsw pmaxub pminsw
pminub pmovmskb pmulhuw pmulhw pmullw
pmuludq pop popa popad popf
popfd por prefetchnta prefetcht0 prefetcht1
prefetcht2 psadbw pshufd pshufhw pshuflw
pshufw pslld pslldq psllq psllw
psrad psraw psrld psrldq psrlq
psrlw psubb psubd psubq psubsb
psubsw psubusb psubusw psubw punpckhbw
punpckhdq punpckhqdq punpckhwd punpcklbw punpckldq
punpcklqdq punpcklwd push pusha pushad
pushf pushfd pxor rcl rcpps
rcpss rcr rdmsr rdpmc rdtsc
rep repe repne repnz repz
ret retf rol ror rsm
rsqrtps rsqrtss sahf sal sar
sbb scas scasb scasd scasw
seta setae setb setbe setc
sete setg setge setl setle
setna setnae setnb setnbe setnc
setne setng setnge setnl setnle
setno setnp setns setnz seto
setp setpe setpo sets setz
sfence sgdt shl shld shr
shrd shufpd shufps sidt sldt
smsw sqrtpd sqrtps sqrtsd sqrtss
stc std sti stmxcsr stos
stosb stosd stosw str sub
subpd subps subsd subss sysenter
sysexit test ucomisd ucomiss ud2
unpckhpd unpckhps unpcklpd unpcklps verr
verw wait wbinvd wrmsr xadd
xchg xlat xlatb xor xorpd
xorps

AMD Opcodes Supported

pavgusb pf2id pfacc pfadd pfcmpeq
pfcmpge pfcmpgt pfmax pfmin pfmul
pfnacc pfpnacc pfrcp pfrcpit1 pfrcpit2
pfrsqit1 pfrsqrt pfsub pfsubr pi2fd
pmulhrw pswapd

Feedback and Comments

Add feedback and comments regarding this page.
Copyright (c) 1999-2002 by Digital Mars, All Rights Reserved