How to Write Assembly Code Equivalent to C Program
x86 Assembly Language Programming
The x86 architecture is the most popular architecture for desktop and laptop computers. Let's see how we can program in assembly language for processors in this family.
Introduction
This document contains very brief examples of assembly language programs for the x86. The topic of x86 assembly language programming is messy because:
- There are many different assemblers out there: MASM, NASM, gas, as86, TASM, a86, Terse, etc. All use radically different assembly languages.
- There are differences in the way you have to code for Linux, OS/X, Windows, etc.
- Many different object file formats exist: ELF, COFF, Win32, OMF, a.out for Linux, a.out for FreeBSD, rdf, IEEE-695, as86, etc.
- You generally will be calling functions residing in the operating system or other libraries so you will have to know some technical details about how libraries are linked, and not all linkers work the same way.
- Modern x86 processors run in either 32 or 64-bit mode; there are quite a few differences between these.
We'll give examples written for NASM, MASM and gas for both Win32 and Linux. We will even include a section on DOS assembly language programs for historical interest. These notes are not intended to be a substitute for the documentation that accompanies the processor and the assemblers, nor is it intended to teach you assembly language. Its only purpose is to show how to assemble and link programs using different assemblers and linkers.
Assemblers and Linkers
Regardless of the assembler, object file format, linker or operating system you use, the programming process is always the same:
Each assembly language file is assembled into an "object file" and the object files are linked with other object files to form an executable. A "static library" is really nothing more than a collection of (probably related) object files. Application programmers generally make use of libraries for things like I/O and math.
Assemblers you should know about include
- MASM, the Microsoft Assembler. It outputs OMF files (but Microsoft's linker can convert them to win32 format). It supports a massive and clunky assembly language. Memory addressing is not intuitive. The directives required to set up a program make programming unpleasant.
- GAS, the GNU assember. This uses the rather ugly AT&T-style syntax so many people do not like it; however, you can configure it to use and understand the Intel-style. It was designed to be part of the back end of the GNU compiler collection (gcc).
- NASM, the "Netwide Assembler." It is free, small, and best of all it can output zillions of different types of object files. The language is much more sensible than MASM in many respects.
There are many object file formats. Some you should know about include
- OMF: used in DOS but has 32-bit extensions for Windows. Old.
- AOUT: used in early Linux and BSD variants
- COFF: "Common object file format"
- Win, Win32: Microsoft's version of COFF, not exactly the same! Replaces OMF.
- Win64: Microsoft's format for Win64.
- ELF, ELF32: Used in modern 32-bit Linux and elsewhere
- ELF64: Used in 64-bit Linux and elsewhere
- macho32: NeXTstep/OpenStep/Rhapsody/Darwin/OS X 32-bit
- macho64: NeXTstep/OpenStep/Rhapsody/Darwin/OS X 64-bit
The NASM documentation has great descriptions of these.
You'll need to get a linker that (1) understands the object file formats you produce, and (2) can write executables for the operating systems you want to run code on. Some linkers out there include
- LINK.EXE, for Microsoft operating systems.
- ld, which exists on all Unix systems; Windows programmers get this in any gcc distribution.
Programming for Linux
Programming Using System Calls
64-bit Linux installations use the processor's SYSCALL instruction to jump into the portion of memory where operating system services are stored. To use SYSCALL, first put the system call number in RAX, then the arguments, if any, in RDI, RSI, RDX, R10, R8, and R9, respectively. In our first example we will use system calls for writing to a file (call number 1) and exiting a process (call number 60). Here it is in the NASM assembly language:
hello.asm
; ---------------------------------------------------------------------------------------- ; Writes "Hello, World" to the console using only system calls. Runs on 64-bit Linux only. ; To assemble and run: ; ; nasm -felf64 hello.asm && ld hello.o && ./a.out ; ---------------------------------------------------------------------------------------- global _start section .text _start: mov rax, 1 ; system call for write mov rdi, 1 ; file handle 1 is stdout mov rsi, message ; address of string to output mov rdx, 13 ; number of bytes syscall ; invoke operating system to do the write mov rax, 60 ; system call for exit xor rdi, rdi ; exit code 0 syscall ; invoke operating system to exit section .data message: db "Hello, World", 10 ; note the newline at the end
Here's the same program in gas:
hello.s
# ---------------------------------------------------------------------------------------- # Writes "Hello, World" to the console using only system calls. Runs on 64-bit Linux only. # To assemble and run: # # gcc -c hello.s && ld hello.o && ./a.out # # or # # gcc -nostdlib hello.s && ./a.out # ---------------------------------------------------------------------------------------- .global _start .text _start: # write(1, message, 13) mov $1, %rax # system call 1 is write mov $1, %rdi # file handle 1 is stdout mov $message, %rsi # address of string to output mov $13, %rdx # number of bytes syscall # invoke operating system to do the write # exit(0) mov $60, %rax # system call 60 is exit xor %rdi, %rdi # we want return code 0 syscall # invoke operating system to exit message: .ascii "Hello, world\n"
Since gas is the "native" assembler under Linux, assembling and linking is automatic with gcc
, as explained in the program's comments. If you just enter "gcc hello.s
" then gcc
will assemble and then try to link with a C library. You can suppress the link step with the -c
option to gcc
, or do the assembly and linking in one step by telling the linker not to use the C library with -nostdlib
.
System Calls in 32-bit Linux
There are some systems with 32-bit builds of Linux out there still. On these systems you invoke operating systems services through an INT instruction, and use different registers for system call arguments (specifically EAX for the call number and EBX, ECX, EDX, EDI, and ESI for the arguments). Although it might be interesting to show some examples for historical reasons, this introduction is probably better kept short.
Programming with a C Library
Sometimes you might like to use your favorite C library functions in your assembly code. This should be trivial because the C library functions are all stored in a C library, such as libc.a
. Technically the code is probably in a dynamic library, like libc.so
, and libc.a
just has calls into the dynamic library. Still, all we have to do is place calls to C functions in our assembly language program, and link with the static C library and we are set.
Before looking at an example, note that the C library already defines _start
, which does some initialization, calls a function named main
, does some clean up, then calls the system function exit
! So if we link with a C library, all we have to do is define main
and end with a ret
instruction! Here is a simple example in NASM, which illustrates calling puts
.
hola.asm
; ---------------------------------------------------------------------------------------- ; Writes "Hola, mundo" to the console using a C library. Runs on Linux. ; ; nasm -felf64 hola.asm && gcc hola.o && ./a.out ; ---------------------------------------------------------------------------------------- global main extern puts section .text main: ; This is called by the C library startup code mov rdi, message ; First integer (or pointer) argument in rdi call puts ; puts(message) ret ; Return from main back into C library wrapper message: db "Hola, mundo", 0 ; Note strings must be terminated with 0 in C
And the equivalent program in GAS:
hola.s
# ---------------------------------------------------------------------------------------- # Writes "Hola, mundo" to the console using a C library. Runs on Linux or any other system # that does not use underscores for symbols in its C library. To assemble and run: # # gcc hola.s && ./a.out # ---------------------------------------------------------------------------------------- .global main .text main: # This is called by C library's startup code mov $message, %rdi # First integer (or pointer) parameter in %rdi call puts # puts(message) ret # Return to C library code message: .asciz "Hola, mundo" # asciz puts a 0 byte at the end
The previous example shows that the first argument to a C function, if it's an integer or pointer, goes in register RDI. Subsequent arguments go in RSI, RDX, RCX, R8, R9, and then subsequent arguments (which no sane programmer would ever use) will go "on the stack" (more about this stack thing later). If you have floating point arguments, they'll go in XMM0, XMM1, etc. There is even quite a bit more to calling functions; we'll see this later.
Programming for OS X
Rather than getting into OS X system calls, let's just show the simple hello program using the C library. We'll assume a 64-bit OS, and we'll also assume you've installed gcc (usually obtained via downloading xcode).
hola.asm
; ---------------------------------------------------------------------------------------- ; This is an macOS console program that writes "Hola, mundo" on one line and then exits. ; It uses puts from the C library. To assemble and run: ; ; nasm -fmacho64 hola.asm && gcc hola.o && ./a.out ; ---------------------------------------------------------------------------------------- global _main extern _puts section .text _main: push rbx ; Call stack must be aligned lea rdi, [rel message] ; First argument is address of message call _puts ; puts(message) pop rbx ; Fix up stack before returning ret section .data message: db "Hola, mundo", 0 ; C strings need a zero byte at the end
There are some differences here! C library functions have underscores, and we had to say default rel
for some strange reason, which you can read about in the NASM documentation.
Programming for Win32
Win32 is the primary operating system API found in most of Microsoft's 32-bit operating systems including Windows 9x, NT, 2000 and XP. We will follow the plan of the previous section and first look at programs that just use system calls and then programs that use a C library.
For historical reference only.These notes are pretty old. I've never learned Win64.
Calling the Win32 API Directly
Win32 defines thousands of functions! The code for these functions is spread out in many different dynamic libraries, but the majority of them are in KERNEL32.DLL
, USER32.DLL
and GDI32.DLL
(which exist on all Windows installations). The interrupt to execute system calls on the x86 processor is hex 2E, with EAX containing the system call number and EDX pointing to the parameter table in memory. However, according to z0mbie, the actually system call numbers are not consistent across different operating systems, so, to write portable code you should stick to the API calls in the various system DLLs.
Here is the "Hello, World" program in NASM, using only Win32 calls.
hello.asm
; ---------------------------------------------------------------------------- ; hello.asm ; ; This is a Win32 console program that writes "Hello, World" on one line and ; then exits. It uses only plain Win32 system calls from kernel32.dll, so it ; is very instructive to study since it does not make use of a C library. ; Because system calls from kernel32.dll are used, you need to link with ; an import library. You also have to specify the starting address yourself. ; ; Assembler: NASM ; OS: Any Win32-based OS ; Other libraries: Use gcc's import library libkernel32.a ; Assemble with "nasm -fwin32 hello.asm" ; Link with "ld -e go hello.obj -lkernel32" ; ---------------------------------------------------------------------------- global go extern _ExitProcess@4 extern _GetStdHandle@4 extern _WriteConsoleA@20 section .data msg: db 'Hello, World', 10 handle: db 0 written: db 0 section .text go: ; handle = GetStdHandle(-11) push dword -11 call _GetStdHandle@4 mov [handle], eax ; WriteConsole(handle, &msg[0], 13, &written, 0) push dword 0 push written push dword 13 push msg push dword [handle] call _WriteConsoleA@20 ; ExitProcess(0) push dword 0 call _ExitProcess@4
Here you can see that the Win32 calls we are using are
GetStdHandle WriteConsoleA ExitProcess
and parameters are passed to these calls on the stack. The comments instruct us to assemble into an object format of "win32" (not "coff"!) then link with the linker ld
. Of course you can use any linker you want, but ld
comes with gcc
and you can download a whole Win32 port of gcc
for free. We pass the starting address to the linker, and specify the static library libkernel32.a
to link with. This static library is part of the Win32 gcc
distribution, and it contains the right calls into the system DLLs.
The gas version of this program looks very similar:
hello.s
/***************************************************************************** * hello.s * * This is a Win32 console program that writes "Hello, World" on one line and * then exits. It uses only plain Win32 system calls from kernel32.dll, so it * is very instructive to study since it does not make use of a C library. * Because system calls from kernel32.dll are used, you need to link with * an import library. You also have to specify the starting address yourself. * * Assembler: gas * OS: Any Win32-based OS * Other libraries: Use gcc s import library libkernel32.a * Assemble with "gcc -c hello.s" * Link with "ld -e go hello.o -lkernel32" *****************************************************************************/ .global go .data msg: .ascii "Hello, World\n" handle: .int 0 written: .int 0 .text go: /* handle = GetStdHandle(-11) */ pushl $-11 call _GetStdHandle@4 mov %eax, handle /* WriteConsole(handle, &msg[0], 13, &written, 0) */ pushl $0 pushl $written pushl $13 pushl $msg pushl handle call _WriteConsoleA@20 /* ExitProcess(0) */ pushl $0 call _ExitProcess@4
In fact the differences between the two programs are really only syntactic. Another minor point is that gas doesn't really care if you define external systems with some sort of "extern" directive or not.
As in the NASM version, we've specified our entry point, and will be passing it to the linker in the -e option. To assemble this code, do
gcc -c hello.s
The -c option is important! It tells gcc to assemble but not link. Without the -c option, gcc will try to link the object file with a C runtime library. Since we are not using a C runtime library, and in fact are specifying our own starting point, and cleaning up ourselves with ExitProcess
we definitely want to link ourselves. The linking step is the same as the NASM example; the only difference is that gcc produces win32 object files with extension .o rather than .obj.
If you really want to pay a vendor for an assembler and linker you can use Microsoft's MASM assembler. Anything less than version 6.14 will be extremely painful to use. Here is the version of the hello
program in MASM
hello.asm
; ---------------------------------------------------------------------------- ; hello.asm ; ; This is a Win32 console program that writes "Hello, World" on one line and ; then exits. It uses only plain Win32 system calls from kernel32.dll, so it ; is very instructive to study since it does not make use of a C library. ; Because system calls from kernel32.dll are used, you need to link with ; an import library. ; ; Processor: 386 or later ; Assembler: MASM ; OS: Any Win32-based OS ; Other libraries: Use Microsoft's import library kernel32.lib ; Assemble with "ml hello.asm /c" ; Link with "link hello kernel32.lib /subsystem:console /entry:go" ; ---------------------------------------------------------------------------- .386P .model flat extern _ExitProcess@4:near extern _GetStdHandle@4:near extern _WriteConsoleA@20:near public _go .data msg byte 'Hello, World', 10 handle dword ? written dword ? .stack .code _go: ; handle = GetStdHandle(-11) push -11 call _GetStdHandle@4 mov handle, eax ; WriteConsole(handle, &msg[0], 13, &written, 0) push 0 push offset written push 13 push offset msg push handle call _WriteConsoleA@20 ; ExitProcess(0) push 0 call _ExitProcess@4 end
The processor (.386P) and model (.model) directives are an annoyance, but they have to be there and the processor directive must precede the model directive or the assembler will think the processor is running in 16-bit mode (*sigh*
). As before we have to specify an entry point and pass it to the linker. Assemble with
ml hello.asm /c
The /c
option is required since ml
will try to link. Not only is the MASM assembler, ml
, not free, but neither is Microsoft's linker, link.exe
, nor are static versions of the Win32 libraries, such as kernel32.lib
. After you buy those you link your code with
link hello.obj kernel32.lib /subsystem:console /entry:go
To get this to work, kernel32.lib
needs to be in a known library path or additional options must be passed to the linker. You might find the /subsystem
option interesting; leave it out to see a ridiculous error message when running the linked executable (at least under Win9x).
Most of MASM's syntactic weirdness, like using the "offset" keyword to get the address of a variable are not present in NASM. While NASM is probably gaining popularity, there is far more MASM code out there, and it is a good idea to have at least a passing acquaintance with MASM, since most publications use it. It is the closest thing to a "standard" x86 assembly language there is.
Using a C Runtime Library for Win32 Programming
As under Linux, using a C runtime library makes it very easy to write simple assembly language programs. Here is one in NASM:
powers.asm
; ---------------------------------------------------------------------------- ; powers.asm ; ; Displays powers of 2 from 2^0 to 2^31, one per line, to standard output. ; ; Assembler: NASM ; OS: Any Win32-based OS ; Other libraries: Use gcc's C runtime library ; Assemble with "nasm -fwin32 powers.asm" ; Link with "gcc powers.obj" (C runtime library linked automatically) ; ---------------------------------------------------------------------------- extern _printf global _main section .text _main: push esi ; callee-save registers push edi mov esi, 1 ; current value mov edi, 31 ; counter L1: push esi ; push value to print push format ; push address of format string call _printf add esp, 8 ; pop off parameters passed to printf add esi, esi ; double value dec edi ; keep counting jne L1 pop edi pop esi ret format: db '%d', 10, 0
The same program in gas looks like this:
powers.s
/***************************************************************************** * powers.s * * Displays powers of 2 from 2^0 to 2^31, one per line. It should be linked * with a C runtime library. The C runtime library contains startup code * so you do not have to specify a starting label. The startup code in * the C library eventually calls main. * * Assembler: gas * OS: Any Win32-based OS * Other libraries: Use the gccs C runtime library * Assemble and link: "gcc powers.s" (gcc links the C library automatically) *****************************************************************************/ .global _main .text format: .asciz "%d\n" _main: pushl %esi /* callee save registers */ pushl %edi movl $1, %esi /* current value */ movl $31, %edi /* counter */ L1: pushl %esi /* push value of number to print */ pushl $format /* push address of format */ call _printf addl $8, %esp addl %esi, %esi /* double value */ decl %edi /* keep counting */ jnz L1 popl %edi popl %esi ret
Note you can assemble and link with
gcc powers.s
For the MASM version of this program, you can go purchase C Runtime Libraries from Microsoft as well. There are many versions of the library, but for single threaded programs, libc.lib
is fine. Here is the powers program in MASM:
powers.asm
; ---------------------------------------------------------------------------- ; powers.asm ; ; Displays powers of 2 from 2^0 to 2^31, one per line, to standard output. ; ; Processor: 386 or later ; Assembler: MASM ; OS: Any Win32-based OS ; Other libraries: Use a Microsoft-compatible C library (e.g. libc.lib). ; Assemble with "ml powers.asm /c" ; Link with "link powers libc.lib" ; ; By default, the linker uses "/subsystem:console /entry:mainCRTStartup". ; The function "mainCRTStartup" is inside libc.lib. It does some ; initialization, calls a function "_main" (which will end up in powers.obj) ; then does more work and finally calls ExitProcess. ; ---------------------------------------------------------------------------- .386P .model flat extern _printf:near public _main .code _main: push esi ; callee-save registers push edi mov esi, 1 ; current value mov edi, 31 ; counter L1: push esi ; push value to print push offset format ; push address of format string call _printf add esp, 8 ; pop off parameters passed to printf add esi, esi ; double value dec edi ; keep counting jnz L1 pop edi pop esi ret format: byte '%d', 10, 0 end
When linking with libc.lib you get nice linker defaults. To assemble and link:
ml powers.asm /c link powers.obj libc.lib
You'll have to make sure the linker knows where to find libc.lib by setting some environment variables, of course, but you get the idea.
OpenGL Programming in NASM for Win32
For fun, here is a complete assembly language program that implements an OpenGL application running under GLUT on Windows systems:
triangle.asm
; ---------------------------------------------------------------------------- ; triangle.asm ; ; A very simple *Windows* OpenGL application using the GLUT library. It ; draws a nicely colored triangle in a top-level application window. One ; interesting thing is that the Windows GL and GLUT functions do NOT use the ; C calling convention; instead they use the "stdcall" convention which is ; like C except that the callee pops the parameters. ; ---------------------------------------------------------------------------- global _main extern _glClear@4 extern _glBegin@4 extern _glEnd@0 extern _glColor3f@12 extern _glVertex3f@12 extern _glFlush@0 extern _glutInit@8 extern _glutInitDisplayMode@4 extern _glutInitWindowPosition@8 extern _glutInitWindowSize@8 extern _glutCreateWindow@4 extern _glutDisplayFunc@4 extern _glutMainLoop@0 section .text title: db 'A Simple Triangle', 0 zero: dd 0.0 one: dd 1.0 half: dd 0.5 neghalf:dd -0.5 display: push dword 16384 call _glClear@4 ; glClear(GL_COLOR_BUFFER_BIT) push dword 9 call _glBegin@4 ; glBegin(GL_POLYGON) push dword 0 push dword 0 push dword [one] call _glColor3f@12 ; glColor3f(1, 0, 0) push dword 0 push dword [neghalf] push dword [neghalf] call _glVertex3f@12 ; glVertex(-.5, -.5, 0) push dword 0 push dword [one] push dword 0 call _glColor3f@12 ; glColor3f(0, 1, 0) push dword 0 push dword [neghalf] push dword [half] call _glVertex3f@12 ; glVertex(.5, -.5, 0) push dword [one] push dword 0 push dword 0 call _glColor3f@12 ; glColor3f(0, 0, 1) push dword 0 push dword [half] push dword 0 call _glVertex3f@12 ; glVertex(0, .5, 0) call _glEnd@0 ; glEnd() call _glFlush@0 ; glFlush() ret _main: push dword [esp+8] ; push argv lea eax, [esp+8] ; get addr of argc (offset changed :-) push eax call _glutInit@8 ; glutInit(&argc, argv) push dword 0 call _glutInitDisplayMode@4 push dword 80 push dword 80 call _glutInitWindowPosition@8 push dword 300 push dword 400 call _glutInitWindowSize@8 push title call _glutCreateWindow@4 push display call _glutDisplayFunc@4 call _glutMainLoop@0 ret
Programming for DOS
Both MASM and NASM can create DOS executables. DOS is a primitive operating system (indeed, many people, perhaps correctly, refuse to call it an operating system), which runs in real mode only. Real mode addresses are 20-bit values written in the form SEGMENT:OFFSET where the segment and offset are each 16-bits wide and the physical address is SEGMENT * 16 + OFFSET.
A DOS program is a collection of segments. When the program is loaded, DS:0 and ES:0 points to a 256-byte section of memory called the program segment prefix and this is immediately followed by the segments of the program. CS:0 will point to the code segment and SS:0 to the stack segment. SP will be loaded with the size of the stack specified by the programmer, which is perfect because on the x86 a PUSH instruction decrements the stack pointer and then moves the pushed value into the memory addressed by SS:SP. The length of the command line argument string is placed in the byte at offset 80h of the prefix and the actual argument string begins at offset 81h.
Here is a simple DOS program to echo the command line argument string:
echo.asm
; ---------------------------------------------------------------------------- ; echo.asm ; ; Echoes the command line to standard output. Illustrates DOS system calls ; 40h = write to file, and 4ch = exit process. ; ; Processor: 386 or later ; Assembler: MASM ; OS: DOS 2.0 or later only ; Assemble and link with "ml echo.asm" ; ---------------------------------------------------------------------------- .model small .stack 64 ; 64 byte stack .386 .code start: movzx cx,byte ptr ds:[80h] ; size of parameter string mov ah, 40h ; write mov bx, 1 ; ... to standard output mov dx, 81h ; ... the parameter string int 21h ; ... by calling DOS mov ah, 4ch int 21h end start
Note with the MASM assembler you have to place the .model directive before the processor directive to make the processor use 16-bit mode required for DOS.
Note that all "operating system services" such as input/output are accessible through the processor's interrupt instruction so there is no need to link your program to a special library. Of course if you wanted to link to a 16-bit C runtime library you certainly can.
The echo program defines only a code and stack segment; an example of a program with a programmer-defined data segment is:
hello1.asm
; ---------------------------------------------------------------------------- ; hello1.asm ; ; Displays a silly message to standard output. Illustrates user-defined data. ; The easiest way to do this is to put the data in a data segment, separate ; from the code, and access it via the ds register. Note that you must have ; ds:0 pointing to your data segment (technically to your segment's GROUP) ; before you reference your data. The predefined symbol @data referes to ; the group containing the segments created by .data, .data?, .const, ; .fardata, and .fardata?. ; ; Processor: 386 or later ; Assembler: MASM ; OS: DOS 2.0 or later only ; Assemble and link with "ml hello1.asm" ; ---------------------------------------------------------------------------- .model small .stack 128 .code start: mov ax, @data mov ds, ax mov ah, 9 lea dx, Msg int 21h mov ah, 4ch int 21h .data Msg byte 'Hello, there.', 13, 10, '$' end start
Although DOS has been obsolete for many years, a brief study of DOS systems and the x86 real-addressing mode is somewhat interesting. First, real-mode addresses correspond to real, physical memory, so one can watch exactly what is happening in the machine very easily with a good debugger. In fact, most embedded microprocessors work in a kind of "real mode." Less than 1% of microprocessors run desktop PCs, servers and workstations; most are simple embedded processors. Finally a lot of DOS applications still exist, so it might be useful to know what kind of technology underlies it all.
Writing Optimized Code
Assembly language programmers and compiler writers should take great care in producing efficient code. This requires a fairly deep understanding of the x86 architecture, especially the behavior of the cache(s), pipelines and alignment bias. These specifics are well beyond the scope of this little document, but an excellent place to begin your study of this material is Agner Fog's Optimization Guide or even Intel's.
Differences between NASM, MASM, and GAS
The complete syntactic specification of each assembly language can be found elsewhere, but you can learn 99% of what you need to know by looking at a comparison table:
Operation | NASM | MASM | GAS |
---|---|---|---|
Move contents of esi into ebx | mov ebx, esi | movl %esi, %ebx | |
Move contents of si into dx | mov dx, si | movw %si, %dx | |
Clear the eax register | xor eax, eax | xorl %eax, %eax | |
Move immediate value 10 into register al | mov al, 10 | movb $10, %al | |
Move contents of address 10 into register ecx | mov ecx, [10] | I DON'T KNOW | movl 10, %ecx |
Move contents of variable dog into register eax | mov eax, [dog] | mov eax, dog | movl dog, %eax |
Move address of variable dog into register eax | mov eax, dog | I DON'T KNOW | movl $dog, %eax |
Move immediate byte value 10 into memory pointed to by edx | mov byte [edx], 10 | mov byte ptr [edx], 10 | movb $10, (%edx) |
Move immediate 16-bit value 10 into memory pointed to by edx | mov word [edx], 10 | mov word ptr [edx], 10 | movw $10, (%edx) |
Move immediate 32-bit value 10 into memory pointed to by edx | mov dword [edx], 10 | mov dword ptr [edx], 10 | movl $10, (%edx) |
Compare eax to the contents of memory 8 bytes past the cell pointed to by ebp | cmp eax, [ebp+8] | cmpl $8(%ebp), %eax | |
Add into esi the value in memory ecx quadwords past the cell pointed to by eax | add esi, [eax+ecx*8] | addl (%eax,%ecx,8), %esi | |
Add into esi the value in memory ecx doublewords past 128 bytes past the cell pointed to by eax | add esi, [eax+ecx*4+128] | addl $128(%eax,%ecx,4), %esi | |
Add into esi the value in memory ecx doublewords past eax bytes past the beginning of the variable named array | add esi, [eax+ecx*4+array] | addl array(%eax,%ecx,4), %esi | |
Add into esi the value in memory ecx words past the beginning of the variable named array | add esi, [ecx*2+array] | addl array(,%ecx,2), %esi | |
Move the immediate value 4 into the memory cell pointed to by eax using selector fs | mov byte [fs:eax], 4 | mov byte ptr fs:eax, 4 | movb $4, %fs:(%eax) |
Jump into another segment | ? | jump far S:O | ljmp $S, $O |
Call to another segment | ? | call far S:O | lcall $S, $O |
Return from an intersegment call | retf V | ret far V | lret $V |
Sign-extend al into ax | cbw | cbtw | |
Sign-extend ax into eax | cwde | cwtl | |
Sign-extend ax into dx:ax | cwd | cwtd | |
Sign-extend eax into edx:eax | cdq | cltd | |
Sign-extend bh into si | movsx si, bh | movsbw %bh, %si | |
Sign-extend bh into esi | movsx esi, bh | movsbl %bh, %esi | |
Sign-extend cx into esi | movsx esi, cx | movswl %cx, %esi | |
Zero-extend bh into si | movzx si, bh | movzbw %bh, %si | |
Zero-extend bh into esi | movzx esi, bh | movzbl %bh, %esi | |
Zero-extend cx into esi | movzx esi, cx | movzwl %cx, %esi | |
100 doublewords, all initialized to 8192 | times 100 dd 8192 | dd 100 dup (8192) | I DON'T KNOW |
Reserve 64 bytes of storage | resb 64 | db 64 dup (?) | .space 64 |
Hello World | db 'Hello, World' | .ascii "Hello, World" | |
Hello World with a newline, and zero-terminated | db 'Hello, World', 10, 0 | .asciz "Hello, World\n" |
Good to know:
- NASM and MASM use what is sometimes called the Intel syntax, while GAS uses what is called the AT&T syntax.
- GAS uses % to prefix registers
- GAS is source(s) first, destination last; MASM and NASM go the other way.
- GAS denotes operand sizes on instructions (with b, w, l suffixes), rather than on operands
- GAS uses $ for immediates, but also for addresses of variables.
- GAS puts rep/repe/repne/repz/repnz prefixes on separate lines from the instructions they modify
- MASM tries to simplify things for the programmer but makes headaches instead: it tries to "remember" segments, variable sizes and so on. The result is a requirement for stupid ASSUME directives, and the inability to tell what an instrction does by looking at it (you have to go look for declarations; e.g.
dw
vs.equ
). - MASM writes FPU registers as ST(0), ST(1), etc.
- NASM treats labels case-sensitively; MASM is case-insensitive.
How to Write Assembly Code Equivalent to C Program
Source: https://cs.lmu.edu/~ray/notes/x86assembly/
0 Response to "How to Write Assembly Code Equivalent to C Program"
Post a Comment