Copyright © 1999-2002 by Konstantin Boldyshev
Copyright © 1996-1999 by Francois-Rene Rideau
$Date: 2002/08/17 08:35:59 $
![]() | You can skip this chapter if you are familiar with HOWTOs, or just hate to read all this assembly-unrelated crap. |
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License Version 1.1; with no Invariant Sections, with no Front-Cover Texts, and no Back-Cover texts. A copy of the license is included in the GNU Free Documentation License appendix.
The most recent official version of this document is available from the Linux Assembly and LDP sites. If you are reading a few-months-old copy, consider checking the above URLs for a new version.
If you don't know what free software is, please do read carefully the GNU General Public License (GPL or copyleft), which is used in a lot of free software, and is the model for most of their licenses. It generally comes in a file named COPYING (or COPYING.LIB). Literature from the Free Software Foundation (FSF) might help you too. Particularly, the interesting feature of free software is that it comes with source code which you can consult and correct, or sometimes even borrow from. Read your particular license carefully and do comply to it.
To contribute, please contact the maintainer.
![]() | At the time of writing, it is Konstantin Boldyshev and no more Francois-Rene Rideau (since version 0.5). I (Fare) had been looking for some time for a serious hacker to replace me as maintainer of this document, and am pleased to announce Konstantin as my worthy successor. |
Korean translation of this HOWTO is avalilable at http://kldp.org/HOWTO/html/Assembly-HOWTO/. Also, there was French translation of the early HOWTO versions, but I couldn't find it now.
your code can be fairly difficult to understand and modify, i.e. to maintain
the result is non-portable to other architectures, existing or upcoming
And in any case, as says moderator John Levine on comp.compilers,
"compilers make it a lot easier to use complex data structures,
and compilers don't get bored halfway through
and generate reliably pretty good code."
They will also correctly propagate code transformations throughout the whole (huge) program when optimizing code between procedures and module boundaries.
have automatic tools translate these programs into assembly code
All of the above, i.e. write (an extension to) an optimizing compiler back-end.
Even when assembly is needed (e.g. OS development), you'll find that not so much of it is required, and that the above principles retain.
See the Linux kernel sources concerning this: as little assembly as needed, resulting in a fast, reliable, portable, maintainable OS. Even a successful game like DOOM was almost massively written in C, with a tiny part only being written in assembly for speed up.
As says Charles Fiterman on comp.compilers about human vs computer-generated assembly code:
The human should always win and here is why.
First the human writes the whole thing in a high level language.
Second he profiles it to find the hot spots where it spends its time.
Third he has the compiler produce assembly for those small sections of code.
Fourth he hand tunes them looking for tiny improvements over the machine generated code.
The human wins because he can use the machine.
Hence, if you identify some code portion as being too slow, you should
Finally, before you end up writing assembly, you should inspect generated code, to check that the problem really is with bad code generation, as this might really not be the case: compiler-generated code might be better than what you'd have written, particularly on modern multi-pipelined architectures! Slow parts of a program might be intrinsically so. The biggest problems on modern architectures with fast processors are due to delays from memory access, cache-misses, TLB-misses, and page-faults; register optimization becomes useless, and you'll more profitably re-think data structures and threading to achieve better locality in memory access. Perhaps a completely different approach to the problem might help, then.
The standard way to have assembly code be generated is to invoke your compiler with the -S flag. This works with most Unix compilers, including the GNU C Compiler (GCC), but YMMV. As for GCC, it will produce more understandable assembly code with the -fverbose-asm command-line option. Of course, if you want to get good assembly code, don't forget your usual optimization options and hints!
The original GCC site is the GNU FTP site ftp://prep.ai.mit.edu/pub/gnu/gcc/ together with all released application software from the GNU project. Linux-configured and pre-compiled versions can be found in ftp://metalab.unc.edu/pub/Linux/GCC/ There are a lot of FTP mirrors of both sites everywhere around the world, as well as CD-ROM copies.
GCC development has split into two branches some time ago (GCC 2.8 and EGCS), but they merged back, and current GCC webpage is http://gcc.gnu.org.
Sources adapted to your favorite OS and pre-compiled binaries should be found at your usual FTP sites.
DOS port of GCC is called DJGPP.
There are two Win32 GCC ports: cygwin and mingw
There is also an OS/2 port of GCC called EMX; it works under DOS too, and includes lots of unix-emulation library routines. Look around the following site: ftp://ftp-os2.cdrom.com/pub/os2/emx09c.
The right section to look for is C Extensions::Extended Asm::
The DJGPP Games resource (not only for game hackers) had page specifically about assembly, but it's down. Its data have nonetheless been recovered on the DJGPP site, that contains a mine of other useful information: http://www.delorie.com/djgpp/doc/brennan/, and in the DJGPP Quick ASM Programming Guide.
GCC depends on GAS for assembling and follows its syntax (see below); do mind that inline asm needs percent characters to be quoted, they will be passed to GAS. See the section about GAS below.
Find lots of useful examples in the linux/include/asm-i386/ subdirectory of the sources for the Linux kernel.
More generally, good compile flags for GCC on the x86 platform are
gcc -O2 -fomit-frame-pointer -W -Wall
-W -Wall enables all useful warnings and helps you to catch obvious stupid errors.
You can add some CPU-specific -m486 or such flag so that GCC will produce code that is more adapted to your precise CPU. Note that modern GCC has -mpentium and such flags (and PGCC has even more), whereas GCC 2.7.x and older versions do not. A good choice of CPU-specific flags should be in the Linux kernel. Check the TeXinfo documentation of your current GCC installation for more.
-m386 will help optimize for size, hence also for speed on computers whose memory is tight and/or loaded, since big programs cause swap, which more than counters any "optimization" intended by the larger code. In such settings, it might be useful to stop using C, and use instead a language that favors code factorization, such as a functional language and/or FORTH, and use a bytecode- or wordcode- based implementation.
Note that you can vary code generation flags from file to file, so performance-critical files will use maximum optimization, whereas other files will be optimized for size.
To optimize even more, option -mregparm=2 and/or corresponding function attribute might help, but might pose lots of problems when linking to foreign code, including libc. There are ways to correctly declare foreign functions so the right call sequences be generated, or you might want to recompile the foreign libraries to use the same register-based calling convention...
Note that you can add make these flags the default by editing file /usr/lib/gcc-lib/i486-linux/2.7.2.3/specs or wherever that is on your system (better not add -W -Wall there, though). The exact location of the GCC specs files on system can be found by gcc -v.
GAS is the GNU Assembler, that GCC relies upon.
Find it at the same place where you've found GCC, in the binutils package. The latest version of binutils is available from http://sources.redhat.com/binutils/.
Here are the major caveats about GAS syntax:
Note: There are few programs which may help you to convert source code between AT&T and Intel assembler syntaxes; some of the are capable of performing conversion in both directions.
GAS has comprehensive documentation in TeXinfo format, which comes at least with the source distribution. Browse extracted .info pages with Emacs or whatever. There used to be a file named gas.doc or as.doc around the GAS source package, but it was merged into the TeXinfo docs. Of course, in case of doubt, the ultimate documentation is the sources themselves! A section that will particularly interest you is Machine Dependencies::i386-Dependent::
Again, the sources for Linux (the OS kernel) come in as excellent examples; see under linux/arch/i386/ the following files: kernel/*.S, boot/compressed/*.S, math-emu/*.S.
If you are writing kind of a language, a thread package, etc., you might as well see how other languages ( OCaml, Gforth, etc.), or thread packages (QuickThreads, MIT pthreads, LinuxThreads, etc), or whatever else do it.
Finally, just compiling a C program to assembly might show you the syntax for the kind of instructions you want. See section Do you need assembly? above.
Good news are that starting from binutils 2.10 release, GAS supports Intel syntax too. It can be triggered with .intel_syntax directive. Unfortunately this mode is not documented (yet?) in the official binutils manual, so if you want to use it, try to examine http://home.snafu.de/phpr/lhpas86.html.gz, which is an extract from AMD 64bit port of binutils 2.11.
GAS also has GASP (GAS Preprocessor), which adds all the usual macroassembly tricks to GAS. GASP comes together with GAS in the GNU binutils archive. It works as a filter, like CPP and M4. I have no idea on details, but it comes with its own texinfo documentation, which you would like to browse (info gasp), print, grok. GAS with GASP looks like a regular macro-assembler to me.
http://nasm.sourceforge.net, http://www.cryogen.com/nasm/
Binary release on your usual metalab mirror in devel/lang/asm/ directory. Should also be available as .rpm or .deb in your usual RedHat/Debian distributions' contrib.
The syntax is Intel-style. Comprehensive macroprocessing support is integrated.
NASM can be used as a backend for the free LCC compiler (support files included).
![]() | NASM comes with a disassembler, NDISASM. |
Note: There are few programs which may help you to convert source code between AT&T and Intel assembler syntaxes; some of the are capable of performing conversion in both directions.
Current version is 0.16, it can be found at http://www.cix.co.uk/~mayday/, in bin86 package with linker (ld86), or as separate archive.
![]() | A completely outdated version 0.4 of AS86 is distributed by HJLu just to compile the Linux kernel versions prior to 2.4, in a package named bin86, available in any Linux GCC repository. But I advise no one to use it for anything else but compiling Linux. This version supports only a hacked minix object file format, which is not supported by the GNU binutils or anything, and it has a few bugs in 32-bit mode, so you really should better keep it only for compiling Linux. |
![]() | They can be in various stages of development, and can be non-classic/high-level/whatever else. |
It looks promising; it is under heavy development, and you may want to take part. See http://www.tortall.net/projects/yasm/.
FASM (flat assembler) is a fast, efficient 80x86 assembler that runs in 'flat real mode'. Unlike many other 80x86 assemblers, FASM only requires the source code to include the information it really needs. It is written in itself and is very small and fast. It runs on DOS/Windows/Linux and can produce flat binary, DOS EXE, Win32 PE and COFF output. See http://fasm.sourceforge.net.
It is (of course) slower than other assemblers. It has its own syntax (and uses its own names for x86 opcodes) Fairly good documentation is included. Check it out: ftp://linux01.gwdg.de/pub/cLIeNUX/interim/. Probably you'll not use it on regular basis, but at least it deserves your interest as an interesting idea.
HLA is a High Level Assembly language. It uses a high level language like syntax (similar to Pascal, C/C++, and other HLLs) for variable declarations, procedure declarations, and procedure calls. It uses a modified assembly language syntax for the standard machine instructions. It also provides several high level language style control structures (if, while, repeat..until, etc.) that help you write much more readable code.
HLA is free and comes with source, Linux and Win32 versions available. On Win32 you need MASM and a 32-bit version of MS-link on Win32, on Linux you nee GAS, because HLA produces specified assembler code and uses that assembler for final assembling and linking.
TALC is another free MASM/Win32 based compiler (however it supports ELF output, does it?).
TAL stands for Typed Assembly Language. It extends traditional untyped assembly languages with typing annotations, memory management primitives, and a sound set of typing rules, to guarantee the memory safety, control flow safety,and type safety of TAL programs. Moreover, the typing constructs are expressive enough to encode most source language programming features including records and structures, arrays, higher-order and polymorphic functions, exceptions, abstract data types, subtyping, and modules. Just as importantly, TAL is flexible enough to admit many low-level compiler optimizations. Consequently, TAL is an ideal target platform for type-directed compilers that want to produce verifiably safe code for use in secure mobile code applications or extensible operating system kernels.
Free Pascal has an internal 32-bit assembler (based on NASM tables) and a switchable output that allows:
Binary (ELF and coff when crosscompiled .o) output
NASM
MASM
TASM
AS (aout,coff, elf32)
The MASM and TASM output are not as good debugged as the other two, but can be handy sometimes.
The assembler's look and feel are based on Turbo Pascal's internal BASM, and the IDE supports similar highlighting, and FPC can fully integrate with gcc (on C level, not C++).
Using a dummy RTL, one can even generate pure assembler programs.
Win32Forth is a free 32-bit ANS FORTH system that successfully runs under Win32s, Win95, Win/NT. It includes a free 32-bit assembler (either prefix or postfix syntax) integrated into the reflective FORTH language. Macro processing is done with the full power of the reflective language FORTH; however, the only supported input and output contexts is Win32For itself (no dumping of .obj file, but you could add that feature yourself, of course). Find it at ftp://ftp.forth.org/pub/Forth/Compilers/native/windows/Win32For/.
Terse is a programming tool that provides THE most compact assembler syntax for the x86 family! However, it is evil proprietary software. It is said that there was a project for a free clone somewhere, that was abandoned after worthless pretenses that the syntax would be owned by the original author. Thus, if you're looking for a nifty programming project related to assembly hacking, I invite you to develop a terse-syntax frontend to NASM, if you like that syntax.
As an interesting historic remark, on comp.compilers,
1999/07/11 19:36:51, the moderator wrote:
"There's no reason that assemblers have to have awful syntax. About
30 years ago I used Niklaus Wirth's PL360, which was basically a S/360
assembler with Algol syntax and a a little syntactic sugar like while
loops that turned into the obvious branches. It really was an
assembler, e.g., you had to write out your expressions with explicit
assignments of values to registers, but it was nice. Wirth used it to
write Algol W, a small fast Algol subset, which was a predecessor to
Pascal. As is so often the case, Algol W was a significant
improvement over many of its successors. -John"
You may find more about them, together with the basics of x86 assembly programming, in the Raymond Moon's x86 assembly FAQ.
Note that all DOS-based assemblers should work inside the Linux DOS Emulator, as well as other similar emulators, so that if you already own one, you can still use it inside a real OS. Recent DOS-based assemblers also support COFF and/or other object file formats that are supported by the GNU BFD library, so that you can use them together with your free 32-bit tools, perhaps using GNU objcopy (part of the binutils) as a conversion filter.
Assembly programming is a bore, but for critical parts of programs.
%.s: %.S other_dependencies $(FILTER) $(FILTER_OPTIONS) < $< > $@ |
See macro4th (this4th) or the Tunes 0.0.0.25 sources as examples of advanced macroprogramming using m4.
However, its disfunctional quoting and unquoting semantics force you to use explicit continuation-passing tail-recursive macro style if you want to do advanced macro programming (which is remindful of TeX -- BTW, has anyone tried to use TeX as a macroprocessor for anything else than typesetting ?). This is NOT worse than CPP that does not allow quoting and recursion anyway.
The right version of M4 to get is GNU m4 1.4 (or later if exists), which has the most features and the least bugs or limitations of all. m4 is designed to be slow for anything but the simplest uses, which might still be ok for most assembly programming (you are not writing million-lines assembly programs, are you?).
For instance, you could use a program outputting source code
Think about it!
There is a project, using the programming language Icon (with an experimental ML version), to build a basis for producing assembly-manipulating code. See around http://www.eecs.harvard.edu/~nr/toolkit/
The TUNES Project for a Free Reflective Computing System is developing its own assembler as an extension to the Scheme language, as part of its development process. It doesn't run at all yet, though help is welcome.
The assembler manipulates abstract syntax trees, so it could equally serve as the basis for a assembly syntax translator, a disassembler, a common assembler/compiler back-end, etc. Also, the full power of a real language, Scheme, make it unchallenged as for macroprocessing/metaprogramming.
FP stack: I'm not sure, but I think result is in st(0), whole stack caller-saved. The SVR4 i386 ABI specs at http://www.caldera.com/developer/devspecs/ is a good reference point if you want more details.
Note that GCC has options to modify the calling conventions by reserving registers, having arguments in registers, not assuming the FPU, etc. Check the i386 .info pages.
Beware that you must then declare the cdecl or regparm(0) attribute for a function that will follow standard GCC calling conventions. See C Extensions::Extended Asm:: section from the GCC info pages. See also how Linux defines its asmlinkage macro...
Some C compilers prepend an underscore before every symbol, while others do not.
Particularly, Linux a.out GCC does such prepending, while Linux ELF GCC does not.
You can also override the implicit C->asm renaming by inserting statements like
void foo asm("bar") (void); |
Here is summary of direct system calls pros and cons.
the smallest possible size; squeezing the last byte out of the system
the highest possible speed; squeezing cycles out of your favorite benchmark
no pollution by C calling conventions (if you're developing your own language or environment)
just for the fun out of it (don't you get a kick out of assembly programming?)
Cons:
If any other program on your computer uses the libc, then duplicating the libc code will actually wastes memory, not saves it.
Services redundantly implemented in many static binaries are a waste of memory. But you can make your libc replacement a shared library.
Size is much better saved by having some kind of bytecode, wordcode, or structure interpreter than by writing everything in assembly. (the interpreter itself could be written either in C or assembly.) The best way to keep multiple binaries small is to not have multiple binaries, but instead to have an interpreter process files with #! prefix. This is how OCaml works when used in wordcode mode (as opposed to optimized native code mode), and it is compatible with using the libc. This is also how Tom Christiansen's Perl PowerTools reimplementation of unix utilities works. Finally, one last way to keep things small, that doesn't depend on an external file with a hardcoded path, be it library or interpreter, is to have only one binary, and have multiply-named hard or soft links to it: the same binary will provide everything you need in an optimal space, with no redundancy of subroutines or useless binary headers; it will dispatch its specific behavior according to its argv[0]; in case it isn't called with a recognized name, it might default to a shell, and be possibly thus also usable as an interpreter!
You cannot benefit from the many functionalities that libc provides besides mere linux syscalls: that is, functionality described in section 3 of the manual pages, as opposed to section 2, such as malloc, threads, locale, password, high-level network management, etc.
Therefore, you might have to reimplement large parts of libc, from printf() to malloc() and gethostbyname. It's redundant with the libc effort, and can be quite boring sometimes. Note that some people have already reimplemented "light" replacements for parts of the libc -- check them out! (Redhat's minilibc, Rick Hohensee's libsys, Felix von Leitner's dietlibc, Christian Fowelin's libASM, asmutils project is working on pure assembly libc)
Static libraries prevent you to benefit from libc upgrades as well as from libc add-ons such as the zlibc package, that does on-the-fly transparent decompression of gzip-compressed files.
The few instructions added by the libc can be a ridiculously small speed overhead as compared to the cost of a system call. If speed is a concern, your main problem is in your usage of system calls, not in their wrapper's implementation.
Using the standard assembly API for system calls is much slower than using the libc API when running in micro-kernel versions of Linux such as L4Linux, that have their own faster calling convention, and pay high convention-translation overhead when using the standard one (L4Linux comes with libc recompiled with their syscall API; of course, you could recompile your code with their API, too).
See previous discussion for general speed optimization issue.
If syscalls are too slow to you, you might want to hack the kernel sources (in C) instead of staying in userland.
If you've pondered the above pros and cons, and still want to use direct syscalls, then here is some advice.
You can easily define your system calling functions in a portable way in C (as opposed to unportable using assembly), by including asm/unistd.h, and using provided macros.
Since you're trying to replace it, go get the sources for the libc, and grok them. (And if you think you can do better, then send feedback to the authors!)
As an example of pure assembly code that does everything you want, examine Linux assembly resources.
Basically, you issue an int 0x80, with the __NR_syscallname number (from asm/unistd.h) in eax, and parameters (up to six) in ebx, ecx, edx, esi, edi, ebp respectively.
Result is returned in eax, with a negative result being an error, whose opposite is what libc would put into errno. The user-stack is not touched, so you needn't have a valid one when doing a syscall.
![]() | Passing sixth parameter in ebp appeared in Linux 2.4, previous Linux versions understand only 5 parameters in registers. |
Linux Kernel Internals, and especially How System Calls Are Implemented on i386 Architecture? chapter will give you more robust overview.
As for the invocation arguments passed to a process upon startup, the general principle is that the stack originally contains the number of arguments argc, then the list of pointers that constitute *argv, then a null-terminated sequence of null-terminated variable=value strings for the environment. For more details, do examine Linux assembly resources, read the sources of C startup code from your libc (crt0.S or crt1.S), or those from the Linux kernel (exec.c and binfmt_*.c in linux/fs/).
Particularly, if what you want is Graphics programming, then do join one of the GGI or XFree86 projects.
Some people have even done better, writing small and robust XFree86 drivers in an interpreted domain-specific language, GAL, and achieving the efficiency of hand C-written drivers through partial evaluation (drivers not only not in asm, but not even in C!). The problem is that the partial evaluator they used to achieve efficiency is not free software. Any taker for a replacement?
Anyway, in all these cases, you'll be better when using GCC inline assembly with the macros from linux/asm/*.h than writing full assembly source files.
Such thing is theoretically possible (proof: see how DOSEMU can selectively grant hardware port access to programs), and I've heard rumors that someone somewhere did actually do it (in the PCI driver? Some VESA access stuff? ISA PnP? dunno). If you have some more precise information on that, you'll be most welcome. Anyway, good places to look for more information are the Linux kernel sources, DOSEMU sources (and other programs in the DOSEMU repository), and sources for various low-level programs under Linux... (perhaps GGI if it supports VESA).
Basically, you must either use 16-bit protected mode or vm86 mode.
The first is simpler to setup, but only works with well-behaved code that won't do any kind of segment arithmetics or absolute segment addressing (particularly addressing segment 0), unless by chance it happens that all segments used can be setup in advance in the LDT.
The later allows for more "compatibility" with vanilla 16-bit environments, but requires more complicated handling.
In both cases, before you can jump to 16-bit code, you must
mmap any absolute address used in the 16-bit code (such as ROM, video buffers, DMA targets, and memory-mapped I/O) from /dev/mem to your process' address space,
setup the LDT and/or vm86 mode monitor.
grab proper I/O permissions from the kernel (see the above section)
Again, carefully read the source for the stuff contributed to the DOSEMU project, particularly these mini-emulators for running ELKS and/or simple .COM programs under Linux/i386.
Docs about DPMI (and much more) can be found on ftp://x2ftp.oulu.fi/pub/msdos/programming/ (again, the original x2ftp site is closing (no more?), so use a mirror site).
DJGPP comes with its own (limited) glibc derivative/subset/replacement, too.
It is possible to cross-compile from Linux to DOS, see the devel/msdos/ directory of your local FTP mirror for metalab.unc.edu; Also see the MOSS DOS-extender from the Flux project from the university of Utah.
Other documents and FAQs are more DOS-centered; we do not recommend DOS development.
Windows and Co. This document is not about Windows programming, you can find lots of documents about it everywhere.. The thing you should know is that Cygnus Solutions developed the cygwin32.dll library, for GNU programs to run on Win32 platform; thus, you can use GCC, GAS, all the GNU tools, and many other Unix applications.
Hence, for easier debugging purpose, you might like to develop your "OS" first as a process running on top of Linux (despite the slowness), then use the Flux OS kit (which grants use of Linux and BSD drivers in your own OS) to make it stand-alone. When your OS is stable, it is time to write your own hardware drivers if you really love that.
This HOWTO will not cover topics such as bootloader code, getting into 32-bit mode, handling Interrupts, the basics about Intel protected mode or V86/R86 braindeadness, defining your object format and calling conventions.
The main place where to find reliable information about that all, is source code of existing OSes and bootloaders. Lots of pointers are on the following webpage: http://www.tunes.org/Review/OSes.html
You may also want to read Introduction to UNIX assembly programming tutorial, it contains sample code for other UNIX-like OSes.
First of all you need assembler (compiler) -- nasm or gas.
As for nasm, you may have to download and install binary packages for Linux and docs from the nasm site; note that several distributions (Stampede, Debian, SuSe, Mandrake) already have nasm, check first.
If you're going to dig in, you should also install include files for your OS, and if possible, kernel source.
section .data ;section declaration msg db "Hello, world!",0xa ;our dear string len equ $ - msg ;length of our dear string section .text ;section declaration ;we must export the entry point to the ELF linker or global _start ;loader. They conventionally recognize _start as their ;entry point. Use ld -e foo to override the default. _start: ;write our string to stdout mov edx,len ;third argument: message length mov ecx,msg ;second argument: pointer to message to write mov ebx,1 ;first argument: file handle (stdout) mov eax,4 ;system call number (sys_write) int 0x80 ;call kernel ;and exit mov ebx,0 ;first syscall argument: exit code mov eax,1 ;system call number (sys_exit) int 0x80 ;call kernel |
.data # section declaration msg: .ascii "Hello, world!\n" # our dear string len = . - msg # length of our dear string .text # section declaration # we must export the entry point to the ELF linker or .global _start # loader. They conventionally recognize _start as their # entry point. Use ld -e foo to override the default. _start: # write our string to stdout movl $len,%edx # third argument: message length movl $msg,%ecx # second argument: pointer to message to write movl $1,%ebx # first argument: file handle (stdout) movl $4,%eax # system call number (sys_write) int $0x80 # call kernel # and exit movl $0,%ebx # first argument: exit code movl $1,%eax # system call number (sys_exit) int $0x80 # call kernel |
First step of building an executable is compiling (or assembling) object file from the source:
$ nasm -f elf hello.asm |
$ as -o hello.o hello.S |
Your main resource for Linux/UNIX assembly programming material is:
Do visit it, and get plenty of pointers to assembly projects, tools, tutorials, documentation, guides, etc, concerning different UNIX operating systems and CPUs. Because it evolves quickly, I will no longer duplicate it here.
If you are new to assembly in general, here are few starting pointers:
ftp.luth.se mirrors the hornet and x2ftp former archives of msdos assembly coding stuff
CoreWars, a fun way to learn assembly in general
Usenet: comp.lang.asm.x86; alt.lang.asm
Mailing list address is <linux-assembly@vger.kernel.org>.
To subscribe send a messgage to <majordomo@vger.kernel.org> with the following line in the body of the message:
subscribe linux-assembly |
Detailed information and list archives are available at http://linuxassembly.org/list.html.
Here are frequently asked questions (with answers) about Linux assembly programming. Some of the questions (and the answers) were taken from the the linux-assembly mailing list.
Ok you have a number of options to graphics in Linux. Which one you use depends on what you want to do. There isn't one Web site with all the information but here are some tips: SVGALib: This is a C library for console SVGA access. Pros: very easy to learn, good coding examples, not all that different from equivalent gfx libraries for DOS, all the effects you know from DOS can be converted with little difficulty. Cons: programs need superuser rights to run since they write directly to the hardware, doesn't work with all chipsets, can't run under X-Windows. Search for svgalib-1.4.x on http://ftp.is.co.za Framebuffer: do it yourself graphics at SVGA res Pros: fast, linear mapped video access, ASM can be used if you want :) Cons: has to be compiled into the kernel, chipset-specific issues, must switch out of X to run, relies on good knowledge of linux system calls and kernel, tough to debug Examples: asmutils (http://www.linuxassembly.org) and the leaves example and my own site for some framebuffer code and tips in asm (http://ma.verick.co.za/linux4k/) Xlib: the application and development libraries for XFree86. Pros: Complete control over your X application Cons: Difficult to learn, horrible to work with and requires quite a bit of knowledge as to how X works at the low level. Not recommended but if you're really masochistic go for it. All the include and lib files are probably installed already so you have what you need. Low-level APIs: include PTC, SDL, GGI and Clanlib Pros: very flexible, run under X or the console, generally abstract away the video hardware a little so you can draw to a linear surface, lots of good coding examples, can link to other APIs like OpenGL and sound libs, Windows DirectX versions for free Cons: Not as fast as doing it yourself, often in development so versions can (and do) change frequently. Examples: PTC and GGI have excellent demos, SDL is used in sdlQuake, Myth II, Civ CTP and Clanlib has been used for games as well. High-level APIs: OpenGL - any others? Pros: clean api, tons of functionality and examples, industry standard so you can learn from SGI demos for example Cons: hardware acceleration is normally a must, some quirks between versions and platforms Examples: loads - check out www.mesa3d.org under the links section. To get going try looking at the svgalib examples and also install SDL and get it working. After that, the sky's the limit. |
There's an early version of the Assembly Language Debugger, which is designed to work with assembly code, and is portable enough to run on Linux and *BSD. It is already functional and should be the right choice, check it out!
You can also try gdb ;). Although it is source-level debugger, it can be used to debug pure assembly code, and with some trickery you can make gdb to do what you need (unfortunately, nasm '-g' switch does not generate proper debug info for gdb; this is nasm bug, I think). Here's an answer from Dmitry Bakhvalov:
Personally, I use gdb for debugging asmutils. Try this: 1) Use the following stuff to compile: $ nasm -f elf -g smth.asm $ ld -o smth smth.o 2) Fire up gdb: $ gdb smth 3) In gdb: (gdb) disassemble _start Place a breakpoint at _start+1 (If placed at _start the breakpoint wouldnt work, dunno why) (gdb) b *0x8048075 To step thru the code I use the following macro: (gdb)define n >ni >printf "eax=%x ebx=%x ...etc...",$eax,$ebx,...etc... >disassemble $pc $pc+15 >end Then start the program with r command and debug with n. Hope this helps. |
An additional note from ???:
I have such a macro in my .gdbinit for quite some time now, and it for sure makes life easier. A small difference : I use "x /8i $pc", which guarantee a fixed number of disassembled instructions. Then, with a well chosen size for my xterm, gdb output looks like it is refreshed, and not scrolling. |
If you want to set breakpoints across your code, you can just use int 3 instruction as breakpoint (instead of entering address manually in gdb).
If you're using gas, you should consult gas and gdb related tutorials.
section .text global init_module global cleanup_module global kernel_version extern printk init_module: push dword str1 call printk pop eax xor eax,eax ret cleanup_module: push dword str2 call printk pop eax ret str1 db "init_module done",0xa,0 str2 db "cleanup_module done",0xa,0 kernel_version db "2.2.18",0 |
$ nasm -f elf -o module.m module.asm |
$ ld -r -o module.o module.m |
A laconic answer from H-Peter Recktenwald:
ebx := 0 (in fact, any value below .bss seems to do) sys_brk eax := current top (of .bss section) ebx := [ current top < ebx < (esp - 16K) ] sys_brk eax := new top of .bss |
An extensive answer from Tiago Gasiba:
section .bss var1 resb 1 section .text ; ;allocate memory ; %define LIMIT 0x4000000 ; about 100Megs mov ebx,0 ; get bottom of data segment call sys_brk cmp eax,-1 ; ok? je erro1 add eax,LIMIT ; allocate +LIMIT memory mov ebx,eax call sys_brk cmp eax,-1 ; ok? je erro1 cmp eax,var1+1 ; has the data segment grown? je erro1 ; ;use allocated memory ; ; now eax contains bottom of ; data segment mov ebx,eax ; save bottom mov eax,var1 ; eax=beginning of data segment repeat: mov word [eax],1 ; fill up with 1's inc eax cmp ebx,eax ; current pos = bottom? jne repeat ; ;free memory ; mov ebx,var1 ; deallocate memory call sys_brk ; by forcing its beginning=var1 cmp eax,-1 ; ok? je erro2 |
An answer from Patrick Mochel:
When you call sys_open, you get back a file descriptor, which is simply an index into a table of all the open file descriptors that your process has. stdin, stdout, and stderr are always 0, 1, and 2, respectively, because that is the order in which they are always open for your process from there. Also, notice that the first file descriptor that you open yourself (w/o first closing any of those magic three descriptors) is always 3, and they increment from there. Understanding the index scheme will explain what select does. When you call select, you are saying that you are waiting certain file descriptors to read from, certain ones to write from, and certain ones to watch from exceptions from. Your process can have up to 1024 file descriptors open, so an fd_set is just a bit mask describing which file descriptors are valid for each operation. Make sense? Since each fd that you have open is just an index, and it only needs to be on or off for each fd_set, you need only 1024 bits for an fd_set structure. 1024 / 32 = 32 longs needed to represent the structure. Now, for the loose example. Suppose you want to read from a file descriptor (w/o timeout). - Allocate the equivalent to an fd_set. .data my_fds: times 32 dd 0 - open the file descriptor that you want to read from. - set that bit in the fd_set structure. First, you need to figure out which of the 32 dwords the bit is in. Then, use bts to set the bit in that dword. bts will do a modulo 32 when setting the bit. That's why you need to first figure out which dword to start with. mov edx, 0 mov ebx, 32 div ebx lea ebx, my_fds bts ebx[eax * 4], edx - repeat the last step for any file descriptors you want to read from. - repeat the entire exercise for either of the other two fd_sets if you want action from them. That leaves two other parts of the equation - the n paramter and the timeout parameter. I'll leave the timeout parameter as an exercise for the reader (yes, I'm lazy), but I'll briefly talk about the n parameter. It is the value of the largest file descriptor you are selecting from (from any of the fd_sets), plus one. Why plus one? Well, because it's easy to determine a mask from that value. Suppose that there is data available on x file descriptors, but the highest one you care about is (n - 1). Since an fd_set is just a bitmask, the kernel needs some efficient way for determining whether to return or not from select. So, it masks off the bits that you care about, checks if anything is available from the bits that are still set, and returns if there is (pause as I rummage through kernel source). Well, it's not as easy as I fantasized it would be. To see how the kernel determines that mask, look in fs/select.c in the kernel source tree. Anyway, you need to know that number, and the easiest way to do it is to save the value of the last file descriptor open somewhere so you don't lose it. Ok, that's what I know. A warning about the code above (as always) is that it is not tested. I think it should work, but if it doesn't let me know. But, if it starts a global nuclear meltdown, don't call me. ;-) |
That's all for now, folks.
Revision History | ||
---|---|---|
Revision 0.6f | 17 Aug 2002 | Revised by: konst |
Added FASM, added URL to Korean translation, added URL to SVR4 i386 ABI specs, update on HLA/Linux, small fix in hello.S example, misc URL updates; | ||
Revision 0.6e | 12 Jan 2002 | Revised by: konst |
Added URL describing GAS Intel syntax; Added OSIMPA(former SHASM); Added YASM; FAQ update. | ||
Revision 0.6d | 18 Mar 2001 | Revised by: konst |
Added Free Pascal; new NASM URL again | ||
Revision 0.6c | 15 Feb 2001 | Revised by: konst |
Added SHASM; new answer in FAQ, new NASM URL, new mailing list address | ||
Revision 0.6b | 21 Jan 2001 | Revised by: konst |
new questions in FAQ, corrected few URLs | ||
Revision 0.6a | 10 Dec 2000 | Revised by: konst |
Remade section on AS86 (thanks to Holluby Istvan for pointing out obsolete information). Fixed several URLs that can be incorrectly rendered from sgml to html. | ||
Revision 0.6 | 11 Nov 2000 | Revised by: konst |
HOWTO is completely rewritten using DocBook DTD. Layout is totally rearranged; too much changes to list them here. | ||
Revision 0.5n | 07 Nov 2000 | Revised by: konst |
Added question regarding kernel modules to FAQ, fixed NASM URLs, GAS has Intel syntax too | ||
Revision 0.5m | 22 Oct 2000 | Revised by: konst |
Linux 2.4 system calls can have 6 args, Added ALD note to FAQ, fixed mailing list subscribe address | ||
Revision 0.5l | 23 Aug 2000 | Revised by: konst |
Added TDASM, updates on NASM | ||
Revision 0.5k | 11 Jul 2000 | Revised by: konst |
Few additions to FAQ | ||
Revision 0.5j | 14 Jun 2000 | Revised by: konst |
Complete rearrangement of Introduction and Resources sections. FAQ added to Resources, misc cleanups and additions. | ||
Revision 0.5i | 04 May 2000 | Revised by: konst |
Added HLA, TALC; rearrangements in Resources, Quick Start Assemblers sections. Few new pointers. | ||
Revision 0.5h | 09 Apr 2000 | Revised by: konst |
finally managed to state LDP license on document, new resources added, misc fixes | ||
Revision 0.5g | 26 Mar 2000 | Revised by: konst |
new resources on different CPUs | ||
Revision 0.5f | 02 Mar 2000 | Revised by: konst |
new resources, misc corrections | ||
Revision 0.5e | 10 Feb 2000 | Revised by: konst |
URL updates, changes in GAS example | ||
Revision 0.5d | 01 Feb 2000 | Revised by: konst |
Resources (former "Pointers") section completely redone, various URL updates. | ||
Revision 0.5c | 05 Dec 1999 | Revised by: konst |
New pointers, updates and some rearrangements. Rewrite of sgml source. | ||
Revision 0.5b | 19 Sep 1999 | Revised by: konst |
Discussion about libc or not libc continues. New web pointers and and overall updates. | ||
Revision 0.5a | 01 Aug 1999 | Revised by: konst |
Quick Start section rearranged, added GAS example. Several new web pointers. | ||
Revision 0.5 | 01 Aug 1999 | Revised by: konstfare |
GAS has 16-bit mode. New maintainer (at last): Konstantin Boldyshev. Discussion about libc or not libc. Added Quick Start section with examples of assembly code. | ||
Revision 0.4q | 22 Jun 1999 | Revised by: fare |
process argument passing (argc, argv, environ) in assembly. This is yet another "last release by Fare before new maintainer takes over". Nobody knows who might be the new maintainer. | ||
Revision 0.4p | 06 Jun 1999 | Revised by: fare |
clean up and updates | ||
Revision 0.4o | 01 Dec 1998 | Revised by: fare |
Revision 0.4m | 23 Mar 1998 | Revised by: fare |
corrections about gcc invocation | ||
Revision 0.4l | 16 Nov 1997 | Revised by: fare |
release for LSL 6th edition | ||
Revision 0.4k | 19 Oct 1997 | Revised by: fare |
Revision 0.4j | 07 Sep 1997 | Revised by: fare |
Revision 0.4i | 17 Jul 1997 | Revised by: fare |
info on 16-bit mode access from Linux | ||
Revision 0.4h | 19 Jun 1997 | Revised by: fare |
still more on "how not to use assembly"; updates on NASM, GAS. | ||
Revision 0.4g | 30 Mar 1997 | Revised by: fare |
Revision 0.4f | 20 Mar 1997 | Revised by: fare |
Revision 0.4e | 13 Mar 1997 | Revised by: fare |
Release for DrLinux | ||
Revision 0.4d | 28 Feb 1997 | Revised by: fare |
Vapor announce of a new Assembly-HOWTO maintainer | ||
Revision 0.4c | 09 Feb 1997 | Revised by: fare |
Added section Do you need assembly?. | ||
Revision 0.4b | 03 Feb 1997 | Revised by: fare |
NASM moved: now is before AS86 | ||
Revision 0.4a | 20 Jan 1997 | Revised by: fare |
CREDITS section added | ||
Revision 0.4 | 20 Jan 1997 | Revised by: fare |
first release of the HOWTO as such | ||
Revision 0.4pre1 | 13 Jan 1997 | Revised by: fare |
text mini-HOWTO transformed into a full linuxdoc-sgml HOWTO, to see what the SGML tools are like | ||
Revision 0.3l | 11 Jan 1997 | Revised by: fare |
Revision 0.3k | 19 Dec 1996 | Revised by: fare |
What? I had forgotten to point to terse??? | ||
Revision 0.3j | 24 Nov 1996 | Revised by: fare |
point to French translated version | ||
Revision 0.3i | 16 Nov 1996 | Revised by: fare |
NASM is getting pretty slick | ||
Revision 0.3h | 06 Nov 1996 | Revised by: fare |
more about cross-compiling -- See on sunsite: devel/msdos/ | ||
Revision 0.3g | 02 Nov 1996 | Revised by: fare |
Created the History. Added pointers in cross-compiling section. Added section about I/O programming under Linux (particularly video). | ||
Revision 0.3f | 17 Oct 1996 | Revised by: fare |
Revision 0.3c | 15 Jun 1996 | Revised by: fare |
Revision 0.2 | 04 May 1996 | Revised by: fare |
Revision 0.1 | 23 Apr 1996 | Revised by: fare |
Francois-Rene "Fare" Rideau creates and publishes the first mini-HOWTO, because "I'm sick of answering ever the same questions on comp.lang.asm.x86" |
Linus Torvalds for Linux
Bruce Evans for bcc from which as86 is extracted
Simon Tatham and Julian Hall for NASM
Greg Hankins and now Tim Bynum for maintaining HOWTOs
Raymond Moon for his FAQ
Eric Dumas for his translation of the mini-HOWTO into French (sad thing for the original author to be French and write in English)
Paul Anderson and Rahim Azizarab for helping me, if not for taking over the HOWTO
Marc Lehman for his insight on GCC invocation
Abhijit Menon-Sen for helping me figure out the argument passing convention
This version of the document is endorsed by Konstantin Boldyshev.
Modifications (including translations) must remove this appendix according to the license agreement.
$Id: Assembly-HOWTO.sgml,v 1.7 2002/08/17 08:35:59 konst Exp $
The Free Software Foundation may publish new, revised versions of the GNU Free Documentation License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. See http://www.gnu.org/copyleft/.
Each version of the License is given a distinguishing version number. If the Document specifies that a particular numbered version of this License "or any later version" applies to it, you have the option of following the terms and conditions either of that specified version or of any later version that has been published (not as a draft) by the Free Software Foundation. If the Document does not specify a version number of this License, you may choose any version ever published (not as a draft) by the Free Software Foundation.
To use this License in a document you have written, include a copy of the License in the document and put the following copyright and license notices just after the title page:
Copyright (c) YEAR YOUR NAME.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.1
or any later version published by the Free Software Foundation;
with the Invariant Sections being LIST THEIR TITLES, with the
Front-Cover Texts being LIST, and with the Back-Cover Texts being LIST.
A copy of the license is included in the section entitled "GNU
Free Documentation License".
If you have no Invariant Sections, write "with no Invariant Sections" instead of saying which ones are invariant. If you have no Front-Cover Texts, write "no Front-Cover Texts" instead of "Front-Cover Texts being LIST"; likewise for Back-Cover Texts.
If your document contains nontrivial examples of program code, we recommend releasing these examples in parallel under your choice of free software license, such as the GNU General Public License, to permit their use in free software.