                ==============================
                Welcome to the FCUTILS package
                ==============================

This package originated as a collecten of ELF binary tools, based
on Libelf.  Now it's actually going to be useful, as a software
development toolchain for the not-yet-existing Freedom CPU.

In order to compile this package, you need

    * An ANSI (err, ISO) C compiler.  Recent versions of gcc work fine,
      but you should avoid pure C89 compilers - they will fail to
      compile some files because they use `long long', explicit
      initializers and similar stuff.  You may also need GNU Make, but
      I'm not sure.

    * Yacc (the Berkeley version, or Bison) and Flex.  I'm sorry but
      ordinary Lex won't work; the code uses some features that only
      Flex provides.  In case you don't have any of these, take a look
      at ftp://ftp.gnu.org/.

    * Libelf (version 0.8.2 or higher).  You can get it from
      http://www.stud.uni-hannover.de/~michael/software/, or from the
      Linux archive at ftp://ftp.ibiblio.org/pub/Linux/libs/.
      You may also succeed with the Solaris 8 version, but I didn't
      check that.  Other versions probably won't work at all.

You may also want to install the F-CPU port of gcc 3.  The latest gcc
patch should be available at the same place where you found this package;
most likely it was http://f-cpu.seul.org/~f-cpu/new/.  That's also the
place where you should look for updates.

To install fcutils, cd to the top directory and run

    ./configure
    make

If you enter `./emu/emu' now, the F-CPU emulator will start and try
to execute a tiny built-in F-CPU program (the "smoke test": if you see
smoke when you turn the machine on, something is wrong - but of course
you would only see virtual smoke in this case ;-).  You may also want
to try the `hello, world' program I generated with gcc and fctools: type
`./emu/emu emu/hello.bin' to execute it.  There's also an ELF version
that you can run with `./emu/elfemu emu/hello.elf'.  If everything works
well, you can install the package with

    make install

By default, the package installs in /usr/local.  In order to not overwrite
existing files, names are prefixed with `fcpu-'.  A link (or copy) with
the original name can be found below /usr/local/fcpu.  You can change
the installation directory when you run configure, e.g. with

    ./configure --prefix=/opt/f-cpu

If you find a bug - that is, if one of the programs crashes or complains
about a `failed assertion' - please tell me (my mail address is at the
end of this document, as well as in most source files).  But don't
forget to mention what you did and in which order.  A small test input
file that makes the bug appear would be most helpful.

                ==============================

What's in here? Besides the set of generic ELF tools -- ar, elfdump,
mcs, nm, ranlib, size, strings and strip -- there are an F-CPU assembler
(as), a disassembler (disasm), a linker (ld) and two instruction-level
emulators.  The emulators share the same emulation engine, but while
`emu' accepts only fully-relocated, binary images that start at physical
address 0, `elfemu' was designed to run `real' binaries, i.e. statically
linked ELF files.  That enables us to do both `bare metal' development
and higher-level software ports.

The assembler will produce binary input files for `emu' if you call it
with `-O bin' (Note: the input files must not contain undefined symbols).
If you use `-O elf' or no `-O' option at all, it will produce relocatable
ELF files (type ET_REL) which must be postprocessed with ld to create
executable (type ET_EXEC) ELF files.  Shared libraries aren't supported
yet.

If you're running some variant of Linux, and have enabled kernel support
for `misc' binaries, you may register the F-CPU ELF format as a valid
binary format - that is, you will be able to start F-CPU binaries directly
from the command line.  Install the `emu/linux-fcpu' script somewhere and
run it (as root) once after system startup to register the file format
(you may want to include this in one of your `rc' files).  Remember that
you must also make your F-CPU ELF binaries executable (with `chmod +x').

Currently, you must link your binaries manually.  That is, you compile
your C and assembler source files with

	fcpu-gcc -c *.c
	fcpu-gcc -c *.s

and link them with

	fcpu-ld -o binary crt1.o *.o

Note:  If you install fctools in the default place, and configure gcc
when the fctools package is already installed, gcc should find the
assembler automatically.

Assembler source for the `initialization' file `crt1.o' is available
in ld/crt1.s.  It contains some short glue code that grabs the arguments
passed by `elfemu' and calls `main'.  After that, you should be able
to run `binary'.

                ==============================

The assembler/disassembler/emulators quartet should understand the same
instruction set (it's rather hard to keep them aligned).  There's an
F-CPU opcode library that makes things a little easier; if you want
to use it in your own programs, add -I/usr/local/include to CPPFLAGS
and -L/usr/local/lib to LDFLAGS, #include <fcpu/fcpu_opcodes.h> and
link with -lfcpu_opcodes.  Here's a list of all F-CPU instructions the
library and the tools support:

    [s]abs[.b|.d|.q|.o]
    [s]add[c|s][.b|.d|.q|.o]
    [s]addi[.b|.d|.q|.o]
    [s]addsub[.b|.d|.q|.o]
    [s]amac[h][s][.b|.d|.q|.o]
    [s]and[.b|.d|.q|.o]
    [s]and.and[.b|.d|.q|.o]
    [s]and.or[.b|.d|.q|.o]
    [s]andi[.b|.d|.q|.o]
    [s]andn[.b|.d|.q|.o]
    [s]andn.and[.b|.d|.q|.o]
    [s]andn.or[.b|.d|.q|.o]
    [s]andni[.b|.d|.q|.o]
    [s]bchg[.b|.d|.q|.o]
    [s]bchgi[.b|.d|.q|.o]
    [s]bclr[.b|.d|.q|.o]
    [s]bclri[.b|.d|.q|.o]
    [s][d]bitrev[h][.b|.d|.q|.o]
    [s][d]bitrevi[.b|.d|.q|.o]
    [s]bset[.b|.d|.q|.o]
    [s]bseti[.b|.d|.q|.o]
    [s]btst[.b|.d|.q|.o]
    [s]btsti[.b|.d|.q|.o]
    [s]byterev[.b|.d|.q|.o]
    cload[e][n]{cc}[.b|.d|.q|.o]
    [s]cmpg[s][.b|.d|.q|.o]
    [s]cmpg[s]i[.b|.d|.q|.o]
    [s]cmple[s][.b|.d|.q|.o]
    [s]cmple[s]i[.b|.d|.q|.o]
    cshiftl[.b|.d|.q|.o]
    cshiftr[.b|.d|.q|.o]
    cstore[e][n]{cc}[.b|.d|.q|.o]
    d2int{rnd}[.b|.d|.q|.o]
    [s]dec[.b|.d|.q|.o]
    [s]div[s][.b|.d|.q|.o]
    [s]divi[.b|.d|.q|.o]
    [s]divrem[s][.b|.d|.q|.o]
    [s]divremi[.b|.d|.q|.o]
    expand[.b|.d|.q|.o]
    expandh[.b|.d|.q|.o]
    expandl[.b|.d|.q|.o]
    f2int{rnd}[.b|.d|.q|.o]
    f2int{rnd}[.d|.f]
    [s]fadd[.d|.f]
    [s]faddsub[.d|.f]
    [s]fdiv[.d|.f]
    [s]fexp[.d|.f]
    [s]fiaprx[.d|.f]
    [s]flog[.d|.f]
    [s]fmac[.d|.f]
    [s]fmul[.d|.f]
    [s]fsqrt[.d|.f]
    [s]fsqrtiaprx[.d|.f]
    [s]fsub[.d|.f]
    get
    geti
    halt
    [s]inc[.b|.d|.q|.o]
    int2d{rnd}[.b|.d|.q|.o]
    int2f{rnd}[.b|.d|.q|.o]
    jmp[n]{cc}
    load[e][.b|.d|.q|.o]
    loadaddr[d]
    loadaddri[d]
    loadcons[.0|.1|.2|.3]
    loadconsx[.0|.1|.2|.3]
    loadi[e][.b|.d|.q|.o]
    loadm
    loop
    [s]lsb0[.b|.d|.q|.o]
    [s]lsb1[.b|.d|.q|.o]
    [s]mach[s][.b|.d|.q|.o]
    [s]macl[s][.b|.d|.q|.o]
    [s]max[s][.b|.d|.q|.o]
    [s]max[s]i[.b|.d|.q|.o]
    [s]min[s][.b|.d|.q|.o]
    [s]min[s]i[.b|.d|.q|.o]
    [s]minmax[s][.b|.d|.q|.o]
    [s]minmax[s]i[.b|.d|.q|.o]
    mix[.b|.d|.q|.o]
    mixh[.b|.d|.q|.o]
    mixl[.b|.d|.q|.o]
    move[n]{cc}[.b|.d|.q|.o]
    [s]msb0[.b|.d|.q|.o]
    [s]msb1[.b|.d|.q|.o]
    [s]mul[h][s][.b|.d|.q|.o]
    [s]muli[.b|.d|.q|.o]
    [s]mux[.b|.d|.q|.o]
    [s]nabs[.b|.d|.q|.o]
    [s]nand
    [s]nand.and[.b|.d|.q|.o]
    [s]nand.or[.b|.d|.q|.o]
    [s]nandi[.b|.d|.q|.o]
    [s]neg[.b|.d|.q|.o]
    nop
    [s]nor[.b|.d|.q|.o]
    [s]nor.and[.b|.d|.q|.o]
    [s]nor.or[.b|.d|.q|.o]
    [s]nori[.b|.d|.q|.o]
    [s]or[.b|.d|.q|.o]
    [s]or.and[.b|.d|.q|.o]
    [s]or.or[.b|.d|.q|.o]
    [s]ori[.b|.d|.q|.o]
    [s]orn[.b|.d|.q|.o]
    [s]orn.and[.b|.d|.q|.o]
    [s]orn.or[.b|.d|.q|.o]
    [s]orni[.b|.d|.q|.o]
    [s]popc[.b|.d|.q|.o]
    [s]popci[.b|.d|.q|.o]
    put
    puti
    [s]rem[s][.b|.d|.q|.o]
    [s]remi[.b|.d|.q|.o]
    [s]rotl[h][.b|.d|.q|.o]
    [s]rotli[.b|.d|.q|.o]
    [s]rotr[h][.b|.d|.q|.o]
    [s]rotri[.b|.d|.q|.o]
    sdup[.b|.d|.q|.o]
    sdupi[.b|.d|.q|.o]
    [s][d]shiftl[h][.b|.d|.q|.o]
    [s][d]shiftli[.b|.d|.q|.o]
    [s][d]shiftr[h][.b|.d|.q|.o]
    [s][d]shiftra[h][.b|.d|.q|.o]
    [s][d]shiftrai[.b|.d|.q|.o]
    [s][d]shiftri[.b|.d|.q|.o]
    store[e][.b|.d|.q|.o]
    storei[e][.b|.d|.q|.o]
    storem
    [s]sub[b|f][.b|.d|.q|.o]
    [s]subi[.b|.d|.q|.o]
    syscall
    trap
    [s]vsel[.b|.d|.q|.o]
    [s]vseli[.b|.d|.q|.o]
    widen[.b|.d|.q|.o]
    [s]xnor[.b|.d|.q|.o]
    [s]xnor.and[.b|.d|.q|.o]
    [s]xnor.or[.b|.d|.q|.o]
    [s]xnori[.b|.d|.q|.o]
    [s]xor[.b|.d|.q|.o]
    [s]xor.and[.b|.d|.q|.o]
    [s]xor.or[.b|.d|.q|.o]
    [s]xori[.b|.d|.q|.o]

Not all of these are work the way they're documented.  Some of them
still need to be discussed, in other cases the manual is simply wrong.
If in doubt, please refer to the source code (emu/emu.c and emu/emu.h).
This is, however, an almost complete set that we can *work* with.
If there's something wrong with the way an instruction is defined,
we now have a chance to find out, instead of just guessing.

The `syscall' instruction does not really do what it's supposed to.
Instead, it returns control to the emulator.  There is a minimal "operating
system" with only three "system calls":

                        // Unix/C equivalent:
    loadconsx $0, r1
    syscall $0, r0      // exit(r2);

    loadconsx $1, r1
    syscall $0, r0      // r1 = read(r2, r3, r4);

    loadconsx $2, r1
    syscall $0, r0      // r1 = write(r2, r3, r4);

so that you can terminate your programs and communicate with the outer
world.  `read' and `write' understand the usual file handles (0 = stdin,
1 = stdout, 2 = stderr) which are directly mapped to those of the emulator.

The assembler also supports a lot of directives:

    .byte           allocate 1-byte integers

    .2byte          allocate 2-byte integers
    .be_2byte       allocate 2-byte integers (big-endian)
    .le_2byte       allocate 2-byte integers (little-endian)
    .4byte          allocate 4-byte integers
    .be_4byte       allocate 4-byte integers (big-endian)
    .le_4byte       allocate 4-byte integers (little-endian)
    .8byte          allocate 8-byte integers
    .be_8byte       allocate 8-byte integers (big-endian)
    .le_8byte       allocate 8-byte integers (little-endian)
    .16byte         allocate 16-byte integers
    .be_16byte      allocate 16-byte integers (big-endian)
    .le_16byte      allocate 16-byte integers (little-endian)

    .half           allocate 2-byte integers
    .be_half        allocate 2-byte integers (big-endian)
    .le_half        allocate 2-byte integers (little-endian)
    .word           allocate 4-byte integers
    .be_word        allocate 4-byte integers (big-endian)
    .le_word        allocate 4-byte integers (little-endian)
    .dword          allocate 8-byte integers
    .be_dword       allocate 8-byte integers (big-endian)
    .le_dword       allocate 8-byte integers (little-endian)
    .qword          allocate 16-byte integers
    .be_qword       allocate 16-byte integers (big-endian)
    .le_qword       allocate 16-byte integers (little-endian)

    .short          allocate 2-byte integers
    .be_short       allocate 2-byte integers (big-endian)
    .le_short       allocate 2-byte integers (little-endian)
    .int            allocate 4-byte integers
    .be_int         allocate 4-byte integers (big-endian)
    .le_int         allocate 4-byte integers (little-endian)
    .long           allocate 8-byte integers
    .be_long        allocate 8-byte integers (big-endian)
    .le_long        allocate 8-byte integers (little-endian)
    .llong          allocate 16-byte integers
    .be_llong       allocate 16-byte integers (big-endian)
    .le_llong       allocate 16-byte integers (little-endian)

    .addr           allocate pointers
    .be_addr        allocate pointers (big-endian)
    .le_addr        allocate pointers (little-endian)

    .sleb128        allocate DWARF encoded signed integer
    .uleb128        allocate DWARF encoded unsigned integer

    .float          allocate 4-byte (IEEE single) floats
    .be_float       allocate 4-byte (IEEE single) floats (big-endian)
    .le_float       allocate 4-byte (IEEE single) floats (little-endian)
    .single         allocate 4-byte (IEEE single) floats
    .be_single      allocate 4-byte (IEEE single) floats (big-endian)
    .le_single      allocate 4-byte (IEEE single) floats (little-endian)
    .double         allocate 8-byte (IEEE double) floats
    .be_double      allocate 8-byte (IEEE double) floats (big-endian)
    .le_double      allocate 8-byte (IEEE double) floats (little-endian)
    .tfloat         allocate 10-byte (Intel extended) floats
    .be_tfloat      allocate 10-byte (Intel extended) floats (big-endian)
    .le_tfloat      allocate 10-byte (Intel extended) floats (little-endian)
    .ldouble        allocate 16-byte (SPARC extended) floats
    .be_ldouble     allocate 16-byte (SPARC extended) floats (big-endian)
    .le_ldouble     allocate 16-byte (SPARC extended) floats (little-endian)

    .ascii          allocate strings w/o trailing NUL
    .asciz          allocate strings w/ trailing NUL
    .string         allocate strings w/ trailing NUL

    .space          allocate any number of bytes
    .block          allocate any number of bytes (same as `.space')
    .skip           allocate any number of bytes (same as `.space')
    .fill           allocate filled bytes

    .org            move the location counter (forward only!)
    . = value       same as `.org value'
    .balign         align location pointer to a multiple of <n>
    .balignl        align location pointer to a multiple of <n>
    .balignw        align location pointer to a multiple of <n>
    .p2align        align location pointer to a multiple of 2**<n>
    .p2alignl       align location pointer to a multiple of 2**<n>
    .p2alignw       align location pointer to a multiple of 2**<n>

    .equ            define a symbol's value
    .equiv          define a symbol's value
    .set            define a symbol's value
    symbol = value  same as `.set symbol, value'
    symbol:         same as `symbol = .'
    .comm           define a global common symbol
    .lcomm          define a local common symbol
    .global         make a symbol global (*)
    .globl          make a symbol global (same as .global) (*)
    .export         make a symbol global (same as .global) (*)
    .local          make a symbol local (*)
    .weak           make a symbol weak (*)
    .weakext        make a symbol weak (*)
    .size           set a symbol's `size' attribute (*)
    .type           set a symbol's `type' attribute (*)
    .hidden         set a symbol's visibility to `hidden' (*)
    .internal       set a symbol's visibility to `internal' (*)
    .protected      set a symbol's visibility to `protected' (*)

    .text           switch to the `.text' section
    .data           switch to the `.data' section
    .bss            switch to the `.bss' section
    .pushsection    switch to an arbitrary (named) section
    .section        switch to an arbitrary (named) section
    .popsection     switch to the previous section (discards old section)
    .previous       switch to the previous section (toggles)

    .if             conditional assembly
    .ifdef          conditional assembly
    .ifndef         conditional assembly
    .ifnotdef       conditional assembly (same as `.ifndef')
    .elif           conditional assembly
    .else           conditional assembly
    .endif          conditional assembly

    .err            raise an error
    .extern         do nothing (ignored for compatibility)
    .include        include another source file
    .file           specify a source file name (*)
    .ident          add a string to the `.comment' section of an ELF file (*)
    .version        add a string to the `.note' section of an ELF file (*)

                    (*) only makes sense in ELF files

Note that you can pick multi-byte data encodings explicitly or let the
assembler choose the appropriate format for the target machine (that
is, little-endian for the F-CPU).  Instructions are stored MSByte first
(that is, big-endian) for the F-CPU, but that can be changed by flipping
a single #define switch.

                ==============================

Copyright (C) 2002, 2003 Michael "Tired" Riepe
<michael@stud.uni-hannover.de>
