/f-cpu/qdcpoc/FORMAT.txt
created Sun Jul 15 00:18:53 by wygee@f-cpu.org
version : Thu Aug  2 04:09:26 2001

WARNING !!!
Thu Aug  2 02:13:09 2001 : the instruction encoding changes !!!
all the bits will be numbered in the reverse order, the opcode
will be stored in "big endian" format.
In case of problem, the endianness can be tuned in opcode.h.
The code in this archive doesn't reflect this problem yet.
The manual will not be updated before a long time, too.


Instruction format (in this QDCPOC implementation) :

General census :
31-24 : opcode (some opcodes are truncated : some bits are used as special flags)
26-24 : logic function (3-bits)
25-22 : Shift for Loadcons
23    : Data/Instruction flag for the TLB (0=I)
23-22 : size flags, branch hint
22    : imm16 sign extension for loadaddri (?)
21-6  : imm16
21    : SIMD flag or Endian Flag or Condition Negation flag
20-19 : condition (00 : nullity, 10 : MSB, 11 : LSB )
20-18 : stream hint
20    : reserved in most arithmetic operations,
        intended for logarithmic operations,
        can be used as a sign extension bit for the imm8 format
18    : sign-extension in move (optional, will be removed,
        sign-ext should be done in a specific SHL unit )
18    : reserved in jmpa but can be used to indicate function return.
19-12 : imm8
12-17 : Register 0 / SRC1 (condition register too)
6-11  : Register 1 / SRC2 (pointer reg too)
0-5   : Register 2 / SRC3 (destination too)


MOVE :
24-31 : opcode byte
22-23 : size flags
21    : Condition Negation flag
19-20 : condition (00 : nullity, 10 : MSB, 11 : LSB )
18    : sign-extension in move (opt.) (????)
   (this adds too much "meat" to the Xbar !!!)
12-17 : condition register
6-11  : source
0-5   : destination reg


LOADCONS(X)
31-26 : opcode (bit 26 means : sign extension when set)
25-22 : Shift (256-bit operation is thus possible)
6-21 : imm16
0-5  : destination

ROP2 :
27-31 : Opcode
24-26 : function (3 bits)
22-23 : size flags
20-21 : optional "combine" size flag
  (not implemented yet, 8-bit chuncks only, 0 by default)
18-19 : mode (0=normal, see the tables for more)
12-17 : Register 0 / SRC1
6-11  : Register 1 / SRC2
0-5   : Register 2 / SRC3 (used when mode=MUX)

ROP2i : (only mode 0 is available)
27-31 : Opcode
24-26 : function (3 bits)
22-23 : size flags
21    : SIMD flag 
20    : sign extension bit for the imm8 format
12-19 : imm8
6-11  : Register 1 / SRC2
0-5   : Register 2 / dest


3R1W and 2R2W : the Reg2 field (destination) is used
with 5 bits. The LSB is negated for the optional destination.

example :
 OP_3R1W R1, R2, R3 : src1=R1, src2=R2, src3=R3, dest=(R3 xor 1)
 OP_2R2W R1, R2, R3 : src1=R1, src2=R2, dest1=R3, dest2=(R3 xor 1)
This behaviour favors 2x loop unrolling, where data and pointers
are handled in pairs (loop cut into halves).
This also simplifies the scoreboard/scheduler and removes
an adder from the CDP.

RFC/note : all instructions which write to R0 are discarded
(except some cases such as load/prefetch)
The behaviour concerning the scheduling is not yet examined.
should the instruction stall until all sources are ready ?


Mon Jul 30 20:00:56 2001 :
NOP is now definitely different from MOVE.
the 8 MSB are zero and the rest is not analysed.
This behaviour is interesting in the case where
one wants to align code pages, so we can encode
the number of skipped opcodes in the LSB.
This remains unclear, however, and the default
for NOP is 0x00000000. A different value in the LSB
would be caught at fetch stage so the fetcher will
discard all the useless NOPs. Alignment at the end
of cache lines will thus consume less cycles
but it makes the fetcher more complex.


MOVE is now 0x01xxxxxx.
maybe we can make CMOVE 0x02xxxxxx (?)
because it will ease the opcode decoding in the LUT.
i don't know yet.
The "real" problem is that the LUT is in "fixed" format
and we need to fetch bits all over the neighbouring bits.
It is easy to hardwire but the SW LUT is bloated
with 'X' and duplicate entries.
