[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [openrisc] Important or1k question...



X-Mailer: Cyclonic 3.0

> > I see no reason why implementation shouldn't be able to have 64-bit VFPRs
> > and 32-bit GPRs. So the answer would be yes.
> 
> Then you need to add an instruction between
> lf.ftoi.s and lf.ftoi.d (that copies a single fpr to two gprs).
>
>
> --
> Matan Ziv-Av.                         matan@svgalib.org

I don't think it's necessary to actually do a 64 bit integer
to floating point conversion in this architecture, but there
has to be a way to transfer data (unmodified) between floating
point registers and general purpose registers. Currently, this
is allowed by using the l.mtspr and l.mfspr commands, but this
is only useful if the integer size and floating point size are
the same.

Question: what is the behavior of l.mfspr/l.mtspr when you have
a 32 bit architecture and are accessing a 64 bit register? (i.e.
a floating point register mapped at 1536-2047)
 
It really is impossible to use the floating point unit without
this functionality, as it becomes impossible to load data into
it. E.g. you have a pointer to floating point data in a general
purpose register. (Stack pointer, for example.) You must load
data to/from this memory region.

Data can be loaded into/out of the floating point registers
using the lvf.ld/lvf.lw/lvf.sd/lvf.sw instructions. However,
these instructions require the address pointer in a floating
point register. (This seems really odd to me, but I can imagine
why it makes things easier for the hardware guys.) But, before
I can put this address into a floating point register, I must
save the current contents of the floating point register...

The only way to do this is to either use lvf.s[d/w] (which
clearly creates an infinite regression), or to get the contents
into a general purpose register using l.mfspr, which is no
good because it only moves 32 bits.

There is one way out of this which I thought of, but it is a
terrible hack, and certainly not something I would recommend
as a design decision. If l.m[f/t]spr accesses only the bottom
32 bits, and lvf.[s/l][d/w] ignores the upper 32 bits of the
address register, then you could bootstrap the process by
saving the lowest 32 bits of f0, putting the address into
the lowest 32 bits of f0, using lvf.sd f0,f0 to store the
64 bit word to memory, and then overwriting the lowest 32
bits with the saved value. Ugly, but it would work without
changing the architecture definition...but I really don't
like this method. It's inefficient, confusing, and easy to
get wrong.

I came across this when I was trying to write the RTEMS floating
point context save/restore routines. I would like to resolve the
issue by changing section 7.1 on p.24 of the architecture spec.

It currently lists the following for group 0:

0 1024-1535	GPR0-GPR511   R/W GPRs mapped to SPR space
0 1536-2047	VFR0-VFR511   R/W VFRs mapped to SPR space

I would like to change this to the following:

0 512-1023	GPR0-GPR511   R/W GPRs mapped to SPR space
0 1024-1535	VFR0-VFR511   R/W VFRs mapped to SPR space
0 1536-2047	VFR0-VFR511   R/W Shadow VFRs mapped to SPR space

A 32 bit l.m[ft]spr access to registers 1024-1535 would hit the
lower 32 bits of a 64 bit double. An access to 1536-2047 would
access the upper 32 bits of a 64 bit double.

If the architecture uses the same size integer and floating point
registers, then both regions are identical.

In the unlikely event of a 64 bit integer unit and a 32 bit floating
point unit, the the region 1024-1535 would cause the exchange to 
occur between the lower 32 bits of the general purpose register. The
shadow register would cause the exchange to occur to the upper 32
bits of the register.


And, as long as we're on the subject (I can already guess why
Damjan doesn't want to do this...but it is REALLY annoying) Can
we either A) add an offset addressing mode to the lvf.[s/l][d/w]
instruction, or B) make this instruction do an auto increment of
the address register so that sequential loads/stores will go into
sequential locations?

I can't just add 8 to a floating point register to increment the
address now can I? 

Alternatively, can we simply use an integer register for the address
so that all of these problems just go away?

Comments?

Chris Ziomkowski
chris@asics.ws


_______________________________________________________
www.asics.ws - Solutions for your ASIC needs
--
To unsubscribe from openrisc mailing list please visit http://www.opencores.org/mailinglists.shtml