[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[oc] MAC Core - Starts
Hi,
This is the proposal for MAC Core.
Specifications :
MAC should handle inputs of 16 bits wide.
MAC should provide output of either 32 bits or 40 bits wide.
MAC should be totally independent of other modules of a DSP core
except Control signals.
MAC when integrated with any DSP core should be efficient in
handling instructions like Multiply, Maultiply-Accumulate & Repeated
Multiply Accumulate.
Description :
This is general purpose Multiply Accumulate Unit. The main goal is to
see that it can fit into any DSP Core that needs a MAC Unit. The primary
assumptions made for this are :
The Memory Architecture of the Target DSP is "Harvard
Architecture".
Due to Harvard Architecture, there exists two memories (Data &
Prog) and a separate pair of data and address buses for each. Also a
result bus (Rbus) exists for handling the data from arithmetic units.
DMDB (Data Memory Data Bus) - 16 bits wide.
PMDB (Program Memory Data Bus) - 16 bits wide.
Rbus (Result Bus) - 32 bits
wide.
Though the present MAC Architecture (shown in the fig below) assumes the
above set of buses, it doesn't bother even there is Von-Neumann
Architecture (Single Memory). In such conditions when MAC is used as a
component in its higher level description, we can virtually merge both
the buses PRDB & DRDB. However since the loading of the registers is
carried out on being qualified by the control signals from the Control
Unit, there
is absolutely no problem with the Architecture and it will function
normally. The only difference would be that, we need as extra cycle to
load the data into both registers.
Now we shall have a look at the MAC Architecture. The MXmux selects the
loading of MX Reg from either DMDB or Rbus (operand may be a result of
earlier arithmetic operation). The MYmux selects the loading of the
MY Reg from either DMDB (Data Memory Data Bus) or PMDB (Program Memory
Data Bus). The Multiplier is capable of handling the 16bit by 16bit
multiplications. The accumulator gets two inputs, one from Multipler
and the other from the Result registers. Accumulator is capable of
accumulating upto 40 bits. The 40 bit result of accumulator is shared
between two registers. LSB 32 bits go into MR1 and MSB 8 bits go into
MR2
register.There can be instances when only multiplication is needed and
no accumulate is needed. In such cases, the result of Multiplier
bypasses the Accumulator via the MR1mux. Further since the Rbus is
capable of
handling 32 bits at a time, there is a mux network formed by MRUmux &
MRLmux which takes care of dumping the result on to Rbus. If we have
only a 32 bit result, MRUmux puts out the higher 24 bits and MRL puts
out
the lower 8 bits. In case if we have 40 bit result, an extra cycle is
taken to dump out higher 8 bits from MRLmux. To further clarify the mux
network, the MRU mux always puts 24 bit data onto Rbus[32:8] and MRLmux
puts 8 bit data onto Rbus[7:0]. The control signals CM1 through CM4 can
be issued from the control unit depending on the Memory Architecture of
the target DSP Core.
The above considerations makes the MAC Architecture to be totally
independent of the Target DSP Core and hence is capable of being ported
to any possible Cores.
Damjan, you can put this on the MAC core webpage.
The MAC architecture MAC.GIF is attached.
Regards
Harish
_____________________________________________________________
Tired of limited space on Yahoo and Hotmail?
Free 100 Meg email account available at http://www.dacafe.com