[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [openrisc] Address pipeline during exception...



> Or1k architecture will have a lot of different applications, DSP
> processors, superscalars, lowpower, high performance.
> This is true, because or1k RTL is completly configurable.

This can still happen, even if you specify the pipelines. It may
require different architecture families for different applications.
However, the generated and written code will be more efficient for
each application, thus actually increasing its capabilities.

> This way you get SW:HW independence.

I don't believe there is such a thing. Object code is made for a
specific processor. If you change the processor, you recompile.
Higher level languages such as C provide the capabilities for what
you are requiring.

> When you are doing OS port, you should cover all implementations.
> I think things are defined in a way that you don't loose much,
> when writing portable code. It is up to implementation to use
> or1k code in a way that application needs.

Fine, give me 7 special cases and I'll happily write 7 completely
customized solutions. It will take me only a little longer, but
generate code that is significantly more efficient, and thus more
useable in a real implementation.

> You can write code as there were no pipelines - but you should
> decrease dependencies as much as possible - because you have
> in mind pipelined and superscalar versions. I don't think this is
> so hard to achieve.

A nice idea, but I think untrue. If the machine isn't pipelined, I
get the best performance by reusing a few temporary register
locations as quickly as possible while leaving some untouched for
global storage. In a pipelined situation, my best efficiency comes
from not using registers until they are available, even if this means
dumping to memory now and then. Different solutions for different
problems. I also need to know the latency for instructions. Always
assuming worst case always generates worst case code. It's not a
recipe for success.

> Anyway for our applications it is much better that code is portable
> than the fact that the user must write its own OS and other utils
> if he wants to use or1k... this way nobody will do the effort.

Don't follow the logic here. If the pipeline is in the architecture
document, then the user will have a known pipeline. Changes to the
pipeline require changes to gcc and hand written assembly to get
good efficiency. The code is still portable across all implementations
which subscribe to the architecture. The issue is why you don't
want to specify the pipeline.

> This architecture was designed mostly by SW guys, and we designed
> towards the goal, that no implementation looses anything, and that
> architecture is orthogonal.

In an embedded processor, efficiency in power and cost are the
primary goals...any project needs to take real world business
concerns into account. There is a huge difference in power and
cost budgets to make something run at 400 MHz instead of 200
MHz. If I can get the same effect by running at a lower speed
and using an optimized compiler, why would I want to use
a worst case generic compiler?

What you are suggesting is that the compiler assume worst case,
and my guess is most real world implementations will lead to a
significant number of pipeline stalls. Gcc will either start
moving things to memory when it could be reusing registers, or
else it will prematurly access registers.

If this is really what you guys want, then I suggest that a cycle
to cycle simulation is imperative to assess the impact of what you
are suggesting, and the efficiency with which code can be
generated in absence of this knowledge. Truthfully, I don't know,
but my gut tells me you're trying to do something that shouldn't
be done. The reality is I can't effect a change in what you guys
are doing....but this is a bad business decision in my opinion. A
word of advice, rethink what you are proposing. It doesn't address
real world issues. I don't know anyone who has a problem with a
restriction on implementation, but alot of people will take
exception to code that runs at suboptimal efficiency. If I can use
a processor that uses only 10% less power to accomplish the same
goal, that is a significant advantage. I've based purchasing
decisions on less (even in the face of higher cost).

What problem space are you addressing that requires this pipeline
restriction?  I don't understand your concerns. Can you cite an
actual example where this was requested, so I can understand your
motivation? If you can explain to me what you are trying to solve,
I might be able to make some more helpful suggestions. As it is,
I don't see a use for what you are proposing, and alot of drawbacks.

Care to support me in finding out what you are sacraficing to
achieve this?

Chris.
chris@asics.ws




_______________________________________________________
www.asics.ws - Solutions for your ASIC needs