1. This forum section is a read-only archive which contains old newsgroup posts. If you wish to post a query, please do so in one of our main forum sections (here). This way you will get a faster, better response from the members on Motherboard Point.

Task level processing vs. Interrupt level processing

Discussion in 'Embedded' started by NewToFPGA, Jan 20, 2008.

  1. NewToFPGA

    NewToFPGA Guest

    Hi,

    I have just started working on programming on low level in the
    embedded systems. I understand some basic stuff and trying to get my
    self familiarize with some low level programming aspects. Below are
    some of the questions I have:

    How expensive are task switchings in a 400 MHz processor (MPC82xx
    based processor)?

    How much time this processor takes to run one assembly instrustion?

    Can I implement a polling interms of micro seconds? If I have the
    polling implemented in the task (or thread) level code what are the
    common problem that I would face? If not at the task level, is there
    anything I can do in the hardware configuration that I can request a
    timed interrupt?

    How do I configure some interrupts so that an FPGA can raise it when
    there some data for the software to read?


    Thanks,
    Eswar.
     
    NewToFPGA, Jan 20, 2008
    #1
    1. Advertisements

  2. NewToFPGA

    Tim Wescott Guest

    As expensive as your RTOS makes them. This should be part of the RTOS
    documentation.
    That depends on your processor, and should be in the processor
    documentation, or is at least something that you can benchmark.
    Generally the execution time will vary with instruction, and for
    processors with pipelines the execution time will depend on the
    instructions that precede and follow the instruction in question, which
    makes it very difficult to predict how long it will take to execute.
    That depends on your environment. If you have a 400MHz processor,
    probably -- but if you're polling once every 1us you'll find that you'll
    use a lot of clock ticks just for the polling.
    That depends on your processor, and should be in its documentation. Does
    it have hardware timers? Can the timers throw interrupts?
    You read the processor documentation, and maybe some applications notes,
    and you figure it out.

    I'm not trying to be snide here -- every processor has the World's Most
    Clever way of turning on interrupts, and every processor designer thinks
    that all the rest are idiots -- so techniques vary.

    Usually you have to set (or clear) a global interrupt mask, and set (or
    clear) an interrupt mask for the specific interrupt you want to enable.
    You'll have to specify where the ISR is to the processor, unless your
    processor vectors to fixed locations. On many microcontrollers, each pin
    can do approximately one bazzilion different things, so you also have to
    configure the pin correctly as an interrupt input.

    Finally, you have to spend a week or two struggling with the one
    important part that got left out of the manual, or is in the manual for
    some seemingly unrelated part of the processor. Usually this involves
    flipping the default value of one frigging little bit in an obscure
    register someplace, but sometimes it requires completely rewriting all
    your code.

    --
    Tim Wescott
    Control systems and communications consulting
    http://www.wescottdesign.com

    Need to learn how to apply control theory in your embedded system?
    "Applied Control Theory for Embedded Systems" by Tim Wescott
    Elsevier/Newnes, http://www.wescottdesign.com/actfes/actfes.html
     
    Tim Wescott, Jan 20, 2008
    #2
    1. Advertisements

  3. NewToFPGA

    Didi Guest

    How expensive are task switchings in a 400 MHz processor (MPC82xx
    It probably varies a lot between operting systems. For DPS on an
    824x this would be somewhere between 1 and 5 uS, depending on
    whether the task which exits uses the FPU (so all 32 64-bit registers
    are saved) and whether the task which is given control to
    uses the FPU (so all the 32 FPU regs have to be restored), and
    some other, less influencing factors. Another factor would
    be memory speed - whether the 824x has a 64 or a 32 bit data
    path; I have only had a 64-bit memory path system here.
    IRQ latency is a whole lot better, of course - IRQ stays masked
    just for a few cycles while putting the CPU in a recoverable state.

    While being really low latency and tiny footprint DPS is not what
    you would typically associate with an RTOS, it is a fullblown OS by
    any standards.

    Notice tht the above is written by the author and owner of DPS.

    Dimiter
     
    Didi, Jan 20, 2008
    #3
  4. This also depends on the particular hardware and the execution instant. If
    the memory for the context storage is cached or not, if page fault happens
    or not, if the cache or SDRAM bank hit or miss - all of that creates a lot
    of variation.

    Vladimir Vassilevsky
    DSP and Mixed Signal Consultant
    www.abvolt.com
     
    Vladimir Vassilevsky, Jan 20, 2008
    #4
  5. For our proprietary RTOS and Blackfin CPU, the interrupt latency is under
    200ns. The task switching time depends on many factors (number of tasks,
    semaphores, messages etc.) and it is generally at the order of microseconds.
    It could be done better then that; however the goal was the portability and
    convenience rather then performance.
    I am curious to know what does it mean "a tiny footprint full blown OS by
    any standards".
    Our RTOS takes about 20K core only; any practical configuration is likely to
    take over 40K. Still this is a small RTOS composed as the library with the
    support for the very minimal set of basic services.

    Vladimir Vassilevsky
    DSP and Mixed Signal Consultant
    www.abvolt.com
     
    Vladimir Vassilevsky, Jan 20, 2008
    #5
  6. NewToFPGA

    Didi Guest

    For our proprietary RTOS and Blackfin CPU, the interrupt latency is under
    Things are in the same ballpark range for the PPC running DPS,
    obviously
    depending on clock frequency and perhaps memory speed. At 400 MHz it
    should be perhaps half that or so.
    It means an OS with multiple tasks, multiple windows, many hundreds of
    system calls to utilize these, filesystem of course, tcp/ip stack, a
    (pretty unique) inherent mechanism for object maintenance and many
    things
    I cannot think of now, probably. All the above takes less than 1M on
    a PPC; think 1/3 that on a CPU32 (I have stopped developing the CPU32
    version a while ago, though).
    It has more than enough so if one needs to write an application
    one does not have to do much if anything but the application.
    The minimum you can boot with - while having scheduler and
    filesystem and about half of all calls - is something like 100K
    on the PPC, and < 30K on a CPU32.

    A more or less representative view - running some applications on top
    of DPS, perhaps another few hundred K - is at
    http://tgi-sci.com/tgi/tools.gif . It is an old screenshot (>5 years),
    but will give you an idea.

    I hope this year I will get around to make a less platform dependent
    package available. I would be doing this at a much higher
    priority if there were any PPC based documented hardware in the
    PS3/XBOX price range, which is not the case.

    Dimiter
     
    Didi, Jan 20, 2008
    #6
  7. NewToFPGA

    NewToFPGA Guest

    Can I implement a polling interms of micro seconds?
    If I dont look at the performance point of view what is the maximum
    number of ticks that I can have in a 400 Mhz processor? is it
    400,000,000 ticks per second (or 2.5 nano seconds per tick)?

    If I have a periodic task which wakes up periodically every 25 micro
    seconds what is the overhead for this timer itself? How do we find it
    out?

    Any general good reference book or online documentation that that
    talks about the processors in general?
     
    NewToFPGA, Jan 20, 2008
    #7
  8. NewToFPGA

    Didi Guest

    If I dont look at the performance point of view what is the maximum
    This is explained in some detail in the 603e core databook.
    On the Freescale website in the 824x section, you will find
    it or the G2 core (which is pretty much the same with some
    enhancements on some implementations).
    They specify "up to 3 instructions per clock cycle",
    which means the core can do in one cycle an integer
    instruction, an FPU instruction and fold a prefetched branch
    in the same cycle. Obviously the branch cannot be fetched
    in the same cycle since the data path to the cache is only
    64 bits.
    Thinking 1 clock per instruction is pretty safe as long
    as you run off cache; you need to calculte external delays
    separately yourself depending on your hard- and software.

    Dimiter
     
    Didi, Jan 21, 2008
    #8
  9. Didi wrote:

    That's very impressive. Although you have an interesting notion of
    tinyness :)

    What is your paradigm for the following problem: passing an object from
    one task to another?

    Let's say the first task is preparing a block of data. The second task
    is sleeping. When the block is ready, it has to be passed to the second
    task, and the task has to be awaken. Who owns the memory occupied by the
    data block? If the memory is dynamic, who allocates and releases it? If
    the memory is static, how does the first task know when the second task
    does not need the object any more? Do you support the object transfer
    mechanism at OS level or is it left to the application?


    Vladimir Vassilevsky
    DSP and Mixed Signal Design Consultant
    http://www.abvolt.com
     
    Vladimir Vassilevsky, Jan 21, 2008
    #9
  10. NewToFPGA

    NewToFPGA Guest

    Thanks for directing me to this manual. There is a lot of info in
    this. I am going to read it in the next couple of days...

    How many instructions are there in C code like "int i = 100; int j =
    i;". Again any reference to look at these details will also be
    appreciated.
     
    NewToFPGA, Jan 21, 2008
    #10
  11. NewToFPGA

    Didi Guest

    Let's say the first task is preparing a block of data. The second task
    In DPS memory is allocated dynamically. At the lowest level, a task
    can
    either allocate pieces in a registered manner (so if the tasak gets
    killed
    "by force" the pieces will be deallocated) or in a non-registered way
    where the allocated piece will stay allocated. Then tasks have the
    option to put in their history record (the same record which contains
    the registrations for allocated pieces) one (or more) addresses in
    their program section which will be called upon kill by the system,
    along with some parameters passed via that same record. And then
    one has the option to allocate a registered piece of memory to
    a third party, i e. task a allocates it but it gets registered on
    behalf of task b. There is a variety of intertask communication
    facilities, starting with the common data section groups of
    tasks share, through the inter-task signalling mechanism, to
    the (highest level) object specific ways. The latter also offer
    higher level facilities for memory allocate/deallocate which
    turned out to be very convenient.
    Oh well, I guess my notion of tiny can only get more interesting
    if I go on :).
    But I meant "tiny" in an apples-to-apples way of comparison,
    say, a running OS with filesystem and about 300 calls in a
    100k PPC program code is tiny... Now if you turn the VM on,
    with page tables and all - which I normally do - things get
    a lot less tiny, and if you add the other 300+ calls for the
    graphics, window maintenance etc. it can still be called
    tiny if compared apples-to-apples.

    Dimiter
     
    Didi, Jan 21, 2008
    #11
  12. NewToFPGA

    Didi Guest

    How many instructions are there in C code like "int i = 100; int j =
    You should be able to talk the compiler into generating an assembly
    output list and look at it. Different compilers would likely
    produce different sequences.

    But for understanding things at the level you want to, C (or Basic
    or Pascal or whatever HLL) is not the right place to look at. You
    need to understand how things work in machine code, then you can
    choose a higher level language to use in order to hide the machine
    level from you. Right now you are trying to understand the machine
    level, though, and hiding it from yourself does not seem to be
    a good idea :).
    You could read for a while the PPC programming environment (or
    sort of title) book, you can locate it on the Freescale site as
    well. It is bulky, but pretty straight forward and easy to understand,
    should make a useful reading, I suppose.

    Dimiter
     
    Didi, Jan 21, 2008
    #12
  13. As others have pointed out, the execution time of a single
    instruction is not constant, but depends on the context (pipeline
    state, cache state, maybe other processor state). So you have to
    consider entire code snippets, for example functions or even whole
    threads.

    The aiT static execution-time analysis tool from AbsInt
    (www.absint.com) can compute bounds on the worst-case execution
    time for PPC code, for some PPC models (which models are supported
    I don't exactly know). It takes into account pipeline and cache
    effects using a very detailed hardware model. It covers all
    execution paths by static analysis and abstract interpretation. But
    it's not cheap.

    HTH
     
    Niklas Holsti, Jan 21, 2008
    #13
  14. NewToFPGA

    CBFalconer Guest

    Actually, for the query case, almost all machines will produce at
    most:

    mvi regno, 100; Move the immediate value to reg no
    stoi regno, baserg; Store that via the address in baserg
    inc baserg, sz; By sz, i.e. the size of an int.

    and something else has set up baserg, etc. The int j = i will be
    almost the same, except that it will start by replacing "mvi regno,
    100" with:

    movm regno, value; Load content mem loc'n 'value' to regno

    and the details of how those assembly instructions are constructed,
    manipulated, etc. will vary from machine to machine. But the idea
    is quite consistent.

    Assembly language is different from higher level in that the
    instruction perform known actions, and the assembly language writer
    has to combine those actions to get the desired effect. In the
    higher level language, he just writes the effect, and other
    software (the compiler, usually) selects the assembly sequence.
     
    CBFalconer, Jan 21, 2008
    #14
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.