Motherboard Forums


Reply
Thread Tools Display Modes

AVR interrupt response time

 
 





















Pygmi
Guest
Posts: n/a

 
      01-12-2005, 07:40 PM


I just started my first time critical project with AVR's.
And time critical meaning interrupt response times.
So far I have been using avr-gcc (3.3.x) and I have been
pretty happy with it. And I have written ALL code in C.

I'm hoping to get some code executed within 2 us or so
after external interrupt (INT0/INT1 with ATMega32).
I wrote the code to be executed today and ended up to
appr. 20 instructions/cycles. With 16 MHz clock that
means something like 1.25 us. Nothing much to optimize
there.
From datasheets I have found out that it takes 4 cycles
minimum (?) to jump to interrupt handler. By adding some
register saving and stuff, I was expecting less than 0.5 us
to start executing my own code => resulting in <2 us.

Ok, that was what I was hoping...

When I compiled the code and ran it, I noticed that it
took about 2.5 us to start executing my code?!?!
(I used ATMega8 as I don't have any M32 at the
moment, but I guess it isn't relevant??)
I checked the list file and one reason is the
LENGHTY prologue added by gcc into interrupt
handler (17 instructions!!!), saving LOT of registers...

Two questions:
1. Even with 4 cycles + 17 instructions there is 1 us
missing?? What else happens before my own handler
code starts executing?
2. Is there any way to tell gcc NOT to 'push' all those
registers in to the prologue??

And finally:
If I expect to have my own code to execute
within 0.5 us, is the assembler the only way to go??

Thanks for any info in advance....
Any links to good resources are appreciated as well.
I REALLY like to know exactly what happens there.

Pygmi


 
Reply With Quote
 
Jeroen
Guest
Posts: n/a

 
      01-12-2005, 08:32 PM

"Pygmi" <> wrote in message
news:B2fFd.411$...
> I just started my first time critical project with AVR's.
> And time critical meaning interrupt response times.
> So far I have been using avr-gcc (3.3.x) and I have been
> pretty happy with it. And I have written ALL code in C.
>
> I'm hoping to get some code executed within 2 us or so
> after external interrupt (INT0/INT1 with ATMega32).
> I wrote the code to be executed today and ended up to
> appr. 20 instructions/cycles. With 16 MHz clock that
> means something like 1.25 us. Nothing much to optimize
> there.
> From datasheets I have found out that it takes 4 cycles
> minimum (?) to jump to interrupt handler. By adding some
> register saving and stuff, I was expecting less than 0.5 us
> to start executing my own code => resulting in <2 us.
>
> Ok, that was what I was hoping...
>
> When I compiled the code and ran it, I noticed that it
> took about 2.5 us to start executing my code?!?!
> (I used ATMega8 as I don't have any M32 at the
> moment, but I guess it isn't relevant??)
> I checked the list file and one reason is the
> LENGHTY prologue added by gcc into interrupt
> handler (17 instructions!!!), saving LOT of registers...
>
> Two questions:
> 1. Even with 4 cycles + 17 instructions there is 1 us
> missing?? What else happens before my own handler
> code starts executing?
> 2. Is there any way to tell gcc NOT to 'push' all those
> registers in to the prologue??
>
> And finally:
> If I expect to have my own code to execute
> within 0.5 us, is the assembler the only way to go??
>
> Thanks for any info in advance....
> Any links to good resources are appreciated as well.
> I REALLY like to know exactly what happens there.
>
> Pygmi
>


The processor first synchronizes the external input to it's own clock,
that's takes at 2 clocks. The processor also has to finish the currently
executing instruction. It takes 3 cyles to go the interrupt vector, from
where it executes a jump to your ISR, another 3 cycles. This is 8 cycles to
11 cycles total time, depending on the executing instruction; or 0.6875 us.
Then it has entered your ISR; you at least need to save the statusregister
and a few registers before useful work can be done.

How did you check the response time? With a scope?

Assembly will be neccesary if you want to sqeeze out every last bit of
performance. What's the application that this is so critical?

Jeroen


 
Reply With Quote
 
Pygmi
Guest
Posts: n/a

 
      01-12-2005, 10:10 PM

"Jeroen" <> wrote in message
news:41e58945$0$6208$...
>
> "Pygmi" <> wrote in message
> news:B2fFd.411$...
> > I just started my first time critical project with AVR's.
> > And time critical meaning interrupt response times.
> > So far I have been using avr-gcc (3.3.x) and I have been
> > pretty happy with it. And I have written ALL code in C.
> >
> > I'm hoping to get some code executed within 2 us or so
> > after external interrupt (INT0/INT1 with ATMega32).
> > I wrote the code to be executed today and ended up to
> > appr. 20 instructions/cycles. With 16 MHz clock that
> > means something like 1.25 us. Nothing much to optimize
> > there.
> > From datasheets I have found out that it takes 4 cycles
> > minimum (?) to jump to interrupt handler. By adding some
> > register saving and stuff, I was expecting less than 0.5 us
> > to start executing my own code => resulting in <2 us.
> >
> > Ok, that was what I was hoping...
> >
> > When I compiled the code and ran it, I noticed that it
> > took about 2.5 us to start executing my code?!?!
> > (I used ATMega8 as I don't have any M32 at the
> > moment, but I guess it isn't relevant??)
> > I checked the list file and one reason is the
> > LENGHTY prologue added by gcc into interrupt
> > handler (17 instructions!!!), saving LOT of registers...
> >
> > Two questions:
> > 1. Even with 4 cycles + 17 instructions there is 1 us
> > missing?? What else happens before my own handler
> > code starts executing?
> > 2. Is there any way to tell gcc NOT to 'push' all those
> > registers in to the prologue??
> >
> > And finally:
> > If I expect to have my own code to execute
> > within 0.5 us, is the assembler the only way to go??
> >
> > Thanks for any info in advance....
> > Any links to good resources are appreciated as well.
> > I REALLY like to know exactly what happens there.
> >
> > Pygmi
> >

>
> The processor first synchronizes the external input to it's own clock,
> that's takes at 2 clocks. The processor also has to finish the currently
> executing instruction. It takes 3 cyles to go the interrupt vector, from
> where it executes a jump to your ISR, another 3 cycles. This is 8 cycles

to
> 11 cycles total time, depending on the executing instruction; or 0.6875

us.
> Then it has entered your ISR; you at least need to save the statusregister
> and a few registers before useful work can be done.
>
> How did you check the response time? With a scope?
>
> Assembly will be neccesary if you want to sqeeze out every last bit of
> performance. What's the application that this is so critical?
>
> Jeroen
>
>


Thanks for the response.

Yes, I checked the response time with scope. From external
signal to first executed instruction of my "own" code in interrupt
handler.

I have a need to service ISA bus logic (I/O read/writes), and I have
been told that R/W requests should be serviced within 2.5 us
(so not actually 2 us). I'm not quite sure about the 2.5 us requirement,
but if it is valid, it seems to be too much for AVR with 16 MHz...
Maybe if this could be the only interrupt in the system or having
nested interrupts.

...or I should forget all about interrupts and do the things I need
by polling. Not very tempting.
...or just faster processor (which would mean also jump from
AVR to another architecture)
...or the solution is a dual ported RAM??
...or some other option...there are of course options...but for
additional HW cost of course

Pygmi


 
Reply With Quote
 
Mike Harrison
Guest
Posts: n/a

 
      01-12-2005, 10:35 PM
On Wed, 12 Jan 2005 19:40:17 GMT, "Pygmi" <> wrote:

>I just started my first time critical project with AVR's.
>And time critical meaning interrupt response times.
>So far I have been using avr-gcc (3.3.x) and I have been
>pretty happy with it. And I have written ALL code in C.


For the lowest latency, the fastest way is to dedicate some registers for use only within the
interrupt code - that way you don't have to push/pop anything, just copy status to a register.

If you can tell your C compiler to never use certain registers in foreground code, and write your
int code in assembler, this will give the fastest response.

It may be that the standard C int handler can be modified to reduce what it saves if it doesn't use
all the regs it saves - take a look at the assembler it generates - you may be able to hand-tweak
it.


 
Reply With Quote
 
Jeroen
Guest
Posts: n/a

 
      01-12-2005, 10:55 PM

"Pygmi" <> wrote in message
news:2fhFd.500$...
>
> "Jeroen" <> wrote in message
> news:41e58945$0$6208$...
> >
> > "Pygmi" <> wrote in message
> > news:B2fFd.411$...


....

> > 11 cycles total time, depending on the executing instruction; or 0.6875

> us.
> > Then it has entered your ISR; you at least need to save the

statusregister
> > and a few registers before useful work can be done.
> >
> > How did you check the response time? With a scope?
> >
> > Assembly will be neccesary if you want to sqeeze out every last bit of
> > performance. What's the application that this is so critical?
> >
> > Jeroen
> >
> >

>
> Thanks for the response.
>
> Yes, I checked the response time with scope. From external
> signal to first executed instruction of my "own" code in interrupt
> handler.
>
> I have a need to service ISA bus logic (I/O read/writes), and I have
> been told that R/W requests should be serviced within 2.5 us
> (so not actually 2 us). I'm not quite sure about the 2.5 us requirement,
> but if it is valid, it seems to be too much for AVR with 16 MHz...
> Maybe if this could be the only interrupt in the system or having
> nested interrupts.
>
> ..or I should forget all about interrupts and do the things I need
> by polling. Not very tempting.
> ..or just faster processor (which would mean also jump from
> AVR to another architecture)
> ..or the solution is a dual ported RAM??
> ..or some other option...there are of course options...but for
> additional HW cost of course
>
> Pygmi
>


Latency on bigger processors is usually even worse... Interrupt latency on
for example a 80386 can take hunderds of cycles.

It's better to have some hardware to interface the ISA bus. A small cheap
CPLD is best, all you really need is an address decoder and a few registers.
The ISA bus runs at 8Mhz, the AVR runs at 16Mhz; this is just 2 instructions
for each ISA bus cycle. A jump alone is 3 cycles. So it's not possible, the
AVR just can't do anything useful. Only a much faster CPU could do it, but
still then the load is still very high.

The ISA interface can be done in plain HCT logic, and will only be a few
chips. A possible solution is a GAL20V8 as adress decoder. This decoder will
generate two strobes. One to enable a '574 that stores the data from the
databus and another to pass data from the AVR to the ISA via an '244. INT0/1
on the AVR can be used to let the AVR know something has been written. An
external interrupts needs to be at least 2 AVR clock cycles before it's
recognized, but to be on the safe side, it's better to use a flipflop that's
set by the address decoder, and reset by the AVR. The output of the FF goes
the INT0/1 input. This costs only 4 chips that cost next to nothing. If
board space is at premium, a 44 pin CPLD like a MAX7000S could be used.

Jeroen


 
Reply With Quote
 
CBFalconer
Guest
Posts: n/a

 
      01-12-2005, 11:04 PM
Jeroen wrote:
> "Pygmi" <> wrote in message
>
>> I just started my first time critical project with AVR's.
>> And time critical meaning interrupt response times.
>> So far I have been using avr-gcc (3.3.x) and I have been
>> pretty happy with it. And I have written ALL code in C.
>>
>> I'm hoping to get some code executed within 2 us or so
>> after external interrupt (INT0/INT1 with ATMega32).
>> I wrote the code to be executed today and ended up to
>> appr. 20 instructions/cycles. With 16 MHz clock that
>> means something like 1.25 us. Nothing much to optimize
>> there.
>>
>> From datasheets I have found out that it takes 4 cycles
>> minimum (?) to jump to interrupt handler. By adding some
>> register saving and stuff, I was expecting less than 0.5 us
>> to start executing my own code => resulting in <2 us.
>>
>> Ok, that was what I was hoping...
>>
>> When I compiled the code and ran it, I noticed that it
>> took about 2.5 us to start executing my code?!?!
>> (I used ATMega8 as I don't have any M32 at the
>> moment, but I guess it isn't relevant??)
>> I checked the list file and one reason is the
>> LENGHTY prologue added by gcc into interrupt
>> handler (17 instructions!!!), saving LOT of registers...
>>
>> Two questions:
>> 1. Even with 4 cycles + 17 instructions there is 1 us
>> missing?? What else happens before my own handler
>> code starts executing?
>> 2. Is there any way to tell gcc NOT to 'push' all those
>> registers in to the prologue??
>>
> > And finally:
>> If I expect to have my own code to execute
>> within 0.5 us, is the assembler the only way to go??

>
> The processor first synchronizes the external input to it's own
> clock, that's takes at 2 clocks. The processor also has to finish
> the currently executing instruction. It takes 3 cyles to go the
> interrupt vector, from where it executes a jump to your ISR,
> another 3 cycles. This is 8 cycles to 11 cycles total time,
> depending on the executing instruction; or 0.6875 us. Then it has
> entered your ISR; you at least need to save the statusregister
> and a few registers before useful work can be done.
>
> How did you check the response time? With a scope?
>
> Assembly will be neccesary if you want to sqeeze out every last
> bit of performance. What's the application that this is so
> critical?


And all that assumes that the executing code has no critical
sections implemented by disabling interrupts. Does no ARM
instruction take over 3 cycles? What about a return? What about
other interrupts and returns from them, if any. Hairy.

--
Chuck F () ()
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!


 
Reply With Quote
 
Ulf Samuelsson
Guest
Posts: n/a

 
      01-13-2005, 12:26 AM
> When I compiled the code and ran it, I noticed that it took about 2.5 us
to start executing my code?!?!
> (I used ATMega8 as I don't have any M32 at the moment, but I guess it

isn't relevant??)
> I checked the list file and one reason is the LENGHTY prologue added by

gcc into interrupt
> handler (17 instructions!!!), saving LOT of registers...



You can try a better compiler than WinAVR!


// IAR C interrupt handler
#pragma vector=12
__interrupt void handler()
{
BYTE i = PORTB;
PORTB = 0xF0;
PORTB = 0x0F;
PORTB = i;
}


Generated code

51 __interrupt void handler()
\ handler:
52 {
\ 00000000 931A ST -Y,R17
\ 00000002 930A ST -Y,R16
53 BYTE i = PORTB;
\ 00000004 B318 IN R17,0x18
54 PORTB = 0xF0;
\ 00000006 EF00 LDI R16,240
\ 00000008 BB08 OUT 0x18,R16
55 PORTB = 0x0F;
\ 0000000A E00F LDI R16,15
\ 0000000C BB08 OUT 0x18,R16
56 PORTB = i;
\ 0000000E BB18 OUT 0x18,R17
57 }
\ 00000010 9109 LD R16,Y+
\ 00000012 9119 LD R17,Y+
\ 00000014 9518 RETI
58

Two registers used, two registers pushed.
As you see, there is no reason to even push the PSR in this case since the
flags do not get updated.

If you need fast interrupt response, and need to do a lot,
then consider to divide the handler into two parts.

First part (minimal) does minimal fast processing and at the end, it sets an
external interrupt
which continues the processing after the fast interrupt has exited.

__no_init __register BYTE SavePortB @4; Put i in Register r4
#pragma vector=TIMER
__interrupt void fast_handler(void)
{
SavePortB = PORTB;
set_ext_interrupt_pending();
}

#pragma vector=EXT_INT_HANDLER
__interrupt void slow_handler(void)
{
// Continue slow processing after fast handler has exited.
PORTB = 0xF0;
PORTB = 0x0F;
PORTB = i;
}

Since the processing is minimal in the fast handler, very few registers
should be pushed by a good compiler.


There is a 4kB restricted C compiler for tests.
You have to personally contact IAR to get it. It is not on their web page.
This does not generate assembly code , only object code.



--
Best Regards
Ulf at atmel dot com
These comments are intended to be my own opinion and they
may, or may not be shared by my employer, Atmel Sweden.




 
Reply With Quote
 
Ulf Samuelsson
Guest
Posts: n/a

 
      01-13-2005, 12:29 AM
> And all that assumes that the executing code has no critical
> sections implemented by disabling interrupts. Does no ARM
> instruction take over 3 cycles? What about a return? What about
> other interrupts and returns from them, if any. Hairy.
>


Don't forget that the main reason for long worst case interrupt latencies is
probably another
interrupt which does not enable the global interrupt flag, this allowing
nexted interrupt.
This conflict will only appear AFTER customer shipment,according to Murphys
law.
You have to add together ALL interrupts in the system which has higher
priority
to find your worst case latency.
This is not something that can be tested. You have to do the calculations.

--
Best Regards
Ulf at atmel dot com
These comments are intended to be my own opinion and they
may, or may not be shared by my employer, Atmel Sweden.


 
Reply With Quote
 
CBFalconer
Guest
Posts: n/a

 
      01-13-2005, 07:37 AM
Ulf Samuelsson wrote:
>
>> And all that assumes that the executing code has no critical
>> sections implemented by disabling interrupts. Does no ARM
>> instruction take over 3 cycles? What about a return? What about
>> other interrupts and returns from them, if any. Hairy.

>
> Don't forget that the main reason for long worst case interrupt
> latencies is probably another interrupt which does not enable the
> global interrupt flag, this allowing nexted interrupt. This
> conflict will only appear AFTER customer shipment,according to
> Murphys law. You have to add together ALL interrupts in the
> system which has higher priority to find your worst case latency.
> This is not something that can be tested. You have to do the
> calculations.


Of course it is not impossible that the OP has something that runs
a basic loop and has only one interrupt in the system, in which
case there will be no critical sections and the latency is
controlled by the longest instruction. However the return
instruction in many systems implies interrupt disable for the
following instruction, as a measure to avoid stack overflow in some
worst cases. There are also special cases, such as the x86 string
instructions when using a repeat prefix. Don't know about the ARM.

--
Chuck F () ()
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!


 
Reply With Quote
 
CBarn24050
Guest
Posts: n/a

 
      01-13-2005, 10:10 AM
>Subject: Re: AVR interrupt response time
>From: "Jeroen"
>Date: 12/01/2005 20:32 GMT Standard Time


>> I'm hoping to get some code executed within 2 us or so
>> after external interrupt


>> If I expect to have my own code to execute
>> within 0.5 us, is the assembler the only way to go??


Yes, you don't get many instructions in 2uS.
 
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: What keeps time current ? William R. Walsh Dell 2 12-22-2008 04:35 PM
Re: What keeps time current ? Ben Myers Dell 1 12-19-2008 11:28 PM
Microsoft, did you hack my computer? Joseph R Loegering Dell 15 07-24-2007 11:56 AM
Vista is Prime Time, for Some Journey Dell 11 04-22-2007 10:33 PM
TZedit did not work tmc333don Dell 4 03-13-2007 02:39 AM


All times are GMT. The time now is 05:56 PM.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43