1. This forum section is a read-only archive which contains old newsgroup posts. If you wish to post a query, please do so in one of our main forum sections (here). This way you will get a faster, better response from the members on Motherboard Point.

High-end AVR vs. low-end ARM?

Discussion in 'Embedded' started by Bresco, Nov 6, 2008.

  1. Bresco

    Bresco Guest

    In terms of pricing, how do high-end AVR's (Mega-128) compare to low-end ARM
    processors? The ARM's are much more powerfull and have large RAM memories on
    them.

    Anyone ever compare them? I heard that ARM's are cheaper than AVR's these
    days. Is this true?
     
    Bresco, Nov 6, 2008
    #1
    1. Advertisements

  2. Bresco

    Leon Guest

    ARM chips like the NXP LPC2000 can be cheaper than high-end AVRs, and
    offer much more performance. However, they consume more power and
    could work out more expensive by the time they are put on a PCB,
    because of the requirement for two supplies.

    Leon
     
    Leon, Nov 6, 2008
    #2
    1. Advertisements

  3. Hi,

    @ Leon, I agree in everything but one, the power consumption. For
    example the mentioned LPC2000 can run with 40 mAs @ 70 MHz (2103), my
    guess would be you need 10 AVRs, running @ 16 MHz to match the
    performance in computing. AFAIK they need more than 4 mAs @ 16 MHz.
    On the other hand there is a HUGE difference in standby current. AVRs
    at least the older ones can go into standby mode at or below 1uA, if
    one if the ARM devices gets hot the standby current exceeds 100 uAs
    easily.

    There is another reason why to stick with AVR, simplicity. If you are
    familiar with AVR, you can finish your project a lot faster than using
    an ARM. Given a scenario where you start a new project and you wonder
    whether the AVR will be powerful enough, go with the ARM, it is going
    to provide more for he money. If your deadline is the most important
    topic and the AVR is powerful enough, spend the extra money and stay
    with the 8-bit.

    As for AVR32, in case you were thinking about that one, there is no
    real reason I would know why to start with that device. Use a Cortex-
    M3 device instead the upcoming standard.

    AnSchwob
     
    An Schwob in the USA, Nov 7, 2008
    #3
  4. Bresco

    linnix Guest

    Just one more thing. The 1.8V to 5.5V operating range for the AVR is
    very useful for battery devices. You usually need higher than 1.8V,
    even for ARM with build-in regulator.
     
    linnix, Nov 7, 2008
    #4
  5. As soon as code size goes over 64K, then the simplicity of the AVR
    vanishes. Not having 24bit or 32bit pointers causes all sorts of
    problems. Also if one need to execute code from RAM space, then AVR
    is a no go.
    The Cortex-M3 devices tend to have a built in regulator for gnerating
    all the needed supplies from one supply. They are also a LOT cheaper
    when one starts looking at >= 128K flash . The Cortex-M3 devices has
    removed a lot of the complexity one has to deal with when using the
    ARM7 and ARM9 MCUs.

    ATMEGA128-16AU is US$15 while a LM3S1607-IQR50 is US$5 in
    single quantities at Digikey. The ATMEGA128 is 16MHz, 8Bit with 128K
    of flash. The LM3S1607 is 50MHz, 32bit with 128K of flash.
    Agreed.

    Regards
    Anton Erasmus
     
    Anton Erasmus, Nov 7, 2008
    #5
  6. As for AVR32, in case you were thinking about that one, there is no
    Let's see,

    Where do I get the Cortex-M3 flash chip with

    * Lower power consumption than any existing Cortex-M3 chip
    * Single 1,8V +/- 10% power-supply for CORE *AND* I/O?
    * 5V VCC , desirable for motor control?
    * debug support allowing you to read/write internal registers without
    stopping the MCU.
    * High Speed USB
    * Free Eclipse/GCC tool directly supported by the silicon vendor
    * Sustained 33 DSP MIPS when doing vector sums
    for(sum=0; i = 0; i < n; i++) sum = sum + C * X;
    * Migration path to low cost versions supporting Linux.
    * Same H/W tools as the AVR (JTAG-ICE Mk II & STK600)
    * Trace capable emulator at below $600 (AVRONE)

    Googling does not give any clue...
     
    Ulf Samuelsson, Nov 7, 2008
    #6


  7. The full combination does not exist.
    Just listed some properties, that could make people want
    to think twice about focusing 100% on CM3.

    UC3L = 1.8V VCC
    UC3C = 5V
    UC3A3 = High Speed USB
    UC3B & UC3L should be lower power than CM3
    UC3A/C has 66 MHz operation and thus 33 DSP MIPS
    AP7 runs Linux, Need Cortex-A8 for this and that ain't cheap.

    In the end, it will be the right combination of peripherals
    which will be key to the decision.
     
    Ulf Samuelsson, Nov 7, 2008
    #7
  8. Well, duuuuuh. It's an impossible question!
     
    Clifford Heath, Nov 7, 2008
    #8
  9. Yep, but I think people get the hint ;-)
     
    Ulf Samuelsson, Nov 8, 2008
    #9
  10. Bresco

    voices Guest



    You can compare Cortex-M3 to AVR32 UC3A and UC3B series, but not to
    AP7(hi-speed usb, mmu, linux) - it's a different class of devices.
    We also don't compare Intel Core2Duo to AVR ;)
     
    voices, Nov 8, 2008
    #10
  11. Bresco

    steve Guest



    googling doesn't give you a clue for 1.8V, 5V AVR32s either....
     
    steve, Nov 10, 2008
    #11
  12. Bresco

    steve Guest

    If you need really cheap and your watching every penny then ARM's are
    still higher price then low end AVR's. Cortex has low power similar to
    AVR and MSP430's, running and standby, and operate down to 2V. ARM's
    tend to come in bigger packages and require more external parts
    (caps), in general. As a wild guess I would say 90% of High end AVR'
    applications could switch to an ARM. There are some ultra low power
    applications where AVR and MSP430 are still king and there is no ARM
    substitute.
     
    steve, Nov 10, 2008
    #12
  13. "steve" <> skrev i meddelandet

    Well that proves that google doesn't know everything :)

    All things above mentioned in the offical UC3 presentation,
    The average Joe won't see UC3L/UC3C/UC3A3 until beginning of next year.

    The technology behind the 1.8V devices is already available in AT91SAM7L.
    The SAM7L runs the flash down to 1,55V.
     
    Ulf Samuelsson, Nov 10, 2008
    #13
  14. Bresco

    steve Guest



    Ok, the 7L are nice, though wish they expand the family

    I've noticed in the Atmel slides packages they say FIR filter is 11
    times faster then on a CortexM3. That is hard to believe, not sure
    why, Cortex is 2 cycle MAC, AVR32 is single cycle, maybe with the 2
    wait states on Cortex FLASH they came up with that number?


    * Sustained 33 DSP MIPS when doing vector sums
    for(sum=0; i = 0; i < n; i++) sum = sum + C * X;

    the 33 MIPS is at what clock speed?
     
    steve, Nov 11, 2008
    #14
  15. "steve" <> skrev i meddelandet


    Ok, the 7L are nice, though wish they expand the family.

    ==> There is a new family in the works with more SRAM.

    I've noticed in the Atmel slides packages they say FIR filter is 11
    times faster then on a CortexM3. That is hard to believe, not sure
    why, Cortex is 2 cycle MAC, AVR32 is single cycle, maybe with the 2
    wait states on Cortex FLASH they came up with that number?

    ==> Not only that.
    I am not sure about 11 times though.

    You win by having
    * 1 clock cycle load instructions.
    Cortex-M3 implementations are at least 2, maybe more
    If running from flash, then there will be plenty of clocks.
    The AVR32 with the AHB will probably use two clocks
    to read from the flash at 66 MHz.
    Furthermore, this is non blocking in some cases
    since the core can read instructions from the intruction
    queue instead of from the flash.
    * The ability to use the upper part of the 32 bit register
    for MAC instructions, so you load TWO samples/coefficients
    in a single clock cycle.

    The unroled loop then becomes:

    LOAD 1 clock
    LOAD 1 clock
    MAC 1 clock
    MAC 1 clock

    * The hidden Accumulator
    The register file on a low end risc processor normally
    only have two read ports.
    You cannot do A = A + C*X in a single clock
    because you need to read A,C and X in the same clock cycle.

    The AVR32 has a "hidden" accumulator (patented) which
    allows you to use the two read ports for C and X

    After the last MAC, you write the accumulator back to the
    register file, adding one clock latency

    * The AVR32 runs with 1 waitstate, while the STM32 runs with 2.

    * Sustained 33 DSP MIPS when doing vector sums
    for(sum=0; i = 0; i < n; i++) sum = sum + C * X;

    * The last feature is instructions which handle saturation
    the way a DSP should, and this has to be handled
    manually in other RISCs like CM3

    the 33 MIPS is at what clock speed?

    ==> 66 MHz (with a 100% unrolled loop)
    I.E: n = 6 =>

    LOAD 1 clock
    LOAD 1 clock
    MAC 1 clock
    MAC 1 clock
    LOAD 1 clock
    LOAD 1 clock
    MAC 1 clock
    MAC 1 clock
    LOAD 1 clock
    LOAD 1 clock
    MAC 1 clock
    MAC 1 clock
    ; Hidden writeback: 1 clock

    --
    --
    Best Regards,
    Ulf Samuelsson

    This message is intended to be my own personal view and it
    may or may not be shared by my employer Atmel Nordic AB
     
    Ulf Samuelsson, Nov 12, 2008
    #15
  16. Indeed, people are still spreading lies about Cortex-M3 as usual.
    Cortex-M3 loads are 2 cycles unless the next instruction is a load or
    store, in which case it is 1 cycle. So a sequence of N loads takes
    N+1 cycles.
    This is the same trick as the ARM9E introduced a long time ago.
    The Luminary Cortex-M3 cores run with 0 wait states. But even with a
    wait state you don't necessary see a slowdown if the fetch width is at
    least 64 bits (3-4 Thumb-2 instructions). Waitstates primarily slowdown
    branches.


    Actually Cortex-M3 has a saturate instruction.
    On Cortex-M3 this would take the following sequence:

    LDRH r2, [r0,#0]
    LDRH r3, [r0,#2]
    LDRH r4, [r0,#4]
    LDRH r5, [r1,#0]
    LDRH r6, [r1,#2]
    LDRH r7, [r1,#4]
    MLA r8,r2,r5,r8
    MLA r8,r3,r6,r8
    MLA r8,r4,r7,r8

    The LDRHs take 7 cycles (6 + 1), the MLAs take 6 cycles, or in total 26 cycles.
    That is exactly twice as slow as AVR32 on the above code. So the claim of 11
    times slower is a total lie. Those Atmel marketeers should be ashamed of
    themselves.

    Wilco
     
    Wilco Dijkstra, Nov 12, 2008
    #16
  17. Bresco

    steve Guest



    Ok, I took the atmel published FIR filter cycle count and the STM FIR
    filter cycle count both from their websites (using their optimized in
    house DSP packages)

    http://www.atmel.com/dyn/resources/prod_documents/doc32076.pdf

    http://www.st.com/stonline/products/literature/um/14988.pdf

    of course both don't give data on the same size FIR filter, so I have
    to normalize...

    For Atmel, a 64 point, 24 tap,41 outputs FIR takes 2,439 cycles, which
    is 41*24 = 984 MACs, for a cycle/MAC ratio of 2.478 cycles/MAC

    For STM Cortex at full speed 2 wait states, 63 point 32 tap, 32 output
    FIR takes 3929 cycles, which is 32*32 = 1024 MACs
    for a ratio of 3.83 cycles/MAC (2 wait states)

    a difference of 1.54X

    at zero wait states ( below 24Mhz) STM reports 3478 cycles

    so 3.396 cycles/Mac (0 wait states), a difference of 1.37 times
     
    steve, Nov 12, 2008
    #17
  18. ==> 66 MHz (with a 100% unrolled loop)

    And you are comparing 3 MACs with 6 MACs.

    6 MACs from memory using AVR32 = 13 clocks.
    6 MACs from memory using CM3 = 52 clocks or 4 x difference.

    --
    Best Regards,
    Ulf Samuelsson

    This message is intended to be my own personal view and it
    may or may not be shared by my employer Atmel Nordic AB
     
    Ulf Samuelsson, Nov 13, 2008
    #18
  19. No, read again. It's 13 cycles to do 3 MACs, so 26 to do 6 MACS.

    Wilco
     
    Wilco Dijkstra, Nov 13, 2008
    #19

  20. No, read again. It's 13 cycles to do 3 MACs, so 26 to do 6 MACS.

    Wilco
     
    Wilco Dijkstra, Nov 13, 2008
    #20
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.