1. This forum section is a read-only archive which contains old newsgroup posts. If you wish to post a query, please do so in one of our main forum sections (here). This way you will get a faster, better response from the members on Motherboard Point.

linear interpolation / Assembler / ATMega32

Discussion in 'Embedded' started by Rolf Bredemeier, Jun 12, 2004.

  1. Hi all,

    i`m searching an ASM-example for 2d lin. interpolation.
    (No need for floating point)

    In the EEPROM is stored an table, like so

    ..db 0, 75
    ..db 40, 37
    ..db 60, 48
    ..db 80, 180
    ..db 100, 170

    (The first byte in every .db line is the index, the second the value.)

    Now i need the interpolated value for an index 55.

    The calculation for that is not so difficult:

    48 - 37
    X = 37 + --------- * (55 - 40)
    60 - 40

    So result is 45 for the index 55

    But, how is this do do in Assembler?
    And how to handle negative slope?

    Perhaps, someone can post example code, or a link?

    Thanks for your time,and best regards

    Rolf Bredemeier, Jun 12, 2004
    1. Advertisements

  2. On Saturday, in article <cafeto$1mc$03$-online.com>
    But you do need FIXED point..
    Or in integer maths (assembler)

    X = 37 + -- * 15

    X = 37 + 0 * 15 = 37

    What precision do you need??

    Does your hardware support 8 x 8 or 16 x 16 bit Multiply?

    Does your hardware support 16 / 16 bit?

    For most cases work out the precision and accuracy you require in BINARY
    places, then multiply up numbers that are used in the divide and multiply
    by at LEAST that number of places do the calculation and remove fixed
    point multiplier to give integer result you need.

    Remember 6 binary bits of precision is not the same as 6 binary bits of
    accuracy as some fractions are always awkward (prime numbers especially).
    Depends on the target and what instructions it has. Also what register
    and data sizes it supports in assembler. As well as the points above.
    That is the least of your worries, you need to trap for divide by zero.
    negative slope just means you add a negative number! Remember that
    a positive / negative gives a negative number so does a negative / positive
    number, similarly a positive * negative number gives a negative number.
    You need to give more info as on controllers I use (H8 series) a lot of
    this is easy because it has 32bit registers as well hardware mult and div

    On some processors you will need to do multi byte arithmetic as they only
    support 8 bit data types.
    Paul Carpenter, Jun 12, 2004
    1. Advertisements

  3. Rolf Bredemeier

    R Adsett Guest

    Or more properly (taking into account finite precision, elementary
    numerical analysis and the fact that 60-40 = 20 ;) )

    X = 37 + (11*15)/20
    X = 37 + 165/20
    X = 37 + 8
    X = 45

    Of course this chops rather than rounds so the question of precision
    still comes up. Also you need an intermediate value of 2n+1 bits (where
    n is the number of bits in your original values), check against overflows
    and of course against divide by zero. You also need to be careful
    that the table step size isn't too large. Some of those checks can be
    eliminated if you can guarantee the composition of the table.

    As long as you don't need better than +-1 than there is no need to have
    any representation other than whole numbers.

    R Adsett, Jun 12, 2004
  4. On Saturday, in article
    Damn retyped it several times to line up everything then put wrong value
    Alternative method..
    Doesn't it always come up...
    As one says depending on required precision...
    Paul Carpenter, Jun 12, 2004
  5. Rolf Bredemeier

    R Adsett Guest

    I hate it when that happens :)
    Mind you if the OP needs better than +/- 1 the first thing to address is
    the data representation in the table.

    I don't know if the AVR has (or has a SW library that has) an 8bit x 8bit
    to 16bit multiply and a 16bit / 8 bit -> 8bit divide but if it does than
    with a few 'reasonble restrictions' on the table there will be no worry
    about overflows or loss of precision.

    R Adsett, Jun 12, 2004
  6. Rolf Bredemeier

    Unbeliever Guest

    Just a suggestion or three: if you drop the index and add a value for 20,
    you'll use less eeprom space. you will know that index = offset * 20. Your
    values are very coarse. and the result is not likely to be very accurate.
    This appears to be a complex curve (at least a cubic). I would throw in
    some extra values. You could insert a value every 10, or perhaps 8 (the
    arithmetic becomes simpler/faster/cheaper if the index is a power of 2 as
    multiplies and particularly divides can be replaced by shifts. ..... but
    I'm not going to write the assembler for you, sorry.

    Unbeliever, Jun 13, 2004
  7. Rolf Bredemeier

    Unbeliever Guest

    You'll notice that the denominator is a constant. There are two ways of
    handling this simply. The first is to multiply by a constant of the
    reciprocal of the denominator (i.e. 1/20 - or 2^N * 1/20 in the exemplar
    case), or better still change the interval to 2^N (e.g. 8 or 16) and shift
    right N to accomplish the division.

    Unbeliever, Jun 13, 2004
  8. Hi Paul!

    I think no, because +/-1 is good enough.

    In the subject of my posting the type of MC is given, AVR ATMega32.
    This device does not support DIV, and has only 8 bit Registers.
    But math software routines are available.

    Unfortunately i'm not an good asm-programmer.
    Due to that fact, to write such an interpol function in asm, it will
    take the rest of my life.
    So i need a piece of code, which i can use as "black box". Values in,
    result out...

    Anyway many thanks for your answer, it shows me some significant details
    and explain the principle.

    And, of course, also thanks do Mel and Robert.

    Regards and greetings from Petershagen, Rolf
    Rolf Bredemeier, Jun 13, 2004
  9. Hi Alf!

    My values only an example, in reality there are more entrys in the table,
    perhaps 20 for an not linear analog value from 0 to 100 percent.

    I am afraid for that, but i understand ;-))

    Best regards, and thanks for the tip to do index steps by eight.

    Rolf Bredemeier, Jun 13, 2004
  10. If possible, use a table with evenly spaced indexes, thus eliminating
    the need to store the index into the table. If possible, fill the
    tables with indexes that are spaced with some power of 2, e.g. using
    point values at 0, 16, 32, 48 and so on, thus the range is always 16
    and the final division by 16 can be implemented by a 4 position right
    If the indexes are spaced by 16 into a byte table, simply shifting the
    index right by 4 places (3 in this case) will give the offset into a
    byte table. Thus, the value can be found from address Table+3 in this
    Do the multiplication before division to avoid truncation problems.
    The multiplication should produce a 16 bit result.

    If the difference between the index values is 16 in the table, the
    division by 16 can be replaced by shifting the 16 bit product right
    by four places.
    If signed multiply and division are used, there should not be a
    problem. If arithmetic shift right is used to handle 2's complement
    negative numbers, the results might not be quite as expected, since
    e.g. -1 >> 4 = -1.

    If the 16 bit multiplication product is negative, convert it first
    from 2's complement to 1's complement (or negate it), shift it right 4
    places and then convert it back to 2's complement (or negate the
    result respectively). If the product was positive, just shift it right
    4 places.

    Paul Keinanen, Jun 13, 2004
  11. Rolf Bredemeier

    Ville Voipio Guest

    You have already got several good answers, but there are two
    things which might help you:

    #1 Can the intervals in the EEPROM be a power of two?
    #2 There is a simple way for proper rounding of the results.

    If you could make this table so that the intervals are powers
    of two, then you can replace the division by a simple shift
    (arithmetic shift right, ASR). Then you can also find the
    table position by shifting instead of dividing. This will
    save a lot of time and code.

    Let's say you have the tabulated data with 32-unit intervals.
    Then when you need to find the value for f(x):

    ; index to the table (i <- x/32, rounded down)
    i = x / 32

    ; sub-index (remainder of the previous calculation)
    s = x mod 32

    ; base value from the table
    b = t

    ; difference to the next tabulated value
    d = t[i+1] - t

    ; the result
    r = b + s/32 * d

    Now, here we still have all the precision and performance worries.
    This can be simplified significantly because of the simpler division
    and modulo with a power-of-two.

    i = x shr 5
    s = x and 0x1f
    b = t
    d = t[i+1] - t
    r = b + (s * d) asr 5

    Beware, be sure to check there is no possibility for any
    overflow on the last row. Also, if d can be negative, you have to
    use asr intead of shr. Alternatively, you may branch the

    if t[i+1] < t then
    d = t - t[i+1]
    r = b - (s * d) shr 5
    d = t[i+1] - t[1]
    r = b + (s * d) shr 5

    This enables the use of positive (unsigned) numbers only.
    This will give you two bits more headroom and sometimes simpler
    code, depending on the machine. (This does not apply to ATmega,
    since there is a built-in signed multiply instruction.)

    There is still one more problem: rounding. Basically, the standard
    method is to do something like this:

    r = b + ((s * d shl 1) + 1) asr 6

    This adds a one bit bias to the first drop-out bit. Everything
    above 1/2 will be rounded up, just as it should be. Of course,
    you may shift your look-up table by one if you want to avoid the
    shift at this point.

    If you want to use this rounding algorithm with the sign branching
    shown above, be sure to _subtract_ the half-bit in the negative
    branch. Otherwise -1.5 will round to -2 instead of -1. This
    will introduce an odd jump around zero. Difficult to spot
    and gives very annoying noise (been there, done that).

    There are at least umpteen ways to polish the algorithm to
    give the fastest performance. Some machines might benefit from
    using biased unsigned instead of signed, and the best way to
    perform the look-up depends on the machine. If you have a
    lot of room in the memory, you may calculate the differences

    If you do not have a hardware multiplier, you may be fastest
    and smallest off by rolling your own: (Doing this on ATmega
    isn't a very good idea, as there is the hw multiplier.)

    i = x shr 5

    ; this becomes unnecessary:
    ; s = x and 0x1f

    r = t
    d = t[i+1] - t

    ; check the highest bit of s (16's in x)
    a = 0
    if (x and 0x10) = 0x10 then
    a = a + d

    ; second highest (8's)
    d = d shr 1
    x = x shl 1
    if (x and 0x10) = 0x10 then
    a = a + d

    ; 4's
    d = d shr 1
    x = x shl 1
    if (x and 0x10) = 0x10 then
    a = a + d

    ; 2's
    d = d shr 1
    x = x shl 1
    if (x and 0x10) = 0x10 then
    a = a + d

    ; 1's
    d = d shr 1
    x = x shl 1
    if (x and 0x10) = 0x10 then
    a = a + d

    ; round-up
    a = a + 1
    a = a shr 1

    r = r + a

    In the code above, the multiplication loop has been rolled out,
    but there is no reason why it couldn't be written as a loop
    (apart from slight performance hit and need for yet another
    variable). Many small processors offer a simple method for
    checking a bit, something like "skip if bit n in register r is

    One of the advantages of this method is that it requires only
    one extra bit. If you can toss rounding (by taking it into
    account in the tabulated data!), then even this bit is not

    - Ville
    Ville Voipio, Jun 13, 2004
  12. Rolf Bredemeier

    Ville Voipio Guest

    Well, no problem, as:

    -17 >> 4 = 11101111b >> 4 = 11111110b = -2
    -16 >> 4 = 11110000b >> 4 = 11111111b = -1
    -1 >> 4 = 11111111b >> 4 = 11111111b = -1
    0 >> 4 = 00000000b >> 4 = 00000000b = 0
    15 >> 4 = 00001111b >> 4 = 00000000b = 0
    16 >> 4 = 00010000b >> 4 = 00000001b = 1

    So, everything is rounded down:

    -2 <= x < -1 -> -2 (bin average -1.5)
    -1 <= x < 0 -> -1 (bin average -0.5)
    0 <= x < 1 -> 0 (bin average 0.5)
    1 <= x < 2 -> 1 (bin average 1.5)

    The results are half-step lower than they should be (they should
    coincide with the bin averages). This can be corrected by
    adding 8 (that is 0.5 * 2^4) to the initial value:

    (-9 + 8) >> 4 = 11111111b >> 4 = -1
    (-8 + 8) >> 4 = 00000000b >> 4 = 0
    ( 7 + 8) >> 4 = 00001111b >> 4 = 0
    ( 8 + 8) >> 4 = 00010000b >> 4 = 1

    Now the limits are exactly where they shoud be:

    -9 / 16 = -0.5625 -> -1
    -8 / 16 = -0.5 -> 0
    7 / 16 = 0.4375 -> 0
    8 / 16 = 0.5 -> 1

    Everything is rounded to the nearest integer, halves are rounded

    So, the plain ASR works fine apart from the average -0.5 bias in
    the result. The bias can be compensated for in the tabulated data
    or by adding the half before shifting.

    The worst thing you may do is to unsignedify the number, round
    it down, and then put the sign back. Because:

    -17 / 16 -> -(17 >> 4) -> -1
    -16 / 16 -> -(16 >> 4) -> -1
    -15 / 16 -> -(15 >> 4) -> 0
    0 / 16 -> ( 0 >> 4) -> 0
    15 / 16 -> (15 >> 4) -> 0
    16 / 16 -> (16 >> 4) -> 1

    The function becomes dead around zero! The result is zero over
    two units, and everything linear becomes non-linear. Cross-over
    distortion sets in...

    Unfortunately, this is exactly what C does when you write:

    int z, x, y;

    z = x / y;

    There is a technical reason for this; division is easier to do,
    if you do not need to write different branches for different
    signs. C rounds towards zero. Not up, not down, but towards
    zero. (This may even be a desired behaviour in some cases,
    but with measurements it is a very bad thing.)

    One simple way around this is to use biased unsigned divisions:

    // bias by 15:
    z = (x + 15 * y) / y - 15;

    If x + 15 * y always >= 0. This is rather easy with constant divisors.
    With variable divisors finding a suitable constant is very difficult.
    Pick a too low one, and the sum will underflow. Pick a too high one,
    and the sum will overflow.

    The case with the possibility of a negative divisor is even
    more complicated. Then the only alternative is to branch the
    calculations by the signs before doing anything;

    int x, y, z;
    unsigned int ax, ay, az;
    bool sign;

    // find out the sign of the result and the absolute values of x, y
    if (x < 0)
    ax = -x;
    sign = false;
    ax = x;
    sign = true;

    if (y < 0)
    ay = -y;
    sign = ~sign;
    ay = y;

    // if the sign of the result is negative, round away from zero
    if (sign == false)
    az = (ax + ay - 1) / ay;
    z = -az;
    az = ax / ay;
    z = az;

    Yes, there are a lot of extra variables which may be avoided
    by using type casts. However, I think it is better avoid the
    type casts and let the compiler get rid of the extra variables.

    So, twenty-something lines of code to achive a very simple
    thing. The good thing is that this does not give a very large
    performance hit, as the signed division algorithm does this

    To make the algorithm round right (towards nearest, halves up)
    requires only the following changes:

    negative branch:
    az = (ax + ay - 1 - ay/2) / ay

    positive branch:
    az = (ax + ay/2) / ay

    Even though it is very tempting to say:

    ay - 1 - ay/2 == ay/2 - 1,

    don't. Because it isn't. For, e.g., ay = 17:

    17 - 1 - 8 == 8 - 1
    8 == 7


    I know this is somewhat complicated. The effect of false rounding
    is usually quite small. I personally crashed into this well-known
    (generally, that is, not well-known by me) problem when I was
    looking at a F-transform of some real-world data. There were some
    harmonics there shouldn't've been. I scratched through
    the skin of my head before finding out the problem.

    If off-by-one is bad, off-by-half is not much nicer, either.

    - Ville
    Ville Voipio, Jun 13, 2004
  13. Hello Ville, hello Paul!

    I'm very astonished about the lot of answers you
    have written for me.

    Many, many thanks! :))

    So my goal is to do the implementation personally.
    While doing this, i will learn much for writing
    better asm code.

    Perhaps, in 10 or 12 years, i can help other
    people for implementing algorithm! ;-)

    Thanks again for the lot of time you spend to me.
    Now i will try it!

    Best regards, Rolf
    Rolf Bredemeier, Jun 13, 2004
  14. Rolf Bredemeier

    dwight elvey Guest


    Most high level languages round towards zero, mostly
    because it is traditional. Doing floored rounding makes
    more sense but even that has to be done carefully.
    In something like DSP, one can accumulate a DC offset
    because of floored rounding. As you point out, you
    get a cross over distortion for rounding towards zero
    so there is no absolutely right way to do this.
    One has to look carefully at the application one intends
    to use it in before determining which way to go on this.
    One might even use both methods in the same stretch of code.
    I once designed a XY table that requires 24 bit math
    that a fellow was doing 16 bit math by simply extending
    it to 32 bits while using the round towards zero.
    It was a disaster. The table would occasionally jump
    by a sizable amount while incrememnting by a joy stick.
    This was a table that should have positioned to about
    two 10/1000th of an inch. This was traced to his round
    towards zero math in the Pascal he was using to do the
    extended math.
    It would have been nice if the uP makes had included
    both types in the hardware so we could choose which
    to use and where.
    dwight elvey, Jun 15, 2004
  15. I think it should be more appropriate to talk about truncation in this
    case and this behaviour makes sense in many situations.

    IMHO, rounding should be reserved for rounding towards the nearest

    At least old Pascal did not allow direct assignment of floating point
    values to integer, but instead the trunc or round functions had to be
    This is a third method and makes sense in this interpolation case,
    however, this is more of a special case.
    It should be noted that if the hardware uses 1's complement or
    sign/magnitude representation, there is the problem with +0 and -0 and
    thus the potential for "cross over distortion".

    Paul Keinanen, Jun 15, 2004
  16. On Sun, 13 Jun 2004 07:38:13 +0200, "Rolf Bredemeier"

    Than I'd say: Write it in C, compile it to assembly and see where you
    can improve it.
    But first: Improve your C code !
    42Bastian Schick, Jun 17, 2004
  17. Rolf Bredemeier

    George Guest

    Here's a suggestion. Do you have a C compiler for your CPU. If not
    try to get a demo version. THen write the solution is C and look at
    the asm output. Then you have the asnwer you're looking for.

    George, Jun 17, 2004
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.