# Subsample and Interpolation in hardware using LUT??

Discussion in 'Embedded' started by A.E lover, Feb 2, 2009.

1. ### A.E loverGuest

Hi all,
I read a technical report and get confused. I would like to ask you
guys for help.

In the technical report, they discribe a method to use LUT-based
approach for approximately implementing a function f(x).
Lets assume that the x input is a 4 bit number. (The actual number in
the report is 25 bit number but just take 4 bit number for the sake of
simplicity).
The problem is that they do not have enough memory to store all 2^4
entries. So their solution is to sub-sample the original range
0....2^4-1 down to 0...2^2-1 and store the data in a 2-bit-address
LUT and then using linear interpolation for obtaining the rest
values. To do the sub-sampling, I see they use the right shift
operator.

I try to infer how they implement this idea in more detail. But I am
confused how they populate the 2-bit address LUT?

The original x is that : 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15.

After using x>>2, I guess the sampled range x_s is as follws:

0 4 8 12

If so, the 2-bit address LUT is full with f(x_s). The question is how
to obtain the rest values: f(13), f(14), f(15)? Perhaps we need one
more memory cell to store f(16) for the interpolation.

In your opinion, how it can work? If you have this problem how do you
implement?

Thank you

A.E lover, Feb 2, 2009

2. ### rickmanGuest

I'm not sure what you are looking for. Are the samples f(13) through f
(15) the only ones you are having trouble figuring out? To
interpolate, you have to have two values, one on either end of a
range. So, yes, you would need a value for f(16). Otherwise, you
would need to extrapolate by assuming these samples follow the same
line as the values between f(8) and f(12). If your actual data is
over a wide range and there are a large number of decimated data
samples, the extrapolation could work fairly well, but it would
require a special circuit to do that. It would just be a lot easier
to provide a value for f(16), even if this requires an extra location
in memory.

Another possibility, a very messy possibility, would be to break the
range into n-1 subranges. In your case of 2^4, this would leave three
subranges of 5 elements each with 4 endpoints giving 3*5+1=16. So it
would fit well in this case. If x_s is larger, say 256, after
interpolation by 256 from an original x of 65,536 this requires that
the 65,536 samples be subdivided into 255 ranges of 257 points with an
extra end point. If x_s is not the square root of x, this does not
result in evenly distributed interpolation subranges.

Of course, this is just me talking off the top of my head. Others may
have worked on this before and found an easier way to deal with it.

Rick

rickman, Feb 2, 2009

3. ### Paul CarpenterGuest

Sounds like all they have done is halved the number of points.
No you need at least an implied value if not actual end value (either
first or last). To know what your end points are, if process always
starts at 0, then use points 3, 7, 11, 15, if process always ends at
15 then use your set. Otherwise you need 5 points.
Either that of f(0) depending on your application.
A lot depends on your application.
--
Paul Carpenter |
<http://www.pcserviceselectronics.co.uk/> PC Services
<http://www.pcserviceselectronics.co.uk/fonts/> Timing Diagram Font
<http://www.gnuh8.org.uk/> GNU H8 - compiler & Renesas H8/H8S/H8 Tiny
<http://www.badweb.org.uk/> For those web sites you hate

Paul Carpenter, Feb 2, 2009
4. ### Paul KeinanenGuest

To reduce computational load, you can store both the function values
for x=0, 4, 8 and 12 and also store the multiplier, by which the
fractional part of x should be multiplied with, thus, values larger
than 12 are automatically extrapolated.

Storing also the square and even cube terms for each fractional value
in the table will more accurately follow the function and you can
reduce the number of elements in the table.

Paul

Paul Keinanen, Feb 2, 2009
5. ### rickmanGuest

THAT is the way to do it. You don't need to store *any* data points
in memory. The slope and offset can be stored for each segment. This
can be applied to whatever sized segments you need and as you say, it
will reduce the computational load. These values can be precomputed
to *minimize* the overall error rather than storing some points
exactly and having an error on all the interpolated points which will
mostly be in the same direction in any given segment and therefore not
a minimum error overall.

I knew there had to be a good solution to this.

Rick

rickman, Feb 2, 2009
6. ### Paul KeinanenGuest

Trigonometric functions are often calculated by first taking modulo 2
pi to get the argument into 0..2pi. The two most significant bits
select the quadrant, the next 2-4 bits select the segment and the
remaining bits are inserted into the polynomial a + bx + cx² + dx³ +
..... in which a, b, c, d, .. are constants stored for each segment.

For single precision a 3rd or 4th degree polynomial is usually enough,
for double precision 7-8 is usually needed, depending on the number of
segments used.

Depending on the processor computational speed and available constant
memory size, quite a lot of trading can be done between speed and
size.

Paul

Paul Keinanen, Feb 2, 2009