# floating point to fixed point conversion

Discussion in 'Embedded' started by riya, Feb 21, 2006.

1. ### riyaGuest

hello guys,

I need some help from you. I am doing a DSP project and for that I need
to do some C coding for the conversion of sample data which is in
floating point representation to fixed point representation.
the sample data is in floating point like

2.296968
-0.448350
-2.779426

My DSP algorithm is implemented in C and is supposed to be using fixed
point representation.
The above data is intended to be converted to fixed integer format.I
request you to help me out regarding this conversion.I will be very
glad if u give me some hints or algorithms for this conversion.

riya, Feb 21, 2006

2. ### Tim WescottGuest

If you must post the same question to multiple newsgroups please cross-post.

--

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Tim Wescott, Feb 21, 2006

3. ### Isaac BosompemGuest

I will use single precision.

As you may or may not know the IEEE754 format is as follows:

SEEEEEEEEMMMMMMMMMMMMMMMMMMMMMMM

S = Sign bit (0 = +, 1 = -)
E = 8-bit biased exponent (Bias = 0x7F)
M = Fractional portion/ Significand

The IEEE754 has a single implied integer bit of 1 (which is excluded
from the mantissa).

Really conversion from FP to Fixed point will be shifting and maybe
negation as well (only whole part, when negating, DO NOT touch the
fractional portion of the fixed point value).

0x800000)
Zero extend the result to 32-bits (ideally larger since you will risk
losing some integer bits, If the values that you have given represent
the range of FP values expected than 32-bits will be sufficient ).

Shift left this value, and decrement the exponent with each shift, if
unbiased exponent is positive.

Shift right this value, and increment the exponent with each shift, if
unbiased exponent is negative.
Repeat until the exponent = 0 (Remember to remove bias)
Take bits 31 - 24 as your integer portion and the bits below that as

Bits 23-0 will be your fractional portion.
Say if you are using 16.16 fixed point you will have to truncate the
fractional portion. So the Leat significant byte of the fractional
portion will have to be discarded.

Last you will need to test the sign value to determine if negation of
the whole portion should take place.

I don't know how efficient this algorithm is or if there are any
mistakes, hopefully someone will point this out. If I was given this
task, that is how I would attack it.

-Isaac

Isaac Bosompem, Feb 22, 2006
4. ### Charles OramGuest

Depending on what precision you need, the simplest way is just to multiply
the floating point numbers by an integer constant, then do all you maths
processing in integers. To convert back to the floating point values just
divide by that same integer constant. Think of it like changing your units
metres.
For example, you could use 32-bit integers as your fixed point numbers -
multiply your floating point numbers by 65536 (or shift 16 bits) to get the
fixed point numbers, then divide by 65536 to get back.

- Charles

Charles Oram, Feb 22, 2006
5. ### Isaac BosompemGuest

This will work and will be a lot easier if your target has the
capability (or instruction) to convert a fp value to an integer value
(like x86).

Isaac Bosompem, Feb 22, 2006
6. ### Meindert SprangGuest

It's not that simple. If you start with metres and you want to calculate 1m
x 1m, the result is obvous. If you now converts 1m to 1000mm first and
simply do 1000mm x 1000mm, the result is not quite what you want.
DSP's solve this by swithing the MAC into either integer or fractional
mode, where the latter shifts the result one bit left after each multiply.

So you have to follow some rules of thinking. If you convert your floats to,
say 1.15 fixed point and you multiply two of these, the result is 2.30,
possibly truncated to 2.14 (16 bits). As long as you keep this in mind, you
can multiply anything like this.

If you look at the numbers the OP gave:
2.296968 can be represented as 2.14 and -0.448350 in 1.15. The result of the
multiplication will be in 3.29 format. Additions do not have this effect.

So as long as you keep the resulting format in mind, you can indeed multiply
each float by for instance 2^16 to convert them int fixed-point numbers.
No, you must devide by 65535 * 65536
Shift the input 16 bits left, shift the output *17* bits right.

Meindert

Meindert Sprang, Feb 22, 2006
7. ### Charles OramGuest

You're dead right - I got confused because I was working on a calculation at
the time where only one of the numebrs being multiplied has been scaled by
16 bits The difficult part about doing it yourself (with no support from your
processor) is that you have to go through each calculation and check that
you are scaling the output correctly each time and that your calculations
are not overflowing the integer size.

- Charles

Charles Oram, Feb 24, 2006
8. ### Meindert SprangGuest

But the good part of this is when you get the hang (and the discipline) of
it, you can do the calculations in any fixed point representation you like.
You can even treat numbers in one format at a certain stage in the
calculation and then move on just "thinking" them in a different format in
the next stage.

Meindert

Meindert Sprang, Feb 24, 2006