1. This forum section is a read-only archive which contains old newsgroup posts. If you wish to post a query, please do so in one of our main forum sections (here). This way you will get a faster, better response from the members on Motherboard Point.

Position independent code with position dependent data ?

Discussion in 'Embedded' started by nono240, Jun 30, 2010.

  1. nono240

    nono240 Guest

    Hi there !

    My CPU has no MMU, very little RAM (8KB), and is running a modified
    FreeRTOS. I'd like to have the ability to "load" and run some code
    from USART/DATAFLASH to FLASH as a RTOS task. Of course, for
    convenience, the compiled code must be fully position independent.
    Using the -fPIC or -fpic option, looking at the assembler, the code
    seems OK : a dynamically computed offset is applied to every
    operations.

    BUT, looking deeply, both DATA and CODE are applied the base offset !
    While this is the expected behavior for CODE (running anywhere in
    FLASH), moving my CODE over the entire flash space doesn't mean moving
    my RAM ! This make only sense when executing everything from SDRAM !

    I'm looking for a solution to generate position independent *code*,
    but with position dependent *data* using GCC/LD... Any help ?
     
    nono240, Jun 30, 2010
    #1
    1. Advertisements

  2. nono240

    David Brown Guest

    Unless you want to download multiple tasks like this, and have them
    stored in arbitrary places in flash (and ram), then there is no need for
    position-independent data or code. Simple arrange (by linker script
    magic, typically) for the code and data to link at the specific
    addresses of the flash and ram slots you have available.
     
    David Brown, Jun 30, 2010
    #2
    1. Advertisements

  3. nono240

    Jon Kirwan Guest

    Position independent code (in my mind, anyway) is machine
    code that does not require a relinking or fixup phase in
    order to be moved to a different address in memory. There
    are a lot of assumptions in that statement.

    Since your program requires _flash_ and _sram_ to operate and
    you don't say that much about the cpu and software environs,
    I can only add a few thoughts.

    Code and constants used, either directly as data, such as a
    table of values, or else copied into sram prior to the start
    of the program to initialize writable variable instances that
    need specific initial values to be present, can be placed
    into flash either as a single segment or multiple ones.

    However, with sram fixed in one place and flash in another
    place (I'm assuming that's the case as you point out there is
    no MMU), there is no question about the fact that there are
    at least two separate segments in your situation. The base
    address of the flash-located segment might be the PC register
    so that this flash block can be moved around freely and uses
    the PC register as a cheap way to figure out where it's own
    stuff is at (assuming the processor supports that), but that
    won't work for the sram data instance segment which is
    obviously located "elsewhere." Somehow, a base address for
    that region needs to be made available to your code and
    applied at run time. What mechanism is available for that?
    Can you reserve a segment for it located at a physical
    address that is not permitted to change, for example? In
    other words, reserve it?

    Also, there are (not infrequently) specific hardware
    peripherals and other features that may be "memory mapped" to
    specific addresses. Clearly, these should not depend upon
    the flash memory locations of your code.

    How all this gets done does depend on what else you are
    doing. And you haven't talked about that. I don't think
    there is a universal, always-works-everywhere, answer. More
    info is needed, I believe.

    The above is general theory and applies broadly.

    Jon
     
    Jon Kirwan, Jun 30, 2010
    #3
  4. nono240

    Jon Kirwan Guest

    And I'm still not sure _why_ you actually want position
    independence in code and data. If this is the entire
    application and there is nothing else present, no operating
    system for example, then why do you care? Just let the tools
    do their job using mainstream approaches.

    Jon
     
    Jon Kirwan, Jun 30, 2010
    #4
  5. A few decades ago, this was a typical minicomputer configuration with
    8-64 KiD of core, occupying at least one rack.
    Running any kind of pre-emptive multi tasking requires some kind of
    per task stack.

    To be practical, this requires a processor with a stack pointer and
    addressing modes that are stack pointer relative. As an absolute
    minimum, some index register+offset addressing is required (unless
    self modified code is used :).
    You really have to study the instruction set of your processor very
    carefully to find the most effective way of handling things.

    There is no single "correct" answer to your problem.
     
    Paul Keinanen, Jun 30, 2010
    #5
  6. nono240

    D Yuniskis Guest

    I find, in resource starved applications, using interpreters
    is a big win. If you're loading apps dynamically, I suspect
    the speed penalty would be insignificant (esp with careful
    choice of language)
     
    D Yuniskis, Jun 30, 2010
    #6
  7. Dynamic data, stack and heap (even if static) typically isn't a
    problem with PI code because you always need to specify where their
    locations and those values can be set dynamically when the code is
    loaded ... it's usually only compile time static data that you need to
    worry about when relocating code - you either need to copy the static
    data with the code maintaining the relative spacing or rebase the data
    to where it will be when the code runs.

    There ought to be a compiler option to base data references at a
    different address from the code. Unfortunately, I have never needed
    to do this using GCC so I can't tell you how. A quick manual search
    didn't turn up anything, but GCC has so many switches you can go blind
    trying to find the right one.

    Even so, it won't necessarily help you unless you know at compile time
    where the static data will be. Once you rebase, ALL relative
    addressing will use the new base - if your CPU has a range limit for
    relative addressing, it may not be possible to do what you want.

    George
     
    George Neuner, Jul 1, 2010
    #7
  8. Any suggestions for such interpretters, Don? Experiences?

    Peter
     
    Peter Dickerson, Jul 1, 2010
    #8
  9. nono240

    D Yuniskis Guest

    Hi Peter,

    Remember, this is c.a.e so, for the most part, you *know*
    what the application is -- and what it will *remain*
    (i.e., we're not looking at an environment where you have to
    be able to handle infinite variety of applications).

    In the past, I've written C-ish, PL/M-ish and BASIC-ish interpreters
    along with Forth. Note that you can use these as guidelines
    for a pseudo-language without strictly complying with any
    formal language definition.

    E.g., you can opt to implement integer only math instead of
    supporting "doubles", etc. You can force limits to be defined
    for string lengths (static memory allocation). You can
    discount recursion, etc.

    The advantage of interpreters has always seemed to be coming
    up with really tight representations of algorithms and
    spend "ROM" instead of needing space in (loadable) RAM...
     
    D Yuniskis, Jul 1, 2010
    #9
  10. nono240

    nono240

    Joined:
    Jul 1, 2010
    Messages:
    1
    Likes Received:
    0
    Hi there ! Thank you for reply !


    I'm running FreeRTOS. We need "dynamic task loading".




    IT IS my case. It's a (commercial) product, letting the user to load multiple (so named) "tasklets" into FLASH, but its only allowed to run ONE at a time. So, we need those "tasklets" to be CODE position independent, but share DATA.

    Any RODATA are stored in FLASH, and the mechanism used for position indepedence is PC relative offset : before any IO operations, the *real* offset from original linkage is computed and added automatically :

    For example, the following code :

    Code:
    extern int myarray[];  // @0x4000 (DATA)
    int foo()
    {
        return myarray[0];
    }
    
    Give :

    Code:
    80018196:	lddpc	    r6, 80018204 <---- R6 = 0x80014198 (Load PC relative)
    80018198:	rsub	    r6,pc    <---- R6 = PC - R6 = 0x4000
    8001819a:	ld.w	    r12,r6[0]    <---- R12= *(uint32_t *) R6
    8001819e:	ret
    ....
    80018204: 
                   .word      0x80014198
    
    So, if I run my code from elsewhere, let's say 16KB farther, the PIC-computed address for m myarray is 0x8000, not what I want.

    The same is applied for ROM constants (but it's OK in this case).
     
    Last edited: Jul 1, 2010
    nono240, Jul 1, 2010
    #10
  11. nono240

    nono240 Guest

    Hi there ! Thank you for reply !


    I'm running FreeRTOS. We need "dynamic task loading".



    IT IS my case. It's a (commercial) product, letting the user to load
    multiple (so named) "tasklets" into FLASH, but its only allowed to run
    ONE at a time. So, we need those "tasklets" to be CODE position
    independent, but share DATA.

    For example, the following code :

    extern int myarray[]; // @0x4000 (DATA)
    int foo()
    {
    return myarray[0];
    }

    Give :

    80018196: lddpc r6, 80018204 <---- R6 = 0x80014198 (Load PC
    relative)
    80018198: rsub r6,pc <---- R6 = PC - R6 = 0x4000
    8001819a: ld.w r12,r6[0] <---- R12= *(uint32_t *) R6
    8001819e: ret
    .....
    80018204:
    .word 0x80014198

    So, if I run my code from elsewhere, let's say 16KB farther, the PIC-
    computed address for m myarray is 0x8000, not what I want.

    The same is applied for ROM constants (but it's OK in this case).
     
    nono240, Jul 1, 2010
    #11
  12. If you can run only one task at a time, why do you need position
    independent code ? Just link each program to the same fixed load
    address. You need PIC code only when there are _multiple_ programs to
    be loaded somewhere into the RAM.

    If you want to share data between these programs, first link the data
    area to a fixed address and then link each program to that address.

    This is how it was done half a century ago. In FORTRAN, create a
    COMMON area, install it into a fixed address (usually at the top of
    the core) and then load each "transient program" into low memory,
    since the whole program could not fit into the core at once. No
    base/stack pointer relative addressing needed, since the data
    addresses were known at compile time.

    With modern processors with versatile addressing modes, why not
    reserve one data area pointer at a known location (such as the first
    or last address in RAM or ROM) and use this to access the shared
    variables in each program ?

    Data = GetPersistentDataAreaPointer() ;
    ...
    Data->SharedVar1 = Data->SharedVar2 ;
     
    Paul Keinanen, Jul 1, 2010
    #12
  13. A basic concept in linking ( ld program) is the ``section''.
    A section is an area of memory belonging together, such that
    e.g. distances within the section are fixable.
    A section may have a relocation table identifying the places
    in the section that still needs to be adjusted to the final
    place it will be used in the program.

    Now you want to have different sections behave differently regards
    location.

    What the linker (ld) does is combine sections from different object
    modules together into larger sections with names like .bss .text
    ..data and possible fixing the relocation table.
    From that point whatever went into such a section will
    be treated in the same way, i.e. once you combined DATA and CODE
    into one section, data and code will be either fixed at a position
    or have a relocation table. The linker is blind to the difference
    between code and data, the only information it gets is by naming
    convention of input sections. This information is generated
    by the compiler.

    Now you have to understand which sections you have, then tell
    the linker what to do with it.
    Using the --debug option to the linker you get a so called
    linker script which details what the linker does.
    What you want can be accomplished by adapting the linker script,
    which is -- I admit -- not necessarily easy.

    Groetjes Albert
     
    Albert van der Horst, Jul 1, 2010
    #13
  14. nono240

    nono240 Guest

    Thanks for all your replies !

    Because those "tasklets" are stored in different places in FLASH : we
    ship the device with 4 embedded tasklets,
    but a dozen more are available to download and free to be uploaded at
    any of those 4 slots, we don't want the user to take care about "link
    address" !
    Moreover, if we update our CPU to more FLASH, we don't want to deal
    with multiple tasklets version.

    Relocating a tasklet "on demand" to an "predefined fixed area" will
    prematurely kill the FLASH since there's not enough RAM to run code
    from...
    Because we want the tasklets to be "RTOS" unaware. Our FreeRTOS is
    running as an "hypervisor" (we did have an MPU though).

    I just need a way to tell LD that my DATA section is ABSOLUTE, and not
    relative from CODE..
     
    nono240, Jul 2, 2010
    #14
  15. OK, different aim. In my case I have a scientific instrument that is making
    various low level measurements. Users, who are typically chemists or
    biochemists, want real answers not raw measurements. For this the apply
    "Methods" that turn instrumental measurements into stuff like
    concentrations. These methods are all pretty standard but there are lots of
    them, with the occasional new one turning up. I'd prefer the applications
    chemists to be able to implement the methods so that I can concentrate on
    measuring femtoamps. So, I'm looking scriptable.

    Peter
     
    Peter Dickerson, Jul 2, 2010
    #15
  16. nono240

    D Yuniskis Guest

    Hi Peter,

    Yes, we wrote/implemented a "QBASIC" for some of our instruments
    for just this reason (blood assays). Allowed the customer to
    design new tests without having to contract with us to code
    them. I.e., we just provided a device that came up with
    the raw data and let the customer come up with the means
    of interpreting that data based on the reagents, etc. that
    he was using in the assay.

    Note that you can do this two different ways:
    - *source* level interpreter in the instrument
    - "bytecode" interpreter in the instrument with
    an external "compiler/parser".

    (I'm talking *really* limited resources, here)

    If you have a more fleshy implementation to work with,
    look at Lua. Lately I am doing a lot with Inferno/Limbo
    (but would not suggest it for "end users")
     
    D Yuniskis, Jul 2, 2010
    #16
  17. Yes, I'd go for source since that is conceptually the simplest. Otherwise I
    need a bytecode compiler somewhere in the machine or on a PC.
    I did get Lua linked in but ran out of memory almost immediately. In
    particular I couldn't measure anything. The problem seems to be that a lot
    of stuff gets stored in RAM - dictionaries, strings etc. I'd prefer to be
    able to keep that stuff in Flash only even at the cost of performance.

    Peter
     
    Peter Dickerson, Jul 2, 2010
    #17
  18. nono240

    D Yuniskis Guest

    Hi Peter,

    I should probably clarify... :<

    To us, customer was the OEM. *They* would design new
    assays, debug them (in terms of the chemistry) and
    code up the "QBASIC" routines to implement those assays
    for *their* customers ("end users"). So, it was reasonable
    to expect them to have certain tools available for that
    process (though they surely wouldn't want to be dealing
    with "raw iron").

    I don't think the end user ever *saw* QBASIC. They would
    typically purchase "Paks" (ROM based) for each assay.
    FDA issues, etc. (though there was no reason why, from
    *our* standpoint, this utility couldn't be "exposed")
    Yes, I find that to be true of "modern" languages. :<
    They try to be overly friendly instead of overly *efficient*.
    Hence my comments regarding fixing/declaring string sizes,
    integer only math, etc. You can do a *lot* in an environment
    thusly constrained *if* you are made aware of those constraints.
    Often, a "user" only needs the ability to do calculations,
    handle conditionals and "print" (emit?) formatted text with
    "results". So, a lot of the flowery features of modern languages
    are wasted...

    Nowadays, I would look into support for connectivity as
    more and more devices talk to each other (or *should*! :> )

    One advantage of Limbo/Inferno is the application can extend
    beyond the confines of the device itself. E.g., you can
    "export" the hardware from the device/instrument and
    actually run the *application* external to the device
    (remotely!)
     
    D Yuniskis, Jul 2, 2010
    #18
  19. Basically the same here except that when the customer snaps his fingers we
    jump. Very few customers and all big. In one case we have something like 14
    versions of an instrument. Actually, more than one case, because often the
    casework and badge is the the main difference. Part of the problem though is
    that each "Method" or assay has a lots of UI parameters to set, and lots of
    languages to work in (including Chinese and Japanese).

    [snip]
     
    Peter Dickerson, Jul 2, 2010
    #19
  20. I was suggesting putting data _AND_ code into the 8 KiB RAM. Assuming
    4 KiB for data and 4 KiB allocated for code, you could use functions
    up to 4 KiB in length. When a function is called, it is copied from a
    unique address in FLASH into RAM starting at address 0 and this
    function is then executed starting at 0.

    When this function calls something else that needs to be loaded, the
    current segment ID and offset from start must be saved, so that the
    calling function code can reload the code and continue execution after
    the call instruction. With the fast flash read times would make such
    systems much faster than the old overlay loading systems loading from
    slow disk drives.

    Why do you need a DATA section ?

    Think about the system initialization routine as the "main" function
    and within that function, allocate all variables as automatic local
    variables that needs to be shared between the tasklets.

    The tasklets can then be called as subfunctions and each call will
    then pass pointers to the "main" local variables as call parameters.
    In order to keep the tasklet parameter list length reasonable, the
    "main" local variables are grouped into structures.
     
    Paul Keinanen, Jul 2, 2010
    #20
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.