1. This forum section is a read-only archive which contains old newsgroup posts. If you wish to post a query, please do so in one of our main forum sections (here). This way you will get a faster, better response from the members on Motherboard Point.

Inline assembler on PowerPC

Discussion in 'Embedded' started by David R Brooks, Jun 13, 2005.

  1. Consider the following (compiler=GCC3.4.3, host=I686,
    target=powerpc-eabi):

    typedef void(*pVoid)(void);

    static inline bool1 kSetVector(uint1 level, pVoid func, int type) {
    int r;
    const int code = 0;
    __asm__ __volatile__ (
    " li 0, %1 \n" /* code */
    " mr 3, %2 \n" /* level */
    " mr 4, %3 \n" /* func */
    " mr 5, %4 \n" /* type */
    " sc \n" /* System Call: may corrupt regs: result in r3 */
    " mr %0, 3 \n" /* Return result */
    : "=r" (r)
    : "rI" (code), "0" (level), "r" (func), "r" (type)
    : "r0", "cc", "memory"
    );
    return r;
    }
    ....
    (void)kSetVector(31, SerialIoInterrupt, 3);

    This compiles, & runs fine (producing the code below). However I
    would like to improve the efficiency, by eliminating the "mr"
    instructions to move arguments to & from registers. The "sc" needs the
    data in precisely the registers shown, so GCC needs to be coaxed into
    using those registers itself.

    Generated code (comments added):

    54:h/services.h **** static inline bool1 kSetVector(uint1 level,
    pVoid func, int type) {
    203 .loc 2 54 0
    204 019c 3940001F li 10,31 /* level */
    205 01a0 3D200000 lis 9,SerialIoInterrupt@ha /* func */
    206 01a4 39290000 la 9,SerialIoInterrupt@l(9)
    207 01a8 39600003 li 11,3 /* type */
    208 .LBB3:
    55:h/services.h **** int r;
    56:h/services.h **** const int code = 0;
    57:h/services.h **** __asm__ __volatile__ (
    209 .loc 2 57 0
    210 01ac 38000000 li 0, 0
    211 01b0 7D435378 mr 3, 10 /* The "mr's" I want to remove */
    212 01b4 7D244B78 mr 4, 9
    213 01b8 7D655B78 mr 5, 11
    214 01bc 44000002 sc
    215 01c0 7C6A1B78 mr 10, 3 /* result */

    In the X86 builds of GCC, there are "register loading codes", as "c",
    "a" & "D" in the following example (from: "Using Inline Assembly With
    gcc" by Clark L. Coleman).

    asm ("cld\n\t" "rep\n\t" "stosl"
    : /* no output registers */
    : "c" (count), "a" (fill_value), "D" (dest)
    : "%ecx", "%edi" );

    Is there a similar device for the PowerPC, whereby I can tell GCC to
    create the values in specific registers, so eliminating the need for
    those "mr" instructions?
    TIA,
     
    David R Brooks, Jun 13, 2005
    #1
    1. Advertising

  2. David R Brooks

    l'indien Guest

    On Mon, 13 Jun 2005 21:50:28 +0800, David R Brooks wrote:

    > Consider the following (compiler=GCC3.4.3, host=I686,
    > target=powerpc-eabi):
    >
    > typedef void(*pVoid)(void);
    >
    > static inline bool1 kSetVector(uint1 level, pVoid func, int type) {
    > int r;
    > const int code = 0;
    > __asm__ __volatile__ (
    > " li 0, %1 \n" /* code */
    > " mr 3, %2 \n" /* level */
    > " mr 4, %3 \n" /* func */
    > " mr 5, %4 \n" /* type */
    > " sc \n" /* System Call: may corrupt regs: result in r3 */
    > " mr %0, 3 \n" /* Return result */
    > : "=r" (r)
    > : "rI" (code), "0" (level), "r" (func), "r" (type)
    > : "r0", "cc", "memory"
    > );
    > return r;
    > }
    > ...
    > (void)kSetVector(31, SerialIoInterrupt, 3);
    >
    > This compiles, & runs fine (producing the code below). However I
    > would like to improve the efficiency, by eliminating the "mr"
    > instructions to move arguments to & from registers. The "sc" needs the
    > data in precisely the registers shown, so GCC needs to be coaxed into
    > using those registers itself.


    Imho, the easiest way is to do it ... in C:
    static inline bool1 kSetVector (uint1 level, pVoid func, int type)
    {
    register uint1 _level __asm__ ("r3");
    register pVoid _func __asm__ ("r4");
    register int _type __asm__ ("r5");

    _level = level;
    _func = func;
    _type = type;
    __asm__ __volatile__ (
    "li 0, %1 \n"
    "sc \n"
    : "=r" (_level)
    : "rI" (code)
    : "r0", "cc", "memory");

    return _level;
    }

    Then gcc will be able to optimise variables allocations then only produce
    mr or lwz if necessary.
    The second thing to consider is that this code is more easily readable
    than any inline assembly dependency.
    The only drawback is that you have to use the same local variable for the
    first argument and the returned value.

    [...]
     
    l'indien, Jun 13, 2005
    #2
    1. Advertising

  3. David R Brooks

    David Brown Guest

    l'indien wrote:
    > On Mon, 13 Jun 2005 21:50:28 +0800, David R Brooks wrote:
    >
    >
    >>Consider the following (compiler=GCC3.4.3, host=I686,
    >>target=powerpc-eabi):
    >>
    >>typedef void(*pVoid)(void);
    >>
    >>static inline bool1 kSetVector(uint1 level, pVoid func, int type) {
    >> int r;
    >> const int code = 0;
    >> __asm__ __volatile__ (
    >> " li 0, %1 \n" /* code */
    >> " mr 3, %2 \n" /* level */
    >> " mr 4, %3 \n" /* func */
    >> " mr 5, %4 \n" /* type */
    >> " sc \n" /* System Call: may corrupt regs: result in r3 */
    >> " mr %0, 3 \n" /* Return result */
    >> : "=r" (r)
    >> : "rI" (code), "0" (level), "r" (func), "r" (type)
    >> : "r0", "cc", "memory"
    >> );
    >> return r;
    >>}
    >>...
    >>(void)kSetVector(31, SerialIoInterrupt, 3);
    >>
    >> This compiles, & runs fine (producing the code below). However I
    >>would like to improve the efficiency, by eliminating the "mr"
    >>instructions to move arguments to & from registers. The "sc" needs the
    >>data in precisely the registers shown, so GCC needs to be coaxed into
    >>using those registers itself.

    >
    >
    > Imho, the easiest way is to do it ... in C:
    > static inline bool1 kSetVector (uint1 level, pVoid func, int type)
    > {
    > register uint1 _level __asm__ ("r3");
    > register pVoid _func __asm__ ("r4");
    > register int _type __asm__ ("r5");
    >
    > _level = level;
    > _func = func;
    > _type = type;
    > __asm__ __volatile__ (
    > "li 0, %1 \n"
    > "sc \n"
    > : "=r" (_level)
    > : "rI" (code)
    > : "r0", "cc", "memory");
    >
    > return _level;
    > }
    >
    > Then gcc will be able to optimise variables allocations then only produce
    > mr or lwz if necessary.
    > The second thing to consider is that this code is more easily readable
    > than any inline assembly dependency.
    > The only drawback is that you have to use the same local variable for the
    > first argument and the returned value.
    >
    > [...]
    >


    Of course, you will still get pretty much the same "mr" instructions in
    the stand-alone version of the function (if it is generated) - it is
    only in in-lined versions that they could be eliminated.

    And I presume you are only doing this optomisation for interest and
    understanding, not because you are setting vectors so often that 3
    cycles delay here will be a serious issue?

    David
     
    David Brown, Jun 14, 2005
    #3
  4. David R Brooks

    l'indien Guest

    On Tue, 14 Jun 2005 08:59:01 +0200, David Brown wrote:

    > l'indien wrote:
    >> On Mon, 13 Jun 2005 21:50:28 +0800, David R Brooks wrote:
    >>
    >>
    >>>Consider the following (compiler=GCC3.4.3, host=I686,
    >>>target=powerpc-eabi):
    >>>
    >>>typedef void(*pVoid)(void);
    >>>
    >>>static inline bool1 kSetVector(uint1 level, pVoid func, int type) {
    >>> int r;
    >>> const int code = 0;
    >>> __asm__ __volatile__ (
    >>> " li 0, %1 \n" /* code */
    >>> " mr 3, %2 \n" /* level */
    >>> " mr 4, %3 \n" /* func */
    >>> " mr 5, %4 \n" /* type */
    >>> " sc \n" /* System Call: may corrupt regs: result in r3 */
    >>> " mr %0, 3 \n" /* Return result */
    >>> : "=r" (r)
    >>> : "rI" (code), "0" (level), "r" (func), "r" (type)
    >>> : "r0", "cc", "memory"
    >>> );
    >>> return r;
    >>>}
    >>>...
    >>>(void)kSetVector(31, SerialIoInterrupt, 3);
    >>>
    >>> This compiles, & runs fine (producing the code below). However I
    >>>would like to improve the efficiency, by eliminating the "mr"
    >>>instructions to move arguments to & from registers. The "sc" needs the
    >>>data in precisely the registers shown, so GCC needs to be coaxed into
    >>>using those registers itself.

    >>
    >>
    >> Imho, the easiest way is to do it ... in C:
    >> static inline bool1 kSetVector (uint1 level, pVoid func, int type)
    >> {
    >> register uint1 _level __asm__ ("r3");
    >> register pVoid _func __asm__ ("r4");
    >> register int _type __asm__ ("r5");
    >>
    >> _level = level;
    >> _func = func;
    >> _type = type;
    >> __asm__ __volatile__ (
    >> "li 0, %1 \n"
    >> "sc \n"
    >> : "=r" (_level)
    >> : "rI" (code)
    >> : "r0", "cc", "memory");
    >>
    >> return _level;
    >> }
    >>
    >> Then gcc will be able to optimise variables allocations then only produce
    >> mr or lwz if necessary.
    >> The second thing to consider is that this code is more easily readable
    >> than any inline assembly dependency.
    >> The only drawback is that you have to use the same local variable for the
    >> first argument and the returned value.
    >>
    >> [...]
    >>

    >
    > Of course, you will still get pretty much the same "mr" instructions in
    > the stand-alone version of the function (if it is generated) - it is
    > only in in-lined versions that they could be eliminated.


    You won't have any mr in the stand-alone version:
    as the arguments are passed in registers r3 ..., then level already is in
    r3, func in r4 and type in r5.
    As the returned argument is into r3, there won't be any mr at all.
    Then, when I compile this function as a standalone one, I get:
    00000000 <kSetVector>:
    0: 38 00 00 00 li r0,0
    4: 44 00 00 02 sc
    8: 4e 80 00 20 blr

    Which is optimal.

    > And I presume you are only doing this optomisation for interest and
    > understanding, not because you are setting vectors so often that 3
    > cycles delay here will be a serious issue?


    We always want optimal code, don't we ? ;-)
     
    l'indien, Jun 14, 2005
    #4
  5. Many thanks. That works with one addition: you still have to mention
    all the arguments to the "sc" (_level, _func, _type) on the inputs
    line, else GCC will optimise them away.
    I got it down to:

    static inline bool1 kSetVector (uint1 level, pVoid func, int type)
    {
    register uint1 _code __asm__ ("r0") = 0;
    register uint1 _level __asm__ ("r3") = level;
    register pVoid _func __asm__ ("r4") = func;
    register int _type __asm__ ("r5") = type;

    __asm__ __volatile__ (
    "sc \n"
    : "=r" (_level)
    : "rI" (_code), "0" (_level), "r" (_func), "r" (_type)
    : "cc", "memory" );

    return _level;
    }


    l'indien <> wrote:

    :On Mon, 13 Jun 2005 21:50:28 +0800, David R Brooks wrote:
    :
    :> Consider the following (compiler=GCC3.4.3, host=I686,
    :> target=powerpc-eabi):
    :>
    :> typedef void(*pVoid)(void);
    :>
    :> static inline bool1 kSetVector(uint1 level, pVoid func, int type) {
    :> int r;
    :> const int code = 0;
    :> __asm__ __volatile__ (
    :> " li 0, %1 \n" /* code */
    :> " mr 3, %2 \n" /* level */
    :> " mr 4, %3 \n" /* func */
    :> " mr 5, %4 \n" /* type */
    :> " sc \n" /* System Call: may corrupt regs: result in r3 */
    :> " mr %0, 3 \n" /* Return result */
    :> : "=r" (r)
    :> : "rI" (code), "0" (level), "r" (func), "r" (type)
    :> : "r0", "cc", "memory"
    :> );
    :> return r;
    :> }
    :> ...
    :> (void)kSetVector(31, SerialIoInterrupt, 3);
    :>
    :> This compiles, & runs fine (producing the code below). However I
    :> would like to improve the efficiency, by eliminating the "mr"
    :> instructions to move arguments to & from registers. The "sc" needs the
    :> data in precisely the registers shown, so GCC needs to be coaxed into
    :> using those registers itself.
    :
    :Imho, the easiest way is to do it ... in C:
    :static inline bool1 kSetVector (uint1 level, pVoid func, int type)
    :{
    : register uint1 _level __asm__ ("r3");
    : register pVoid _func __asm__ ("r4");
    : register int _type __asm__ ("r5");
    :
    : _level = level;
    : _func = func;
    : _type = type;
    : __asm__ __volatile__ (
    : "li 0, %1 \n"
    : "sc \n"
    : : "=r" (_level)
    : : "rI" (code)
    : : "r0", "cc", "memory");
    :
    : return _level;
    :}
    :
    :Then gcc will be able to optimise variables allocations then only produce
    :mr or lwz if necessary.
    :The second thing to consider is that this code is more easily readable
    :than any inline assembly dependency.
    :The only drawback is that you have to use the same local variable for the
    :first argument and the returned value.
    :
    :[...]
     
    David R Brooks, Jun 14, 2005
    #5
  6. David R Brooks

    l'indien Guest

    On Tue, 14 Jun 2005 18:12:31 +0800, David R Brooks wrote:

    > Many thanks. That works with one addition: you still have to mention
    > all the arguments to the "sc" (_level, _func, _type) on the inputs
    > line, else GCC will optimise them away.


    You're absolutely right. I have to admit I wrote it down without testing...

    > I got it down to:
    >
    > static inline bool1 kSetVector (uint1 level, pVoid func, int type)
    > {
    > register uint1 _code __asm__ ("r0") = 0;
    > register uint1 _level __asm__ ("r3") = level;
    > register pVoid _func __asm__ ("r4") = func;
    > register int _type __asm__ ("r5") = type;
    >
    > __asm__ __volatile__ (
    > "sc \n"
    > : "=r" (_level)
    > : "rI" (_code), "0" (_level), "r" (_func), "r" (_type)
    > : "cc", "memory" );
    >
    > return _level;
    > }


    I just have two questions/remarks:
    - why don't you directly initialise _code = code ? This would make code
    even more easy to read and won't product more output code.
    - I would use "+r" constraint for _level, to follow gcc asm constraints
    specifications. But, I'm not a specialist on this point, I must admit...


    > l'indien <> wrote:
    >
    > :On Mon, 13 Jun 2005 21:50:28 +0800, David R Brooks wrote:
    > :
    > :> Consider the following (compiler=GCC3.4.3, host=I686,
    > :> target=powerpc-eabi):
    > :>
    > :> typedef void(*pVoid)(void);
    > :>
    > :> static inline bool1 kSetVector(uint1 level, pVoid func, int type) {
    > :> int r;
    > :> const int code = 0;
    > :> __asm__ __volatile__ (
    > :> " li 0, %1 \n" /* code */
    > :> " mr 3, %2 \n" /* level */
    > :> " mr 4, %3 \n" /* func */
    > :> " mr 5, %4 \n" /* type */
    > :> " sc \n" /* System Call: may corrupt regs: result in r3 */
    > :> " mr %0, 3 \n" /* Return result */
    > :> : "=r" (r)
    > :> : "rI" (code), "0" (level), "r" (func), "r" (type)
    > :> : "r0", "cc", "memory"
    > :> );
    > :> return r;
    > :> }
    > :> ...
    > :> (void)kSetVector(31, SerialIoInterrupt, 3);
    > :>
    > :> This compiles, & runs fine (producing the code below). However I
    > :> would like to improve the efficiency, by eliminating the "mr"
    > :> instructions to move arguments to & from registers. The "sc" needs the
    > :> data in precisely the registers shown, so GCC needs to be coaxed into
    > :> using those registers itself.
    > :
    > :Imho, the easiest way is to do it ... in C:
    > :static inline bool1 kSetVector (uint1 level, pVoid func, int type)
    > :{
    > : register uint1 _level __asm__ ("r3");
    > : register pVoid _func __asm__ ("r4");
    > : register int _type __asm__ ("r5");
    > :
    > : _level = level;
    > : _func = func;
    > : _type = type;
    > : __asm__ __volatile__ (
    > : "li 0, %1 \n"
    > : "sc \n"
    > : : "=r" (_level)
    > : : "rI" (code)
    > : : "r0", "cc", "memory");
    > :
    > : return _level;
    > :}
    > :
    > :Then gcc will be able to optimise variables allocations then only produce
    > :mr or lwz if necessary.
    > :The second thing to consider is that this code is more easily readable
    > :than any inline assembly dependency.
    > :The only drawback is that you have to use the same local variable for the
    > :first argument and the returned value.
    > :
    > :[...]
     
    l'indien, Jun 14, 2005
    #6
  7. Answering your questions:
    1. _code is explicitly a constant: being the function code. There are
    several similar definitions in the header file, having different names
    & corresponding function codes. The number of arguments varies too.
    2. "+r", although legal in pure asm, is not accepted by GCC.

    l'indien <> wrote:

    :On Tue, 14 Jun 2005 18:12:31 +0800, David R Brooks wrote:
    :
    :> Many thanks. That works with one addition: you still have to mention
    :> all the arguments to the "sc" (_level, _func, _type) on the inputs
    :> line, else GCC will optimise them away.
    :
    :You're absolutely right. I have to admit I wrote it down without testing...
    :
    :> I got it down to:
    :>
    :> static inline bool1 kSetVector (uint1 level, pVoid func, int type)
    :> {
    :> register uint1 _code __asm__ ("r0") = 0;
    :> register uint1 _level __asm__ ("r3") = level;
    :> register pVoid _func __asm__ ("r4") = func;
    :> register int _type __asm__ ("r5") = type;
    :>
    :> __asm__ __volatile__ (
    :> "sc \n"
    :> : "=r" (_level)
    :> : "rI" (_code), "0" (_level), "r" (_func), "r" (_type)
    :> : "cc", "memory" );
    :>
    :> return _level;
    :> }
    :
    :I just have two questions/remarks:
    :- why don't you directly initialise _code = code ? This would make code
    :even more easy to read and won't product more output code.
    :- I would use "+r" constraint for _level, to follow gcc asm constraints
    :specifications. But, I'm not a specialist on this point, I must admit...
    :
    [snip]
     
    David R Brooks, Jun 14, 2005
    #7
  8. David R Brooks

    R Adsett Guest

    In article <>,
    says...
    > On Tue, 14 Jun 2005 08:59:01 +0200, David Brown wrote:
    >
    > > l'indien wrote:
    > > And I presume you are only doing this optomisation for interest and
    > > understanding, not because you are setting vectors so often that 3
    > > cycles delay here will be a serious issue?

    >
    > We always want optimal code, don't we ? ;-)


    Actually no. Readable (human readable) and correct first. Optimal is,
    at best, a distant third.

    Robert
     
    R Adsett, Jun 15, 2005
    #8
  9. David R Brooks

    l'indien Guest

    On Wed, 15 Jun 2005 06:39:40 +0800, David R Brooks wrote:

    > Answering your questions:
    > 1. _code is explicitly a constant: being the function code. There are
    > several similar definitions in the header file, having different names
    > & corresponding function codes. The number of arguments varies too.


    OK, sorry, I misread your code...

    > 2. "+r", although legal in pure asm, is not accepted by GCC.


    I did the test, gcc does accept it.
    "+r" is documented in gcc documentation (I'm using gcc 2.95.3 as a PowerPC
    cross compiler).

    [...]
     
    l'indien, Jun 15, 2005
    #9
  10. On Tue, 14 Jun 2005 22:49:54 -0400, R Adsett
    <> wrote:

    >In article <>,
    > says...
    >> On Tue, 14 Jun 2005 08:59:01 +0200, David Brown wrote:
    >>
    >> > l'indien wrote:
    >> > And I presume you are only doing this optomisation for interest and
    >> > understanding, not because you are setting vectors so often that 3
    >> > cycles delay here will be a serious issue?

    >>
    >> We always want optimal code, don't we ? ;-)

    >
    >Actually no. Readable (human readable) and correct first. Optimal is,
    >at best, a distant third.


    Optimal implies correct code. One cannot decribe anything as an
    optimal solution, if it does not do what it is supposed to do.
    Things that are obscure at first, become very "Human Readable" if it
    is the optimum solution to a problem.
    Readable code for even a complete newby programmer is total black
    magic to the avarage lay person.

    Regards
    Anton Erasmus
     
    Anton Erasmus, Jun 16, 2005
    #10
  11. David R Brooks

    R Adsett Guest

    In article <1118916402.66792381a7012eb59f1dd4438ce355a4@teranews>,
    says...
    > On Tue, 14 Jun 2005 22:49:54 -0400, R Adsett
    > <> wrote:
    >
    > >In article <>,
    > > says...
    > >> On Tue, 14 Jun 2005 08:59:01 +0200, David Brown wrote:
    > >>
    > >> > l'indien wrote:
    > >> > And I presume you are only doing this optomisation for interest and
    > >> > understanding, not because you are setting vectors so often that 3
    > >> > cycles delay here will be a serious issue?
    > >>
    > >> We always want optimal code, don't we ? ;-)

    > >
    > >Actually no. Readable (human readable) and correct first. Optimal is,
    > >at best, a distant third.

    >
    > Optimal implies correct code.


    Well, yes. The converse is not, I think, true. Unless of course you
    define correct as a synonym for optimal. In this case though the context
    suggests that optimal meant fast.

    To quote Knuth "Premature optimization is the root of all evil". I think
    that was Knuth anyway. Clear, fast enough and small enough are good for
    me. No need to go to the trouble of as small as possible or as fast as
    possible in most cases.

    I've seen attempts to optimize that ended up only optimizing the obvious
    and missed doing the correct thing for the whole set of inputs when the
    clear version worked correctly for all cases. This in a case where the
    clear version was fast enough and small enough.

    > One cannot decribe anything as an
    > optimal solution, if it does not do what it is supposed to do.
    > Things that are obscure at first, become very "Human Readable" if it
    > is the optimum solution to a problem.


    On this I will disagree. We've all done clever things at one time or
    another that when we went back to them later were far from clear. If you
    have ever used APL I can guarantee it ;)

    > Readable code for even a complete newby programmer is total black
    > magic to the avarage lay person.


    Yes, but so what? If it is necessary to optimize a sequence to fit it
    within tight constraints then sufficient supporting comments must be
    added to make it clear what is being done and why even to someone who is
    encountering it for the first time. Basic knowledge of the
    implementation language and external HW can probably be assumed but when
    you start relying on multiple side effects or delay testing a flag for
    several instructions you had better warn the unwary reader of the traps
    that lay in the code. I don't expect I get this right all the time
    either but I do try.

    Robert
     
    R Adsett, Jun 16, 2005
    #11
  12. On Thu, 16 Jun 2005 10:57:03 -0400, R Adsett
    <> wrote:

    >In article <1118916402.66792381a7012eb59f1dd4438ce355a4@teranews>,
    > says...
    >> On Tue, 14 Jun 2005 22:49:54 -0400, R Adsett
    >> <> wrote:
    >>
    >> >In article <>,
    >> > says...
    >> >> On Tue, 14 Jun 2005 08:59:01 +0200, David Brown wrote:
    >> >>
    >> >> > l'indien wrote:
    >> >> > And I presume you are only doing this optomisation for interest and
    >> >> > understanding, not because you are setting vectors so often that 3
    >> >> > cycles delay here will be a serious issue?
    >> >>
    >> >> We always want optimal code, don't we ? ;-)
    >> >
    >> >Actually no. Readable (human readable) and correct first. Optimal is,
    >> >at best, a distant third.

    >>
    >> Optimal implies correct code.

    >
    >Well, yes. The converse is not, I think, true. Unless of course you
    >define correct as a synonym for optimal. In this case though the context
    >suggests that optimal meant fast.


    No correct code is not neceserally optimal, but I believe that for
    code to be the optimal code for a specific problem, it should be
    correct. Under all conditions within the specific problems domain.
    Optimal code is the smallest and/or fastest set of instructions to
    do the specific thing one wants to do.

    >To quote Knuth "Premature optimization is the root of all evil". I think
    >that was Knuth anyway. Clear, fast enough and small enough are good for
    >me. No need to go to the trouble of as small as possible or as fast as
    >possible in most cases.
    >
    >I've seen attempts to optimize that ended up only optimizing the obvious
    >and missed doing the correct thing for the whole set of inputs when the
    >clear version worked correctly for all cases. This in a case where the
    >clear version was fast enough and small enough.


    If the code is broken when trying to optimize, then the resultant code
    is not optimal, but wrong. And yes I do agree that trying to optimize
    a total application to the point where it is impossible to get it
    faster/smaller is in 99.999% of the cases just a waste of time.

    >> One cannot decribe anything as an
    >> optimal solution, if it does not do what it is supposed to do.
    >> Things that are obscure at first, become very "Human Readable" if it
    >> is the optimum solution to a problem.

    >
    >On this I will disagree. We've all done clever things at one time or
    >another that when we went back to them later were far from clear. If you
    >have ever used APL I can guarantee it ;)
    >
    >> Readable code for even a complete newby programmer is total black
    >> magic to the avarage lay person.

    >
    >Yes, but so what? If it is necessary to optimize a sequence to fit it
    >within tight constraints then sufficient supporting comments must be
    >added to make it clear what is being done and why even to someone who is
    >encountering it for the first time. Basic knowledge of the
    >implementation language and external HW can probably be assumed but when
    >you start relying on multiple side effects or delay testing a flag for
    >several instructions you had better warn the unwary reader of the traps
    >that lay in the code. I don't expect I get this right all the time
    >either but I do try.


    What I mean is that when confronted with a section of code for the
    first time, it might be quite obscure. If this sequence of code is the
    optimal solution to a specific problem, and many programmers end up
    using this sequence, then it become "Human Readable" by the mere fact
    that it is used often, by many people in a well defined context.
    For someone used only to high level code, simple basic assembly can be
    quite obscure and not readable at all. What is obscure to a beginner
    might actually be quite clear to a more experienced person. As in all
    most things the difference is not Black/White and exactely where the
    line lies is open to debate.

    Regards
    Anton Erasmus
     
    Anton Erasmus, Jun 16, 2005
    #12
  13. David R Brooks

    CBFalconer Guest

    Anton Erasmus wrote:
    >

    .... snip ...
    >
    > What I mean is that when confronted with a section of code for the
    > first time, it might be quite obscure. If this sequence of code is
    > the optimal solution to a specific problem, and many programmers
    > end up using this sequence, then it become "Human Readable" by the
    > mere fact that it is used often, by many people in a well defined
    > context. For someone used only to high level code, simple basic
    > assembly can be quite obscure and not readable at all. What is
    > obscure to a beginner might actually be quite clear to a more
    > experienced person. As in all most things the difference is not
    > Black/White and exactely where the line lies is open to debate.


    I recall an exposition of Knuths some years ago, in which he
    reworked some fairly normal code into a peculiar monster. It was
    developed step by step to improve efficiency in a perfectly logical
    manner. IIRC it ended up with a goto into the middle of a
    structured statement, which is considered a no-no.

    One of the points he made with it was that such derivations should
    include the original, and the various steps taken to attain the end
    result. Otherwise it has virtually no chance of making sense to
    the later reader.

    --
    Chuck F () ()
    Available for consulting/temporary embedded and systems.
    <http://cbfalconer.home.att.net> USE worldnet address!
     
    CBFalconer, Jun 17, 2005
    #13
  14. David R Brooks

    David Brown Guest

    R Adsett wrote:
    > In article <1118916402.66792381a7012eb59f1dd4438ce355a4@teranews>,
    > says...
    >
    >>On Tue, 14 Jun 2005 22:49:54 -0400, R Adsett
    >><> wrote:


    <snip>

    >>>
    >>>Actually no. Readable (human readable) and correct first. Optimal is,
    >>>at best, a distant third.

    >>
    >>Optimal implies correct code.

    >


    Yes, it is easy to write code that is fast but incorrect!

    >
    > Well, yes. The converse is not, I think, true. Unless of course you
    > define correct as a synonym for optimal. In this case though the context
    > suggests that optimal meant fast.
    >
    > To quote Knuth "Premature optimization is the root of all evil". I think
    > that was Knuth anyway. Clear, fast enough and small enough are good for
    > me. No need to go to the trouble of as small as possible or as fast as
    > possible in most cases.
    >


    Knuth also gave two rules of optimization:

    1) Don't do it.
    2) (For experts only) Don't do it yet.


    Readability goes hand-in-hand with correctness in priority - code that
    is unreadable is unlikely to be correct, and even less likely to be
    checked to be correct (either by testing or proof). Getting optimal
    code, or at least close to optimal, involves two things - thinking about
    your code as you write it, and using a good compiler. The biggest
    difference to the speed and size of code is made when thinking about
    what your code should do (i.e., at the algorithmic stage), then by
    thinking about your implementation (e.g., using integers instead of
    floats, and understanding how your code will fit with the target's
    capabilities). Small things, such as when to use arrays and when to use
    pointers, are best left to the compiler if it has a good optimizer.
     
    David Brown, Jun 17, 2005
    #14
  15. On Fri, 17 Jun 2005 09:20:04 +0200, David Brown
    <> wrote:

    >R Adsett wrote:
    >> In article <1118916402.66792381a7012eb59f1dd4438ce355a4@teranews>,
    >> says...
    >>
    >>>On Tue, 14 Jun 2005 22:49:54 -0400, R Adsett
    >>><> wrote:

    >
    ><snip>
    >
    >>>>
    >>>>Actually no. Readable (human readable) and correct first. Optimal is,
    >>>>at best, a distant third.
    >>>
    >>>Optimal implies correct code.

    >>

    >
    >Yes, it is easy to write code that is fast but incorrect!
    >
    >>
    >> Well, yes. The converse is not, I think, true. Unless of course you
    >> define correct as a synonym for optimal. In this case though the context
    >> suggests that optimal meant fast.
    >>
    >> To quote Knuth "Premature optimization is the root of all evil". I think
    >> that was Knuth anyway. Clear, fast enough and small enough are good for
    >> me. No need to go to the trouble of as small as possible or as fast as
    >> possible in most cases.
    >>

    >
    >Knuth also gave two rules of optimization:
    >
    >1) Don't do it.
    >2) (For experts only) Don't do it yet.
    >


    The problem is that many programmers seem to understand. "Write code
    as sloppily as possible, and do not even think about whether the
    current approach is easy or difficult for the processor"

    The worst I have seen was an application where a set of different
    routines was called based on configuration data. There was an array of
    function pointers, and obviously for readability they decided that
    calling function[3](args) is not clear. Calling something like
    function[FOO](args) is a lot clearer where FOO gives some indication
    of what the function does. In stead of just have a #defined or
    enumerated list, the programmers put the names in a character array.
    They then had a function that did a string compare on the character
    array every time it was called to decide which function pointer in the
    array to take. Needless to say this was VERY slow. A very simple
    optimization speeded the code up by more 100 times.

    Of course it is debatable whether this was optimization or fixing
    incorrect code.

    >Readability goes hand-in-hand with correctness in priority - code that
    >is unreadable is unlikely to be correct, and even less likely to be
    >checked to be correct (either by testing or proof). Getting optimal
    >code, or at least close to optimal, involves two things - thinking about
    >your code as you write it, and using a good compiler. The biggest
    >difference to the speed and size of code is made when thinking about
    >what your code should do (i.e., at the algorithmic stage), then by
    >thinking about your implementation (e.g., using integers instead of
    >floats, and understanding how your code will fit with the target's
    >capabilities). Small things, such as when to use arrays and when to use
    >pointers, are best left to the compiler if it has a good optimizer.


    Having now read the rest of yout post :) I totally agree with your
    additions to the Knuth optimization rules. A better algorithm is worth
    more than a poor algorithm optimized to the core.

    Regards
    Anton Erasmus
     
    Anton Erasmus, Jun 17, 2005
    #15
  16. David R Brooks

    R Adsett Guest

    In article <1118949838.101c02e5d7c78f654f62183750680f48@teranews>,
    says...
    > On Thu, 16 Jun 2005 10:57:03 -0400, R Adsett
    > <> wrote:
    >
    > >To quote Knuth "Premature optimization is the root of all evil". I think
    > >that was Knuth anyway. Clear, fast enough and small enough are good for
    > >me. No need to go to the trouble of as small as possible or as fast as
    > >possible in most cases.
    > >
    > >I've seen attempts to optimize that ended up only optimizing the obvious
    > >and missed doing the correct thing for the whole set of inputs when the
    > >clear version worked correctly for all cases. This in a case where the
    > >clear version was fast enough and small enough.

    >
    > If the code is broken when trying to optimize, then the resultant code
    > is not optimal, but wrong. And yes I do agree that trying to optimize
    > a total application to the point where it is impossible to get it
    > faster/smaller is in 99.999% of the cases just a waste of time.


    I think we are agreeing here. My only point in the above was that the
    obsession with optimization appears to often result in somewhat faster
    and/or smaller but broken code.

    > >> One cannot decribe anything as an
    > >> optimal solution, if it does not do what it is supposed to do.
    > >> Things that are obscure at first, become very "Human Readable" if it
    > >> is the optimum solution to a problem.

    > >
    > >On this I will disagree. We've all done clever things at one time or
    > >another that when we went back to them later were far from clear. If you
    > >have ever used APL I can guarantee it ;)
    > >
    > >> Readable code for even a complete newby programmer is total black
    > >> magic to the avarage lay person.

    > >
    > >Yes, but so what? If it is necessary to optimize a sequence to fit it
    > >within tight constraints then sufficient supporting comments must be
    > >added to make it clear what is being done and why even to someone who is
    > >encountering it for the first time. Basic knowledge of the
    > >implementation language and external HW can probably be assumed but when
    > >you start relying on multiple side effects or delay testing a flag for
    > >several instructions you had better warn the unwary reader of the traps
    > >that lay in the code. I don't expect I get this right all the time
    > >either but I do try.

    >
    > What I mean is that when confronted with a section of code for the
    > first time, it might be quite obscure. If this sequence of code is the
    > optimal solution to a specific problem, and many programmers end up
    > using this sequence, then it become "Human Readable" by the mere fact
    > that it is used often, by many people in a well defined context.


    Again we will have to disagree on this. Once an implementation goes
    beyond the straightforward it becomes obscure. It may be completely
    transparent while you are working on it, but once you leave it for 6
    months or a year it will no longer be so. I still find myself at least
    occaisionally beefing up the comments on code when I revisit it later.

    As far as a frequently used sequence being clear, it seems to me that
    more than once I've read code with a comment that goes something like
    "For some reason everyone does this, I don't know why it works but it
    does". Frequently used obscure code appears to also have a tendency to
    devolve into magic incantations ;)

    > For someone used only to high level code, simple basic assembly can be
    > quite obscure and not readable at all. What is obscure to a beginner
    > might actually be quite clear to a more experienced person. As in all
    > most things the difference is not Black/White and exactely where the
    > line lies is open to debate.


    True enough, but there is a difference between a implementation being
    obscure because you are not familiar with the language, processor or
    harwares and it being obscure because you are using non-straightforward
    techniques. It's this latter that demands special care and attention.
    particularly in making sure that whoever follows can figure out what is
    going on without needing to spend a lot of time re-inventing the
    solution. Sometimes that involves a short note, sometimes a long
    description and sometines a reference to a discussion in a lab book or
    paper.

    An example might be an FFT. Any implementation of that should either
    contain a full description or better a reference to a full description.
    In this case the refernce would be better since the explanation is almost
    certainly more complete than one any of us are likely to have the
    patience to complete to accompany the code.

    Robert
     
    R Adsett, Jun 18, 2005
    #16
  17. David R Brooks

    R Adsett Guest

    In article <d8tsve$52v$>,
    says...
    > Knuth also gave two rules of optimization:
    >
    > 1) Don't do it.
    > 2) (For experts only) Don't do it yet.
    >
    >
    > Readability goes hand-in-hand with correctness in priority - code that
    > is unreadable is unlikely to be correct, and even less likely to be
    > checked to be correct (either by testing or proof). Getting optimal
    > code, or at least close to optimal, involves two things - thinking about
    > your code as you write it, and using a good compiler. The biggest
    > difference to the speed and size of code is made when thinking about
    > what your code should do (i.e., at the algorithmic stage), then by
    > thinking about your implementation (e.g., using integers instead of
    > floats, and understanding how your code will fit with the target's
    > capabilities). Small things, such as when to use arrays and when to use
    > pointers, are best left to the compiler if it has a good optimizer.


    Well put.

    Robert
     
    R Adsett, Jun 18, 2005
    #17
  18. David R Brooks

    R Adsett Guest

    In article <1119002425.b7ad46355833ac9e63bd16368286f661@teranews>,
    says...
    > On Fri, 17 Jun 2005 09:20:04 +0200, David Brown
    > <> wrote:
    >
    > >R Adsett wrote:
    > >> In article <1118916402.66792381a7012eb59f1dd4438ce355a4@teranews>,
    > >> says...
    > >>
    > >>>On Tue, 14 Jun 2005 22:49:54 -0400, R Adsett
    > >>><> wrote:
    > >> To quote Knuth "Premature optimization is the root of all evil". I think
    > >> that was Knuth anyway. Clear, fast enough and small enough are good for
    > >> me. No need to go to the trouble of as small as possible or as fast as
    > >> possible in most cases.
    > >>

    > >
    > >Knuth also gave two rules of optimization:
    > >
    > >1) Don't do it.
    > >2) (For experts only) Don't do it yet.
    > >

    >
    > The problem is that many programmers seem to understand. "Write code
    > as sloppily as possible, and do not even think about whether the
    > current approach is easy or difficult for the processor"
    >
    > The worst I have seen was an application where a set of different
    > routines was called based on configuration data. There was an array of
    > function pointers, and obviously for readability they decided that
    > calling function[3](args) is not clear. Calling something like
    > function[FOO](args) is a lot clearer where FOO gives some indication
    > of what the function does. In stead of just have a #defined or
    > enumerated list, the programmers put the names in a character array.
    > They then had a function that did a string compare on the character
    > array every time it was called to decide which function pointer in the
    > array to take. Needless to say this was VERY slow. A very simple
    > optimization speeded the code up by more 100 times.
    >
    > Of course it is debatable whether this was optimization or fixing
    > incorrect code.


    Well there is a difference between badly designed code and un-optimized
    code :). OTOH this might have made sense if this was part of a command
    parser or interpreter. In a command parser the lookup might have been
    small comapred to the typing time ;). In a straightforward case of
    substituting different run-time routines it seems unnecessarily complex
    though.
     
    R Adsett, Jun 18, 2005
    #18
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    2
    Views:
    354
    Carlo Razzeto
    Mar 29, 2005
  2. Fahd
    Replies:
    9
    Views:
    1,541
    Rene Straub
    Jul 8, 2003
  3. Michael
    Replies:
    3
    Views:
    603
  4. Bastian Stahmer
    Replies:
    7
    Views:
    1,130
    CBFalconer
    Apr 8, 2005
  5. Gilles
    Replies:
    3
    Views:
    636
    Gilles
    Feb 5, 2008
Loading...

Share This Page