Hello,
Without actualling have done any real gpgpu programming and performance
benchmarking yet I am beginning to see lot's of parallel algorithm
possibilities for massive execution in parallel. (Which I will soon give a
try on gpu !

)
Since I have a DX9 graphics card I will be focusing on this hardware
architecture for now...
Which mostly means "read from anywhere, write to where you are, no locking,
massive parallel".
However maybe NVIDIA abonned the DX9 hardware architecture and is
continueing with DX10/DX11 architecture only ?!?
Which makes me a bit nervous... I already once read somewhere vaguely that
for some things DX10/DX11 might actually be slower ?!?
Lately I also read something about "banking conflicts for memory (?)" <-
that does not sound good ?! Not sure if this applies to DX9 or DX10 or both
hardware.
For now my advice to NVIDIA would be: "don't throw away the chicken that
lays the golden eggs"
So far DX9 graphics cards have proven to sell well I presume... so it could
be smart for NVIDIA to keep betting on two horses:
1. Develop DX9 hardware further... and maybe it's architecture too.
2. Develop D10/DX11 architecture and hardware further in case I am a
clueless newby and it's totally great !
For DX10/DX11 architecture I would get worried that they start to "dumb it
down" to much and add possibily silly things like locking and asynchronous
execution... which might just prove to be bad... and result in many many
many conflicts slowing the hardware down.
In that's true then my advice would be:
1. Focus on massive parallel execution first, get rid of banking conflicts,
add more cores, more processors, more memory lanes whatever it takes to make
it massive parallel first.
For algorithms doing some unnecessary work might prove to be not that bad...
Suppose a chip becomes 1.000.000 times faster (by executing 1.000.000 times
more things in parallel) and only 50% of that work is usuable then that
still means a speed up of 500.000 over the previous generation.
There is still quite some software and algorithms out there that run on
single core processors and that could be turned into parallel algorithms...
a speed up of 500.000 would therefore be very attractive !
For now this means doubling each two to three months should still be good
2x 4x 8x 16x 32x 64x 128x 256x 1024x 2048x 4096x etc.
Keep on trying to double the massiveness of it and also try to keep down the
heat...
Also one last tip try to increase the texture limit from 4096x4096 to
something much larger, also try to increase viewport size to something much
larger, also keep adding more texture coordinates inputs and outputs, for
vertex/pixel shaders as well... those could benefit from it as well for
gpgpu programming maybe even graphics programming to keep more "state"
between passes and such. (Not totally sure about this last tip but give it a
try and see what happens me thinks

)
Seems like good plan to me...
What are your thoughts on DX9 hardware/architecture vs DX10
hardware/archicture when it comes to speed and future performance
predictions ?!?
Also any tips for getting the maximum performance out of DX9 cards are
welcome ?!
What would be common pitfalls for DX9 graphics cards ?!
Also I hope nvidia will also keep providing enough information for DX9
graphics cards out there.... I understand they want to sell new cards... but
we programmers want to program software perferably just once and get maximum
performance too and all cards... not just the latest and greatest generation
of hardware.
Parallel programming might be hot and all the crazy right now... but what
happens if the architecture keeps changing and changing and the api's keep
changing and changing... and the languages keep changing and changing...
could this be bad for software applications ?!?!
Try to imagine the situation for the software developer: You must have the
latest and greatest cards he says to end users... because I use latest and
greatest tools/api's etc... then if backwards compatibility is not supported
he has no way of saying: yes this software will work on future cards as
well... ofcourse it will not because of lack of backwards compatibility.
How many times will programmers invest in re-writing algorithms/software etc
?!?
This could become nvidia's achillies heal... and this is where intel might
over take nvidia if nvidia is not carefull and serious about backwards
compatibility...
intel/x86 is known for backwards compatibility... old software still runs on
the latest processors usually... and also the new instruction set isn't
radically changed which means software efforts pay of as well.
Something to keep in mind for nvidia... intel might be lurking just around
the corner... trying to snatch software developers away from your precious
graphics cards and technology !
Maybe nvidia gets lucky and programmers will develop software for both
technologies... but you wanna bet the company on luck ?!
Lastly there is "direct compute/open cl" etc... this is supposed to be a
"unifieing" platform...
But do you really believe that is achieveable ?!? At what price will that be
achieveable ?!?
Will performance start to suffer ?!
For big companies with lot's of support a "shotgun" strategy might be best.
Focus on developing different products with different support for differen
technologies.
Meaning:
1. Have one hardware product line which focuses on maximum performance,
everything else is secundair.
2. Have one hardware product line which focuses on making parallel
performance/computing more accessible to "less skilled programmers and
programmers with less time". This architecture might not have to be
backwards compatible for now because it's still in experimental phase. The
focus should be on "easy of use".
3. Have one hardware product line which focuses on backwards compatibility
for parallel computing... this might be a more expensive solution/product
line with more resources spent on supporting different and older
technologies.
Speed would be a bit slower, easy of use could be high as well, and offer
confidence for the further as well: "return on investment" for software
efforts !
Ultimately the company with the most resources being able to develop all
three product lines might be the company taking the most market share for
all kinds of customers !
Smaller companies might have to keep focussing on one of the three.
Option 3 might be overkill for a small company, supporting many different
technology might eat to much into the companie's resources and this might
delay new products from coming out.
Option 2 is highly experimental and might fail, not a wise decision.
Option 1 is best, stick with what is known to work and scale it up (fast) !
So option 1 means: Keep running away from the competitors... make sure your
products are the fastest ! So the competitor is always playing "catch up"


Bye
Skybuck =D