1. This forum section is a read-only archive which contains old newsgroup posts. If you wish to post a query, please do so in one of our main forum sections (here). This way you will get a faster, better response from the members on Motherboard Point.

A New Way to Talk about GPUs (ExtremeTech on ATI's R520 and Nvidia G70)

Discussion in 'ATI' started by Guest, Aug 1, 2005.

  1. Guest

    Guest Guest


    A New Way to Talk about GPUs
    By Jason Cross

    "I was perusing the forums over at Beyond 3D, marveling at the absolute
    guesswork and rumor-mongering going on about ATI's upcoming R520 graphics
    chip. If the forum's prognosticators are to be believed, it will run at
    around 700MHz, or nowhere close to that. It'll have 32 pipelines, or 24, or
    20, or 16. It'll have some sort of unified shader architecture, though
    nothing like the GPU in the Xbox 360, or it will have a completely
    traditional architecture. It will enable AA together with HDR (a current
    sore spot for Nvidia's cards), or that's just totally unfeasible.
    The random guesswork isn't really a surprise. ATI has been incredibly quiet
    and secretive about its next major GPU architecture: All we really know for
    sure is that it will be Shader Model 3.0-compliant, several revisions have
    taped out by now, and some version of it was running the impressive Alan
    Wake demo at E3 this past May. Some sites post new rumors every week,
    usually contradicting the rumors from the week before."

    "What really struck me is how the fans of 3D graphics are sticking hard and
    fast to a certain way of looking at GPUs. They discuss everything in terms
    of "pipelines," with some even going so far as to say that the GeForce 7800
    GTX isn't a "true" 24-pipeline chip because it only has 16 raster operation
    units (ROPs), and can therefore only really draw 16 pixels per clock, max.
    I've spoken with both ATI and Nvidia on the subject, and they both say 16
    ROPs is plenty. The truth is, the more-advanced 3D games are so limited by
    shader operation speed and texture fetching that the GPU is drawing nowhere
    near 16 pixels or samples per clock, they say. I was told by one engineer
    that the performance benefit of moving from 16 to 24 ROPs would be less than
    5%, but it would come at a considerable cost in transistor count.

    In the grand old days of just three or four years ago, even the most
    advanced 3D engines basically just layered a few textures on top of each
    other with simple blending modes. Every now and then a pixel shader would be
    used to make the water look like bumpy Mylar, but beyond that, shaders were
    mostly used to perform more of these texture blends at once. It was
    appropriate to talk about GPU performance by counting pipelines and how many
    pixels or samples could be drawn per second. You had your fill rate, your
    clock speed, your memory bandwidth, and that was enough.

    The world is changing rapidly. Games that use DirectX 9 level shaders,
    either Shader Model 2.0 or 3.0, are tricky. Some shaders use floating-point
    math, some integer math. The math required to draw a single pixel is
    increasing-not just on spot areas like bumpy and shiny water, but on
    virtually every pixel in the game. And it's not just blending together some
    textures, either. "Data textures" like normal maps or gloss maps are used to
    feed comparatively complex calculations to determine the final color of a
    pixel. Compared with the number of pixels a GPU will effectively output per
    clock cycle, a whole lot of math is going on, and the number of texture
    fetches is going up, too.

    We need a new way to talk about GPUs. Pipelines, clock rates, and fill rate
    were a useful shorthand a couple of years ago, but that's no longer the
    case. What do we do when the same shader units that perform pixel shading
    operations are used for vertex shading operations? What do we do when the
    arithmetic logic units (ALUs) aren't organized into neat little "pipelines"
    or even quads anymore? How do we account for the fact that not all ALUs are
    created equal-some can perform more operations per cycle than others, and
    different GPUs may have ALUs that perform operations of different types. How
    do we account for the increasing value of on-chip caches?

    Before long, the performance of GPUs may hinge on some of the same features
    that make for a good desktop CPU, things like out-of-order instruction
    processing, translation lookaside buffers, or data prefetching logic.

    What do you think the most important metric of next-generation GPUs will be?
    And what simple, understandable terms should we use to compare them?"

    Guest, Aug 1, 2005
    1. Advertisements

  2. I heard that in two generations of cards down the road, they're going to add
    the P.O.F.F. technology. Of course it'll mean an extra fan in your case, but
    hey, it's worth it! It sure will be nice to smell burning rubber in games
    like Need for Speed! P.O.F.F.= potential odor fragrance fan. *snickers*

    /\\/\\UF/-\\S/-\\, Aug 2, 2005
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.