1. This forum section is a read-only archive which contains old newsgroup posts. If you wish to post a query, please do so in one of our main forum sections (here). This way you will get a faster, better response from the members on Motherboard Point.

Question on PCI-express verssus Standard PCI performance

Discussion in 'Embedded' started by Benjamin Couillard, Jul 25, 2011.

  1. Hi everyone,

    I'm working on a conversion project where we needed to convert a PCI
    acquisition card to a PCI-express (x1) acquisition card. The project
    is essentially the same except instead that the new acquisition card
    is a PCI-express endpoint instead of being a standard-PCI endpoint.
    The project is implemented on a Xilinx FPGA, but I don't think my
    issue is Xilinx specific.

    The conversion has worked fine on all levels except one. The read
    latency of PCI express is about 4 times higher than standard PCI. For
    example, on the old product, it takes about 0.9 us to perform a 1-
    DWORD read. With the PCI-express product it takes about 3-4 us to
    perform a 1-DWORD read. I've seen this read latency both in real-life
    (with a real board) and in VHDL Simulation so I don't think that this
    is a driver issue. Do any of you have experienced similar performance
    issues?

    Don't get me wrong, for me PCI-express is a major step ahead, the
    write burst and read burst performance is way better than standard
    PCI.. Perhaps this is the reason, since most PCI-express cards are
    mostly used in burst transactions, the read latency does not really
    matter, therefore they sacrificed some read latency in order to obtain
    better performance.

    Best regards
     
    Benjamin Couillard, Jul 25, 2011
    #1
    1. Advertisements


  2. One lane PCIe 1.x should be able to turn a word read around in about
    250ns assuming not too much else is going on. Of course an excessive
    number of switches (or slow switches) or slow hardware on either end
    are obviously possible issues. But PCIe is certainly much faster than
    3-4us to read a word.
     
    Robert Wessel, Jul 25, 2011
    #2
    1. Advertisements

  3. I have no actual experience of experimenting with this, however, I
    have
    been interested in a latency sensitive device that may potentially use
    PCI-E
    so have been looking around for answers.

    Have a look at this write up, of a comparison of HyperTransport and
    PCI-E.
    The authors claim around 250 nano-seconds (page 9) to read the first
    byte:

    http://www.hypertransport.org/docs/wp/Low_Latency_Final.pdf

    It would be interesting to hear what is causing you to see 3-4 us?
    That
    would kill off my potential project, so I am hoping to be able to
    match the
    results in the above paper.

    Could there be some inaccuracy in your measurements; how do you
    measure the latency?

    Rupert
     
    rupertlssmith, Jul 26, 2011
    #3
  4. When designing with PCI or PCIe you should really try to avoid reads
    as much as possible.
    What do you need it for anyway? In a multitasking operating system you
    are going to have microseconds of jitter on the software side in
    kernel mode and tens of miliseconds in user mode anyway. So I am
    wondering what the scenario is that benefits from sub us latency for
    software reads?

    Kolja
    cronologic.de
     
    Kolja Sulimma, Jul 26, 2011
    #4
  5. Benjamin Couillard

    John Adair Guest

    Generally speaking PCI Express much more prone to latency than
    convertional PCI because packets have to be constructed, passed
    through a structure of nodes, and checked at most levels. Data
    checking isn't completed, and onward transmission, until last data
    arrives and CRCs are checked.

    If you do a "read" this will have a packet outgoing and one coming
    back so doubly worse. If you can do a DMA like operation where data is
    sent from the data source and then interrupt your system to use the
    data in memory.

    The latency will also vary from system to system because rooting
    structures differ between motherboards. The amount of other things
    going on will also affect latency as different things contend for the
    data pipes. Generally speaking if you are trying to do anything real
    time it is something of a nightmare if you are planning using the host
    motherboard processor for control functions.

    You can try and make the latency smaller by using smaller packet sizes
    and this sometimes helps. Ultimately if there is a real time element
    to this then putting the processing and/or control on your card is
    probably best for performance and accuracy.

    John Adair
    Home of Raggedstone2. The Spartan-6 PCIe Development Board.
     
    John Adair, Jul 26, 2011
    #5
  6. In the paper I posted a link to, I think the times are for an
    interrupt, or for DMA, not a software initiated "read". Thanks for
    explaining the difference.

    Rupert
     
    rupertlssmith, Jul 26, 2011
    #6
  7. Is it possible that time-stamping the data would disconnect you somewhat
    from the latency problem?
    Usually data can't be processed and presented real-time at those speeds
    anyway..
     
    Morten Leikvoll, Jul 26, 2011
    #7
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.