Latency is the time taken for a message to travel from its source to its destination, including all overheads. In networking terms, this would be: sender overhead + time of flight + transmission time + receiver overhead.
Reported latencies are usually generated using a half round-trip measurement. Here the total elapsed time to send a small message back and forth between two endpoints is measured. The result is averaged over the number of iterations and divided by 2. Half round-trip latency is important where application performance depends on small message exchange particularly in the parallel compute arena.
However other applications such as streaming media and stock-update processing performed within the financial services community depend rather more critically on the performance of processing streams of messages. For these applications, one-way latency is a rather more useful measurement. Here the total elapsed time to transfer a large number of messages from a sender to a receiver is measured and averaged over the total number of messages.
At a glance, the reader might think that one-way latency and half round-trip latency should be equal. However this is not so because the system overheads will be very different when processing streams of messages compared with a single message. It's therefore a shame that one-way latency is often not reported in discussions of interconnect performance.
Recently latency has been used synonymously with the other attributes (by my definition) of QoS, namely jitter and bandwidth. For example, as data sets have grown, the performance of some applications have increasingly become bottlenecked by the time taken to transfer this data back and forth. This time is actually dominated by bandwidth, but the problem is often reported as data latency.
Similarly, jitter (which is a measure of the variance of message transfer times) has been termed application latency. A good example of an application where jitter is a critical metric would be the stock ticker feed. Put simply, any human perceivable jitter in the feed will cause traders to lose confidence in their information source and must be prevented at all cost.
The low-jitter property (rather than low-latency) of interconnects such as Infiniband is one of the main reasons that they have gained momentum in the financial services community. But ironically much of this jitter has been caused by OS interactions, particularly between the scheduler and the network stack. Recent work in this area has done much to reduce these issues and it is now possible to achieve low-jitter operation using classic Ethernet networks.