NOTE: this is only a technical report, not a full study of why we need latency measurements, how does locality works, etc.

Technical report: Measuring peers latency in P2P networks using passive TCP RTT estimation

Abstract

Most of the latency estimation techniques are based on active measurements. Such measurements can give us quite good accuracy but are often not feasible due to either technical problems or the high load they generate. On the other hand, passive measurements are very cheap: they do not generate any additional network traffic, do not require both peers to cooperate and do not require significant resources on any side of the measurement.

In this study we check how does passive TCP RTT measurement compare to active measurements like pings and traceroutes.

Passive vs. active measurements

When measuring the latency between two peers one can use either active or passive techniques.

Active measurements is conducted by generating traffic between two peers that wouldn't be sent otherwise. A good example of active measurement is an ICMP ping: a sender sends a small ICMP packet to the receiver; the receiver sends the packet back; the sender calculates the time between sending and receiving the packet (round-trip-time rtt). The measurement is considered to be active, because the main purpose of sending the packet is to measure the rtt between two hosts.

Passive measurements are done simply by observing normal peer interactions. In case of latency measurements, instead of measuring the rtt of special ICMP packets, rtt of normal packets is used. A passive measurements watches the traffic exchanged between two peers and tries to calculate rtt based on observed activity. An example: if during normal interaction between peer A and B a packet is sent from A to B which should be immediately answered by B (according to the protocol, for example during protocol handshake) one can measure the time between sending the packet and receiving the response. The measurement is very similar to the active one, with one exception: the measurement is only a side-effect of normal protocol interaction.

Active measurements have many advantages over passive measurements: we have full control of the time of the measurement, not much interaction (apart from the measurement) is required from the second host and the processing time of the receiving host is smaller (making the rtt estimation more accurate).

There are two important disadvantages of active measurements:

  1. the receiving host has to cooperate with the sending host (i.e., be willing to send the ICMP packet back)
  2. additional network traffic is generated

In case of latency estimation the first argument is very important: as we will show later, the percentage of peers not responding to ICMP pings is very significant.

Possible active latency measurements:

  • ICMP ping (this document)
  • traceroute (this document)
  • application level ping

Possible passive latency measurements:

  • tcp rtt (this document)
  • application level events (e.g., bittorrent handshake)

In this study we compare ICMP pings against traceroute and TCP RTT.

Technical details

Approach

For each host measure/estimate three different things:

  • host rtt: round trip time for normal ICMP ping packet between us and the host
  • router rtt: round trip time for normal ICMP ping packet between between us and last router before the target host. The biggest difference between host and router rtt is that router rtt does not include last-mile congestion on the host side.
  • tcp rtt: round trip time between us and the host, passively estimated by the Linux kernel.

For all rtts (host, router and tcp) we always take into account the min of all measurements.

TCP RTT

TCP needs to estimate rtt in order to compute the window size (or something like that, I'm not an expert :) ). In order to get the information from the Linux kernel one has to call (in c/c++):

int fd;

struct tcp_info tcpinfo;

socklen_t len = sizeof(tcpinfo);



int success = getsockopt(fd,SOL_TCP,TCP_INFO,&tcpinfo,&len);

if (success != -1) {

   std::cout << "rtt=" << (tcpinfo.tcpi_rtt/1000000.0) << ", var=" << (tcpinfo.tcpi_rttvar/1000000.0) << std::endl;

}

Please note, that =getsockopt= does not require root permissions.

host rtt and router rtt

mtr was used to do ping and traceroute. Namely for every IP address a series of 20 pings was sent to every hop on the traceroute path. Only the results for the last hop (the host itself) and last router before the last hop were saved. This way with one mtr call we get both host rtt and router rtt.

Measurement

The measurement was done using a modified version of libtorrent library (and rtorrent client). The modified library logged information after transferring or receiving every chunk. An example:

1233881141 CHFIN, name=___TORRENT_NAME___, fd=272, ip=___PEER_IP___, s=0, rtt=0.1265, var=0.05

Each line of the log file describes one chunk. In this example the smoothed rtt from the kernel for the peer is 0.1265 seconds (126 ms).

While the client is running, a script is executed which does the mtr call for every new address (not checked before).

After the measurement is finished/stopped we have two log files:

  1. one with many tcp rtt measurements for every peer (one for each transferred chunk)
  2. one with mtr results for host rtt and router rtt

Results analysis

The measurement was conducted during one night, at my place, on a symmetric 20Mbit connection. I used Ubuntu 8.10 running inside a virtual machine on my Mac. The libtorrent client was capped to use at most 500 kbytes input and 500 kbytes output traffic. NOTE: my link was not congested.

  • Duration of measurement: about 10 hours
  • # of TCP RTT measured IPs (all IPs taken into account): 1561
  • # of IPs responding to ICMP pings: 703 (45%) <-- less than 50%!!'''
  • # of IPs for which the last router was responding to ping: 1355 (87%)
  • # of IPS responding to ping for which the last router was also responding to pings: 526 (33%)

router rtt vs host rtt

Results are almost as expected: most of the host rtts match the router rtts. There are virtually no router rtts bigger than host rtts, whereas there is a significant amount of IPs with much bigger host rtt than router rtt. This can be attributed to the last-mile congestion.

Virtually all router rtts are < 400ms. 400ms is the expected rtt to the most distanced hosts (i.e., Australia and New Zealand). This indirectly supports my feeling that there is no congestion in the backbone (we would observe many router rtt > 400ms otherwise).

host rtt vs tcp rtt

This graph is the one with most outliers. It is worth mentioning that the outliers go much further than the 1 second interval visible on the graph.

Both tcp rtt and host rtt are heavily affected by the congestion and this might by the reason for outliers on both sides.

Nevertheless, there is a visible correlation: when the tcp rtt is small, the host rtt is also small.

router rtt vs tcp rtt

This is the most important graph for us. Since the router rtt does not take last-mile congestion into account, we might treat it as the "best measurable rtt" of the connection.

Correlation in this graph is very visible: for IP with tcp rtt less than about 200ms, the router rtt is also small. What is more: this constitutes big majority of tcp rtt's with times < 1second.

We get a lot of false negatives (tcp rtt is big because of congestion, but in fact, the real rtt is small), but virtually no false positives. This is an important property of kernel TCP RTT measurement that can be used in many good ways (e.g., latency based locality estimation).

Summary/Conclusions

This report shows that both tcp rtt and host rtt are not an accurate tool of measuring rtt of the end-to-end connection. There is a heavy concentration in the expected region, but there isn't a reasonable bound on the values (tcp rtt can get even to 15 seconds, which is something like 4 times rtt with the moon...).

On the other hand, router rtt seems to be quite accurate. There are virtually none results > 400ms. There is also much higher ping-response for router rtt than for host rtt.

What is most important in this study is lack of false positives when estimating router rtt by tcp rtt.

Future

There are few more things that need to be checked/assured:

  • run the experiment on a congested ADSL link
  • run the experiment on a normal machine (theoretically the virtual machine might influence the results)

As a continuation of this study I also want to look at:

  • stability of latency estimation inside the same IP prefix. This might be quite complicated because the same peers in the same IP prefix might have different last-mile congestion level. On the other hand router rtt should be comparable. Results might be very interesting. Stay tuned!

Attachments