Close to my heart...: Demystifiying Latency in Web Page Request

Since the advent of the Internet and the World Wide Web (www) in the last century, till today there have been trillions and trillions of Internet page requests made across the world. However, webmasters feel all the more curious to know how actually a webpage request is being served on their user's computer and to use that understanding to try to speed it up.

Data on the Internet is packaged and transported across in small data packets. The regular or irregular flow of these data packets affect the user's Internet experience. Whenever, one sees a continuous flow of data on his screen, this in turn means that the data packets are moving across smoothly and in a timely fashion. However, if the same data packets move across with large and visible delays, that means that the user's experience is degraded and he may feel frustrated at the poor network connection and speed.

In this article, I seek to demystify all such beliefs and bring parity to our understanding of webpage request. I shall make an attempt creating a better understanding of latency effects of network and low bandwidth on a webpage request.

Lets first develop an understanding of some of the networking concepts to build our follow up understanding of the webpage request and latency. With the advent of networking, it was thought that millions and millions of users would be connected through a common network, hence the idea of TCP / IP model.

The key features of the TCP/IP model is encapsualtion, which is the concept of collecting the data and covering it with a common container for transmission. The common container is called the “IP Datagram”, also known as “IP Packet” or just the “Packet”. This IP Packet is a simple thing, with a header which contains the information used for routing the the packet to the destination and followed by data which is any information sent across.

Lets now concentrate all our energies on understanding another important concept in networking, the OSI model. It was created to lay out the process of turning the application data into something that can be transported through the Internet. The upper layers of the OSI model describe as to what happens within the applications running on the computer. These include the human-machine interface, conversion of the high level language into machine language, encryption, authentication and permissions. The lower layers are the ones where to and from applications are turned into data to move across the network. This is where data encapsulation occurs and the IP Datagram or “packet” is built.

The transport of data across the network is a 3 step process:

1. Data from the source is passed through the TCP/IP stack and wrapped into IP Datagrams, commonly known as “Packets”. These packets are then transmitted by the source computer in the network

2. Packets are passed along the network until they reach the destination computer

3. Packets are received by the destination computer and are passed through the stack

According to Wikipedia, Latency is the time delay between the moment something is initiated, and the moment one of its effects begins or becomes detectable. The most common understanding of Latency is it takes time for web pages to load and for emails to reach the destination inbox. Though, this is a form of latency, however, lets take down latency as the time delay imparted by each element involved in the transmission of data. Lets develop our understanding of what causes latency. The are many logical and physical elements involved in networking.

Application Latency

The need to read and write to disk causes some time delays. The processor could be very strong and highly rated, however, it still has limitations to as to what it can read and write in stipulated time. It takes a finite amount of time to manufacture data and present it. There are a lot of hardware limitations as well, such as the amount of memory which affects application performance.

Serialization Latency

The encapsulation of data (as discussed above) is called serialization and takes a finite amount of time. It is calculated as follows: Serialization Delay = Packet Size in bits / Transmission Rate in bits per second Serialization can lead to significant delays and latency on link that operate on low transmission rates.

Routing & Switching Latency

A network request causes data to flow from point A to point B. This would be simple, if the network was just 2 computers, however, this is not to be. In networks like the Internet, data and hence the packets are transmitted from source to destination through a series of routers and switches connected through circuits, which are hardware devices needed for transmission through the network.

These hardware machines have to manage the Internet traffic causing delays caused by the routing and switching process. This refers to the amount of processing time for a router or switch to receive a packet, process it and transmit it.

These days, with the advancements in the computer hardware technology, these delays have reduced to only a few nanoseconds. High performance routers and switches each usually add upto 200µs of latency to the link.

Queuing Latency

Queuing latency refers to the amount of time a packet spends sitting in a queue waiting for transmission due to over utilization of the link. Though over-utilization of high speed Internet backbone is very rare, but it can be easily seen on lower speed networks. Congestion can cause these delays to become infinite since packets may be dropped when router becomes full. Routers use various queueing management algorights to ensure latency is minimized. The most commonly used WRED algorithms bound queueing latency at 20 ms.

Propagation Latency

Propagation latency is the delay caused by the transmission medium. The amount of slowing down is known as the Velocity Factor (VF). Typically, there are 3 medium of transmission of data across the networks, copper cables having a VF in the range of 40% - 80% of the speed of light, fibre-optic cables leading to a VF of around 70% of the speed of light and the electro-magnetic radio waves having the least possible VF. This delay happens even without considering the amount of data being transferred, the transmission rate, the protocol being used or the link problems.

Transmission Rate and Bandwidth Latency

Transmission Rate is the term used to define the number of bits that can be extracted from the medium. It is commonly measured in the terms of number of bits per second. The maximum transmission rate defines the fundamental limitation of the transmission medium. Generally, Copper links have a maximum transmission rate of 10, 100 or 1000 Mbps. For Fibre-optic links, transmission rates vary from around 50 Mbps to 10 Gbps.

Wireless LANs and satellite links use a modem to convert the bits into a modulated wave and then on transmission convert them back into bits using the demodulator. The limiting factor in these type of links is the limited bandwidth available to these signals. The amount of radio spectrum occupied by any given signal is called its bandwidth. Since radio spectrum is a limited resource, the occupied radio bandwidth is an important limiting factor in wireless and satellite links.

Protocol Latency

Lets now take a look at the network data exchanges. Connectionless data exchange is the one where data is pushed through with any consideration. Here the packet traverses the Internet to search for its destination, however, if something happens to it midway, nothing can be done. This is usually used for streaming music, videos, VOIP. The protocol used is User Datagram Protocol (UDP). It doesn't have any overhead or connection management. There is no retransmission of data as well.

On the other hand are the connection based data exchanges. They rely on the establishment of the connection which manages every packet that is transmitted. The transport protocol used is the Transmission Control Protocol (TCP). It provides for the error free delivery of packets and hence the data. TCP connections have 3 phases:

1. Establish the connection

2. Send the data

3. Close the connection

All this adds to the time being taken while the data is transmitted and hence the delay and Latency.

This puts the webpage request on the table and opens it thread bare to clear the air on what goes behind each of our clicks while we are on the Internet connected to the millions or billions of users and trillions of data. We make an understanding of the time delays or Latency and now agree that it is imperative and necessary.

References,

1. What is network latency? and Why does it matter? - http://www.o3bnetworks.com/docs/O3b_latency_white_paper2.pdf

2. Satellite Internet Access - http://www.sisp.net/broadband/satellite.htm

3. Network bandwidth and Latency - http://compnetworking.about.com/od/speedtests/a/network_latency.htm

4. Anatomy of a HTTP Request - http://www.websiteoptimization.com/secrets/metrics/10-21-http-request.html