TCP: The Transmission Control Protocol

Most Internet applications use TCP for the transport layer (layer 4 in the OSI model). It is a connection-oriented protocol which provides guaranteed, reliable delivery of data over an underlying, unreliable network. TCP data units are called segments – this includes the header and the data which it encapsulates. TCP is specified by RFC 793.

TCP Properties

  • Connections

  • Guaranteed and ordered data transfer

  • Multiplexing

  • Reliability through error recovery

  • Flow control

TCP Connections

TCP is connection-oriented, which means a connection is established between two hosts at the start of a communication and lasts for the duration of data transfer until it is torn down at the end. Therefore, even if the underlying network at layer 3 changes, TCP provides a continuous service provided for the application layer to use.

TCP connections use SYN and ACK flags in the TCP header to keep track of the lifecycle of the connection.

What is the Three-Way Handshake

The three-way handshake is used to establish TCP connections. A basic summary for a client / server TCP connection is:

  1. The server is listening for new connections.

  2. The client asks to connect. It sets the SYN flag to 1, the ACK flag to 0 and sends a random sequence number (x). Information such as the port it wants to connect to is also present.

  3. If the server is listening on the chosen destination port, it accepts the connection.

  4. The server then replies with the SYN flag set to 1 and the ACK flag set to 1. The server sends back its random sequence number (y) and an acknowledgement number which is (x + 1).

  5. The client replies with SYN 0 and ACK1. It uses a sequence number of (x + 1) and an acknowledgement number of (y + 1).

If the server is not listening on the specified destination port, then it replies with the RST (reset) flag set.

Continuing the Connection

Once the connection is established, both parties continue incrementing their sequence numbers when they send data and replying with acknowledgements when they receive data. The sequence number is increased for every byte sent. Acknowledgement numbers are based on the received sequence number and number of bytes received - it tells the sender what sequence number it's expecting next. This process helps provide ordered and guaranteed communication.

TCP connections are 'full-duplex' – both parties can send data at the same time.

Releasing a TCP Connection

Either party can tear down its half of the conversation at any point by setting the FIN flag. A typical termination of a connection is as follows:

  1. Host A sends a FIN segment.

  2. Host B acknowledges the FIN with an ACK

  3. Host B sends its FIN.

  4. Host A replies with an ACK.

  5. Both hosts release the connection.

Guaranteed, Ordered Delivery

Because each TCP segment has a sequence number, a recipient can deliver data to the application layer in the correct order – even if the IP network delivers them out of order.

If a host doesn’t receive an expected segment, then it won’t send an acknowledgement for that segment. If the sender doesn’t receive an ACK that it’s expecting, then it resends the data.

This is what enables TCP to provide guaranteed, ordered delivery to the application layer.

Error Recovery

TCP provides basic error recovery using a checksum. When the recipient checks the checksum, if it fails, then the segment is discarded. Because the segment has been discarded, an acknowledgement won’t be sent, and then the sender resends the data.

Multiplexing with Ports

Multiplexing is used to allow multiple connections simultaneously and to differentiate between the connections for different applications.

When a host receives data, it needs to work out who that data is for – is it HTTP traffic for a web browser, VOIP for Skype or FTP data being transferred. TCP does this using ports. Hosts manage network connections using sockets which are defined by:

  • IP Address

  • Transport Protocol

  • Port number

Source Ports

When a client sets up a connection for an application, it assigns an unused port as the source port. This range of ports are known as ephemeral (or dynamic) ports – they don't relate to a specific protocol and are only assigned for the duration of the connection. All data received on this port (until the connection is terminated) is sent on to that application.

Destination Ports

The destination port that data is sent to is more specific. Common applications are assigned particular port numbers which a client uses as the destination port. For example, HTTP traffic uses port 80.

The server can differentiate between connections using the client's IP address and source port.

Port Numbers

There are three ranges of port numbers. IANA (the Internet Assigned Numbers Authority) maintains a registry of assigned port numbers. Port numbers are grouped into different ranges:

  • Well known ports (0 – 1023)

  • Registered ports (1024 – 49151)

  • Dynamic / Private Ports - often used as ephemeral ports (49152 – 65535)

Important Well Known TCP Port Numbers

21 File Transfer Protocol (FTP)

22 Secure Shell (SSH)

23 Telnet

25 Simple Mail Transfer Protocol (SMTP)

53 Domain Name System (DNS)

80 Hypertext Transfer Protocol (HTTP)

110 Post Office Protocol (POP3)

143 Internet Message Access Protocol (143)

443 HTTPS (HTTP Secure – using TLS/SSL)

Flow Control

TCP lets the receiver specify a ‘window’. The size of the window (in bytes) dictates how many bytes of data the sender can transmit before it should wait for an acknowledgement from the receiver.

TCP Header Fields

Source Port (16 bits)

TCP port number which the data is being sent from.

Destination Port (16 bits)

TCP port number which the data is being sent to.

Sequence Number (32 bits)

Used for initiating the connection and then keeping track of the order of data.

Data Offset (4 bits)

The data offset field gives the size of the TCP header in bytes. The minimum is 20 bytes. If 'Options' are used, then they may add up to an additional 40 bytes. The maximum size of a TCP header (data offset) is 60 bytes.

Reserved (3 bits)

The Transmission Control Protocol has these 3 bits reserved for future use. They should be set to zero.

TCP Control Flags (6 bits)

There are 6 single-bit flags:

URG: Urgent Pointer field significant

ACK: Acknowledgment field significant

PSH: Push Function

RST: Reset the connection

SYN: Synchronize sequence numbers

FIN: No more data from sender

Window Size (16 bits)

The size of the window is the number of bytes which the sender accepts (from the current acknowledgement number) before it sends another acknowledgement.

Checksum (16 bits)

The checksum is calculated over a ‘pseudo-header’ which consists of:

  • Source IP Address (32 bits)

  • Destination IP Address (32 bits)

  • Zero Padding (8 bits)

  • Protocol – 6 for TCP (8 bits)

  • TCP length (8 bits)

Urgent Pointer (16 bits)

If the URG flag is set, this field is used to give the last urgent byte of data.

Options

If the data offset is greater than 5, then the options field occupies the remaining space. Zero padding is included at the end if necessary.

Questions

Question 1

What is a connection-oriented protocol?

Question 2

What are the key properties of TCP?

References

RFC-793 Transmission Control Protocol

Internet Engineering Task Force

Prerequisite Skills

lesson

How does IPv4 work?

IPv4 Functions IPv4 (Internet Protocol version 4) operates at the Internet layer in the TCP stack (or layer 3 in the OSI model). It has two…

lesson

Summary of the TCP/IP Model

Introduction The TCP/IP suite of protocols is what most modern computer networks, including the Internet, are built upon. It can be viewed…

lesson

Same layer and adjacent layer interactions

When we look at data in a network we can look ‘horizontally’ at a single layer in the network stack or ‘vertically’ as data moves up and…

Follow On Cyber Learning

lesson

Access Control Lists for Beginners

Access control lists (ACLs) provide a simple but effective layer of security in modern networks.

lesson

How does HTTP work?

HTTP is used by web browsers and many other applications to retrieve websites and much more.

lesson

NAT: Network Address Translation

NAT is the mapping of one address to another. It is a useful tool for conserving IP address space.

lesson

Netcat Field Guide

Netcat (nc) is a simple but versatile utility for TCP and UDP communication.

lesson

Network Sockets and Ports

Network sockets are software structures that represent the endpoints of a network connection. A pair of sockets fully specify a network connection, and these connections enable communication in both directions. There are three main types of network socket: datagram, stream and raw.

lesson

Telnet vs SSH

Telnet and SSH are both protocols used for interacting with remote devices. Learn about the differences and the details.

Related Training Courses

course

Computer Networking Foundations

Understand how modern computer networks work.