1. PART V: TRANSPORT LAYER
TRANSPORT-LAYER PROTOCOLS
Subject: Data Communications and Networking
Tutor: Bilal Munir Mughal
1
Ch-24
2. Content Outline
ī¨ INTRODUCTION
ī¤ Services
ī¤ Port Numbers
ī¨ USER DATAGRAM PROTOCOL
ī¤ User Datagram
ī¤ UDP Services
ī¤ UDP Applications
2
3. Content Outline
ī¨ TRANSMISSION CONTROL PROTOCOL
ī¤ TCP Services
ī¤ TCP Features
ī¤ Segment
ī¤ A TCP Connection
ī¤ State Transition Diagram
ī¤ Windows in TCP
ī¤ Flow Control
ī¤ Error Control
ī¤ TCP Congestion Control
ī¤ TCP Timers
ī¤ Options
3
4. Content Outline
ī¨ SCTP
ī¤ SCTP Services
ī¤ SCTP Features
ī¤ Packet Format
ī¤ An SCTP Association
ī¤ Flow Control
ī¤ Error Control
4
5. INTRODUCTION
ī¨ The transport layer in the TCP/IP suite is located
between the application layer and the network
layer.
ī¨ It provides services to the application layer and
receives services from the network layer.
ī¨ The transport layer acts as a liaison between a
client program and a server program, a process-
to-process connection.
ī¨ The transport layer is the heart of the TCP/IP
protocol suite; it is the end-to-end logical vehicle
for transferring data from one point to another in
5
7. SERVICES
ī¨ UDP(UserDatagramProtocol)
ī¤ UDP is an unreliable connectionless transport-layer
protocol used for its simplicity and efficiency in
applications where error control can be provided by
the application-layer process.
ī¨ TCP(TransmissionControlProtocol)
ī¤ TCP is a reliable connection-oriented protocol that
can be used in any application where reliability is
important.
ī¨ SCTP(StreamControlTransmissionProtocol)
ī¤ SCTP is a new transport-layer protocol that
combines the features of UDP and TCP.
7
8. PORT NUMBERS
ī¨ Port numbers provide end-to-end addresses at the
transport layer and allow multiplexing and demultiplexing
at this layer.
8
10. USER DATAGRAM PROTOCOL
ī¨ Example 24.1
ī¨ The following is the content of a UDP header in
hexadecimal format.
CB84000D001C001C
a. What is the source port number?
b. What is the destination port number?
c. What is the total length of the user datagram?
d. What is the length of the data?
e. Is the packet directed from a client to a server or
vice versa?
f. What is the client process?
10
11. USER DATAGRAM PROTOCOL
ī¨ Solution
a. The source port number is the first four hexadecimal
digits (CB84)16, which means that the source port
number is 52100.
b. The destination port number is the second four
hexadecimal digits (000D)16, which means that the
destination port number is 13.
c. The third four hexadecimal digits (001C)16 define the
length of the whole UDP packet as 28 bytes.
d. The length of the data is the length of the whole packet
minus the length of the header, or 28 â 8 = 20 bytes.
e. Since the destination port number is 13 (well-known
port), the packet is from the client to the server.
11
12. USER DATAGRAM PROTOCOL:
UDPSERVICES
ī¨ Process-to-Process Communication
ī¨ Connectionless Services
ī¨ No Flow Control
ī¨ No Error Control except for the checksum
ī¨ Checksum
ī¨ No Congestion Control
ī¨ Encapsulation and Decapsulation
ī¨ Queuing
ī¨ Multiplexing and Demultiplexing
12
13. USER DATAGRAM PROTOCOL:
UDPSERVICES
ī¨ Checksum
ī¤ UDP checksum calculation includes three sections:
a pseudoheader, the UDPheader, and the data
coming from the application layer.
ī¤ The pse udo he ade r is the part of the header of the
IP packet (discussed in Chapter 19) in which the
user datagram is to be encapsulated with some
fields filled with 0s.
ī¤ If the checksum does not include the
pseudoheader, a user datagram may arrive safe
and sound. However, if the IP header is corrupted, it
may be delivered to the wrong host.
13
14. USER DATAGRAM PROTOCOL:
UDPSERVICES
ī¤ The protocol field is added to ensure that the packet
belongs to UDP, and not to TCP.
ī¤ The value of the protocol field for UDP is 17. If this
value is changed during transmission, the
checksum calculation at the receiver will detect it
and UDP drops the packet. It is not delivered to the
wrong protocol.
14
15. USER DATAGRAM PROTOCOL:
UDPSERVICES
ī¨ OptionalInclusionof Checksum
ī¨ The sender of a UDP packet can choose not to
calculate the checksum. In this case, the
checksum field is filled with all 0s before being
sent.
ī¨ In the situation where the sender decides to
calculate the checksum, but it happens that the
result is all 0s, the checksum is changed to all
1s before the packet is sent.
15
16. USER DATAGRAM PROTOCOL:
UDPAPPLICATIONS
ī¨ UDP is suitable for a process that requires
simple request-response communication with
little concern for flow and error control. It is not
usually used for a process such as FTPthat
needs to send bulk data (see Chapter 26).
ī¨ UDP is suitable for a process with internal flow-
and error-control mechanisms. For example, the
Trivial File TransferProtocol (TFTP) process
includes flow and error control. It can easily use
UDP.
16
17. USER DATAGRAM PROTOCOL:
UDPAPPLICATIONS
ī¨ UDP is a suitable transport protocol for
multicasting. Multicasting capability is
embedded in the UDP software but not in the
TCP software.
ī¨ UDP is used for management processes such
as SNMP(see Chapter 27).
ī¨ UDP is used for some route updating protocols
such as Routing Information Protocol (RIP)
(see Chapter 20).
ī¨ UDP is normally used for interactive real-time
applications that cannot tolerate uneven delay
17
18. TRANSMISSION CONTROL PROTOCOL
ī¨ Transmission Control Protocol (TCP) is a
connection-oriented, reliable protocol.
ī¨ TCP explicitly defines connection establishment,
data transfer, and connection teardown phases
to provide a connection-oriented service.
ī¨ TCP uses a combination of GBN and SR
protocols to provide reliability.
ī¨ To achieve this goal, TCP uses checksum (for
error detection), retransmission of lost or
corrupted packets, cumulative and selective
acknowledgments, and timers.
18
19. TRANSMISSION CONTROL PROTOCOL
TCPSERVICES
ī¨ Process-to-Process Communication
ī¨ StreamDelivery Service
ī¨ Full-Duplex Communication
ī¨ Multiplexing and Demultiplexing
ī¨ Connection-Oriented Service
ī¨ Reliable Service
19
20. TRANSMISSION CONTROL PROTOCOL
TCP SERVICES
ī¨ StreamDelivery Service
ī¨ TCP, allows the sending process to deliver data
as a stream of bytes and allows the receiving
process to obtain data as a stream of bytes.
20
23. TRANSMISSION CONTROL PROTOCOL
TCPFEATURES
ī¨ Numbering System
ī¤ Byte Number
īŽ TCP numbers all data bytes (octets) that are transmitted in a
connection. Numbering is independent in each direction.
īŽ The numbering does not necessarily start from 0. Instead,
TCP chooses an arbitrary number between 0 and 232
â 1 for
the number of the first byte.
ī¤ Sequence Number
īŽ TCP assigns a sequence number to each segment that is
being sent. The sequence number, in each direction, is
defined as follows:
1.The sequence number of the first segment is the ISN
(initial sequence number), which is a random number.
2.The sequence number of any other segment is the
sequence number of the previous segment plus the
23
24. TRANSMISSION CONTROL PROTOCOL
TCP FEATURES
ī¨ Numbering SystemâĻ
ī¤ Acknowledgment Number
īŽ The value of the acknowledgment field in a segment
defines the number of the next byte a party expects to
receive.
īŽ The acknowledgment number is cumulative. which
means that the party takes the number of the last byte
that it has received, safe and sound, adds 1 to it, and
announces this sum as the acknowledgment number.
24
32. TRANSMISSION CONTROL PROTOCOL
TCP CONNECTION
ī¨ Connection Reset
ī¨ TCP at one end may deny a connection request,
may abort an existing connection, or may
terminate an idle connection.
ī¨ All of these are done with the RST (reset) flag.
32
39. TRANSMISSION CONTROL PROTOCOL
ERRORCONTROL
ī¨ Error control in TCP is achieved through the use
of three simple tools:
ī¨ Checksum
ī¨ Acknowledgment
ī¤ Cumulative Acknowledgment (ACK)
ī¤ Selective Acknowledgment (SACK)
ī¨ Retransmission
ī¤ Retransmission after RTO(Retransmission time-out)
ī¤ Retransmission after Three Duplicate ACK Segments
39
40. TRANSMISSION CONTROL PROTOCOL
ERROR CONTROL
ī¨ TCP implementations today do not discard out-
of-ordersegments.
ī¨ They store them temporarily and flag them as
out-of-order segments until the missing
segments arrive.
40
44. TRANSMISSION CONTROL PROTOCOL
CONGESTION CONTROL
ī¨ TCP uses different policies to handle the
congestion in the network.
ī¨ Congestion Window
ī¤ The TCP sender uses the occurrence of two events
as signs of congestion in the network: time-out and
receiving three duplicate ACKs.
ī¤ The lack of regular, timely receipt of ACKs, which
results in a time-out, is the sign of a strong
congestion; the receiving of three duplicate ACKs is
the sign of a weak congestion in the network.
44
45. TRANSMISSION CONTROL PROTOCOL
CONGESTION CONTROL
ī¨ Congestion Policies
ī¤ Slow Start: Exponential Increase algorithm
ī¤ Congestion Avoidance: Additive Increase algorithm
ī¤ Fast Recovery algorithm
45
46. TRANSMISSION CONTROL PROTOCOL
CONGESTION CONTROL
ī¨ Congestion PoliciesâĻ
ī¤ The fast-recovery algorithm is optional in TCP.
ī¤ It starts when three duplicate ACKs arrive, which is
interpreted as light congestion in the network.
ī¤ Like congestion avoidance, this algorithm is also an
additive increase, but it increases the size of the
congestion window when a duplicate ACK arrives
(after the three duplicate ACKs that trigger the use
of this algorithm)
46
47. TRANSMISSION CONTROL PROTOCOL
CONGESTION CONTROL
ī¨ Three versions of TCP:
ī¤ Taho TCPuses only slow start and congestion
avoidance
ī¤ Reno TCPadded fast-recovery state
ī¤ New Reno TCPadded three duplicate ACKs arrive
47
49. TRANSMISSION CONTROL PROTOCOL
TCPTIMERS
ī¨ To perform their operations smoothly, most TCP
implementations use at least fourtimers:
ī¤ Retransmission Timer
ī¤ Persistence Timer
ī¤ Keepalive Timer
ī¤ TIME-WAIT Timer
49
50. TRANSMISSION CONTROL PROTOCOL
TCP TIMERS
ī¨ Retransmission Timer
ī¤ To retransmit lost segments, TCP employs one
retransmission timer (for the whole connection period) that
handles the retransmission time-out (RTO), the waiting
time for an acknowledgment of a segment. We can define
the following rules for the retransmission timer:
1. When TCP sends the segment in front of the sending queue, it
starts the timer.
2. When the timer expires, TCP resends the first segment in front
of the queue, and restarts the timer.
3. When a segment or segments are cumulatively acknowledged,
the segment or segments are purged from the queue.
4. If the queue is empty, TCP stops the timer; otherwise, TCP
restarts the timer.
50
51. TRANSMISSION CONTROL PROTOCOL
TCP TIMERS
ī¨ Retransmission TimerâĻ
ī¤ Round-Trip Time (RTT)
īŽ Measured RTT.
īŽ We need to find how long it takes to send a segment and receive an
acknowledgment for it. This is the measured RTT
īŽ In TCP, there can be only one RTT measurement in progress at any time.
īŽ We use the notation RTTM to stand for measured RTT.
īŽ Smoothed RTT.
īŽ The measured RTT, RTTM, is likely to change for each round trip.
īŽ The fluctuation is so high in todayâs Internet that a single measurement
alone cannot be used for retransmission time-out purposes.
īŽ Most implementations use a smoothed RTT, called RTTS, which is a
weighted average of RTTM and the previous RTTS as shown below:
īŽ The value of Îą is implementation-dependent, but it is normally set to 1/8.
In other words, the new RTTS is calculated as 7/8 of the old RTTS and 1/8
51
52. TRANSMISSION CONTROL PROTOCOL
TCP TIMERS
ī¨ Retransmission TimerâĻ
ī¤ Round-Trip Time (RTT)âĻ
īŽ RTT Deviation.
īŽ Most implementations do not just use RTTS; they also calculate the
īŽ RTT deviation, called RTTD, based on the RTTS and RTTM, using the
following formulas.
īŽ (The value of β is also implementation-dependent, but is usually set to
1/4.)
52
53. TRANSMISSION CONTROL PROTOCOL
TCP TIMERS
ī¨ Retransmission TimerâĻ
ī¤ Retransmission Time-out (RTO)
īŽ The value of RTO is based on the smoothed roundtrip time and its deviation.
īŽ Most implementations use the following formula to calculate the RTO:
īŽ In other words, take the running smoothed average value of RTTS and add four
times the running smoothed average value of RTTD (normally a small value).
ī¤ Karnâs Algorithm
īŽ Do not consider the round-trip time of a retransmitted segment in the
calculation of RTTs.
īŽ Do not update the value of RTTs until you send a segment and receive an
acknowledgment without the need for retransmission.
53
54. TRANSMISSION CONTROL PROTOCOL
TCP TIMERS
ī¨ Retransmission TimerâĻ
ī¤ Retransmission Time-out (RTO)
ī¤ Exponential Backoff
īŽ What is the value of RTO if a retransmission occurs?
īŽ Most TCP implementations use an exponential backoff strategy.
īŽ The value of RTO is doubled for each retransmission.
īŽ So if the segment is retransmitted once, the value is two times the RTO.
īŽ If it is retransmitted twice, the value is four times the RTO, and so on.
54
55. TRANSMISSION CONTROL PROTOCOL
TCP TIMERS
ī¨ Persistence Timer
ī¤ To correct the deadlock, TCP uses a persistence timer for
each connection. When the sending TCP receives an
acknowledgment with a window size of zero, it starts a
persistence timer.
ī¤ When the persistence timer goes off, the sending TCP
sends a special segment called a probe. This segment
contains only 1 byte of new data.
ī¤ It has a sequence number, but its sequence number is
never acknowledged; it is even ignored in calculating the
sequence number for the rest of the data.
ī¤ The probe causes the receiving TCP to resend the
acknowledgment.
55
56. TRANSMISSION CONTROL PROTOCOL
TCP TIMERS
ī¨ Persistence TimerâĻ
ī¤ The value of the persistence timer is set to the value of the
retransmission time.
ī¤ However, if a response is not received from the receiver,
another probe segment is sent and the value of the
persistence timer is doubled and reset.
ī¤ The sender continues sending the probe segments and
doubling and resetting the value of the persistence timer
until the value reaches a threshold (usually 60 s).
ī¤ After that the sender sends one probe segment every 60
seconds until the window is reopened.
56
57. TRANSMISSION CONTROL PROTOCOL
TCP TIMERS
ī¨ Keepalive Timer
ī¤ A keepalive timer is used in some implementations to
prevent a long idle connection between two TCPs e.g. a
aerver and a client.
ī¤ Each time the server hears from a client, it resets this
timer.
ī¤ The time-out is usually 2 hours. If the server does not hear
from the client after 2 hours, it sends a probe segment.
ī¤ If there is no response after 10 probes, each of which is 75
seconds apart, it assumes that the client is down and
terminates the connection.
57
58. TRANSMISSION CONTROL PROTOCOL
TCP TIMERS
ī¨ TIME-WAIT Timer
ī¤ The TIME-WAIT (2MSL) timer is used during connection
termination. The maximumsegment lifetime (MSL) is the
amount of time any segment can exist in a network before
being discarded.
ī¤ The implementation needs to choose a value for MSL.
Common values are 30 seconds, 1 minute, or even 2
minutes.
ī¤ The 2MSL timer is used when TCP performs an active
close and sends the final ACK.
ī¤ The connection must stay open for 2 MSL amount of time
to allow TCP to resend the final ACK in case the ACK is
lost.
ī¤ This requires that the RTO timer at the other end times out
58
59. TRANSMISSION CONTROL PROTOCOL
OPTIONS
ī¨ The TCP header can have up to 40 bytes of
optional information.
ī¨ Options convey additional information to the
destination or align other options.
ī¨ These options are included on the book website
for further reference.
59
60. STREAMCONTROL TRANSMISSION
PROTOCOL
ī¨ StreamControl Transmission Protocol (SCTP)
is a new transport-layer protocol designed to
combine some features of UDP and TCP in an
effort to create a better protocol for multimedia
communication.
60
61. STREAM CONTROL TRANSMISSION
PROTOCOL:
SERVICES
ī¨ Process-to-Process Communication
ī¨ Multiple Streams
ī¨ Multihoming
ī¨ Full-Duplex Communication
ī¨ Connection-Oriented Service
ī¨ Reliable Service
61
62. STREAM CONTROL TRANSMISSION
PROTOCOL:
SERVICES
ī¨ Multiple Streams
ī¤ SCTP allows multistreamservice in each
connection, which is called associationin SCTP
terminology.
ī¤ If one of the streams is blocked, the other streams
can still deliver their data.
62
63. STREAM CONTROL TRANSMISSION
PROTOCOL:
SERVICES
ī¨ Multihoming
ī¤ In TCP connection a multihomed host (connected to
more than one physical address with multiple IP
addresses), only one of these IP addresses per end
can be utilized during the connection.
ī¤ An SCTP association, supports multihoming
service. The sending and receiving host can define
multiple IP addresses in each end for an
association.
63
64. STREAM CONTROL TRANSMISSION
PROTOCOL:
FEATURES
ī¨ Transmission Sequence Number(TSN)
īŽ TSNs are 32 bits long and randomly initialized between 0
and 232 â 1. Each data chunk must carry the
corresponding TSN in its header.
ī¨ StreamIdentifier(SI)
īŽ Each stream in SCTP needs to be identified using a
stream identifier (SI). Each data chunk must carry the SI
in its header so that when it arrives at the destination, it
can be properly placed in its stream. The SI is a 16-bit
number starting from 0.
ī¨ StreamSequence Number(SSN)
īŽ SCTP defines each data chunk in each stream with a
stream sequence number (SSN).
64
69. STREAM CONTROL TRANSMISSION
PROTOCOL:
AN SCTPASSOCIATION
ī¨ A connection in SCTP is called an association to
emphasize multihoming.
ī¨ Association establishment in SCTP requires a
four-way handshake.
69
70. STREAM CONTROL TRANSMISSION
PROTOCOL:
AN SCTP ASSOCIATION
ī¨ Data Transfer
ī¤ SCTP recognizes and maintains boundaries.
ī¤ Each message coming from the process is treated
as one unit and inserted into a DATA chunk unless it
is fragmented.
ī¤ In SCTP data chunks are related to each other.
70
71. STREAM CONTROL TRANSMISSION
PROTOCOL:
AN SCTP ASSOCIATION
ī¨ Data TransferâĻ
ī¤ Multihoming Data Transfer
īŽ Multihoming allows both ends to define multiple IP
addresses for communication. However, only one of
these addresses can be defined as the primary address;
the rest are alternative addresses.
īŽ Data transfer, by default, uses the primary address of the
destination. If the primary is not available, one of the
alternative addresses is used.
īŽ The process, however, can always override the primary
address and explicitly request that a message be sent to
one of the alternative addresses.
īŽ A process can also explicitly change the primary address
of the current association.
71
72. STREAM CONTROL TRANSMISSION
PROTOCOL:
AN SCTP ASSOCIATION
ī¨ Data TransferâĻ
ī¤ MultistreamDelivery
īŽ SCTP differentiates between data transferand data
delivery.
īŽ SCTP uses TSN numbers to handle data transfer,
movement of data chunks between the source and
destination.
īŽ The delivery of the data chunks is controlled by stream
identifier(SI) and stream sequence numbers (SSN).
īŽ SCTP can support multiple streams, a message can
belong to one of these streams. Each stream is assigned
a unique stream identifier (SI).
īŽ SCTP supports two types of data delivery in each
stream: ordered (default) and unordered. In ordered data
72
73. STREAM CONTROL TRANSMISSION
PROTOCOL:
AN SCTP ASSOCIATION
ī¨ Data TransferâĻ
ī¤ Fragmentation
īŽ Fragmentation in IP and SCTP belong to different levels:
the former at the network layer, the latter at the transport
layer.
īŽ SCTP preserves the boundaries of the message from
process to process when creating a DATA chunk from a
message if the size of the message does not exceed the
MTU of the path.
īŽ If the total size exceeds the MTU, the message needs to
be fragmented. (For more details about fragmentation,
see the Extra Materials section.)
73
74. STREAM CONTROL TRANSMISSION
PROTOCOL:
AN SCTP ASSOCIATION
ī¨ Association Termination
īŽ Unlike TCP, SCTP does not allow a âhalfclosedâ
association.
īŽ If one end closes the association, the other end must
stop sending new data.
īŽ If any data are left over in the queue of the recipient of
the termination request, they are sent and the
association is closed.
74
75. STREAM CONTROL TRANSMISSION
PROTOCOL:
FLOWCONTROL
ī¨ Flo w co ntro lin SCTP is similar to that in TCP.
ī¨ In TCP, we need to deal with only one unit of
data, the byte. In SCTP, we need to handle two
units of data, the byte and the chunk.
ī¨ The values of rwnd and cwnd are expressed in
bytes; the values of TSN and acknowledgments
are expressed in chunks.
75
76. STREAM CONTROL TRANSMISSION
PROTOCOL:
FLOW CONTROL
ī¨ ReceiverSite
ī¤ The receiver has one buffer (queue) and three
variables.
īŽ first variable holds the last TSN received, cumTSN
īŽ second variable holds the available buffer size, winSize
īŽ third variable holds the last cumulative acknowledgment,
lastACK
76
77. STREAM CONTROL TRANSMISSION
PROTOCOL:
FLOW CONTROL
ī¨ SenderSite
ī¤ The sender has one buffer (queue) and three
variables (Figure 24.48). We assume each chunk is
100 bytes long.
īŽ first variable, curTSN, refers to the next chunk to be sent
īŽ second variable, rwnd, holds the last value advertised by
the receiver (in bytes)
īŽ third variable, inTransit, holds the number of bytes in
transit,
77
78. STREAM CONTROL TRANSMISSION
PROTOCOL:
ERRORCONTROL
ī¨ SCTP, like TCP, is a reliable transport-layer
protocol.
ī¨ It uses a SACK chunk to report the state of the
receiver buffer to the sender.
ī¨ Each implementation uses a different set of
entities and timers for the receiver and sender
sites.
ī¨ Here, a very simple design is used to convey
the concept to the reader.
78
80. STREAM CONTROL TRANSMISSION
PROTOCOL:
ERROR CONTROL
ī¨ SenderSite
ī¤ At the sender site, our design demands two buffers
(queues): a sending queue and a retransmission
queue.
ī¤ We also use three variables: rwnd, inTransit, and
curTSN, as described in the previous section.
80
81. STREAM CONTROL TRANSMISSION
PROTOCOL:
ERROR CONTROL
ī¨ SenderSiteâĻ
ī¤ To see how the state of the sender changes,
assume that the SACK in Figure 24.49 arrives at
the sender site in Figure 24.50. Figure 24.51 shows
the new state.
81
82. STREAM CONTROL TRANSMISSION
PROTOCOL:
ERROR CONTROL
ī¨ Sending Data Chunks
ī¤ An end can send a data packet whenever there are
data chunks in the sending queue with a TSN
greater than or equal to curTSN or if there are data
chunks in the retransmission queue.
ī¤ The retransmission queue has priority.
ī¤ However, the total size of the data chunk or chunks
included in the packet must not exceed the (rwnd â
inTransit) value and the total size of the frame must
not exceed the MTU size, as we discussed in
previous sections.
82
83. STREAM CONTROL TRANSMISSION
PROTOCOL:
ERROR CONTROL
ī¨ Sending Data ChunksâĻ
ī¤ To control a lost or discarded chunk, SCTP, like
TCP, employs two strategies: using retransmission
timers and receiving three SACKs with the same
missing chunks.
ī¤ Retransmission.
īŽ SCTP uses a retransmission timer, which handles the
retransmission time, the waiting time for an
acknowledgment of a segment. The procedures for
calculating RTO and RTT in SCTP are the same as we
described for TCP. SCTP uses a measured RTT (RTTM),
a smoothed RTT (RTTS), and an RTT deviation (RTTD)
to calculate the RTO. SCTP also uses Karnâs algorithm
to avoid acknowledgment ambiguity. Note that if a host is
83
84. STREAM CONTROL TRANSMISSION
PROTOCOL:
ERROR CONTROL
ī¨ Sending Data ChunksâĻ
ī¤ FourMissing Reports.
īŽ Whenever a sender receives four SACKs whose gap
ACK information indicates one or more specific data
chunks are missing, the sender needs to consider those
chunks as lost and immediately move them to the
retransmission queue. This behavior is analogous to
âfast retransmissionâ in TCP.
84
85. STREAM CONTROL TRANSMISSION
PROTOCOL:
ERROR CONTROL
ī¨ Generating SACKChunks
īŽ Another issue in error control is the generation of SACK
chunks. We summarize the rules as listed below.
1.When an end sends a DATA chunk to the other end, it must
include a SACK chunk advertising the receipt of
unacknowledged DATA chunks.
2.When an end receives a packet containing data, but has no
data to send, it needs to acknowledge the receipt of the
packet within a specified time (usually 500 ms).
3.An end must send at least one SACK for every other packet it
receives. This rule overrides the second rule.
4.When a packet arrives with out-of-order data chunks, the
receiver needs to immediately send a SACK chunk reporting
the situation to the sender.
5.When an end receives a packet with duplicate DATA chunks
and no new DATA chunks, the duplicate data chunks must be
85
86. STREAM CONTROL TRANSMISSION
PROTOCOL:
ERROR CONTROL
ī¨ Congestion Control
ī¤ SCTP, like TCP, is a transport-layer protocol with
packets subject to congestion in the network. The
SCTP designers have used the same strategies for
congestion control as those used in TCP.
86
In other words, the sender complements the sum two times. Note that this does not create confusion because the value of the
checksum is never all 1s in a normal situation
Because the sending and the receiving processes may not necessarily write or read
data at the same rate, TCP needs buffers for storage. There are two buffers, the sending
buffer and the receiving buffer, one for each direction. We will see later that these
buffers are also necessary for flow- and error-control mechanisms used by TCP. One
way to implement a buffer is to use a circular array of 1-byte locations as shown in
Figure 24.5. For simplicity, we have shown two buffers of 20 bytes each; normally
the buffers are hundreds or thousands of bytes, depending on the implementation. We
also show the buffers as the same size, which is not always the case.
The figure shows the movement of the data in one direction. At the sender, the buffer
has three types of chambers. The white section contains empty chambers that can be
filled by the sending process (producer). The colored area holds bytes that have been
sent but not yet acknowledged. The TCP sender keeps these bytes in the buffer until it
receives an acknowledgment. The shaded area contains bytes to be sent by the sending
TCP. However, as we will see later in this chapter, TCP may be able to send only part of
this shaded section. This could be due to the slowness of the receiving process or to
congestion in the network. Also note that, after the bytes in the colored chambers are
acknowledged, the chambers are recycled and available for use by the sending process.
This is why we show a circular buffer.
The operation of the buffer at the receiver is simpler. The circular buffer is divided
into two areas (shown as white and colored). The white area contains empty chambers
to be filled by bytes received from the network. The colored sections contain received
bytes that can be read by the receiving process. When a byte is read by the receiving
process, the chamber is recycled and added to the pool of empty chambers.
Although buffering handles the disparity between the speed of the producing and consuming
processes, we need one more step before we can send data. The network layer, as a
service provider for TCP, needs to send data in packets, not as a stream of bytes. At the
transport layer, TCP groups a number of bytes together into a packet called a segment.
TCP adds a header to each segment (for control purposes) and delivers the segment to the
network layer for transmission. The segments are encapsulated in an IP datagram and
transmitted. This entire operation is transparent to the receiving process. Later we will see
that segments may be received out of order, lost or corrupted, and resent. All of these are
handled by the TCP receiver with the receiving application process unaware of TCPâs
activities. Figure 24.6 shows how segments are created from the bytes in the buffers.
Note that segments are not necessarily all the same size. In the figure, for simplicity,
we show one segment carrying 3 bytes and the other carrying 5 bytes. In reality,
segments carry hundreds, if not thousands, of bytes.
The value in the sequence number field of a segment defines the number assigned to the
first data byte contained in that segment.
When a segment carries a combination of data and control information (piggybacking),
it uses a sequence number. If a segment does not carry user data, it does not
logically define a sequence number. The field is there, but the value is not valid. However,
some segments, when carrying only control information, need a sequence number
to allow an acknowledgment from the receiver. These segments are used for connection
establishment, termination, or abortion. Each of these segments consume one sequence
number as though it carries one byte, but there are no actual data. We will elaborate on
this issue when we discuss connections.
The term cumulative here means that if a party uses 5643 as an acknowledgment number,
it has received all bytes from the beginning up to 5642. Note that this does not mean
that the party has received 5642 bytes, because the first byte number does not have
to be 0.
â Source port address. This is a 16-bit field that defines the port number of the
application program in the host that is sending the segment.
â Destination port address. This is a 16-bit field that defines the port number of the
application program in the host that is receiving the segment.
â Sequence number. This 32-bit field defines the number assigned to the first byte of
data contained in this segment. As we said before, TCP is a stream transport protocol.
To ensure connectivity, each byte to be transmitted is numbered. The sequence
number tells the destination which byte in this sequence is the first byte in the segment.
During connection establishment (discussed later) each party uses a random
number generator to create an initial sequence number (ISN), which is usually
different in each direction.
â Acknowledgment number. This 32-bit field defines the byte number that the
receiver of the segment is expecting to receive from the other party. If the receiver
of the segment has successfully received byte number x from the other party, it
returns x + 1 as the acknowledgment number. Acknowledgment and data can be
piggybacked together.
â Header length. This 4-bit field indicates the number of 4-byte words in the TCP
header. The length of the header can be between 20 and 60 bytes. Therefore, the
value of this field is always between 5 (5 Ã 4 = 20) and 15 (15 Ã 4 = 60).
â Control. This field defines 6 different control bits or flags, as shown in Figure 24.8.
One or more of these bits can be set at a time. These bits enable flow control, connection
establishment and termination, connection abortion, and the mode of data
transfer in TCP. A brief description of each bit is shown in the figure. We will discuss
them further when we study the detailed operation of TCP later in the chapter.
â Window size. This field defines the window size of the sending TCP in bytes. Note
that the length of this field is 16 bits, which means that the maximum size of the
window is 65,535 bytes. This value is normally referred to as the receiving window
(rwnd) and is determined by the receiver. The sender must obey the dictation of the
receiver in this case.
â Checksum. This 16-bit field contains the checksum. The calculation of the checksum
for TCP follows the same procedure as the one described for UDP. However, the
use of the checksum in the UDP datagram is optional, whereas the use of the
checksum for TCP is mandatory. The same pseudoheader, serving the same
purpose, is added to the segment. For the TCP pseudoheader, the value for the protocol
field is 6. See Figure 24.9.
â Urgent pointer. This 16-bit field, which is valid only if the urgent flag is set, is
used when the segment contains urgent data. It defines a value that must be added
to the sequence number to obtain the number of the last urgent byte in the data section
of the segment. This will be discussed later in this chapter.
â Options. There can be up to 40 bytes of optional information in the TCP header.
We will discuss some of the options used in the TCP header later in the section.
Encapsulation
A TCP segment encapsulates the data received from the application layer. The TCP
segment is encapsulated in an IP datagram, which in turn is encapsulated in a frame at
the data-link layer.
TCP is connection-oriented. As discussed before, a connection-oriented transport protocol
establishes a logical path between the source and destination. All of the segments
belonging to a message are then sent over this logical path. Using a single logical pathway
for the entire message facilitates the acknowledgment process as well as retransmission
of damaged or lost frames. You may wonder how TCP, which uses the services
of IP, a connectionless protocol, can be connection-oriented. The point is that a TCP
connection is logical, not physical. TCP operates at a higher level. TCP uses the services
of IP to deliver individual segments to the receiver, but it controls the connection
itself. If a segment is lost or corrupted, it is retransmitted. Unlike TCP, IP is unaware of
this retransmission. If a segment arrives out of order, TCP holds it until the missing segments
arrive; IP is unaware of this reordering.
A SYN segment cannot carry data, but it consumes one sequence number.
A SYN + ACK segment cannot carry data, but it does consume one sequence number.
An ACK segment, if carrying no data, consumes no sequence number.
SYN Flooding Attack
The connection establishment procedure in TCP is susceptible to a serious security
problem called SYN flooding attack. This happens when one or more malicious attackers
send a large number of SYN segments to a server pretending that each of them is
coming from a different client by faking the source IP addresses in the datagrams. The
server, assuming that the clients are issuing an active open, allocates the necessary
resources, such as creating transfer control block (TCB) tables and setting timers. The
TCP server then sends the SYN + ACK segments to the fake clients, which are lost.
When the server waits for the third leg of the handshaking process, however, resources
are allocated without being used. If, during this short period of time, the number of
SYN segments is large, the server eventually runs out of resources and may be unable
to accept connection requests from valid clients. This SYN flooding attack belongs to a
group of security attacks known as a denial of service attack, in which an attacker
monopolizes a system with so many service requests that the system overloads and
denies service to valid requests.
Some implementations of TCP have strategies to alleviate the effect of a SYN
attack. Some have imposed a limit of connection requests during a specified period of
time. Others try to filter out datagrams coming from unwanted source addresses. One
recent strategy is to postpone resource allocation until the server can verify that the
connection request is coming from a valid IP address, by using what is called
a cookie. SCTP, the new transport-layer protocol that we discuss later, uses this
strategy.
The application program at the sender can
request a push operation. This means that the sending TCP must not wait for the window
to be filled. It must create a segment and send it immediately. The sending TCP
must also set the push bit (PSH) to let the receiving TCP know that the segment
includes data that must be delivered to the receiving application program as soon as
possible and not to wait for more data to come. This means to change the byte oriented
TCP to a chunk-oriented TCP, but TCP can choose whether or not to use this
feature.
There are occasions in which an application program needs to send
urgent bytes, some bytes that need to be treated in a special way by the application at the
other end. The solution is to send a segment with the URG bit set. The sending application
program tells the sending TCP that the piece of data is urgent. The sending TCP
creates a segment and inserts the urgent data at the beginning of the segment. The rest
of the segment can contain normal data from the buffer. The urgent pointer field in the
header defines the end of the urgent data (the last byte of urgent data). For example, if
the segment sequence number is 15000 and the value of the urgent pointer is 200, the
first byte of urgent data is the byte 15000 and the last byte is the byte 15200. The rest of
the bytes in the segment (if present) are nonurgent.
It is important to mention that TCPâs urgent data is neither a priority service nor an
out-of-band data service as some people think. Rather, TCP urgent mode is a service by
which the application program at the sender side marks some portion of the byte stream
as needing special treatment by the application program at the receiver side. The
receiving TCP delivers bytes (urgent or nonurgent) to the application program in order,
but informs the application program about the beginning and end of urgent data. It is
left to the application program to decide what to do with the urgent data.
The FIN segment consumes one sequence number if it does not carry data.
The FIN + ACK segment consumes only one sequence number if it does not carry data.
Connection Reset
TCP at one end may deny a connection request, may abort an existing connection, or
may terminate an idle connection. All of these are done with the RST (reset) flag.
In TCP, one end can stop sending data while still receiving data. This is called a half-close. Either the server or the client can issue a half-close request. It can occur when the
server needs all the data before processing can begin.
The state marked ESTABLISHED in the FSM is in fact two different
sets of states that the client and server undergo to transfer data.
Suppose that a segment is not acknowledged during the retransmission time-out period
and is therefore retransmitted. When the sending TCP receives an acknowledgment for
this segment, it does not know if the acknowledgment is for the original segment or for
the retransmitted one. The value of the new RTT is based on the departure of the segment.
However, if the original segment was lost and the acknowledgment is for the
retransmitted one, the value of the current RTT must be calculated from the time the
segment was retransmitted. This ambiguity was solved by Karn.
To deal with a zero-window-size advertisement, TCP needs another timer. If the receiving
TCP announces a window size of zero, the sending TCP stops transmitting segments
until the receiving TCP sends an ACK segment announcing a nonzero window
size. This ACK segment can be lost. Remember that ACK segments are not acknowledged
nor retransmitted in TCP. If this acknowledgment is lost, the receiving TCP
thinks that it has done its job and waits for the sending TCP to send more segments.
There is no retransmission timer for a segment containing only an acknowledgment.
The sending TCP has not received an acknowledgment and waits for the other TCP to
send an acknowledgment advertising the size of the window. Both TCPâs might continue
to wait for each other forever (a deadlock).
To deal with a zero-window-size advertisement, TCP needs another timer. If the receiving
TCP announces a window size of zero, the sending TCP stops transmitting segments
until the receiving TCP sends an ACK segment announcing a nonzero window
size. This ACK segment can be lost. Remember that ACK segments are not acknowledged
nor retransmitted in TCP. If this acknowledgment is lost, the receiving TCP
thinks that it has done its job and waits for the sending TCP to send more segments.
There is no retransmission timer for a segment containing only an acknowledgment.
The sending TCP has not received an acknowledgment and waits for the other TCP to
send an acknowledgment advertising the size of the window. Both TCPâs might continue
to wait for each other forever (a deadlock).
The client and the server can make an association using four different pairs of IP addresses. However, note
that in the current implementations of SCTP, only one pair of IP addresses can be chosen
for normal communication; the alternative is used if the main choice fails. In other
words, at present, SCTP does not allow load sharing between different paths.
An association may send
many packets, a packet may contain several chunks, and chunks may belong to different
streams. To make the definitions of these terms clear, let us suppose that process A
needs to send 11 messages to process B in three streams. The first four messages are in
the first stream, the second three messages are in the second stream, and the last four
messages are in the third stream. Although a message, if long, can be carried by several
data chunks, we assume that each message fits into one data chunk. Therefore, we have
11 data chunks in three streams.
The steps, in a normal situation, are as follows:
1. The client sends the first packet, which contains an INIT chunk. The verification
tag (VT) of this packet (defined in the general header) is 0 because no verification
tag has yet been defined for this direction (client to server). The INIT tag includes
an initiation tag to be used for packets from the other direction (server to client).
The chunk also defines the initial TSN for this direction and advertises a value for
rwnd. The value of rwnd is normally advertised in a SACK chunk; it is done here
because SCTP allows the inclusion of a DATA chunk in the third and fourth packets;
the server must be aware of the available client buffer size. Note that no other chunks
can be sent with the first packet.
2. The server sends the second packet, which contains an INIT ACK chunk. The verification
tag is the value of the initial tag field in the INIT chunk. This chunk initiates
the tag to be used in the other direction, defines the initial TSN, for data flow
from server to client, and sets the serverâs rwnd. The value of rwnd is defined to
allow the client to send a DATA chunk with the third packet. The INIT ACK also
sends a cookie that defines the state of the server at this moment. We will discuss
the use of the cookie shortly.
3. The client sends the third packet, which includes a COOKIE ECHO chunk. This is
a very simple chunk that echoes, without change, the cookie sent by the server.
SCTP allows the inclusion of data chunks in this packet.
4. The server sends the fourth packet, which includes the COOKIE ACK chunk that
acknowledges the receipt of the COOKIE ECHO chunk. SCTP allows the inclusion
of data chunks with this packet.
Association termination uses
three packets, as shown in Figure 24.46. Note that although the figure shows the case
in which termination is initiated by the client, it can also be initiated by the server.
To show the concept, we make some unrealistic
assumptions. We assume that there is never congestion in the network and that
the network is error free. In other words, we assume that cwnd is infinite and no packet
is lost, is delayed, or arrives out of order. We also assume that data transfer is unidirectional.
We correct our unrealistic assumptions in later sections. Current SCTP implementations
still use a byte-oriented window for flow control. We, however, show a
buffer in terms of chunks to make the concept easier to understand.
1. When the site receives a data chunk, it stores it at the end of the buffer (queue) and
subtracts the size of the chunk from winSize. The TSN number of the chunk is
stored in the cumTSN variable.
2. When the process reads a chunk, it removes it from the queue and adds the size of
the removed chunk to winSize (recycling).
3. When the receiver decides to send a SACK, it checks the value of lastAck; if it is
less than cumTSN, it sends a SACK with a cumulative TSN number equal to the
cumTSN. It also includes the value of winSize as the advertised window size. The
value of lastACK is then updated to hold the value of cumTSN.
The following is the procedure used by the sender.
1. A chunk pointed to by curTSN can be sent if the size of the data is less than or
equal to the quantity (rwnd â inTransit). After sending the chunk, the value of
curTSN is incremented by one and now points to the next chunk to be sent. The
value of inTransit is incremented by the size of the data in the transmitted chunk.
2. When a SACK is received, the chunks with a TSN less than or equal to the cumulative
TSN in the SACK are removed from the queue and discarded. The sender
does not have to worry about them anymore. The value of inTransit is reduced by
the total size of the discarded chunks. The value of rwnd is updated with the value
of the advertised window in the SACK.
In our design, the receiver stores all chunks that have arrived in its queue including
the out-of-order ones. However, it leaves spaces for any missing chunks. It discards
duplicate messages, but keeps track of them for reports to the sender. Figure 24.49
shows a typical design for the receiver site and the state of the receiving queue at a
particular point in time.
The last acknowledgment sent was for data chunk 20. The available window size is
1000 bytes. Chunks 21 to 23 have been received in order. The first out-of-order block
contains chunks 26 to 28. The second out-of-order block contains chunks 31 to 34. A
variable holds the value of cumTSN. An array of variables keeps track of the beginning
and the end of each block that is out of order. An array of variables holds the duplicate
chunks received. Note that there is no need for storing duplicate chunks in the queue,
they will be discarded. The figure also shows the SACK chunk that will be sent to
report the state of the receiver to the sender. The TSN numbers for out-of-order chunks
are relative (offsets) to the cumulative TSN.
Figure 24.50 shows a typical design.
The sending queue holds chunks 23 to 40. The chunks 23 to 36 have already
been sent, but not acknowledged; they are outstanding chunks. The curTSN points to
the next chunk to be sent (37). We assume that each chunk is 100 bytes, which means
that 1400 bytes of data (chunks 23 to 36) are in transit. The sender at this moment has
a retransmission queue. When a packet is sent, a retransmission timer starts for that
packet. Some implementations use a single timer for the entire association, but we
continue with our tradition of one timer for each packet for simplification. When the
retransmission timer for a packet expires, or three SACKs arrive that declare a packet
as missing (fast retransmission was discussed for TCP), the chunks in that packet are
moved to the retransmission queue to be resent. These chunks are considered lost,
rather than outstanding. The chunks in the retransmission queue have priority. In
other words, the next time the sender sends a chunk, it would be chunk 21 from the
retransmission queue.
1. All chunks having a TSN equal to or less than the cumTSN in the SACK are
removed from the sending or retransmission queue. They are no longer outstanding
or marked for retransmission. Chunks 21 and 22 are removed from the retransmission
queue and 23 is removed from the sending queue.
2. Our design also removes all chunks from the sending queue that are declared in the
gap blocks; some conservative implementations, however, save these chunks until
a cumTSN arrives that includes them. This precaution is needed for the rare occasion
when the receiver finds some problem with these out-of-order chunks. We
ignore these rare occasions. Chunks 26 to 28 and chunks 31 to 34, therefore, are
removed from the sending queue.
3. The list of duplicate chunks does not have any effect.
4. The value of rwnd is changed to 1000 as advertised in the SACK chunk.
5. We also assume that the transmission timer for the packet that carried chunks 24
and 25 has expired. These move to the retransmission queue and a new retransmission
timer is set according to the exponential backoff rule discussed for TCP.
6. The value of inTransit becomes 400 because only 4 chunks are now in transit. The
chunks in the retransmission queue are not counted because they are assumed lost,
not in transit.
If
we assume, in our previous scenario, that our packet can take 3 chunks (due to the MTU
restriction), then chunks 24 and 25 from the retransmission queue and chunk 37, the next
chunk ready to be sent in the sending queue, can be sent. Note that the outstanding
chunks in the sending queue cannot be sent; they are assumed to be in transit. Note also
that any chunk sent from the retransmission queue is also timed for retransmission again.
The new timer affects chunks 24, 25, and 37. We need to mention here that some implementations
may not allow mixing chunks from the retransmission queue and the sending
queue. In this case, only chunks 24 and 25 can be sent in the packet. (The format of the
data chunk is on the book website.)
The rules for generating SCTP SACK chunks are similar to the rules used for acknowledgment with the TCP ACK flag.