TCP acceleration


TCP acceleration is the name of a series of techniques for achieving better throughput on a network connection than standard TCP achieves, without modifying the end applications. It is an alternative or a supplement to TCP tuning.
Commonly used approaches include checksum offloading, TCP segmentation and reassembly offloading, DMA offloading, ACK pacing, TCP transparent proxies in two or more middleboxes, and TCP offload engines.

TCP transparent proxies

TCP transparent proxies involve breaking up of long end-to-end control loops to several smaller control loops by intercepting and relaying TCP connections within the network. By adopting this procedure, they allow for the TCP flows to have a shorter reaction time to packet losses which may occur within the network and thus guarantees a higher throughput.
The idea of a TCP accelerator is to terminate TCP connections inside the network processor and then relay the data to a second connection toward the end system. The data packets that originate from the sender are buffered at the accelerator node, which is responsible for performing local retransmissions in the event of packet loss. Thus, in case of losses, the feedback loop between the sender and the receiver is shortened to the one between the acceleration node and the receiver which guarantees a faster delivery of data to the receiver.
Since TCP is a rate-adaptive protocol, the rate at which the TCP sender injects
packets into the network is directly proportional to the prevailing load condition within the network as well as the processing capacity of the receiver. The prevalent conditions within the network are judged by the sender on the basis of the acknowledgments received by it. The acceleration node splits the feedback loop between the sender and the receiver and thus guarantees a shorter round trip time per packet. A shorter RTT is beneficial as it ensures a quicker response time to any changes in the network and a faster adaptation by the sender to combat these changes.
Disadvantages of the method include the fact that the TCP session has to be directed through the accelerator; this means that if routing changes, so that the accelerator is no longer in the path, the connection will be broken. It also destroys the end-to-end property of the TCP ack mechanism; when the ACK is received by the sender, the packet has been stored by the accelerator, not delivered to the receiver.

Asymmetric TCP acceleration

While TCP proxies require such devices to be deployed at both parties of the communication because the protocol running between the proxies is usually proprietary, asymmetric TCP acceleration is able to boost the network performance with unilateral deployment, i.e., only one end of the peers is required to deploy the device or software.
Asymmetric TCP acceleration implies the WAN-side protocol has to be TCP of the same 5-tuples and states. The implementations typically terminate the TCP flows on the LAN side like the TCP proxies do. On the WAN side, however, they mirror the TCP state machines and establish the TCP flows to the peers. To accelerate, they usually run a compatible version of TCP with performance improvements on the WAN side. While most of the improvements, such as FAST TCP, Zeta-TCP, etc., are focused on the TCP congestion avoidance algorithm, some also attempt to promote the performance of other aspects of the protocol. For instance, Zeta-TCP provides more accurate loss detection and heuristic download acceleration in addition to its congestion avoidance algorithm.
Compared with the symmetric TCP proxies, asymmetric TCP acceleration is more flexible in all kinds of deployment scenarios. A typical setup is to deploy the asymmetric acceleration device on the server side only. Then all the accessing clients, without having to install any extra software, will benefit from it. Performance-wise, without compression factors, asymmetric TCP acceleration is capable of offering the same level of improvement as the symmetric ones.
However, with symmetric deployment, the proxies are able to perform data compression and caching operations which further boost the performance by a factor of the compression ratio. The drawback of the compression/caching, though, is added latency and burst on the receiver side.