A new feature in OpenBSD enables TCP_NODELAY mode system-wide

Job Snyders (Job Snijders), a veteran OpenBSD developer, head of one of the committees in the IETF (Internet Engineering Task Force), author of 11 RFCs related to routing and RPKI, and creator IRRd (Internet Routing Registry Daemon), published a set of patches for OpenBSD adding a new sysctl parameter “net.inet.tcp.nodelay” to disable Nagle's algorithm at the level of the entire system. This parameter relieves application developers from setting the TCP_NODELAY flag for individual sockets.

Nagle's algorithm is used to aggregate small messages to reduce traffic. The algorithm pauses sending new TCP segments until confirmation of receipt of previously sent data is received or until a timeout occurs. For example, without using aggregation, when sending 1 byte, an additional 40 bytes are sent with the TCP and IP headers of the packet, and with the use of the Nagle algorithm, messages sent before the confirmation from the remote side arrives are accumulated and sent in one packet. Due to the presence of the “delayed ACK” optimization, which delays the sending of ACK packets, signaling through acknowledgment packets does not actually work, and the accumulated messages are sent when a timeout occurs.


Snyders is of the opinion that in modern realities, Nagle's algorithm, developed in a time when several users competed for 1200 baud bandwidth, is outdated and does more harm than good on high-speed networks. Recently, a similar position was also expressed by Marc Brooker from Amazon Web Services (AWS). The arguments for disabling the Nagle algorithm by default can be found in a note published a few days ago.

To disable the Nagle algorithm, the TCP_NODELAY option is provided, which can be set for individual network sockets. The TCP_NODELAY mode has long been enabled in many OpenBSD applications, including openssh, httpd, iscsid, relayd, bgpd and unwind, and Snijders believes the time has come to provide a system-wide option to enable it for all TCP sockets. Snijders also suggests discussing the issue of enabling TCP_NODELAY by default and making Nagle's algorithm a separate enableable option.

Thanks for reading: