securityproxy

How to Optimize TCP Traffic on Windows and Linux

FastSox Team2026-03-2714 min read

When you route your traffic through a VPN or proxy, you are adding at least one extra network hop, an encryption layer, and potentially a tunnel encapsulation overhead. If your underlying TCP stack is not tuned for these conditions, you are leaving significant throughput and latency on the table — even with a fast, low-latency gateway on the other end.

This guide covers the kernel-level and OS-level TCP tuning that matters most for VPN and proxy users, with runnable commands for both Linux and Windows. The final section explains how FastSox automates these optimizations inside its tunnel stack so you get the benefits without manually touching system settings.


Why TCP Performance Matters for VPN and Proxy Users

TCP was designed in 1981, long before gigabit consumer broadband, high-latency intercontinental links, or encrypted tunnels existed. The default settings shipped in most operating systems are conservative compromises — reasonable for a 1990s LAN, but suboptimal for a modern encrypted proxy connection.

The problems compound when tunneling is involved:

  • RTT amplification. Your TCP connection now traverses the client → proxy → destination path. Every ACK round-trip is longer than a direct connection. Congestion control algorithms that assume short RTTs under-utilise the available bandwidth.
  • Buffer bloat. Overly large buffers in routers and modems add latency without improving throughput. Proper buffer sizing minimises queuing delay.
  • MTU mismatch. Every tunnel protocol adds header bytes. If your inner packets are sized for a raw 1500-byte Ethernet MTU, the tunnel will silently fragment them — doubling the number of packets and adding unpredictable latency spikes.
  • Head-of-line blocking. When TCP is used as the transport for a proxy (e.g. SOCKS5 over TCP), a single lost packet stalls all multiplexed streams until retransmission completes.

Getting these settings right can double effective throughput on high-latency intercontinental routes, and can reduce latency tail percentiles by 30–50%.


Key TCP Concepts

Before running commands, it helps to understand what each knob actually controls.

Congestion Control

TCP's congestion control algorithm decides how fast to send data based on observed network conditions. The classic algorithm (Reno/CUBIC) is loss-based: it increases the send rate until a packet is dropped, then backs off. This works well on local networks but performs poorly over high-latency or high-bandwidth-delay product (BDP) links, because it takes many round trips to fill the available pipe after a back-off.

BBR (Bottleneck Bandwidth and Round-trip propagation time), developed by Google and released in Linux 4.9, is model-based: it estimates the available bandwidth and the minimum RTT independently, and targets a send rate that saturates the bottleneck without filling queues. On long-distance links, BBR typically achieves 2–5× the throughput of CUBIC.

Window Scaling and Buffer Sizes

TCP's receive window determines how much unacknowledged data can be in flight at once. The theoretical maximum throughput is:

max_throughput = window_size / RTT

With a default 64 KB window and a 100 ms RTT, you are capped at ~5 Mbps — regardless of your actual link speed. Window scaling (RFC 1323) extends the window to up to 1 GB, but the OS must allocate memory to back it. The rmem and wmem parameters control those limits.

MTU and MSS

MTU (Maximum Transmission Unit) is the largest Layer 2 frame your network can carry. On Ethernet, the standard is 1500 bytes. MSS (Maximum Segment Size) is the TCP payload that fits inside an MTU frame after subtracting IP (20 bytes) and TCP (20 bytes) headers: 1500 - 40 = 1460 bytes.

Tunnel protocols add their own headers on top of the original IP/TCP headers, reducing the available payload. If the MSS is not adjusted to account for tunnel overhead, packets will be fragmented (or dropped if the DF bit is set), causing retransmissions and unpredictable latency.

TCP Fast Open

TCP Fast Open (TFO) allows data to be sent in the SYN packet during the TCP handshake, saving one round-trip for connection establishment. For short-lived proxy connections — which are opened and closed frequently — TFO can meaningfully reduce latency.


Linux Optimizations

Checking Current Settings

Before making changes, record your baseline:

# Show all TCP-related kernel parameters
sysctl -a | grep -E "tcp|rmem|wmem|qdisc"

# Show current congestion control algorithm
sysctl net.ipv4.tcp_congestion_control

# Show available congestion control algorithms
sysctl net.ipv4.tcp_available_congestion_control

# Show current queue discipline
sysctl net.core.default_qdisc

Enabling BBR Congestion Control

BBR requires Linux kernel 4.9 or later (check with uname -r). Most modern distributions ship a kernel that supports it, but the module may not be loaded.

# Load the BBR module
sudo modprobe tcp_bbr

# Verify it loaded
lsmod | grep bbr

# Apply BBR as the active congestion control algorithm
sudo sysctl -w net.ipv4.tcp_congestion_control=bbr

# Set the Fair Queuing (fq) scheduler — BBR works best with fq
sudo sysctl -w net.core.default_qdisc=fq

# Confirm both settings took effect
sysctl net.ipv4.tcp_congestion_control net.core.default_qdisc

TCP Buffer Sizes

The default buffer sizes on most Linux distributions cap throughput at well below 1 Gbps on any link with meaningful RTT. The values below are appropriate for a host with 4 GB+ RAM on a modern broadband or data-centre link.

# Maximum receive socket buffer (bytes)
sudo sysctl -w net.core.rmem_max=134217728

# Maximum send socket buffer (bytes)
sudo sysctl -w net.core.wmem_max=134217728

# TCP read buffer: min, default, max (bytes)
sudo sysctl -w net.ipv4.tcp_rmem="4096 87380 134217728"

# TCP write buffer: min, default, max (bytes)
sudo sysctl -w net.ipv4.tcp_wmem="4096 65536 134217728"

# Allow auto-tuning of socket buffer sizes
sudo sysctl -w net.ipv4.tcp_moderate_rcvbuf=1

The 134217728 value is 128 MB. For a 1 Gbps link with 100 ms RTT, the bandwidth-delay product is 1e9 * 0.1 / 8 = ~12.5 MB, so 128 MB gives more than enough headroom.

TCP Fast Open

TFO saves a full RTT on connection establishment. Enable it for both outgoing and incoming connections (value 3 = both directions):

sudo sysctl -w net.ipv4.tcp_fastopen=3

Note that some middleboxes (firewalls, NAT devices) drop SYN packets with data. If you observe connection failures after enabling TFO, set the value back to 1 (client-only) or 0 (disabled).

Additional Tuning

# Reduce SYN retransmission timeout — fail fast on broken paths
sudo sysctl -w net.ipv4.tcp_syn_retries=3

# Enable selective ACK — faster recovery from packet loss
sudo sysctl -w net.ipv4.tcp_sack=1

# Enable TCP timestamps — used by BBR and PAWS replay protection
sudo sysctl -w net.ipv4.tcp_timestamps=1

# Reduce TIME_WAIT socket lingering
sudo sysctl -w net.ipv4.tcp_fin_timeout=15

Making Settings Persistent with sysctl.conf

sysctl -w changes are lost on reboot. To persist them, write the settings to /etc/sysctl.d/99-tcp-tuning.conf:

sudo tee /etc/sysctl.d/99-tcp-tuning.conf > /dev/null <<'EOF'
# Congestion control
net.core.default_qdisc = fq
net.ipv4.tcp_congestion_control = bbr

# Socket buffer sizes
net.core.rmem_max = 134217728
net.core.wmem_max = 134217728
net.ipv4.tcp_rmem = 4096 87380 134217728
net.ipv4.tcp_wmem = 4096 65536 134217728
net.ipv4.tcp_moderate_rcvbuf = 1

# TCP Fast Open (both directions)
net.ipv4.tcp_fastopen = 3

# Reliability
net.ipv4.tcp_sack = 1
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_syn_retries = 3
net.ipv4.tcp_fin_timeout = 15
EOF

# Apply immediately without rebooting
sudo sysctl --system

To load only your new file:

sudo sysctl -p /etc/sysctl.d/99-tcp-tuning.conf

Windows Optimizations

Windows uses a different TCP stack (TCPIP.sys) with its own tuning interface. All commands below use netsh, which ships with every version of Windows since Vista. Run them in an elevated Command Prompt or PowerShell session (Run as Administrator).

Checking Current TCP Settings

# Show all global TCP settings
netsh interface tcp show global

# Show supplemental TCP settings (Windows 8.1 / Server 2012 R2+)
netsh interface tcp show supplemental

# Show per-adapter settings
netsh interface tcp show heuristics

Key fields to note:

  • Congestion Provider — currently active algorithm (default is CUBIC on Windows 10/11)
  • Receive Window Auto-Tuning Level — controls buffer growth behaviour
  • ECN Capability — Explicit Congestion Notification
  • Timestamps — RFC 1323 timestamps

Congestion Control: Enabling CTCP or CUBIC

Windows 10 and later ship with CUBIC as the default congestion provider. Earlier versions default to NewReno. Compound TCP (CTCP) is a good option on older Windows versions; on Windows 10/11 stick with CUBIC or leave the provider at its default.

# Windows 10 / 11 — CUBIC is already the default; confirm it
netsh interface tcp show global | Select-String "Congestion"

# If you see "None" or "NewReno", upgrade to CUBIC
netsh int tcp set global congestionprovider=cubic

# On older systems (Windows 7 / Server 2008 R2), use CTCP
netsh int tcp set global congestionprovider=ctcp

Receive Window Auto-Tuning

Windows auto-tuning adjusts the TCP receive window dynamically. The default level is normal, which is appropriate for most users. If you are behind a transparent proxy or a corporate firewall that interferes with window scaling, try highlyrestricted or disabled to diagnose, but normal is the correct production value.

# Set auto-tuning to the recommended level
netsh int tcp set global autotuninglevel=normal

# Available levels: disabled | highlyrestricted | restricted | normal | experimental
# Use 'experimental' on very high-BDP links (e.g., satellite, trans-Pacific fibre)
netsh int tcp set global autotuninglevel=experimental

Explicit Congestion Notification (ECN)

ECN allows routers to signal congestion by marking packets rather than dropping them. When a marked packet arrives, TCP backs off — avoiding the retransmission cost of actual loss. Modern cloud and CDN networks support ECN end-to-end.

# Enable ECN
netsh int tcp set global ecncapability=enabled

# Verify
netsh interface tcp show global | Select-String "ECN"

Timestamps

TCP timestamps (RFC 1323) improve RTT estimation and enable PAWS (Protection Against Wrapped Sequence numbers), which is required for correctness on high-speed links. They are enabled by default on Windows 10/11 but worth verifying:

netsh int tcp set global timestamps=enabled

Checking and Setting MTU per Adapter

Windows sets the MTU per network adapter. To check current values:

# List all adapters with their MTU
netsh interface ipv4 show subinterfaces

# Or via PowerShell
Get-NetIPInterface | Select-Object InterfaceAlias, NlMtu

To set the MTU on a specific adapter (replace "Ethernet" with your adapter name):

# Set MTU to 1420 for WireGuard tunnels
netsh interface ipv4 set subinterface "Ethernet" mtu=1420 store=persistent

# Verify
netsh interface ipv4 show subinterfaces

MTU Optimization for VPN Tunnels

MTU misconfiguration is one of the most common causes of degraded performance inside VPN tunnels. The symptoms are subtle: large file transfers work but feel slow, video streaming stutters, or SSH sessions lag intermittently.

How Tunnel Encapsulation Reduces Effective MTU

Each tunnel protocol wraps the original IP packet in additional headers:

| Protocol | Header Overhead | Recommended MTU | |----------|----------------|-----------------| | WireGuard (UDP/IPv4) | 60 bytes | 1420 | | WireGuard (UDP/IPv6) | 80 bytes | 1400 | | OpenVPN (UDP) | 70–100 bytes | 1400 | | SOCKS5 over TCP | TCP/IP stack handles it | — |

The standard Ethernet MTU is 1500 bytes. WireGuard adds 20 (outer IPv4) + 8 (UDP) + 32 (WireGuard header) = 60 bytes of overhead, leaving 1440 bytes for the inner packet. Subtract 20 bytes for inner IP and 20 bytes for inner TCP headers, and you have an MSS of 1400 bytes — but WireGuard's recommended MTU setting of 1420 already accounts for common cases.

Setting MTU on Linux WireGuard Interface

# Set MTU on the WireGuard interface at creation time (in wg0.conf)
[Interface]
MTU = 1420

# Or set it live
sudo ip link set dev wg0 mtu 1420

# Verify
ip link show wg0

Path MTU Discovery

If you cannot manually set the MTU, ensure Path MTU Discovery (PMTUD) is working. PMTUD relies on ICMP "Fragmentation Needed" messages being able to traverse the network. Firewalls that block all ICMP will break PMTUD, causing the "black hole" problem where large packets silently fail.

# Linux: check PMTUD behaviour
sysctl net.ipv4.ip_no_pmtu_disc   # should be 0 (PMTUD enabled)

# Test manually: send a large packet with DF bit set
ping -M do -s 1400 8.8.8.8

On Windows:

# Send a large ICMP packet with DF bit set (no fragmentation)
ping -f -l 1400 8.8.8.8

If the ping fails at 1400 bytes but succeeds at 1372 bytes, you have a tunnel with approximately 28 bytes of overhead.


Validating Improvements with iperf3

After applying optimizations, measure the actual effect with iperf3. You need an iperf3 server reachable through (or outside) your VPN tunnel.

Installing iperf3

# Debian / Ubuntu
sudo apt install iperf3

# RHEL / CentOS / Fedora
sudo dnf install iperf3

# macOS
brew install iperf3

On Windows, download the official binary from https://iperf.fr/iperf-download.php and run it from PowerShell.

Running a Baseline Test

On the server side:

iperf3 -s

On the client side (replace <server-ip> with the iperf3 server address):

# TCP throughput test, 10 seconds
iperf3 -c <server-ip> -t 10

# With parallel streams (better utilises multi-core and multiple paths)
iperf3 -c <server-ip> -t 10 -P 4

# Reverse test (server sends to client — tests your download path)
iperf3 -c <server-ip> -t 10 -R

# JSON output for scripting
iperf3 -c <server-ip> -t 10 -J | python3 -m json.tool

What to Look For

  • Sender bitrate and Receiver bitrate should be close. A large gap suggests packet loss or buffer exhaustion.
  • Retransmits should be near zero. High retransmit counts indicate congestion, MTU fragmentation, or buffer overflow.
  • Run the test before and after applying sysctl changes. On a trans-Pacific link with default settings, it is common to see throughput increase from 50–100 Mbps to 400–600 Mbps after enabling BBR and increasing buffer sizes.
# Example before/after comparison
echo "=== Before ===" && iperf3 -c <server-ip> -t 10 -J | python3 -c "
import sys, json
d = json.load(sys.stdin)
s = d['end']['sum_sent']
print(f'Throughput: {s[\"bits_per_second\"]/1e6:.1f} Mbps, Retransmits: {s[\"retransmits\"]}')
"

How FastSox Handles These Optimizations Automatically

Manually tuning sysctl settings on every device you own is tedious and easy to get wrong. FastSox, developed by OneDotNet Ltd, addresses this at the infrastructure level so end users do not need to touch system settings.

Inside the tunnel stack:

  • Gateway MTU is set precisely. Every FastSox gateway configures its WireGuard interface MTU to 1420 (IPv4) or 1400 (IPv6), and clamps the MSS on outbound TCP connections via iptables/nftables rules. Inner packets never get fragmented.
  • BBR is the default congestion control on all FastSox gateway hosts. Combined with the fq queue discipline, this means the gateway-side TCP stack is already optimised for high-BDP paths.
  • Per-user buffer sizing. The FastSox gateway platform is tuned with the same rmem/wmem settings described in this guide. Your traffic does not compete with an under-tuned kernel buffer.
  • ECN is enabled end-to-end. FastSox gateways signal and honour ECN marks, so backpressure from congested links is communicated without packet drops.

On the client side, the FastSox application on Linux applies the relevant sysctl settings to the tunnel interface at connection time. On Windows, the application sets the interface MTU for the virtual adapter used by the tunnel, avoiding fragmentation without requiring administrative intervention.

The result is that the performance gap between a naively configured connection and a fully tuned one is largely eliminated — out of the box.

For a technical overview of the FastSox tunnel architecture, see What is HyperSox Protocol.


Summary

TCP performance tuning for VPN and proxy users comes down to four levers:

| Setting | Linux | Windows | |---------|-------|---------| | Congestion control | BBR via net.ipv4.tcp_congestion_control=bbr | CUBIC via netsh int tcp set global congestionprovider=cubic | | Buffer sizes | net.core.rmem_max / wmem_max = 128 MB | Auto-tuning level = normal or experimental | | TCP Fast Open | net.ipv4.tcp_fastopen=3 | Enabled by default on Windows 10+ | | MTU / MSS | WireGuard interface MTU = 1420, verify with ping -M do | Per-adapter MTU via netsh interface ipv4 set subinterface |

Apply these changes, measure with iperf3, and you will see the difference — especially on intercontinental or high-latency paths where the gap between default and tuned performance is widest.

If you would rather skip the manual tuning entirely, FastSox handles all of this at the infrastructure level, giving you optimised TCP performance on every connection without touching a single system file.

#tcp#linux#windows#networking#performance#bbr#optimization

Related Articles