Skip to content

Commit

Permalink
mptcp: sysctl: add syn_retrans_before_tcp_fallback
Browse files Browse the repository at this point in the history
The number of SYN + MPC retransmissions before falling back to TCP was
fixed to 2. This is certainly a good default value, but having a fixed
number can be problem in some environments.

The current behaviour means that if all packets are dropped, there will
be:

- The initial SYN + MPC

- 2 retransmissions with MPC

- The next ones will be without MPTCP.

So typically ~3 seconds before falling back to TCP. In some networks
where some temporally blackholes are unfortunately frequent, or when a
client tries to initiate connections while the network is not ready yet,
this can cause new connections not to have MPTCP connections.

In such environments, it is now possible to increase the number of SYN
retransmissions with MPTCP options to make sure MPTCP is used.

Interesting values are:

- 0: the first retransmission will be done without MPTCP options: quite
     aggressive, but also a higher risk of detecting false-positive
     MPTCP blackholes.

- >= 128: all SYN retransmissions will keep the MPTCP options: back to
          the < 6.12 behaviour.

The default behaviour is not changed here.

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
  • Loading branch information
matttbe authored and intel-lab-lkp committed Jan 14, 2025
1 parent d646e05 commit de60348
Show file tree
Hide file tree
Showing 2 changed files with 33 additions and 4 deletions.
16 changes: 16 additions & 0 deletions Documentation/networking/mptcp-sysctl.rst
Original file line number Diff line number Diff line change
Expand Up @@ -108,3 +108,19 @@ stale_loss_cnt - INTEGER
This is a per-namespace sysctl.

Default: 4

syn_retrans_before_tcp_fallback - INTEGER
The number of SYN + MP_CAPABLE retransmissions before falling back to
TCP, i.e. dropping the MPTCP options. In other words, if all the packets
are dropped on the way, there will be:

* The initial SYN with MPTCP support
* This number of SYN retransmitted with MPTCP support
* The next SYN retransmissions will be without MPTCP support

0 means the first retransmission will be done without MPTCP options.
>= 128 means that all SYN retransmissions will keep the MPTCP options. A
lower number might increase false-positive MPTCP blackholes detections.
This is a per-namespace sysctl.

Default: 2
21 changes: 17 additions & 4 deletions net/mptcp/ctrl.c
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ struct mptcp_pernet {
unsigned int close_timeout;
unsigned int stale_loss_cnt;
atomic_t active_disable_times;
u8 syn_retrans_before_tcp_fallback;
unsigned long active_disable_stamp;
u8 mptcp_enabled;
u8 checksum_enabled;
Expand Down Expand Up @@ -92,6 +93,7 @@ static void mptcp_pernet_set_defaults(struct mptcp_pernet *pernet)
pernet->mptcp_enabled = 1;
pernet->add_addr_timeout = TCP_RTO_MAX;
pernet->blackhole_timeout = 3600;
pernet->syn_retrans_before_tcp_fallback = 2;
atomic_set(&pernet->active_disable_times, 0);
pernet->close_timeout = TCP_TIMEWAIT_LEN;
pernet->checksum_enabled = 0;
Expand Down Expand Up @@ -245,6 +247,12 @@ static struct ctl_table mptcp_sysctl_table[] = {
.proc_handler = proc_blackhole_detect_timeout,
.extra1 = SYSCTL_ZERO,
},
{
.procname = "syn_retrans_before_tcp_fallback",
.maxlen = sizeof(u8),
.mode = 0644,
.proc_handler = proc_dou8vec_minmax,
},
};

static int mptcp_pernet_new_table(struct net *net, struct mptcp_pernet *pernet)
Expand All @@ -269,6 +277,7 @@ static int mptcp_pernet_new_table(struct net *net, struct mptcp_pernet *pernet)
/* table[7] is for available_schedulers which is read-only info */
table[8].data = &pernet->close_timeout;
table[9].data = &pernet->blackhole_timeout;
table[10].data = &pernet->syn_retrans_before_tcp_fallback;

hdr = register_net_sysctl_sz(net, MPTCP_SYSCTL_PATH, table,
ARRAY_SIZE(mptcp_sysctl_table));
Expand Down Expand Up @@ -392,17 +401,21 @@ void mptcp_active_enable(struct sock *sk)
void mptcp_active_detect_blackhole(struct sock *ssk, bool expired)
{
struct mptcp_subflow_context *subflow;
u32 timeouts;

if (!sk_is_mptcp(ssk))
return;

timeouts = inet_csk(ssk)->icsk_retransmits;
subflow = mptcp_subflow_ctx(ssk);

if (subflow->request_mptcp && ssk->sk_state == TCP_SYN_SENT) {
if (timeouts == 2 || (timeouts < 2 && expired)) {
MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_MPCAPABLEACTIVEDROP);
struct net *net = sock_net(ssk);
u8 timeouts, to_max;

timeouts = inet_csk(ssk)->icsk_retransmits;
to_max = mptcp_get_pernet(net)->syn_retrans_before_tcp_fallback;

if (timeouts == to_max || (timeouts < to_max && expired)) {
MPTCP_INC_STATS(net, MPTCP_MIB_MPCAPABLEACTIVEDROP);
subflow->mpc_drop = 1;
mptcp_subflow_early_fallback(mptcp_sk(subflow->conn), subflow);
} else {
Expand Down

0 comments on commit de60348

Please sign in to comment.