Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gatekeeper: rewrite initialization of network interfaces #688

Merged
merged 9 commits into from
May 24, 2024

Conversation

AltraMayor
Copy link
Owner

@AltraMayor AltraMayor commented May 8, 2024

The original version of the bonding driver required applications to set many parameters of the members of a bonded interface directly. This is no longer the case, and this pull request reviews the initialization code of network interfaces to accomplish the following:

  1. Simplifying the initialization code by leveraging the current bonding driver;
  2. Fixing a LACP bug when a bonded interface has two or more physical interfaces and all support multicast;
  3. Improving transmission capacity of bonded interfaces by adding IPv4 and IPv6 addresses into the balancing hash;
  4. Reporting MAC addresses of the interfaces to enable diagnoses;
  5. Monitoring the state of the links through interruptions;
  6. Updating the calculation of the bandwidth of the request channel;
  7. Fixing miscalculations in the size of mbuf pools.

@AltraMayor AltraMayor added the Production requirement Either the issue is solved, or Gatekeeper doesn't work in production label May 8, 2024
@AltraMayor AltraMayor added this to the Version 1.2 milestone May 8, 2024
@AltraMayor AltraMayor force-pushed the multi_port branch 2 times, most recently from 4cf4833 to 53fabfb Compare May 13, 2024 23:00
Copy link
Owner Author

@AltraMayor AltraMayor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lib/net: handle bonded interfaces as any other

lib/net.c Outdated Show resolved Hide resolved
Copy link
Owner Author

@AltraMayor AltraMayor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lib/net: report the MAC addresses of all interfaces

lib/net.c Show resolved Hide resolved
lib/net.c Outdated Show resolved Hide resolved
lib/net.c Outdated Show resolved Hide resolved
Copy link
Owner Author

@AltraMayor AltraMayor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lib/net: monitor state of the links

include/gatekeeper_log_ratelimit.h Outdated Show resolved Hide resolved
lib/net.c Outdated Show resolved Hide resolved
Copy link
Owner Author

@AltraMayor AltraMayor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sol: require parameter req_channel_bw_mbps

sol/main.c Show resolved Hide resolved
Copy link
Owner Author

@AltraMayor AltraMayor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sol: require parameter req_channel_bw_mbps

Replace the example in the description of the patch with the following because it better represents the problem:

For example, Gatekeeper servers typically operate in networks with higher speeds (e.g., 40Gbps) than the protected destination (e.g., 10Gbps).

The original version of the bonding driver required applications
to directly set many parameters of the members of a bonded interface.
Since this is no longer the case, this commit simplifies
the initialization code by leveraging the new interface.
This commit implements the workaround discussed in issue #686.
By default, bonded interfaces load balance the outgoing packets by
hashing only the MAC addresses. This commit changes this hash to
include the source and destination addresses of IPv4 and IPv6
headers as well.
Knowing the MAC addresses of all interfaces is useful to diagnose
issues in production; especially issues related to bonded interfaces.

Not only are the primary MAC addresses of the interfaces reported,
but also the secondary MAC addresses of the interfaces.
Waiting for links to come up is no longer helpful in the current
version of DPDK.
Having log entries reporting changes in the state of the links
eases diagnoses in production. This commit adds these log entries
through the Link State Change (LSC) interuption that most NICs
support. The information that a given NIC does not support
the LSC interuption is logged, but its link state is not monitored.

Since the callbacks of interuptions run in threads that are not
logical cores, this commit introduces the macro MAIN_LOG() to
Gatekeeper's log library to be used in these contexts.
If req_queue_init() fails before _all_ request queues are
initiated, sol_stage2() calls cleanup_sol(), which, in turn,
triggers a segmentation fault.

This commit solves this problem by adding list_initiated() to
include/list.h, and using it to test if the request queues are
initiated in cleanup_sol() before freeing the associated resources.
The original configuration of SOL blocks assumes that
the bandwidth of the back interface of a Gatekeeper server
is representative of the bandwidth of the destination network.
This assumption does not hold in production deployments of
Gatekeeper.  For example, Gatekeeper servers typically operate
in networks with higher speeds (e.g., 40Gbps) than the protected
destination (e.g., 10Gbps).

This commit introduces the parameter destination_bw_gbps to address
this reality.
This commit fixes two miscalculations in the size of mbuf pools:

1. calculate_mempool_config_para() was not accounting for
multiple members in bonded interfaces; and

2. Although unlikely, a GK or GT instance might receive all
the packets, so the instances must account for this worst case.
@AltraMayor AltraMayor marked this pull request as ready for review May 21, 2024 22:45
@AltraMayor
Copy link
Owner Author

This pull request was successfully tested in production.

@AltraMayor AltraMayor merged commit cc1d98d into v1.2.0-dev May 24, 2024
1 check passed
@AltraMayor AltraMayor deleted the multi_port branch May 24, 2024 18:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Production requirement Either the issue is solved, or Gatekeeper doesn't work in production
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant