Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade go-ipfix to v0.13.0 #6998

Merged
merged 3 commits into from
Feb 21, 2025

Conversation

antoninbas
Copy link
Contributor

@antoninbas antoninbas commented Feb 14, 2025

As part of this upgrade, we make the following changes:

  • Switch to buffered IPFIX exporter in the Flow Aggregator. This
    exporter has better performance for UDP IPFIX messages, by ensuring
    that multiple data records can be batched together in a single
    message.
  • Provide Path MTU (PMTU) when creating the IPFIX exporter in the Flow
    Aggregator. The value is used by the new buffered exporter to
    determine how many IPFIX records can fit in a single message while
    avoiding IP fragmentation. In our case, we "approximate" the Path MTU
    by looking up the MTU of the Flow Aggregator Pod's eth0 interface.
  • Add a MaxMsgSize configuration parameter to the Flow Aggregator as a
    way to override the default behavior, which is to use the MTU (minus
    header overhead) when the UDP protocol is used.
  • Add periodic flushing when exporting IPFIX records, which is necessary
    after switching to the buffered exporter. In Aggregation mode,
    flushing happens after processing a given batch of expired records. In
    Proxy mode, flushing happens every second.
  • Use updated reference IPFIX collector in e2e tests. The updated
    collector handles the case where multiple data records are included in
    the same IPFIX message more gracefully, which leads to some
    simplification in the test code.

@antoninbas antoninbas added this to the Antrea v2.3 release milestone Feb 14, 2025
@antoninbas antoninbas added the area/flow-visibility Issues or PRs related to flow visibility support in Antrea label Feb 14, 2025
@antoninbas antoninbas force-pushed the upgrade-go-ipfix-to-0.13.0 branch 4 times, most recently from 09b8c16 to 31bbde9 Compare February 19, 2025 20:54
@antoninbas antoninbas marked this pull request as ready for review February 19, 2025 20:54
@@ -144,7 +150,8 @@ func (e *IPFIXExporter) UpdateOptions(opt *options.Options) {
e.observationDomainID = genObservationDomainID(e.clusterUUID)
}
e.templateRefreshTimeout = opt.TemplateRefreshTimeout
klog.InfoS("New IPFIXExporter configuration", "collectorAddress", e.externalFlowCollectorAddr, "collectorProtocol", e.externalFlowCollectorProto, "sendJSON", e.sendJSONRecord, "domainID", e.observationDomainID, "templateRefreshTimeout", e.templateRefreshTimeout)
e.maxIPFIXMsgSize = int(opt.Config.FlowCollector.MaxIPFIXMsgSize)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any range for this? should we check if the value provided by the user is a reasonable value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added validation code, using the same limits as in the go-ipfix library

Comment on lines 172 to 174
if klog.V(7).Enabled() {
klog.InfoS("Data set sent successfully", "bytes sent", sentBytes)
klog.InfoS("Data record added successfully")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it just call klog.V(7).InfoS("Data record added successfully"), getting rid of Enabled check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you are suggesting this because the "cost" is the same in the absence of fields to log?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it's not common to have a dedicated Enabled check when there is no big overhead to serialize the values as it could have been written in one line.

Comment on lines 156 to 157
if e.exportingProcess != nil {
e.exportingProcess.CloseConnToCollector()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it flush bufferedExporter first like Stop()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say it's reasonable not to flush in case of error?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is when user changes the configuration, right? If not flushed, could some records received in the last second be lost. My understanding this is similar to Stop + Start, so I wonder why we flush upon Stop but not here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right actually, I thought you left that comment in a different place. I'll make the change

As part of this upgrade, we make the following changes:
* Switch to buffered IPFIX exporter in the Flow Aggregator. This
  exporter has better performance for UDP IPFIX messages, by ensuring
  that multiple data records can be batched together in a single
  message.
* Provide Path MTU (PMTU) when creating the IPFIX exporter in the Flow
  Aggregator. The value is used by the new buffered exporter to
  determine how many IPFIX records can fit in a single message while
  avoiding IP fragmentation. In our case, we "approximate" the Path MTU
  by looking up the MTU of the Flow Aggregator Pod's eth0 interface.
* Add a MaxMsgSize configuration parameter to the Flow Aggregator as a
  way to override the default behavior, which is to use the MTU (minus
  header overhead) when the UDP protocol is used.
* Add periodic flushing when exporting IPFIX records, which is necessary
  after switching to the buffered exporter. In Aggregation mode,
  flushing happens after processing a given batch of expired records. In
  Proxy mode, flushing happens every second.
* Use updated reference IPFIX collector in e2e tests. The updated
  collector handles the case where multiple data records are included in
  the same IPFIX message more gracefully, which leads to some
  simplification in the test code.

Signed-off-by: Antonin Bas <antonin.bas@broadcom.com>
Signed-off-by: Antonin Bas <antonin.bas@broadcom.com>
@antoninbas antoninbas force-pushed the upgrade-go-ipfix-to-0.13.0 branch from 31bbde9 to 8942f97 Compare February 21, 2025 00:00
Copy link
Contributor

@luolanzone luolanzone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, one nit.

@@ -34,6 +34,7 @@ Kubernetes: `>= 1.19.0-0`
| flowAggregatorAddress | string | `""` | Provide an extra DNS name or IP address of flow aggregator for generating TLS certificate. |
| flowCollector.address | string | `""` | Provide the flow collector address as string with format <IP>:<port>[:<proto>], where proto is tcp or udp. If no L4 transport proto is given, we consider tcp as default. |
| flowCollector.enable | bool | `false` | Determine whether to enable exporting flow records to external flow collector. |
| flowCollector.maxIPFIXMsgSize | int | `0` | Maximum message size to use for IPFIX records. If set to 0 (recommended), a reasonable default value will be used based on the protocol (tcp or udp) used to connect to the collector. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| flowCollector.maxIPFIXMsgSize | int | `0` | Maximum message size to use for IPFIX records. If set to 0 (recommended), a reasonable default value will be used based on the protocol (tcp or udp) used to connect to the collector. |
| flowCollector.maxIPFIXMsgSize | int | `0` | Maximum message size to use for IPFIX records. If set to 0 (recommended), a reasonable default value will be used based on the protocol (tcp or udp) used to connect to the collector. Min valid value is 512 and max valid value is 65535. |

@antoninbas
Copy link
Contributor Author

/test-all

@antoninbas antoninbas closed this Feb 21, 2025
@antoninbas antoninbas reopened this Feb 21, 2025
@antoninbas
Copy link
Contributor Author

closed by mistake...

@antoninbas
Copy link
Contributor Author

/test-all

@antoninbas antoninbas force-pushed the upgrade-go-ipfix-to-0.13.0 branch from 17e2d9d to cb16327 Compare February 21, 2025 03:19
tnqn
tnqn previously approved these changes Feb 21, 2025
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one minor comment about inconsistent error messages

Comment on lines 119 to 122
return nil, fmt.Errorf("maxIPFIXMsgSize cannot be smaller than minimum valid IPFIX mesage size %d", flowaggregatorconfig.MinValidIPFIXMsgSize)
}
if opt.Config.FlowCollector.MaxIPFIXMsgSize > flowaggregatorconfig.MaxValidIPFIXMsgSize {
return nil, fmt.Errorf("maxIPFIXMsgSize cannot be greater than the maximum valid IPFIX mesage size %d", flowaggregatorconfig.MaxValidIPFIXMsgSize)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the error message have "the" in front of minimum/maximum consistently?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thanks

Signed-off-by: Antonin Bas <antonin.bas@broadcom.com>
@antoninbas
Copy link
Contributor Author

/test-all

Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@antoninbas antoninbas merged commit cb1f0de into antrea-io:main Feb 21, 2025
56 of 63 checks passed
@antoninbas antoninbas deleted the upgrade-go-ipfix-to-0.13.0 branch February 21, 2025 06:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/flow-visibility Issues or PRs related to flow visibility support in Antrea
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants