Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output/TLS: Allow logging of client/server handshake parameters - V4 #12650

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

rmcconnell-r7
Copy link
Contributor

@rmcconnell-r7 rmcconnell-r7 commented Feb 21, 2025

Make sure these boxes are checked accordingly before submitting your Pull Request -- thank you.

Contribution style:

Our Contribution agreements:

Changes (if applicable):

Link to ticket: https://redmine.openinfosecfoundation.org/issues/6695

Describe changes:

  • Introduce JSON logging of client handshake parameters
  • Introduce JSON logging of server handshake parameters
  • Enable tracking JA4 fields without ja4-fingerprints being explicitly enabled

Provide values to any of the below to override the defaults.

  • To use an LibHTP, Suricata-Verify or Suricata-Update pull request,
    link to the pull request in the respective _BRANCH variable.
  • Leave unused overrides blank or remove.

SV_REPO=
SV_BRANCH=OISF/suricata-verify#2313
SU_REPO=
SU_BRANCH=
LIBHTP_REPO=
LIBHTP_BRANCH=

Add new custom log fields:

"client_handshake" which logs the following:
1. TLS version used during handshake
2. TLS extensions, excluding GREASE, SNI and ALPN
3. All cipher suites, excluding GREASE
4. All signature algorithms, excluding GREASE

The use-case is for logging TLS handshake parameters in order to survey
them, and so that JA4 hashe can be computed offline (in the case that
they're not already computed for the purposes of rule matching).
"server_handshake" which logs the following:
1. TLS version used during handshake
2. The chosen cipher suite, excluding GREASE
3. TLS extensions, excluding GREASE
The JA4 object can now 'track' the data from each TLS conversation
without the user having to explicitly enabling ja4-fingerprint. This is
to allow for other fields to be output without the JA4, for example
client_handshake.
@rmcconnell-r7
Copy link
Contributor Author

Replaces: #12526

@victorjulien
Copy link
Member

Ci failures for rust clippy are expected currently.

Copy link

codecov bot commented Feb 21, 2025

Codecov Report

Attention: Patch coverage is 98.24561% with 2 lines in your changes missing coverage. Please review.

Project coverage is 80.78%. Comparing base (d61f36c) to head (f0f22e6).

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #12650      +/-   ##
==========================================
+ Coverage   80.76%   80.78%   +0.01%     
==========================================
  Files         932      931       -1     
  Lines      259381   259448      +67     
==========================================
+ Hits       209484   209589     +105     
+ Misses      49897    49859      -38     
Flag Coverage Δ
fuzzcorpus 56.99% <51.75%> (-0.02%) ⬇️
livemode 19.35% <15.78%> (-0.01%) ⬇️
pcap 44.17% <50.00%> (+0.01%) ⬆️
suricata-verify 63.52% <98.24%> (+0.03%) ⬆️
unittests 58.32% <13.15%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Copy link
Contributor

@catenacyber catenacyber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this work

CI : not fully green, but a rebase should do it
Code : looking
Commits segmentation : ok
Commit messages : ⚠️ missing a ticket number in these
Also sv- and cl- could be srv and cli for more clarity : sv is suricata-verify for us ;-)

typo JA4 hashe missing s

Git ID set : looks fine for me
CLA : you already contributed
Doc update : ok but ⚠️ there should be an update to suricata.yaml.in also right ?
Redmine ticket : ok
Rustfmt : nit: could you add a commit that does rustfmt rust/src/ja4.rs
Tests : ok thanks :-)
Dependencies added: none

jb_open_object(js, "client_handshake");

const uint16_t vers = SCJA4GetVersion(ssl_state->client_connp.ja4);
JsonTlsLogVersion(js, vers);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we use a different version than ssl_state->server_connp.version ? Can it have different values ?

Copy link
Contributor Author

@rmcconnell-r7 rmcconnell-r7 Feb 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

 {"timestamp":"2024-05-08T15:10:21.211207+0100","flow_id":1452419569489363,"pcap_cnt":6764,"event_type":"tls","src_ip":"172.17.56.188","src_port":61461,"dest_ip":"172.17.0.130","dest_port":8080,"proto":"TCP","pkt_src":"wire/pcap","tls":{"sni":"collector.azure.microsoft.scloud","version":"UNDETERMINED","ja3":{"hash":"66eaff235d22f1390d518dda0a0adb10","string":"771,49196-49195-49188-49187-49162-49161-49192-49191-49172-49171-157-156-61-60-53-47-10,0-5-10-11-13-35-23-65281,25-24-23,0"},"ja4":"t12d170800_699f1e34060c_7af1ed941c26","client_handshake":{"version":"TLS 1.2","ciphers":[49196,49195,49188,49187,49162,49161,49192,49191,49172,49171,157,156,61,60,53,47,10],"exts":[0,5,10,11,13,35,23,65281],"sig_algs":[2052,2053,2054,1025,1281,513,1027,1283,515,514,1537,1539]},"from_proto":"http"}}

If you see a snippet here, you will see the version from the output is UNDETERMINED whereas the version in the client_handshake, is: TLS 1.2, which is used to generate the JA4.

The version in this case is the server-version, as you've suggested, so if there hasn't been any server-side communication, this won't be set whilst the JA4 can be generated (as this is client-side). So the idea was to expose the 'raw fields' that are used to generate the hash.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, deserves a comment in the code ;-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, will be done in the next iteration :)

#ifndef SURICATA_UTIL_JA4_H
#define SURICATA_UTIL_JA4_H

#define JA4_HEX_LEN 36
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice removal :-)

@@ -1553,8 +1553,7 @@ static int TLSDecodeHandshakeHello(SSLState *ssl_state,
/* Ensure that we have a JA4 state defined by now, we are in a client/server hello
and we don't have such a state yet (to avoid leaking memory in case this function
is entered more than once). */
if (SC_ATOMIC_GET(ssl_config.enable_ja4) &&
ssl_state->current_flags &
if (ssl_state->current_flags &
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jasonish what do you think about this commit ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems the if is still relevant, as we're testing for it as part of the if statement?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean it removes the config check ssl_config.enable_ja4 for ja4 computation, and moves it just before logging it...
Does that look ok ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the original intention was to avoid computing it when disabled, not simply avoid logging it. Something else that I'm realizing is just how much JA4 is still in the code with the --disable-ja4 compile time option. I wonder how this plays with that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, its just stubs. But the original intention stands.

Copy link
Member

@jasonish jasonish Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to disassociate the fields that go into computing JA4, from JA4 itself. That way they could be tracked and logged even if ja4 is disabled at compile time? Thoughts @victorjulien @catenacyber ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be cool indeed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find that a little confusing, sorry! So the fields that are used to make up a JA4 are to be unrelated to a JA4? Or are you meaning for it to be a handshake object that can produce a JA4? With this change, it really is just that, only it's called JA4. Maybe calling it Handshake would be more suited? So they are tracked without JA$ enablement but we only produce the hash when required? Really, that is what I have produced. So I'm a little uncertain of what you're meaning..

Copy link
Member

@jasonish jasonish Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, what I'm getting at is that the fields used to compute JA4 not be tied to JA4 code, and tracked independent of JA4 being enabled or not, compiled in or not. Then JA4 is just a user of these fields, if that makes sense.

What I'm not positive about is if that makes sense for our reasoning of making JA4 a compile time option, which is a clear separation from patents (even though JA4 has an exception).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could that be a big change to the design of many components within the system? It's hard for me to be sure if my changes will break something else. With this design, it was aiming to allow for extraction of the handshake data as raw fields so that the JA4 can be computed offline. The addition of this part i.e. making it work without having to ja4-fingerprint=yes was something discussed in a previous PR and I thought I'd provide the code for proposal. But now, it seems much more effort for the output we're getting. Maybe it's best to remove that, and force the user to enable JA4/fingerprint in order to get the handshake data, too? (so as-is in the current release) If/when this is finalised we can see about introducing the above proposal?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants