-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hyperscan: add caching mechanism for hyperscan contexts v11 #12658
base: master
Are you sure you want to change the base?
Conversation
This variant of hashlittle2() ensures that it avoids accesses beyond the last byte of the string, which will cause warnings from tools like Valgrind or Address Sanitizer.
Cache Hyperscan serialized databases to disk to prevent compilation of the same databases when Suricata is run again with the same ruleset. Hyperscan binary files are stored per rulegroup in the designated folder, by default in the cached library folder. Since caching is per signature group heads, some chunk of the ruleset can change and it still can reuse part of the unchanged signature groups. Loading *fresh* ET Open ruleset: 19 seconds Loading *cached* ET Open ruleset: 07 seconds Ticket: 7170
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #12658 +/- ##
==========================================
- Coverage 80.77% 80.65% -0.13%
==========================================
Files 932 934 +2
Lines 259517 259935 +418
==========================================
+ Hits 209629 209641 +12
- Misses 49888 50294 +406
Flags with carried forward coverage won't be shown. Click here to find out more. |
@victorjulien wrote in #12394 :
The cache files are created by running the |
Information: QA ran without warnings. Pipeline 24865 |
Side note: this PR broke on one of my machines, but it was due to its hyperscan version being very old (4.6). We don't seem to check, or the check doesn't work well. |
Looks like there is no check. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See inline.
Additionally, I think it would be good to count the successfully loaded caches, and output that number. Right now I don't know from the output if the cache works.
# Cache MPM contexts to the disk to avoid rule compilation at the startup. | ||
# Cache files are created in the standard library directory. | ||
sgh-mpm-caching: yes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not clear that this is under detect, so
detect:
sgh-mpm-caching....
FILE *db_cache_out = fopen(hash_file_static, "w"); | ||
if (!db_cache_out) { | ||
if (!notified) { | ||
SCLogWarning("Failed to create Hyperscan cache file, make sure the folder exist and is " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to also have the strerror(errno) output as it's very unclear to me why I get this failure
Feb 25 19:57:17 c2758 suricata[231574]: Warning: mpm-hs-cache: Failed to create Hyperscan cache file, make sure the folder exist and is writable or adjust sgh-mpm-caching-path setting (/var/lib/suricata/cache/sgh/ids/00445770507689860353_v1.hs) [HSSaveCache:util-mpm-hs-cache.c:197]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The path exists and is owned by the same user as the process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, it looks like it was landlock getting in the way. Wonder if we should automatically add this path to the read and write allow list.
@@ -1297,7 +1297,7 @@ static int AppLayerProtoDetectPMPrepareMpm(AppLayerProtoDetectPMCtx *ctx) | |||
int ret = 0; | |||
MpmCtx *mpm_ctx = &ctx->mpm_ctx; | |||
|
|||
if (mpm_table[mpm_ctx->mpm_type].Prepare(mpm_ctx) < 0) | |||
if (mpm_table[mpm_ctx->mpm_type].Prepare(mpm_ctx, false) < 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not too excited about this bool everywhere, that isn't really clear. Wonder if it should be a flags field with named flags, like MPM_PREPARE_FLAG_NONE, MPM_PREPARE_FLAG_USE_CACHE
, etc or similar.
@@ -162,7 +162,8 @@ typedef struct MpmTableElmt_ { | |||
*/ | |||
int (*AddPattern)(struct MpmCtx_ *, uint8_t *, uint16_t, uint16_t, uint16_t, uint32_t, SigIntId, uint8_t); | |||
int (*AddPatternNocase)(struct MpmCtx_ *, uint8_t *, uint16_t, uint16_t, uint16_t, uint32_t, SigIntId, uint8_t); | |||
int (*Prepare)(struct MpmCtx_ *); | |||
int (*Prepare)(struct MpmCtx_ *, bool); | |||
int (*CacheRuleset)(void); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's odd that this doesn't take any args for a ruleset or detect engine, for which to write the cache
Should I make it a "Notice" level? |
Followup of #12394
Cache Hyperscan serialized databases to disk to prevent compilation of the same databases when Suricata is run again with the same ruleset.
Hyperscan binary files are stored per rulegroup in the designated folder, by default in the cached library folder.
Since caching is per signature group heads, some chunk of the ruleset can change and it still can reuse part of
the unchanged signature groups.
Loading fresh ET Open ruleset: 19 seconds
Loading cached ET Open ruleset: 07 seconds
Link to ticket: https://redmine.openinfosecfoundation.org/issues/7170
Describe changes:
v11:
v10:
v9:
v7: (v6 was private)
v5:
v4:
v3
v2
v1