Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Detection Engine] Expose EQL partial results #205277

Open
3 tasks
yctercero opened this issue Dec 30, 2024 · 10 comments
Open
3 tasks

[Detection Engine] Expose EQL partial results #205277

yctercero opened this issue Dec 30, 2024 · 10 comments
Assignees
Labels
bug Fixes for quality problems that affect the customer experience Feature:Event Correlation (EQL) Rule Security Solution Event Correlation (EQL) rule type Team:Detection Engine Security Solution Detection Engine Area

Comments

@yctercero
Copy link
Contributor

yctercero commented Dec 30, 2024

Summary

Prior to this update, EQL disallowed partial results. Any index unavailability, including shard failures, would result in search failure. We realized that there are use cases in which partial results would be helpful and is how the feature should have worked to begin with for non-sequence. Treating this as a bug fix/enhancement mix of sorts.

Feature

We need to update our rule failure logic to account for partial results being available for non-sequence queries and ensure we are properly exposing any shard failure errors to the user.

  • Update rule execution logic so that rule results in partial failure state (successful run, warn of shard failures)
  • Telemetry - update Kseniia's error dashboard to track this error? cc @approksiu
  • Docs - we don't need to document as a new feature, but should add to release notes and check if we make mention of partial failures not being available in the EQL rule docs.
@botelastic botelastic bot added the needs-team Issues missing a team label label Dec 30, 2024
@yctercero yctercero added Team:Detection Engine Security Solution Detection Engine Area and removed needs-team Issues missing a team label labels Jan 2, 2025
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detection-engine (Team:Detection Engine)

@yctercero
Copy link
Contributor Author

Note: currently for custom query and indicator match, partial results are available, we set the rule to partial failure state but still generate alerts.

@dhurley14
Copy link
Contributor

Somewhat related to this work, rules used to be defined based on the configuration of the rule saved object. That changed as we added exceptions and now with rules being dependent on these "global rule configuration options" via the Kibana advanced settings, the output from the rules export api is not the single source of truth. I'm wondering if we should incorporate the values from the kibana advanced settings into the output of the rules export api (and, likewise, the import api?)

This is probably a separate conversation but wanted to see what others thought of this before opening a ticket.

@yctercero
Copy link
Contributor Author

That's a great point @dhurley14 .

@approksiu what are users expectations of the rule object being the source of truth/needing to be a representation of what's run during execution? Would we want to reflect anywhere what additional settings influenced the execution (data tier settings, and now this setting)?

We can address this separately from this particular ticket, but it's good to discuss.

@yctercero yctercero added enhancement New value added to drive a business result Feature:Event Correlation (EQL) Rule Security Solution Event Correlation (EQL) rule type labels Jan 3, 2025
@yctercero
Copy link
Contributor Author

Discussion from team sync:

  • Is this considered a bug fix? If so, we are just matching behavior with other rule types. Other rule types don't have an advanced setting.
  • Do we want to expose this setting if we are fixing behavior?

cc @approksiu

@approksiu
Copy link

approksiu commented Jan 7, 2025

@yctercero @dhurley14

Do we want to expose this setting if we are fixing behavior?

Excellent point, team. Let's change the behavior to allow partial results but not introduce the new setting. This is aligned with customer expectations and would not require additional setup from users.

@approksiu
Copy link

approksiu commented Jan 7, 2025

Also good point about

what are users expectations of the rule object being the source of truth/needing to be a representation of what's run during execution? Would we want to reflect anywhere what additional settings influenced the execution (data tier settings, and now this setting)?

I opened a ticket for this https://github.com/elastic/security-team/issues/11494 . Other settings like "snooze" also came up in customer discussions.

@yctercero yctercero changed the title [Detection Engine] Expose EQL partial results setting [Detection Engine] Expose EQL partial results Jan 13, 2025
@yctercero yctercero added bug Fixes for quality problems that affect the customer experience and removed enhancement New value added to drive a business result labels Jan 13, 2025
@dhurley14
Copy link
Contributor

Reading through that PR from elasticsearch they call out a possible false-positive result for sequence queries.

Sequence queries are particularly problematic, because their result does not depend on a single record, but by the correlation of multiple records, so the result of a query can be slightly incorrect in case of partial shards (consider the case where a missing event clause matches records that are in missing shards, it will return false positives).
For this reason, we also provide an additional parameter, allow_partial_sequence_results to fine-tune the behavior of sequence queries:

Just reiterating my interpretation; if an event in the sequence matches a record in an unavailable shard, the sequence query would return a successful hit, whether the values in that particular sequence matched the data in the missing shard or not. Thus a potential false-positive. Am I interpreting this correctly? If so, this type of functionality makes me wonder if we should open this property up as a rule configuration option, rather than just default it to true for all eql queries.

What are your thoughts @approksiu @yctercero @marshallmain ?

@approksiu
Copy link

Just reiterating my interpretation; if an event in the sequence matches a record in an unavailable shard, the sequence query would return a successful hit, whether the values in that particular sequence matched the data in the missing shard or not. Thus a potential false-positive. Am I interpreting this correctly? If so, this type of functionality makes me wonder if we should open this property up as a rule configuration option, rather than just default it to true for all eql queries.

@dhurley14 we only want to change the behavior and apply new setting for non-sequence EQL rules. The sequence rules should still fail - indeed, to prevent false positives.

@dhurley14
Copy link
Contributor

Ah! Apologies I missed that. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:Event Correlation (EQL) Rule Security Solution Event Correlation (EQL) rule type Team:Detection Engine Security Solution Detection Engine Area
Projects
None yet
Development

No branches or pull requests

4 participants