Search must return a most a single document

Hi,

I'm working on an enrich index to power an ENRICH statement in ES|QL search. This is what I have so far:

PUT /_enrich/policy/fortinet-fortimail-session-details
{
  "match": {
    "indices": "logs-fortinet_fortimail.log-*",
    "match_field": "fortinet_fortimail.log.session_id",
    "enrich_fields": ["source.ip"]
	"query": {
      "term": {
        "fortinet_fortimail.log.type": "statistics"
      }
    }
  }
}

This works fine as long as only a single document is found. However, there are some edge cases when there are multiple documents with the same session_id. In this case the ENRICH Unfortunately, the documents are mostly identical, so there is no other field to easily filter on.

I'm wondering how I can limit the results returned by the query to a single document at most. I've thought about calculating a score and use min_score to filter but my knowledge on this approach is too limited.

Any ideas?

Thank you.

Hi @tokcum,

Can you submmit some results with fake data of the both cases: single and multiple documents?

Thanks,

Sure, thank you, Alex for your support.

Let me explain the background of this question. There might be a completely different solution to this.

I've installed the FortiMail integration and ingress the FortiMail logs. The log has different document types. I'm focusing on the documents related to virus infection. Virus related details are included in those documents but some essential data for my SIEM use case is missing, e.g. the source.ip.

This is what I get in a Kibana search:

The missing data is available in other documents of the same index. The session_id is the key to find those other documents.
The problem is, that there is not just one other document with that session_id but multiple. That's why I filter the fortinet_fortimail.log.type for statistics in the enrich policy. In most cases there is just one document of that type.

With ES|QL and the enrich index in place this looks like this:

This is my search:

from logs-fortinet_fortimail.log-siem
| where fortinet_fortimail.log.type == "virus"
| enrich fortinet-fortimail-session-details
| sort @timestamp desc

The result is in the next post. As a new user, I'm not allowed to embed two pictures in a single post.

This is the result. Looks pretty good, except when there are multiple "statistics" documents for session_id. This case is marked in yellow. This occurs when a mail was sent to multiple recipients.

I would like to "resolve" a session_id into a single source.ip.

Any ideas appreciated.

Hi @tokcum,

Just to clarify, could you send the results of a query with all data from the same session ID? For example, the last one in the last picture.

Hi Alex,
not sure if I fully understand your question. I think, if I could limit the search result in the enrich index to a single document this would be fine because all the data I need to enrich the current document would be there.

Thank you.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.