Advice on user correlation / enrichment

Hey all.

I am currently onboarding new log sources onto our new Elastic instance, and there was something that I was thinking about and trying to figure out how/if it would work in Elastic.

The issue is that there are a lot of security solutions we are working with that do not natively support IP -> user correlation, or even if they do, they do a bad job at it because of the limited information they can gather from the environment to make those decisions.

Assuming:

  • There is an index that tracks wireless authentication and can provide a timestamp, a username, and an IP address assigned to the device.
  • There is an index that tracks when a user obtains a Kerberos ticket from Active Directory and provides a timestamp, a username, and an IP address of the device requesting the ticket.
  • There are other miscellaneous indices that contain log-on information from an IP address that would have similar information (timestamp, username, IP).
  • IP addresses are dynamically assigned and with very short lease times.

and

  • There is an index containing firewall/EDR type information that has an associated IP address in the log/document.

Can Elastic:

  • Enrich/perform a correlation between the IP address and the different indices above based on a specific logic - (e.g. try correlating from wireless authentication first, then try Kerberos, then if not, try other types of logs) and add the resulting username into a field?

The only ways I can envision this being possible is using enrichment policies or runtime field mappings. However, both of them wouldn't really work because:

  • Enrichment policies are not real-time and require re-execution so that a new enrichment index can be built. This wouldn't be feasible for this type of data since the index is updated in near real-time and the constant rebuilding of the enrichment indices probably would not work well.
  • Runtime mappings wouldn't work either since searching simply on an IP address would yield multiple results and return a random result from the list. Under these circumstances, I would need it to also take into account the timestamp in which the event was generated, since the user associated with the IP address could change just merely 10 minutes later and would most likely be different when the search is conducted.

Has anyone dealt with such a use case? If so, how did you manage to solve this issue?

Thanks!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.