Optimal structure for identifying users in log records

garbetjie · September 19, 2018, 6:03am

Hi there.

I'm very new to Elasticsearch, so please forgive me if this is a question that has been answered elsewhere.

In our application, we find it important to be able to look up the logs by a specific user ID. However, the user involved might only be involved much later on in a request (each request is assigned a unique correlation ID) - and so there would be a number of log messages already recorded that wouldn't have the user ID associated with them.

What would be the optimal way to structure an Elasticsearch document that allows for easy searching of log records by a user ID? My proposed document structure is provided below:

{
    "correlation_id": "111",
    "users": ["000", "222", "333"],
    "records": [
        { "message": "something happened", "logged_at": "2018-01-01T00:00:00Z" },
        { "message": "some other event", "logged_at": "2018-01-01T00:00:00Z" },
        { "message": "user identified:222", "logged_at": "2018-01-01T00:00:00Z" },
        { "message": "some other event", "logged_at": "2018-01-01T00:00:00Z" }
    ]
}

So, when searching for log records, I want to be able to find all the records for a correlation_id when searching for any log records that involve user 222.

However, in order to create a structure like this, I'll need to use upserts from within Logstash. My concern is that the appending of records to the records property might start to cause hot spots - is that even a concern with Elasticsearch?

If appending to the same array is not the correct way to go about it, what would the recommendation be in structuring and linking multiple log documents together? (ie: ensuring that searching for a specific user will bring back all the documents that have the same correlation_id).

Apologies for the wall of text. As mentioned, I've very new to Elasticsearch, and would really appreciate any input or guidance that can be given!

system · October 17, 2018, 6:03am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Unique index for each user Elasticsearch	5	487	June 6, 2020
Output to indices based on field in message Logstash	5	12733	May 25, 2017
Add a new field to document based on the value of the previous with the same correlation id Elasticsearch	6	1985	September 6, 2017
Combine fields in Elasticsearch from logs with same ID Logstash	7	1666	August 25, 2020
Right structure for index Elasticsearch	4	277	September 17, 2022

Optimal structure for identifying users in log records

Related topics