Entity-centric indexing with Transforms

katja1 · July 5, 2021, 9:15am

Hello, we're working with Elasticsearch for the first time and we are currently deciding on what would be the best solution for our problem at hand.

We are receiving Event based logs (in JSON form) from our applications directly to Elasticsearch index. These logs are highly interconnected (they share a common unique ID) and therefore we need to convert/aggregate them in an Entity-centered fashion.

Each event usually has a status change in the target field. There are more statuses than just start/end. Document has more data which can be used to create more than just one Entity-centered index.

{
*uniqueID*: ain123in145512kn
name: Bob
target: {
eventStart: {timestamp: 2020-06-01T13:50:55.000Z}
}
}

{
*uniqueID*: ain123in145512kn
name: Bob
target: {
eventStop: {timestamp: 2021-06-01T13:50:55.000Z}
}
}

We were already able to join these documents using Python or Logstash. We basically created an index that contains the following documents:

{
*uniqueID*: ain123in145512kn
name: Bob
target: {
eventStart: {timestamp: 2020-06-01T13:50:55.000Z},
eventStop: {timestamp: 2021-06-01T13:50:55.000Z}
*time_dif_Start_Stop : xxxx*
}
}

We assigned all events document ID that is the same as uniqueID which updated them automatically. Next step just calculated the difference between eventStart and eventStop timestamps.

We have certain requirements for our pipeline so we would prefer if data never has to leave elasticsearch. Therefore, we are wondering if it is possible to do this with any of the tools that already exist in the ELK stack or are hosted in the Elastic cloud? We tried using Transforms but we were only able to calculate aggregated fields in a new index. Is it possible to also basically merge/update all the documents into a single one with this tool or any other? It would be ideal for us as it is running on a schedule and we do not need any external tools to modify documents.

Any other suggestions or help would also be greatly appreciated.

przemekwitek · July 5, 2021, 1:36pm

Hi,

It seems like transforms should be fitting your needs but it would be good to know more details.

If you tried using transforms, could you show the config you were using for that?
What do you mean by "only able to calculate aggregated fields"? eventStart should be a result of min aggregation. Similarly eventStop should be a result of max aggregation.
Is it time_dif_Start_Stop that is problematic for you? It looks like it could be calculated an ingest pipeline attached to your destination (entity-centric) index.

katja1 · July 6, 2021, 10:03am

No, the time_dif_Start_Stop is not problematic, we are able to caluculate it with scripted metrics, and write it to destination index. What we are wondering is how to also "transfer" some of the existing fields (that are not part of aggregations and calcualtions) from the source index to the destination index, based on the shared ID (uniqueID)

przemekwitek · July 6, 2021, 12:22pm

If there are not many such fields, you can put them in the group_by section of the transform config.
Of course, in such case they are not meant to be used for grouping (as grouping is achieved by having uniqueID) but they will be present in the destination index.

Please note, however, that if you have many such fields, it can impact performance of the transform.

katja1 · July 7, 2021, 12:34pm

Thanks for your answer, that is what we needed.

system · August 4, 2021, 12:35pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Using Transforms and including values which only exist on some events in the source index Elasticsearch transforms	2	331	October 25, 2021
Updating indexed documents in Elasticsearch via Logstash Elasticsearch	1	490	July 5, 2017
Transforms: How to aggregate multiple events into one event based on shared field? Elasticsearch transforms	1	247	November 12, 2023
Entity Centric Architecture Elasticsearch	3	1534	August 15, 2017
Entity Centric Index Elasticsearch	3	826	July 5, 2017

Entity-centric indexing with Transforms

Related topics