I have three types of logs, e.g. A, B and C, where A is the parent of both B and C.
The A log comes first and is saved into elastic search, but B and C can come minutes or hours later. B and C contain the reference to A (their parent).
I want to update the log A in elastic search with B and C when they come, so the schema looks like:
{
dataA: A,
dataB: B,
dataC: C
}
I tried using parent/child relationship in elastic search, but since the relationships are one-to-one, it does
not make sense to use it (also the querying is slow) - I would rather have denormalized structure.
Is there any way to partialy update the elastic search document from logstash? Or have I gone about this
all wrong? I would love some suggestions
You could define your own _id for the event when you send it to Elasticsearch, that would be for the parent document, A.
Then for document B, you could do an Elasticsearch filter to lookup A and then merge the two docs, making sure the output uses the same _id as the step above in the output.
Then repeat for document C.
Basically you store the first doc and then update it with the next two
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.