This is obviously a contrived minimal example, so while you may think the data could be refactored into one document, please for the sake of this question assume you can't (unless there is a really good reason not to).
I want to search bar for all documents that have a total more than 10. I then want to use the id of these documents to get the corresponding documents from foo (so I can use the type).
I search for all documents in bar with a total greater than 10, and get ids 1 and 2. I then use these to get the corresponding documents from foo: {id: 1, type: "aaa"}, {id: 2, type: "bbb"}, {id: 2, type: "ccc"}.
Is there a way to do this using Kibana/Elasticsearch?
You are asking for an expensive lookup operation: using the results of one query to construct a new query by ID. There are some options you have, but the best option is to change the shape of your data to better fit the index-oriented nature of Elasticsearch. If that is impossible, then it is theoretically possible to construct this kind of lookup using a series of functions in Canvas- but I haven't tested this functionality.
Do you have any suggestions to go about doing this?
To give some more info, I have data coming from multiple sources (e.g. it cannot be added at the same time to a single document) and at some point in the future, somebody is going to ask me "please get us a list of X that has Y above Z" (or some other obscure query). And this information is going to be spread across separate indices. In the day-to-day case, the data we need for our visualisation are all within a single index.
Would the best option to be to re-index data from separate indexes into one, if that is possible, then perform the query on that?
The problem is when someone asks for something really specific, e.g. in the past month, get me a list of all Y for people who have X higher Z. Multiple visualisations wouldn't give that data - it's usually someone in a specific department who needs a list to do something with.
Putting the data into single documents and using update could potentially work, however it would be quite complicated, because the data isn't actually really related (from very separate sources, but generated by the same user), is updated at different times and as with my example using foo and bar, there may be multiple documents, or no documents in foo for a given id, which exists thousand of times already in bar, or vice versa.
However (please correct me if I'm wrong), I think I could re-index the data from all the relevant indices (that contain the data I need), somehow merging all the connected documents into one document, on the case-by-case basis that one of these specific queries needs to be done?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.