I have started working with significant term aggregations. I would really like to understand if I can create multi-dimentinal results. Eg lets say I get the following terms back:
iPhone
iOS
Bend
Apple Watch
Smashed
Small
Android
I would like to be able to get results that cross pollinate these terms. I could create a multidimentional array in my application code and then search based on multiple combinations but this seems very inefficient. I would ideally like results to come back as follows:
iPhone | Smashed
iPhone | Bend
iPhone | iOS | Android
Apple Watch | Smashed
Apple Watch | Small
I am not sure the best way to achieve this but thought I would see if anyone had any thoughts.
Could you post some small set of sample documents, the query you are currently sending and the results that you would like to see so it's easier to reproduce what you want to do?
You might be able to chain significant terms aggregations - would have to try it out on your sample data though to verfy that it works as I think it works and see whether it actually fixes your problems.
If you want to quickly start tinkering yourself I'd suggest you install kibana which makes it dead-simple to play around with aggregations.
I will have to build up some test data so I can share it.
I thought about chaining but if the first aggregation gives me lets say 50 results and I chain them together to find the combinations. But would this not become inefficient?
The following information may help.
I have sets of data that are around 500 characters long. These contain similar subject matter and may be in sets of around 200. I currently get around 50 significant terms and would like to find the most significant combinations of those terms.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.