The title says it all really,
I need to write a query which will search multiple indices and returns the top 5 results for each index.
What can i use to do this?
Thx
The title says it all really,
I need to write a query which will search multiple indices and returns the top 5 results for each index.
What can i use to do this?
Thx
There are two options:
_index
field.Downside of using field collapsing is that it won't work on the _index
metadata fied. So you will need to add an additional field to your documents that indicate what index these docs are in, so you can collapse on that field.
For example, given these docs in three indexes a
, b
and c
:
POST _bulk
{ "index" : { "_index": "a", "_type": "doc", "_id" : "1"}}
{ "foo" : "bar a", "my_index": "a"}
{ "index" : { "_index": "a", "_type": "doc", "_id" : "2"}}
{ "foo" : "bar a", "my_index": "a"}
{ "index" : { "_index": "b", "_type": "doc", "_id" : "3"}}
{ "foo" : "bar b", "my_index": "b"}
{ "index" : { "_index": "b", "_type": "doc", "_id" : "4"}}
{ "foo" : "bar b", "my_index": "b"}
{ "index" : { "_index": "c", "_type": "doc", "_id" : "5"}}
{ "foo" : "bar c", "my_index": "c"}
{ "index" : { "_index": "c", "_type": "doc", "_id" : "6"}}
{ "foo" : "bar c", "my_index": "c"}
You could run a collapse on the my_index.keyword
field:
GET a,b,c/_search
{
"query": {
"match": {
"foo": "bar"
}
},
"collapse": {
"field": "my_index.keyword",
"inner_hits": {
"name": "my_top_5",
"size": 5
}
}
}
Probably easier (because it doesn't require that additional my_index
field) is the top hits aggregation. Given the docs above, the following aggregation request gives you what you're looking for:
GET a,b,c/_search
{
"query": {
"match": {
"foo": "bar"
}
},
"size": 0,
"aggs": {
"indices": {
"terms": {
"field": "_index"
},
"aggs": {
"my_top_hits": {
"top_hits": {
"size": 5
}
}
}
}
}
}
I figured i would need to use a top hits aggregation and i found something similar to your suggestion,
this approach seems to be working just fine!
The field collaplsing seems interesting too, but as you said it's probably better to use the tophits aggregation because of the _index field.
Thanks for the extensive reply.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.