Why does matchAll query so slowly?

hi all ,I want to use matchAll to achieve a query which same as the sql 'select top xxx from tab' in rdbms.But i find it does slowly,is there has another appropriate function to achieve my purpose?

This is the fastest query.
Can you share your exact query please? And the full response as well?

Your query is probably using a lot of time sorting results.

Have a look at the Field stats API. It should return the minimum and maximum values for you field much faster than using a matchAll query which would then have to sort all documents first.

here is my code

QueryBuilder qb =QueryBuilders.matchAllQuery();
SearchResponse response = Es_Utils.getClient().prepareSearch(indexName)
.setTypes(mappingName)
.setQuery(qb )
.setSize(100)
.setFrom(0)
.execute()
.actionGet()
;
SearchHits hits=response.getHits();

but i haven't use sorting

Can you print the response object please?

It's not the response object

How to print it

I'm not clear what "xxx" is here. Is it documents? Is it values e.g. top "author names"?
If the latter then you want to be using aggregations, not retrieving individual documents.
Can you be more specific about the data you have and the question you want answered?

This is my document,The xxx is integer value,I just replaced the true values.I use matchall to get data then put them on the homepage.

And when you say you want the top values do you mean the largest-seen values or the most-frequently used?

response.toString()

{
"took" : 478,
"timed_out" : false,
"_shards" : {
"total" : 12,
"successful" : 12,
"failed" : 0
},
"hits" : {
"total" : 196680688,
"max_score" : 1.0,
"hits" : [{
....
},
{
....

} ]

}
}

here is the code print time usage
"
startTime=System.currentTimeMillis();
QueryBuilder qb =QueryBuilders.matchAllQuery();
SearchResponse response = Es_Utils.getClient().prepareSearch(indexName)
.setTypes(mappingName)
.setQuery(qb )
.setSize(100)
.setFrom(0)
.execute()
.actionGet();

System.out.println(response.toString());
endTime =System.currentTimeMillis();
endTime -startTime=1924ms
"

Thanks.

2 things here.

It takes around 500ms to elasticsearch to collect all things on the coordinating node.
Then it takes around 1500ms to send that data over the wire to your client.

What is elasticsearch version BTW?

Can you test the same with size =10 ?

es version is 2.3.5,i set the size =10 ,but the query use the same time usage.

Can you print again the full response and your endTime-startTime?

{
"took" : 472,
"timed_out" : false,
"_shards" : {
"total" : 12,
"successful" : 12,
"failed" : 0
},
"hits" : {
"total" : 196680688,
"max_score" : 1.0,
"hits" : [
......
]
}
}

(endTime-startTime)=1887

So you are consuming most of the time (75%) outside elasticsearch.

I know that 5.0 has enhanced the match all query so may you can lower down the 25% part but you will still have to fix the remaining 75%

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.