Why does matchAll query so slowly?


(Hubo3085632) #1

hi all ,I want to use matchAll to achieve a query which same as the sql 'select top xxx from tab' in rdbms.But i find it does slowly,is there has another appropriate function to achieve my purpose?


(David Pilato) #2

This is the fastest query.
Can you share your exact query please? And the full response as well?


(Daniel Penning) #3

Your query is probably using a lot of time sorting results.

Have a look at the Field stats API. It should return the minimum and maximum values for you field much faster than using a matchAll query which would then have to sort all documents first.


(Hubo3085632) #4

here is my code

QueryBuilder qb =QueryBuilders.matchAllQuery();
SearchResponse response = Es_Utils.getClient().prepareSearch(indexName)
.setTypes(mappingName)
.setQuery(qb )
.setSize(100)
.setFrom(0)
.execute()
.actionGet()
;
SearchHits hits=response.getHits();


(Hubo3085632) #5

but i haven't use sorting


(David Pilato) #6

Can you print the response object please?


(David Pilato) #8

It's not the response object


(Hubo3085632) #9

How to print it


(Mark Harwood) #10

I'm not clear what "xxx" is here. Is it documents? Is it values e.g. top "author names"?
If the latter then you want to be using aggregations, not retrieving individual documents.
Can you be more specific about the data you have and the question you want answered?


(Hubo3085632) #11

This is my document,The xxx is integer value,I just replaced the true values.I use matchall to get data then put them on the homepage.


(Mark Harwood) #12

And when you say you want the top values do you mean the largest-seen values or the most-frequently used?


(David Pilato) #13
response.toString()

(Hubo3085632) #14

{
"took" : 478,
"timed_out" : false,
"_shards" : {
"total" : 12,
"successful" : 12,
"failed" : 0
},
"hits" : {
"total" : 196680688,
"max_score" : 1.0,
"hits" : [{
....
},
{
....

} ]

}
}

here is the code print time usage
"
startTime=System.currentTimeMillis();
QueryBuilder qb =QueryBuilders.matchAllQuery();
SearchResponse response = Es_Utils.getClient().prepareSearch(indexName)
.setTypes(mappingName)
.setQuery(qb )
.setSize(100)
.setFrom(0)
.execute()
.actionGet();

System.out.println(response.toString());
endTime =System.currentTimeMillis();
endTime -startTime=1924ms
"


(David Pilato) #15

Thanks.

2 things here.

It takes around 500ms to elasticsearch to collect all things on the coordinating node.
Then it takes around 1500ms to send that data over the wire to your client.

What is elasticsearch version BTW?

Can you test the same with size =10 ?


(Hubo3085632) #16

es version is 2.3.5,i set the size =10 ,but the query use the same time usage.


(David Pilato) #17

Can you print again the full response and your endTime-startTime?


(Hubo3085632) #18

{
"took" : 472,
"timed_out" : false,
"_shards" : {
"total" : 12,
"successful" : 12,
"failed" : 0
},
"hits" : {
"total" : 196680688,
"max_score" : 1.0,
"hits" : [
......
]
}
}

(endTime-startTime)=1887


(David Pilato) #19

So you are consuming most of the time (75%) outside elasticsearch.

I know that 5.0 has enhanced the match all query so may you can lower down the 25% part but you will still have to fix the remaining 75%


(system) #20

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.