回复: really bad post_filter performance

spancer_ray · March 31, 2014, 11:47pm

Too many shards may result in querying performance.

I just upgraded to ES 1.0.1 from ES 0.9.2 and am seeing huge performance problems.

I traced them to what I think is the post_filter.

Here is the query that we used to run against ES 0.9.2

{
filter": {

"and": [
 {

   "terms": {

     "index_ids": [

       2134616789944

     ]

   }

 },

 {

   "or": [

     {

       "term": {

         "trashed_at": 0

       }

     },

     {

       "not": {

         "exists": {

           "field": "trashed_at"

         }

       }

     }

   ]

 }
]

}

}

This used to take the 0.9 cluster about 150ms to execute

The same query takes about 2.5s for the 1.0 cluster.

I rewrote it to conform to my understanding of the changes in 1.0, using a filtered query, however, that didn't help.

I then tried to figure out which parts were slow. I now have the following query

{
"query": {

"filtered": {
 "query": {

   "match_all": {}

 },

 "filter": {

       "terms": {

         "index_ids": [

           2134616789944

         ]}

 }
}

},

"post_filter": {

"or": [
 {"term": {"trashed_at": 0}},

 {"not": {"exists": {"field": "trashed_at"}}}

 ]}
}

It takes 2.5 s and returns 34 hits. However, removing the "post_filter" clause:

{
"query": {

"filtered": {
 "query": {

   "match_all": {}

 },

 "filter": {

   "terms": {

     "index_ids": [

       2134616789944

     ]

   }

 }
}

}

}

Makes it take 50ms and return 34 results.

My conclusion is that it's taking 2.5 seconds to filter 34 results, and that's confusing.

The cluster uses 3 machines, 50 shards, 2 replicas per shard. This means that each machine has the entire copy of the index. We use the ?routing= parameter, and are always hitting a single shard for the query.

Help?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/cd5b6bb1-7fce-4688-84cb-4ec6d0db8f93%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/73fhfrw850xcj5nhwiuwk1o8.1396309621505%40email.android.com.
For more options, visit https://groups.google.com/d/optout.

Binh_Ly_2 · April 1, 2014, 3:14pm

I'd probably just collapse everything into a filtered query. Something like
this:

{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"terms": {
"index_ids": ["2134616789944"]
}
}
],
"should": [
{
"terms": {
"trashed_at": "0"
}
},
{
"not": {
"exists": {
"field": "trashed_at"
}
}
}
]
}
}
}
}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0845cbee-26bb-43be-9318-7a36a08e6504%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Really bad post_filter performance Elasticsearch	1	497	July 6, 2017
Query performance issue with the very first one Elasticsearch	6	485	March 17, 2020
Further optimization to ES queries / performance Elasticsearch	1	343	September 3, 2020
Very slow filter query Elasticsearch	13	1045	July 6, 2017
Concurrent searches over same dataset degrades performance Elasticsearch	3	920	July 6, 2017

回复: really bad post_filter performance

Related topics