Diagnosing a slow query

Christopher_Ambler · July 31, 2014, 2:51pm

Okay, let's attack this directly. We have a cluster of 6 machines (6
nodes). We have an index of just under 3.5 million documents. Each document
represents an Internet domain name. We are performing queries against this
index to see names that exist in our index. Most queries are coming back in
the sub-50ms range. But a bunch are taking 600ms to 900ms and, thus,
showing up in our slow query log. If they ALL were performing at this
speed, I'd wouldn't be nearly as confused, but it looks like only about 10%
to 20% of the queries are "slow." That's clearly too much.

Head reports that this index looks like this:

aftermarket-2014-07-31_02-38-19
size: 424Mi (2.47Gi)
docs: 3,428,471 (3,428,471)

Here is the configuration for a typical node (they're all pretty-much the
same). We have 2 machines in a dev data center, 2 machines in a mesa data
center and 2 machines in a phx data center. Each of the two machines in a
data center has a "node.zone" tag set, and, as you can see, I have the
cluster routing awareness set to see "zone" as its marching orders. The
data pipes between the data centers are beefy, and while I acknowledge that
cross-DC isn't something that's generally smiled-upon, it appears to work
fine.

Each machine has 96G of RAM. We start ES giving it 30G for the heap size.
File descriptors are set at 64,000. Note that I've selected the memory
mapped file system.

Server-specific settings for cluster domainiq-es

cluster.name: domainiq-es
node.name: "Mesa-03"
node.zone: es-mesa-prod
discovery.zen.ping.unicast.hosts: ["dev2.glbt1.gdg", "m1p1.mesa1.gdg",
"m1p4.mesa1.gdg", "p3p3.phx3.gdg", "p3p4.phx3.gdg"]

The following configuration items should be the same for all ES servers

node.master: true
node.data: true
index.number_of_shards: 5
index.number_of_replicas: 5
index.store.type: mmapfs
index.memory.index_buffer_size: 30%
index.translog.flush_threshold_ops: 25000
index.refresh_interval: 30s
bootstrap.mlockall: true
cluster.routing.allocation.awareness.attributes: zone
gateway.recover_after_nodes: 4
gateway.recover_after_time: 2m
gateway.expected_nodes: 6
discovery.zen.minimum_master_nodes: 3
discovery.zen.ping.timeout: 10s
discovery.zen.ping.retries: 3
discovery.zen.ping.interval: 15s
discovery.zen.ping.multicast.enabled: false

And here is a typical slow query:

[2014-07-31 07:35:31,530][WARN ][index.search.slowlog.query] [Mesa-03]
[aftermarket-2014-07-31_02-38-19][2] took[707.6ms], took_millis[707],
types[premium], stats[], search_type[QUERY_THEN_FETCH], total_shards[5],
source[],
extra_source[{"size":35,"query":{"query_string":{"query":"sld:petusies^20.0
OR tokens:(((pet^1.2 pets^1.0 *^1.0)AND(us^1.2 *^0.8)AND(ie^1.2
*^0.6)AND(s^1.2 *^0.4)) OR((pet^1.2 pets^1.0)AND(us^1.2)AND(ie^1.2))^3.0)
AND tld:(com^1.001 OR in^0.99 OR co.in^0.941174367459617 OR
net.in^0.8848832474555992 OR us^0.85 OR org.in^0.8397882862729736 OR
gen.in^0.785829669672289 OR firm.in^0.7414549824163524 OR ind.in^0.7 OR
org^0.6) OR
_id:petusi.es^5.0-domaintype:partner","lowercase_expanded_terms":true,"analyze_wildcard":false}}}],

So note that I create 5 shards and 5 replicas, so that each node has all 5
shards at all times. I THOUGHT THIS MEANT BETTER PERFORMANCE. That is, I
thought having all 5 shards on every node meant that a query to a node
didn't have to ask another node for data. IS THIS NOT TRUE?

Here's where it also gets interesting: I tried setting the number of shards
to 2 (with 5 replicas) and my slow queries went to almost 2 seconds
(2000ms). This is also terribly counter-intuitive! I thought fewer shards
meant less lookup time.

Clearly, I want to optimize for read here. I don't care if indexing is
three times as slow, we need our queries to be sub-100ms.

Any help is SERIOUSLY appreciated (and if you're in the Bay Area, I'm not
above bribes of beer :-))

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/43b8bd8a-b20f-49de-a99d-825168095d6a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Christopher_Ambler · July 31, 2014, 5:49pm

It has been suggested that what I'm seeing is a CPU-bound issue in that the
large number of OR directives in our query could make many of these queries
take a long time.

As I'm not an expert on crafting queries, any expert opinions?

Because I'm feeling pretty good about my configuration about now...

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ca810684-5fa3-4f0e-92ca-38b3fe359fd7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

otisg · August 1, 2014, 12:27am

ORs certainly tend to be slower than simpler/shorter term queries, but I'd
suspect that cross-DC part because your index is tinny and servers are
beefy and plentiful. Maybe you can look at your network and query metrics
and correlate a drop or spike in traffic or packet loss with slow queries?

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

tel: +1 347 480 1610 fax: +1 718 679 9190

On Thursday, July 31, 2014 7:49:43 PM UTC+2, Christopher Ambler wrote:

It has been suggested that what I'm seeing is a CPU-bound issue in that
the large number of OR directives in our query could make many of these
queries take a long time.

As I'm not an expert on crafting queries, any expert opinions?

Because I'm feeling pretty good about my configuration about now...

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/18093462-aa38-4f6b-835f-0f6578dd69dc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Christopher_Ambler · August 1, 2014, 12:56am

Suspecting this, we tried taking things down to a single server and still
have the exact same response.

That said, optimizing the query to get rid of some of those ORs has helped,
so I think that's the path we're taking.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/17cdcddd-fa02-40bf-b59c-dfee613c3a84%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

ryan_chou · August 1, 2014, 6:56am

i would rather suggest you separate the nodes roles. you could divide them
into 3 node roles.
1.Index node : for index data
2.load balance : for query , you could maximize load balance node memory
for avoiding the GC consume too much memory.
3.data node : just for store the data.
and optimize the routing strategy for query.

Those echo node has 96G, what a high allocations.

在 2014年7月31日星期四UTC+8下午10时51分17秒，Christopher Ambler写道：

Okay, let's attack this directly. We have a cluster of 6 machines (6
nodes). We have an index of just under 3.5 million documents. Each document
represents an Internet domain name. We are performing queries against this
index to see names that exist in our index. Most queries are coming back in
the sub-50ms range. But a bunch are taking 600ms to 900ms and, thus,
showing up in our slow query log. If they ALL were performing at this
speed, I'd wouldn't be nearly as confused, but it looks like only about 10%
to 20% of the queries are "slow." That's clearly too much.

Head reports that this index looks like this:

aftermarket-2014-07-31_02-38-19
size: 424Mi (2.47Gi)
docs: 3,428,471 (3,428,471)

Here is the configuration for a typical node (they're all pretty-much the
same). We have 2 machines in a dev data center, 2 machines in a mesa data
center and 2 machines in a phx data center. Each of the two machines in a
data center has a "node.zone" tag set, and, as you can see, I have the
cluster routing awareness set to see "zone" as its marching orders. The
data pipes between the data centers are beefy, and while I acknowledge that
cross-DC isn't something that's generally smiled-upon, it appears to work
fine.

Each machine has 96G of RAM. We start ES giving it 30G for the heap size.
File descriptors are set at 64,000. Note that I've selected the memory
mapped file system.

Server-specific settings for cluster domainiq-es

cluster.name: domainiq-es
node.name: "Mesa-03"
node.zone: es-mesa-prod
discovery.zen.ping.unicast.hosts: ["dev2.glbt1.gdg", "m1p1.mesa1.gdg",
"m1p4.mesa1.gdg", "p3p3.phx3.gdg", "p3p4.phx3.gdg"]

The following configuration items should be the same for all ES servers

node.master: true
node.data: true
index.number_of_shards: 5
index.number_of_replicas: 5
index.store.type: mmapfs
index.memory.index_buffer_size: 30%
index.translog.flush_threshold_ops: 25000
index.refresh_interval: 30s
bootstrap.mlockall: true
cluster.routing.allocation.awareness.attributes: zone
gateway.recover_after_nodes: 4
gateway.recover_after_time: 2m
gateway.expected_nodes: 6
discovery.zen.minimum_master_nodes: 3
discovery.zen.ping.timeout: 10s
discovery.zen.ping.retries: 3
discovery.zen.ping.interval: 15s
discovery.zen.ping.multicast.enabled: false

And here is a typical slow query:

[2014-07-31 07:35:31,530][WARN ][index.search.slowlog.query] [Mesa-03]
[aftermarket-2014-07-31_02-38-19][2] took[707.6ms], took_millis[707],
types[premium], stats, search_type[QUERY_THEN_FETCH], total_shards[5],
source,
extra_source[{"size":35,"query":{"query_string":{"query":"sld:petusies^20.0
OR tokens:(((pet^1.2 pets^1.0 *^1.0)AND(us^1.2 *^0.8)AND(ie^1.2
*^0.6)AND(s^1.2 *^0.4)) OR((pet^1.2 pets^1.0)AND(us^1.2)AND(ie^1.2))^3.0)
AND tld:(com^1.001 OR in^0.99 OR co.in^0.941174367459617 OR net.in^0.8848832474555992
OR us^0.85 OR org.in^0.8397882862729736 OR gen.in^0.785829669672289 OR
firm.in^0.7414549824163524 OR ind.in^0.7 OR org^0.6) OR _id:petusi.es^5.0-domaintype:partner","lowercase_expanded_terms":true,"analyze_wildcard":false}}}],

So note that I create 5 shards and 5 replicas, so that each node has all 5
shards at all times. I THOUGHT THIS MEANT BETTER PERFORMANCE. That is, I
thought having all 5 shards on every node meant that a query to a node
didn't have to ask another node for data. IS THIS NOT TRUE?

Here's where it also gets interesting: I tried setting the number of
shards to 2 (with 5 replicas) and my slow queries went to almost 2 seconds
(2000ms). This is also terribly counter-intuitive! I thought fewer shards
meant less lookup time.

Clearly, I want to optimize for read here. I don't care if indexing is
three times as slow, we need our queries to be sub-100ms.

Any help is SERIOUSLY appreciated (and if you're in the Bay Area, I'm not
above bribes of beer :-))

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5591aa45-b933-43c9-a13a-168deb53bdf1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

smonasco_2 · August 2, 2014, 4:24pm

10 to 20% of the time queries take the better part of a second when only on one node and you indicate the queries are very similar in nature.

For that second do a lot of queries show up all at once or are they spaced out?
Do you see a spike in query requests?
Do you see the CPU max out?
Do you see queries get queued?
Can you run the same exact query regularly and still see the problem or only in a heterogeneous set do you get the issue?
Is there IO disk contention when they happen?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e427b15b-510c-4e28-8aa1-5af5dce30265%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.