ElasticSearch Performance Issues

Jason_Moore · December 17, 2012, 10:20pm

I am encountering a performance issue with Elasticsearch in Amazon EC2.
Currently I am getting at most 500 rps using ab - Apache HTTP server
benchmarking tool with a very simple query. Our goal is to get to atleast
1000 rps, but it seems unlikely unless we throw more hardware at it. Any
advice would be greatly appreciated.

Configurations:
8 node cluster - m1.xlarge - 4 are EBS optimized
Default memory(5gbs)
Java version 1.7.0_03
OS Ubuntu 12.04.1

active shards 4
replicas 1

number of documents: 1.5millions
each document has about 400 attributes that are searchable, they are of
varying data types.
here are the analyzers that are applied to all string fields:

{
"settings" : {
"index" : {
"number_of_shards" : 4,
"number_of_replicas" : 1
},
"analysis" : {
"char_filter" : {
"my_mapping" : {
"type" : "mapping",
"mappings" : ["(=>", ")=>", "[=>", "]=>", "â
¢=>", "Â®=>" ]
}
},
"analyzer" :{
"default" : {
"type" : "custom",
"tokenizer" : "whitespace",
"filter" : ["lowercase", "pattern_replace"],
"char_filter" : ["my_mapping"],
"stopwords" : "none"
},
"lowercase_only_alphanum" : {
"type" : "custom",
"tokenizer" : "keyword",
"filter" : ["lowercase", "pattern_replace"],
"char_filter" : ["my_mapping"]
}
},
"filter" : {
"pattern_replace": {
"type" : "pattern_replace",
"pattern" : "\xA0",
"replacement" : " "
}
}
}
}
}'

Thanks,

JM

--

Randall_McRee · December 17, 2012, 10:47pm

Since you are primarily interested in query performance you should optimize
your index e.g.

curl -XPOST "http://$ES_HOST:9200/_optimize?max_num_segments=1"

Have you tried this?

On Mon, Dec 17, 2012 at 2:20 PM, Jason Moore jason.moore89@gmail.comwrote:

I am encountering a performance issue with Elasticsearch in Amazon EC2.
Currently I am getting at most 500 rps using ab - Apache HTTP server
benchmarking tool with a very simple query. Our goal is to get to atleast
1000 rps, but it seems unlikely unless we throw more hardware at it. Any
advice would be greatly appreciated.

Configurations:
8 node cluster - m1.xlarge - 4 are EBS optimized
Default memory(5gbs)
Java version 1.7.0_03
OS Ubuntu 12.04.1

active shards 4
replicas 1

number of documents: 1.5millions
each document has about 400 attributes that are searchable, they are of
varying data types.
here are the analyzers that are applied to all string fields:

{
"settings" : {
"index" : {
"number_of_shards" : 4,
"number_of_replicas" : 1
},
"analysis" : {
"char_filter" : {
"my_mapping" : {
"type" : "mapping",
"mappings" : ["(=>", ")=>", "[=>", "]=>", "â
¢=>", "Â®=>" ]
}
},
"analyzer" :{
"default" : {
"type" : "custom",
"tokenizer" : "whitespace",
"filter" : ["lowercase", "pattern_replace"],
"char_filter" : ["my_mapping"],
"stopwords" : "none"
},
"lowercase_only_alphanum" : {
"type" : "custom",
"tokenizer" : "keyword",
"filter" : ["lowercase", "pattern_replace"],
"char_filter" : ["my_mapping"]
}
},
"filter" : {
"pattern_replace": {
"type" : "pattern_replace",
"pattern" : "\xA0",
"replacement" : " "
}
}
}
}
}'

Thanks,

JM

--

--

Jason_Moore · December 17, 2012, 10:58pm

Thanks, RKM, but yes I have applied this to all nodes.

JM

On Monday, December 17, 2012 4:47:01 PM UTC-6, RKM wrote:

Since you are primarily interested in query performance you should
optimize your index e.g.

curl -XPOST "http://$ES_HOST:9200/_optimize?max_num_segments=1"

Have you tried this?

On Mon, Dec 17, 2012 at 2:20 PM, Jason Moore <jason....@gmail.com<javascript:>

wrote:

I am encountering a performance issue with Elasticsearch in Amazon EC2.
Currently I am getting at most 500 rps using ab - Apache HTTP server
benchmarking tool with a very simple query. Our goal is to get to atleast
1000 rps, but it seems unlikely unless we throw more hardware at it. Any
advice would be greatly appreciated.

Configurations:
8 node cluster - m1.xlarge - 4 are EBS optimized
Default memory(5gbs)
Java version 1.7.0_03
OS Ubuntu 12.04.1

active shards 4
replicas 1

number of documents: 1.5millions
each document has about 400 attributes that are searchable, they are of
varying data types.
here are the analyzers that are applied to all string fields:

{
"settings" : {
"index" : {
"number_of_shards" : 4,
"number_of_replicas" : 1
},
"analysis" : {
"char_filter" : {
"my_mapping" : {
"type" : "mapping",
"mappings" : ["(=>", ")=>", "[=>", "]=>", "â
¢=>", "Â®=>" ]
}
},
"analyzer" :{
"default" : {
"type" : "custom",
"tokenizer" : "whitespace",
"filter" : ["lowercase", "pattern_replace"],
"char_filter" : ["my_mapping"],
"stopwords" : "none"
},
"lowercase_only_alphanum" : {
"type" : "custom",
"tokenizer" : "keyword",
"filter" : ["lowercase", "pattern_replace"],
"char_filter" : ["my_mapping"]
}
},
"filter" : {
"pattern_replace": {
"type" : "pattern_replace",
"pattern" : "\xA0",
"replacement" : " "
}
}
}
}
}'

Thanks,

JM

--

--

Randall_McRee · December 18, 2012, 12:05am

Yes, well, as you speculate you may simply end up needing more memory. But
first here is a list of "levers". Try them one by one and see which of them
work for you:

mlockall : true
restrict the number of docs returned to 5 or even 1 if possible
mmapFS
up replicas from 4/1 to 4/2, even 4/3. This allows each query to be
answered by fewer nodes and (if your nodes have spare capacity) it will
increase throughput

Use bigdesk to understand what your current constraint (memory/cpu/disk io)
is likely to be. Make sure that all nodes are at roughly the same cpu
utilization, i.e. that the query workload is balanced across your cluster.

On Mon, Dec 17, 2012 at 2:58 PM, Jason Moore jason.moore89@gmail.comwrote:

Thanks, RKM, but yes I have applied this to all nodes.

JM

On Monday, December 17, 2012 4:47:01 PM UTC-6, RKM wrote:

Since you are primarily interested in query performance you should
optimize your index e.g.

curl -XPOST "http://$ES_HOST:9200/_**optimize?max_num_segments=1"

Have you tried this?

On Mon, Dec 17, 2012 at 2:20 PM, Jason Moore jason....@gmail.com wrote:

I am encountering a performance issue with Elasticsearch in Amazon EC2.
Currently I am getting at most 500 rps using ab - Apache HTTP server
benchmarking tool with a very simple query. Our goal is to get to atleast
1000 rps, but it seems unlikely unless we throw more hardware at it. Any
advice would be greatly appreciated.

Configurations:
8 node cluster - m1.xlarge - 4 are EBS optimized
Default memory(5gbs)
Java version 1.7.0_03
OS Ubuntu 12.04.1

active shards 4
replicas 1

number of documents: 1.5millions
each document has about 400 attributes that are searchable, they are of
varying data types.
here are the analyzers that are applied to all string fields:

{
"settings" : {
"index" : {
"number_of_shards" : 4,
"number_of_replicas" : 1
},
"analysis" : {
"char_filter" : {
"my_mapping" : {
"type" : "mapping",
"mappings" : ["(=>", ")=>", "[=>", "]=>", "â
** ¢=>", "Â®=>" ]
}
},
"analyzer" :{
"default" : {
"type" : "custom",
"tokenizer" : "whitespace",
"filter" : ["lowercase", "pattern_replace"],
"char_filter" : ["my_mapping"],
"stopwords" : "none"
},
"lowercase_only_alphanum" : {
"type" : "custom",
"tokenizer" : "keyword",
"filter" : ["lowercase", "pattern_replace"],
"char_filter" : ["my_mapping"]
}
},
"filter" : {
"pattern_replace": {
"type" : "pattern_replace",
"pattern" : "\xA0",
"replacement" : " "
}
}
}
}
}'

Thanks,

JM

--

--

--

jprante · December 18, 2012, 8:16am

It would be nice if you can give us

the size of your index, maybe it does not fit into memory
the queries you are executing, and how distinct your field values are,
since faceting, sorting and caching contribute a lot to performance
information about what client you are using, since the throughput depends
on how fast your client can process reults
and if all nodes are participating in responding to the client or just a
single node
some monitoring facts about how much of your CPU, RAM, network bandwidth
is used, to find out what resource is saturated

I assume you set the JVM heap to 5GB? Did you change other JVM settings?

Thanks,

Jörg

--

Topic		Replies	Views
Performance issues on EC2/EBS Elasticsearch	2	398	July 6, 2017
Performance problems Elasticsearch	12	590	July 6, 2017
EC2 Perfomance problems, advice needed Elasticsearch	19	492	July 6, 2017
ElasticSearch search performance question Elasticsearch	12	781	July 6, 2017
Help debugging performance issues Elasticsearch	5	1584	July 6, 2017

ElasticSearch Performance Issues

Related topics