ElasticSearch 2.3.0 Performance

(Mahendra) #1

I am using elastic search 2.3.0 hosted on AWS Cloud.

I have recently made code changes for using the bulk indexing option provided by elastic.

When i have tested this change on my PERF env , i was happy with the performance achieved.
But , when i pushed this code to PROD env , the performance was very poor compared to the performance observed in PERF environment.

  • Both PERF and PROD environments have the same hardware configuration (m4.2xlarge EC2 Server)
  • Data Volume in PERF is about 170 Million Docs Stored in a single index , whereas data volume in PROD is about 800 Million Docs Stored in a single index.

Rest all the configurations in elasticsearch.yml are same in both PERF and PROD environments

I am unable to figure out the poor performance. Please provide me with some pointers.


(Mark Walkom) #2

Are you running the AWS ES service?

(Christian Dahlqvist) #3

The main difference seems to be the amount of data in each cluster. As you have a single index, are you assigning your own document IDs and/or performing document updates? Does the shard size differ between the environments?

(Mahendra) #4

Shard size is same in both the environments. We are assigning our own document id in both the environments ..

(Christian Dahlqvist) #5

If the shard size is the same, does that mean that you have 4-5 times more shards in the production environment as the number of documents is considerable larger? How many shards do you have? What is the average shard size?

(Mahendra) #6

I meant to say number of shards are same.

Both in PERF and PROD , we have 50 Shards for the index.

In Production each shard size is now currently at 6 GB.

I found this out using the command GET /_cat/shards?v

(Christian Dahlqvist) #7

When you say that performance is poor, are you referring to indexing and/or query performance? What performance are you seeing?

(Mahendra) #8

Both Indexing and Querying as well.

Actually Querying is good on PROD as well , when there is no Indexing happening.

When there is indexing running in parallel , querying is slow in response.

(system) #9

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.