EC2 Perfomance problems, advice needed

Another thing that really worries me is the difference in system load
between the nodes, when i'm running the tests, it's always the master
node, that has a spike on system load.
Even if the requests are sent to all of the nodes.

Regards,
Ariel

On Jul 7, 2:07 pm, Ariel Amato ariel.amato...@gmail.com wrote:

Hi again,
Here's a sample query just the age search like you
suggested:Example query · GitHub.
Right now i'm running a 3 server cluster, with that config
that's why I tested with 3 replicas.
I've put the configuration for the thread pool, but
still.... the same problems happen.

Thanks for your assistance, and any extra help would be appreciated!

Regards,
Ariel

On Jul 7, 11:32 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Hey,

What I would like to see are those queries that you execute, and possibly the script.

A note on shards and replicas. You can increase teh number of replicas to a high value, but as long as you don't have enough nodes to allocate them, it does not matter. For example, if you have 3 shards and 1 replica, and 2 nodes, and then you increase the number of replicas, it would not have affect unless you bring more nodes into the picture.

Another option that you can try and play with is the search thread pool. Can you try and configure it to be a fixed sized thread pool? The configuration is:

threadpool.search.type: fixed
threadpool.search.size: 10

-shay.banon

On Thursday, July 7, 2011 at 2:27 AM, Ariel Amato wrote:

Hi again,

I created the cluster you suggested, but with no luck. Same results as
before, an extremely high load on one of the nodes and almost nothing
on the other and a lot of timeouts.
After that I added a new node to the cluster, with a configuration
like this:

  • 3 shards / 1 replica, at first, and then
  • 3 shards / 3 replicas

Results were similar, then I removed almost every term from the query
(leaving just 2 terms, which are simple integer value properties), and
ran the stress test again, with a big improvement on the amount of
requests per second, with just a few timeouts, but there were still
timeouts.
But again, system load skyrocketed, here are some captures from the
servers:

Master node:
top:Main node, top · GitHub
vmstat:vmstat master · GitHub

Slave node 1:
top:top slave · GitHub
vmstat;vmstat slave · GitHub

Slave node 2:
top:Top slave 2 · GitHub

Total GC time (ParNew + CMS) on all nodes does not exceed 300ms.

We're using the Sun jvm, Java HotSpot(TM) 64-Bit Server VM, version
19.1-b02, Java version 1.6.0_24 on all nodes and the OS is Ubuntu
10.10.

Would a thread dump of the nodes help?
Right now i'm using the dynamic mapping, except for a geo location
point, would it help to create the mappings before indexing? At the
moment i'm not seeing any problems at indexing time.
I'm actually running out of questions to make, tomorrow i'll work a
little on replicating the queries with your suggestion, but even
without the search query the system load is way too high.

Thank you in advance for any help you can give me on this.

Best regards
Ariel

On Jul 6, 4:33 pm, Shay Banon <shay.ba...@elasticsearch.com (http://elasticsearch.com)> wrote:

Heya,

Lets start with the query, try and replace the range query you add as a should clause with a ConstantScore one wrapping a range filter (and set the boost on it). This will work only on memory and should speed things up.

The suggested filtering will use more memory (you should see in the stats the filter cache being used), but not by much, since age is easily cacheable.

I sugget you start with 2 nodes, and test, lets see what we get. Improving search perf is simple since once if you don't have full utilization of nodes (for example, 2 shard with 1 replica will max out on 4 nodes), you can dynamically increase the number of replicas.

-shay.banon

On Wednesday, July 6, 2011 at 8:43 PM, Ariel Amato wrote:

Hey,

Here they are:
Stats when idle:When idle · GitHub
Stats under load:Under load · GitHub
Top:System load during test · GitHub

Sample code for generating the query:Sample query · GitHub

What's the amount of m1.xlarge servers you suggest?

Regards,
Ariel

On Jul 6, 1:52 pm, Shay Banon <shay.ba...@elasticsearch.com (http://elasticsearch.com)> wrote:

Hey,

I would suggest using m1.xlarge, and allocate to ES ~5gb. When you run the test, can you take a node.stats sample, and gist it?

As for your queries and using filter, it feels like filters will really help here. Can you gist a sample query so we can work on how it will look like?

-shay.banon

On Wednesday, July 6, 2011 at 7:46 PM, Ariel Amato wrote:

Hi Shay, I am allocating 5gb of ram for ES on each server. When it's
not under load, heap usage is really low ~350mb with the 5gb
allocated.
Under load, the amount threads increased to around 150 (when I stopped
the test).
The problem with the uneven load between the master and slave node
seems to be fixed if I specify a routing key when indexing.
The main cause for the load seems to be cpu usage, here is a "top"
output:

top - 16:24:07 up 17:45, 2 users, load average: 31.69, 20.89, 11.02
Tasks: 158 total, 1 running, 157 sleeping, 0 stopped, 0 zombie
Cpu(s): 88.3%us, 3.3%sy, 0.0%ni, 2.7%id, 0.0%wa, 0.0%hi,
0.1%si, 5.6%st
Mem: 7132040k total, 7042560k used, 89480k free, 178336k
buffers
Swap: 0k total, 0k used, 0k free, 1078180k
cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
28055 root 20 0 5689m 5.3g 16m S 789 77.7
21:45.30
java
221 root 20 0 0 0 0 S
0 0.0 0:28.83
kjournald
28314 root 20 0 19276 1352 968 R 0
0.0 0:00.06
top
1 root 20 0 23860 1364 656 S 0
0.0 0:00.83
init

Regarding the queries, for the range terms, I need to downgrade
gracefully the quality of the results that do not match.
For example, when searching for an age field, I need to specify a
range of 25 to 31, so the users on that range get the greatest score,
but I also need to include on the results users with 32 with a lower
score, and 35 with an even lower score. Please let me know if there is
a way to do that with filters.

I'm in the process of expanding the cluster to 4 m1.large instances,
with a 2 shards / 2 replicas configuration.
Please let me know what you think about that.

Thanks in advance
Regards,

Ariel

On Jul 6, 1:07 pm, Shay Banon <shay.ba...@elasticsearch.com (http://elasticsearch.com)> wrote:

On ec2, I would suggest going with m1.xlarge, as more RAM means more file system cache. Also, having more shards on 2 nodes can slow down search, as it needs to execute across all of them, maybe you can do with 2-4 shards (with 1 replica). Increasing the number of replicas will not help, since you only run on 2 nodes (and you can always increase it dynamically later if oyu add more nodes).

Also, in terms of queries, maybe some of your range queries, especially the numeric ones, can benefit from being used as filters instead of queries (so they can be cached).

How much memory are you allocating to ES?

Master nodes play no part when searching, so the fact that it uses more resources does not imply because its the master node.

Can you tell where the load is coming from? Is it CPU, IO?

On Wednesday, July 6, 2011 at 7:33 AM, jp.lora...@cfyar.com (mailto:jp.lora...@cfyar.com (http://cfyar.com)) wrote:

How much min and max ram for ES -- you only mention the image's RAM. Is there anything else running on the box?

Also a high shard/replica ratio like you have increases indexing (esp. bulk) performance, but the inverse ratio will boost search speed, so enlarging the amount of replicas will speed up the system (have to be more replicas than shards though for optimal results).

I would try with smaller machines -- less cores, less ram, 2 shards, 8 replicas, 8 nodes total. Shay suggests in the README on github a 1/10 ratio even.

Espero que ayude,
JP

-------- Original Message --------
Subject: EC2 Perfomance problems, advice needed
From: Ariel Amato <ariel.amato...@gmail.com (http://ariel.amato...@gmail.com (http://gmail.com))>
Date: Tue, July 05, 2011 11:32 pm
To: users <us...@elasticsearch.com (mailto:us...@elasticsearch.com (http://elasticsearch.com))>

Hi,
I have been trying to use ES for a search service for my site,
which is running on EC2 servers, and I am having a lot of performance
problems.
My data set is roughly 1.2M records of 30 fields each, mostly
Integers of Long numbers, a few short Strings and a Geo location
point.
Most of the search queries use a native custom score script.
I need to query the ES cluster with around 300 concurrent
requests, and I roughly need a performance of 40 requests per second.
I have got a staging

...

read more »