Unbalanced loading on 2-identical-node cluster

I have setup a 2-node cluster, but 1 node is showing high CPU usage/high GC
number while the other is showing normal.

Let me describe the basic details of the setup:

  • 2 separate and "identical" machines, 32G ram
  • each machine is hosting an elasticsearch instance, the setup is the same,
    they communicate by multicast
  • index setup is 5 shards + 1 replica (i.e. the default setup)

I have another java process running on the other machine, which keep
submitting bulk index requests to this 2-node cluster (the index load is
not high, just around 1000~2000 req for every 5 minutes).

And we don't have any search requests on the 2-node cluster, i.e. just only
doing bulk index requests.

By "elasticsearch head", node 1 are hosting all the "active" 5 shards while
node 2 are hosting all the "non-active" 5 shards. Now the index size is
about 4.3gb/docs: 233774

By "elasticsearch bigdesk", we observe that the node 2 are having much
higher cpu usage and the GC numbers are much higher and node 2 are showing
"higher" loading. I have attached the screen shots of the 2 nodes (1_.jpg
refer to node 1, 2_
.jpg refer to node 2)

And this seems to make our bulk index requests "unstable", e.g. sometimes
it can finish indexing 1000 records within 1 seconds , but sometimes it
takes over 5 mins, which is random and independent of the number of records
in the bulk index requests.

Another strange thing to me is that the "doc counts" of the 2 nodes are not
the same, is that normal?

Do you have any hint on troubleshooting the cause of the above?

Thanks,
Wing

I observe this "higher cpu" pattern for the 2 nodes. (say "master
node" is the one holding on the active shards, "slave node" is the one
holding all the replicated shards)

  1. when index request comes, the master node cpu has no real change,
    but the slave node cpu usage jumps from 0.x% to 10%

  2. even index request is finished and returned to the client, the cpu
    usage of the slave node is still keeping for some times, may be 1 to 2
    minutes, it varies

  3. somehow this jump of cpu usage after each index request cannot
    "end" before the next index request and the cpu usage of the slave
    node jumps to 1x%, by then I can observe the index request (just bulk
    index with 100 items) will need serveral minutes to complete.

Things seem to be nicer for index requests if I turn down the slave
node, i.e. only the master node is doing indexing. But at sometime
later, the index request still need some minutes to finish even the
index request only index for 100 items. Normally the index request of
100 items can complete well belows 1 second.

My index is around 110.7gb / docs: 2725536 , and I use bulk request
for this 100 item and will turn off the refresh interval when bulk
request starts.

So, is the indexing performance random? or depends on what factor?

Wing

On Wed, Jun 13, 2012 at 4:29 PM, Yiu Wing TSANG ywtsang@gmail.com wrote:

I have setup a 2-node cluster, but 1 node is showing high CPU usage/high GC
number while the other is showing normal.

Let me describe the basic details of the setup:

  • 2 separate and "identical" machines, 32G ram
  • each machine is hosting an elasticsearch instance, the setup is the same,
    they communicate by multicast
  • index setup is 5 shards + 1 replica (i.e. the default setup)

I have another java process running on the other machine, which keep
submitting bulk index requests to this 2-node cluster (the index load is not
high, just around 1000~2000 req for every 5 minutes).

And we don't have any search requests on the 2-node cluster, i.e. just only
doing bulk index requests.

By "elasticsearch head", node 1 are hosting all the "active" 5 shards while
node 2 are hosting all the "non-active" 5 shards. Now the index size is
about 4.3gb/docs: 233774

By "elasticsearch bigdesk", we observe that the node 2 are having much
higher cpu usage and the GC numbers are much higher and node 2 are showing
"higher" loading. I have attached the screen shots of the 2 nodes (1_.jpg
refer to node 1, 2_
.jpg refer to node 2)

And this seems to make our bulk index requests "unstable", e.g. sometimes it
can finish indexing 1000 records within 1 seconds , but sometimes it takes
over 5 mins, which is random and independent of the number of records in the
bulk index requests.

Another strange thing to me is that the "doc counts" of the 2 nodes are not
the same, is that normal?

Do you have any hint on troubleshooting the cause of the above?

Thanks,
Wing

I guess I find the cause of long index time.

I tried to turn on the debug logging on the es servers and finds that
it keeps to print out the "mapping" which should be the cause of the
problem.

Here is the mapping:

(be careful it is very large)

I have 2 "dynamic" fields: attributeAssociatesMap and gridAssociatesMap
which are java map

these 2 fields in fact will grow depending on the input data.

e.g.
for attributeAssociatesMap, if the entry contains different values of
"attribute id", it will generate multiple keys
e.g. 97/name/1, 98/name/1, etc

same logic applies to gridAssociatesMap

and different entry contains their own multiple attribute ids and grid
ids and it makes the mapping changes and grows continously

From observing on the log, I guess whenever this happens during indexing:

  1. client submit index request to es node
  2. if the index request introduces new fields, the mapping of the
    index will change
  3. the changed mapping will "update" the existing records in the
    indexes, that's why my index requests will hold longer (from several
    minutes to over 30 minutes) when the index size grows

Do you think this map mapping is the cause of the problem?

The reasons why I use map is that because I can directly index the
field values without needing to aggregate all the field values into
one field by custom separator (and so do not need to escape the
separator as well)

If this map mapping is not appropriate, is there any other suggestion?

Thanks,
Wing

On Thu, Jun 14, 2012 at 10:59 AM, Yiu Wing TSANG ywtsang@gmail.com wrote:

I observe this "higher cpu" pattern for the 2 nodes. (say "master
node" is the one holding on the active shards, "slave node" is the one
holding all the replicated shards)

  1. when index request comes, the master node cpu has no real change,
    but the slave node cpu usage jumps from 0.x% to 10%

  2. even index request is finished and returned to the client, the cpu
    usage of the slave node is still keeping for some times, may be 1 to 2
    minutes, it varies

  3. somehow this jump of cpu usage after each index request cannot
    "end" before the next index request and the cpu usage of the slave
    node jumps to 1x%, by then I can observe the index request (just bulk
    index with 100 items) will need serveral minutes to complete.

Things seem to be nicer for index requests if I turn down the slave
node, i.e. only the master node is doing indexing. But at sometime
later, the index request still need some minutes to finish even the
index request only index for 100 items. Normally the index request of
100 items can complete well belows 1 second.

My index is around 110.7gb / docs: 2725536 , and I use bulk request
for this 100 item and will turn off the refresh interval when bulk
request starts.

So, is the indexing performance random? or depends on what factor?

Wing

On Wed, Jun 13, 2012 at 4:29 PM, Yiu Wing TSANG ywtsang@gmail.com wrote:

I have setup a 2-node cluster, but 1 node is showing high CPU usage/high GC
number while the other is showing normal.

Let me describe the basic details of the setup:

  • 2 separate and "identical" machines, 32G ram
  • each machine is hosting an elasticsearch instance, the setup is the same,
    they communicate by multicast
  • index setup is 5 shards + 1 replica (i.e. the default setup)

I have another java process running on the other machine, which keep
submitting bulk index requests to this 2-node cluster (the index load is not
high, just around 1000~2000 req for every 5 minutes).

And we don't have any search requests on the 2-node cluster, i.e. just only
doing bulk index requests.

By "elasticsearch head", node 1 are hosting all the "active" 5 shards while
node 2 are hosting all the "non-active" 5 shards. Now the index size is
about 4.3gb/docs: 233774

By "elasticsearch bigdesk", we observe that the node 2 are having much
higher cpu usage and the GC numbers are much higher and node 2 are showing
"higher" loading. I have attached the screen shots of the 2 nodes (1_.jpg
refer to node 1, 2_
.jpg refer to node 2)

And this seems to make our bulk index requests "unstable", e.g. sometimes it
can finish indexing 1000 records within 1 seconds , but sometimes it takes
over 5 mins, which is random and independent of the number of records in the
bulk index requests.

Another strange thing to me is that the "doc counts" of the 2 nodes are not
the same, is that normal?

Do you have any hint on troubleshooting the cause of the above?

Thanks,
Wing

Do you think this map mapping is the cause of the problem?

The reasons why I use map is that because I can directly index the
field values without needing to aggregate all the field values into
one field by custom separator (and so do not need to escape the
separator as well)

If this map mapping is not appropriate, is there any other suggestion?

There are probably more efficient ways to do this. I suggest gisting a
few example docs, and explain how you want to be able to query them.
(Also, perhaps explain what the 27/1/1 numbers actually mean, and if
those values are at all relevant)

clint

Thanks,
Wing

On Thu, Jun 14, 2012 at 10:59 AM, Yiu Wing TSANG ywtsang@gmail.com wrote:

I observe this "higher cpu" pattern for the 2 nodes. (say "master
node" is the one holding on the active shards, "slave node" is the one
holding all the replicated shards)

  1. when index request comes, the master node cpu has no real change,
    but the slave node cpu usage jumps from 0.x% to 10%

  2. even index request is finished and returned to the client, the cpu
    usage of the slave node is still keeping for some times, may be 1 to 2
    minutes, it varies

  3. somehow this jump of cpu usage after each index request cannot
    "end" before the next index request and the cpu usage of the slave
    node jumps to 1x%, by then I can observe the index request (just bulk
    index with 100 items) will need serveral minutes to complete.

Things seem to be nicer for index requests if I turn down the slave
node, i.e. only the master node is doing indexing. But at sometime
later, the index request still need some minutes to finish even the
index request only index for 100 items. Normally the index request of
100 items can complete well belows 1 second.

My index is around 110.7gb / docs: 2725536 , and I use bulk request
for this 100 item and will turn off the refresh interval when bulk
request starts.

So, is the indexing performance random? or depends on what factor?

Wing

On Wed, Jun 13, 2012 at 4:29 PM, Yiu Wing TSANG ywtsang@gmail.com wrote:

I have setup a 2-node cluster, but 1 node is showing high CPU usage/high GC
number while the other is showing normal.

Let me describe the basic details of the setup:

  • 2 separate and "identical" machines, 32G ram
  • each machine is hosting an elasticsearch instance, the setup is the same,
    they communicate by multicast
  • index setup is 5 shards + 1 replica (i.e. the default setup)

I have another java process running on the other machine, which keep
submitting bulk index requests to this 2-node cluster (the index load is not
high, just around 1000~2000 req for every 5 minutes).

And we don't have any search requests on the 2-node cluster, i.e. just only
doing bulk index requests.

By "elasticsearch head", node 1 are hosting all the "active" 5 shards while
node 2 are hosting all the "non-active" 5 shards. Now the index size is
about 4.3gb/docs: 233774

By "elasticsearch bigdesk", we observe that the node 2 are having much
higher cpu usage and the GC numbers are much higher and node 2 are showing
"higher" loading. I have attached the screen shots of the 2 nodes (1_.jpg
refer to node 1, 2_
.jpg refer to node 2)

And this seems to make our bulk index requests "unstable", e.g. sometimes it
can finish indexing 1000 records within 1 seconds , but sometimes it takes
over 5 mins, which is random and independent of the number of records in the
bulk index requests.

Another strange thing to me is that the "doc counts" of the 2 nodes are not
the same, is that normal?

Do you have any hint on troubleshooting the cause of the above?

Thanks,
Wing

Both the 2 maps are just for "searching" and "retrieving" values, we
do not need to do facet/filter on the 2 maps.

And I just map each map entry as a row of my database table row. So
the key of each map entry will have "placeholders":

{gridId}/{langId}/{displaySeq}/label/{labelId}
which will resolve something like: 8/2/1/label/8

And for this case, I think it would be just better to map the field as
a "json string" and read the field from the _source field directly?

And, does the size of mapping (or number of fields in a document)
affects indexing?

because we have some "dynamic" fields that need facets and filters, e.g.

I have a "price" field for different region and I map this "price" field as

region/{regionId}/price

For now, some entries have regionId 1, 2, 3, while some other entries
have regionId 10, 11, 12, so they will resolves to multiple price
fields for different entries:

entry 1:
region/1/price
region/2/price
region/3/price

entry 2:
region/10/price
region/11/price
region/12/price

But I expect we can add new region later and so the number of fields
resolved will increase. Is this kind of dynamic field generation
appropriate?

Thanks,
Wing

On Thu, Jun 14, 2012 at 9:15 PM, Clinton Gormley clint@traveljury.com wrote:

Do you think this map mapping is the cause of the problem?

The reasons why I use map is that because I can directly index the
field values without needing to aggregate all the field values into
one field by custom separator (and so do not need to escape the
separator as well)

If this map mapping is not appropriate, is there any other suggestion?

There are probably more efficient ways to do this. I suggest gisting a
few example docs, and explain how you want to be able to query them.
(Also, perhaps explain what the 27/1/1 numbers actually mean, and if
those values are at all relevant)

clint

Thanks,
Wing

On Thu, Jun 14, 2012 at 10:59 AM, Yiu Wing TSANG ywtsang@gmail.com wrote:

I observe this "higher cpu" pattern for the 2 nodes. (say "master
node" is the one holding on the active shards, "slave node" is the one
holding all the replicated shards)

  1. when index request comes, the master node cpu has no real change,
    but the slave node cpu usage jumps from 0.x% to 10%

  2. even index request is finished and returned to the client, the cpu
    usage of the slave node is still keeping for some times, may be 1 to 2
    minutes, it varies

  3. somehow this jump of cpu usage after each index request cannot
    "end" before the next index request and the cpu usage of the slave
    node jumps to 1x%, by then I can observe the index request (just bulk
    index with 100 items) will need serveral minutes to complete.

Things seem to be nicer for index requests if I turn down the slave
node, i.e. only the master node is doing indexing. But at sometime
later, the index request still need some minutes to finish even the
index request only index for 100 items. Normally the index request of
100 items can complete well belows 1 second.

My index is around 110.7gb / docs: 2725536 , and I use bulk request
for this 100 item and will turn off the refresh interval when bulk
request starts.

So, is the indexing performance random? or depends on what factor?

Wing

On Wed, Jun 13, 2012 at 4:29 PM, Yiu Wing TSANG ywtsang@gmail.com wrote:

I have setup a 2-node cluster, but 1 node is showing high CPU usage/high GC
number while the other is showing normal.

Let me describe the basic details of the setup:

  • 2 separate and "identical" machines, 32G ram
  • each machine is hosting an elasticsearch instance, the setup is the same,
    they communicate by multicast
  • index setup is 5 shards + 1 replica (i.e. the default setup)

I have another java process running on the other machine, which keep
submitting bulk index requests to this 2-node cluster (the index load is not
high, just around 1000~2000 req for every 5 minutes).

And we don't have any search requests on the 2-node cluster, i.e. just only
doing bulk index requests.

By "elasticsearch head", node 1 are hosting all the "active" 5 shards while
node 2 are hosting all the "non-active" 5 shards. Now the index size is
about 4.3gb/docs: 233774

By "elasticsearch bigdesk", we observe that the node 2 are having much
higher cpu usage and the GC numbers are much higher and node 2 are showing
"higher" loading. I have attached the screen shots of the 2 nodes (1_.jpg
refer to node 1, 2_
.jpg refer to node 2)

And this seems to make our bulk index requests "unstable", e.g. sometimes it
can finish indexing 1000 records within 1 seconds , but sometimes it takes
over 5 mins, which is random and independent of the number of records in the
bulk index requests.

Another strange thing to me is that the "doc counts" of the 2 nodes are not
the same, is that normal?

Do you have any hint on troubleshooting the cause of the above?

Thanks,
Wing

Just want to update:

after avoiding the generation of that many fields in a document
(instead I just store those fields into a single json string and let
the reader of the result to parse the json string into objects
accordingly), the indexing speed returns to normal.

Thanks,
Wing

On Fri, Jun 15, 2012 at 9:49 AM, Yiu Wing TSANG ywtsang@gmail.com wrote:

Both the 2 maps are just for "searching" and "retrieving" values, we
do not need to do facet/filter on the 2 maps.

And I just map each map entry as a row of my database table row. So
the key of each map entry will have "placeholders":

{gridId}/{langId}/{displaySeq}/label/{labelId}
which will resolve something like: 8/2/1/label/8

And for this case, I think it would be just better to map the field as
a "json string" and read the field from the _source field directly?

And, does the size of mapping (or number of fields in a document)
affects indexing?

because we have some "dynamic" fields that need facets and filters, e.g.

I have a "price" field for different region and I map this "price" field as

region/{regionId}/price

For now, some entries have regionId 1, 2, 3, while some other entries
have regionId 10, 11, 12, so they will resolves to multiple price
fields for different entries:

entry 1:
region/1/price
region/2/price
region/3/price

entry 2:
region/10/price
region/11/price
region/12/price

But I expect we can add new region later and so the number of fields
resolved will increase. Is this kind of dynamic field generation
appropriate?

Thanks,
Wing

On Thu, Jun 14, 2012 at 9:15 PM, Clinton Gormley clint@traveljury.com wrote:

Do you think this map mapping is the cause of the problem?

The reasons why I use map is that because I can directly index the
field values without needing to aggregate all the field values into
one field by custom separator (and so do not need to escape the
separator as well)

If this map mapping is not appropriate, is there any other suggestion?

There are probably more efficient ways to do this. I suggest gisting a
few example docs, and explain how you want to be able to query them.
(Also, perhaps explain what the 27/1/1 numbers actually mean, and if
those values are at all relevant)

clint

Thanks,
Wing

On Thu, Jun 14, 2012 at 10:59 AM, Yiu Wing TSANG ywtsang@gmail.com wrote:

I observe this "higher cpu" pattern for the 2 nodes. (say "master
node" is the one holding on the active shards, "slave node" is the one
holding all the replicated shards)

  1. when index request comes, the master node cpu has no real change,
    but the slave node cpu usage jumps from 0.x% to 10%

  2. even index request is finished and returned to the client, the cpu
    usage of the slave node is still keeping for some times, may be 1 to 2
    minutes, it varies

  3. somehow this jump of cpu usage after each index request cannot
    "end" before the next index request and the cpu usage of the slave
    node jumps to 1x%, by then I can observe the index request (just bulk
    index with 100 items) will need serveral minutes to complete.

Things seem to be nicer for index requests if I turn down the slave
node, i.e. only the master node is doing indexing. But at sometime
later, the index request still need some minutes to finish even the
index request only index for 100 items. Normally the index request of
100 items can complete well belows 1 second.

My index is around 110.7gb / docs: 2725536 , and I use bulk request
for this 100 item and will turn off the refresh interval when bulk
request starts.

So, is the indexing performance random? or depends on what factor?

Wing

On Wed, Jun 13, 2012 at 4:29 PM, Yiu Wing TSANG ywtsang@gmail.com wrote:

I have setup a 2-node cluster, but 1 node is showing high CPU usage/high GC
number while the other is showing normal.

Let me describe the basic details of the setup:

  • 2 separate and "identical" machines, 32G ram
  • each machine is hosting an elasticsearch instance, the setup is the same,
    they communicate by multicast
  • index setup is 5 shards + 1 replica (i.e. the default setup)

I have another java process running on the other machine, which keep
submitting bulk index requests to this 2-node cluster (the index load is not
high, just around 1000~2000 req for every 5 minutes).

And we don't have any search requests on the 2-node cluster, i.e. just only
doing bulk index requests.

By "elasticsearch head", node 1 are hosting all the "active" 5 shards while
node 2 are hosting all the "non-active" 5 shards. Now the index size is
about 4.3gb/docs: 233774

By "elasticsearch bigdesk", we observe that the node 2 are having much
higher cpu usage and the GC numbers are much higher and node 2 are showing
"higher" loading. I have attached the screen shots of the 2 nodes (1_.jpg
refer to node 1, 2_
.jpg refer to node 2)

And this seems to make our bulk index requests "unstable", e.g. sometimes it
can finish indexing 1000 records within 1 seconds , but sometimes it takes
over 5 mins, which is random and independent of the number of records in the
bulk index requests.

Another strange thing to me is that the "doc counts" of the 2 nodes are not
the same, is that normal?

Do you have any hint on troubleshooting the cause of the above?

Thanks,
Wing