I've been on/off dealing with some ES issues, and I think I've
resolved a LOT of them. but there's still one thing that bugs me.
My structure:
40gb index
two nodes, ten shards, one replica.
each node is a quad core 32gb of ram machine. min=max=16gb
our data is extremely uniform, and extremely small. but comes in at a
high rate of around 100 index actions per second.
ES is at 18.6 (I know, we'll be updating this week)
currently, all of the shards are marked as primary on node one.
node 1:
uptime is 9 hours.
big desk shows 11399 garbage collection actions, and 2000 merges.
cpu has been uniform at under 5% for pretty much ever.
memory is at used=30gb, actual=14gb
node 2:
uptime is 3 hours
bigdesk shows 104000 gc actions, and 800 merges.
cpu has been uniform at 80-90% for pretty much forever.
memory is at used=21gb, actual=13gb
note:
node 1 is also running a few python scripts (server density, mongo
sync tool, geckoboard agent)
my questions:
why is cpu so high on node 2?
I was thinking that the 'primary' shards are the ones that do the
merging, and the 'slaves' do the receives of the complete index, but
in both cases lucene indices have to do a merge. not so much a
question as making sure I'm understanding everything.
my data is routed, which in theory should mean that some shards
would be doing a lot more work than others. does ES account for 'work'
and move shards from machine to machine? or does it just account for
shard size?
I probably do not have answers to all your questions but let me try to
address some of them.
As for the high CPU, can this be caused by the fact that you connect ES
client just to he second node? Which means that this node needs to handle
query execution as well? In other words, you did not share with us how you
query the cluster. It could be one of the reasons.
On Wed, Apr 4, 2012 at 9:49 PM, Oren Mazor oren@wildbit.com wrote:
hey guys,
I've been on/off dealing with some ES issues, and I think I've
resolved a LOT of them. but there's still one thing that bugs me.
My structure:
40gb index
two nodes, ten shards, one replica.
each node is a quad core 32gb of ram machine. min=max=16gb
our data is extremely uniform, and extremely small. but comes in at a
high rate of around 100 index actions per second.
ES is at 18.6 (I know, we'll be updating this week)
currently, all of the shards are marked as primary on node one.
node 1:
uptime is 9 hours.
big desk shows 11399 garbage collection actions, and 2000 merges.
cpu has been uniform at under 5% for pretty much ever.
memory is at used=30gb, actual=14gb
node 2:
uptime is 3 hours
bigdesk shows 104000 gc actions, and 800 merges.
cpu has been uniform at 80-90% for pretty much forever.
memory is at used=21gb, actual=13gb
note:
node 1 is also running a few python scripts (server density, mongo
sync tool, geckoboard agent)
my questions:
why is cpu so high on node 2?
I was thinking that the 'primary' shards are the ones that do the
merging, and the 'slaves' do the receives of the complete index, but
in both cases lucene indices have to do a merge. not so much a
question as making sure I'm understanding everything.
my data is routed, which in theory should mean that some shards
would be doing a lot more work than others. does ES account for 'work'
and move shards from machine to machine? or does it just account for
shard size?
Because the number of replicas is 1 and you have 2 nodes, then it is
strange that one node has high CPU load compared to the other... . Is the
OS the same on both? Are you running on a virtualized / AWS env? Is the JVM
the same on both? Can you gist a couple of jstack results from the node
with the high load? ( jstack - Stack Trace).
On Wed, Apr 4, 2012 at 10:49 PM, Oren Mazor oren@wildbit.com wrote:
hey guys,
I've been on/off dealing with some ES issues, and I think I've
resolved a LOT of them. but there's still one thing that bugs me.
My structure:
40gb index
two nodes, ten shards, one replica.
each node is a quad core 32gb of ram machine. min=max=16gb
our data is extremely uniform, and extremely small. but comes in at a
high rate of around 100 index actions per second.
ES is at 18.6 (I know, we'll be updating this week)
currently, all of the shards are marked as primary on node one.
node 1:
uptime is 9 hours.
big desk shows 11399 garbage collection actions, and 2000 merges.
cpu has been uniform at under 5% for pretty much ever.
memory is at used=30gb, actual=14gb
node 2:
uptime is 3 hours
bigdesk shows 104000 gc actions, and 800 merges.
cpu has been uniform at 80-90% for pretty much forever.
memory is at used=21gb, actual=13gb
note:
node 1 is also running a few python scripts (server density, mongo
sync tool, geckoboard agent)
my questions:
why is cpu so high on node 2?
I was thinking that the 'primary' shards are the ones that do the
merging, and the 'slaves' do the receives of the complete index, but
in both cases lucene indices have to do a merge. not so much a
question as making sure I'm understanding everything.
my data is routed, which in theory should mean that some shards
would be doing a lot more work than others. does ES account for 'work'
and move shards from machine to machine? or does it just account for
shard size?
these are two RS machines, identical in every way other than the busy one
(node 2), is all non-primaries. all of the queries are going via node 1,
the one with the low cpu. node 2 (high cpu) is never actually queried,
which makes this even weirder.
if replication is basically just mirroring of requests, then the merge
activity should be identical on both, no?
jstack with no special flags:
On Wednesday, April 4, 2012 5:35:58 PM UTC-4, kimchy wrote:
Because the number of replicas is 1 and you have 2 nodes, then it is
strange that one node has high CPU load compared to the other... . Is the
OS the same on both? Are you running on a virtualized / AWS env? Is the JVM
the same on both? Can you gist a couple of jstack results from the node
with the high load? ( jstack - Stack Trace).
On Wed, Apr 4, 2012 at 10:49 PM, Oren Mazor oren@wildbit.com wrote:
hey guys,
I've been on/off dealing with some ES issues, and I think I've
resolved a LOT of them. but there's still one thing that bugs me.
My structure:
40gb index
two nodes, ten shards, one replica.
each node is a quad core 32gb of ram machine. min=max=16gb
our data is extremely uniform, and extremely small. but comes in at a
high rate of around 100 index actions per second.
ES is at 18.6 (I know, we'll be updating this week)
currently, all of the shards are marked as primary on node one.
node 1:
uptime is 9 hours.
big desk shows 11399 garbage collection actions, and 2000 merges.
cpu has been uniform at under 5% for pretty much ever.
memory is at used=30gb, actual=14gb
node 2:
uptime is 3 hours
bigdesk shows 104000 gc actions, and 800 merges.
cpu has been uniform at 80-90% for pretty much forever.
memory is at used=21gb, actual=13gb
note:
node 1 is also running a few python scripts (server density, mongo
sync tool, geckoboard agent)
my questions:
why is cpu so high on node 2?
I was thinking that the 'primary' shards are the ones that do the
merging, and the 'slaves' do the receives of the complete index, but
in both cases lucene indices have to do a merge. not so much a
question as making sure I'm understanding everything.
my data is routed, which in theory should mean that some shards
would be doing a lot more work than others. does ES account for 'work'
and move shards from machine to machine? or does it just account for
shard size?
On Thu, Apr 5, 2012 at 8:05 AM, Oren Mazor oren@wildbit.com wrote:
these are two RS machines, identical in every way other than the busy one
(node 2), is all non-primaries. all of the queries are going via node 1,
the one with the low cpu. node 2 (high cpu) is never actually queried,
which makes this even weirder.
Even if you don't send explicit search request to them, they are still
being searched on because elasticsearch will distribute the data between
them.
if replication is basically just mirroring of requests, then the merge
activity should be identical on both, no?
First, note that the merge stats are only from the time the node was
started. And no, even if both are started on teh same time, they will
differ because they are not completely in sync (doc wise they are, time
wise of when things kick off, like merges, might differ).
Can you gist another set of jstack? Can you also include one from node1?
On Wednesday, April 4, 2012 5:35:58 PM UTC-4, kimchy wrote:
Because the number of replicas is 1 and you have 2 nodes, then it is
strange that one node has high CPU load compared to the other... . Is the
OS the same on both? Are you running on a virtualized / AWS env? Is the JVM
the same on both? Can you gist a couple of jstack results from the node
with the high load? (Moved
**share/jstack.htmlhttp://docs.oracle.com/javase/1.5.0/docs/tooldocs/share/jstack.html
).
On Wed, Apr 4, 2012 at 10:49 PM, Oren Mazor oren@wildbit.com wrote:
hey guys,
I've been on/off dealing with some ES issues, and I think I've
resolved a LOT of them. but there's still one thing that bugs me.
My structure:
40gb index
two nodes, ten shards, one replica.
each node is a quad core 32gb of ram machine. min=max=16gb
our data is extremely uniform, and extremely small. but comes in at a
high rate of around 100 index actions per second.
ES is at 18.6 (I know, we'll be updating this week)
currently, all of the shards are marked as primary on node one.
node 1:
uptime is 9 hours.
big desk shows 11399 garbage collection actions, and 2000 merges.
cpu has been uniform at under 5% for pretty much ever.
memory is at used=30gb, actual=14gb
node 2:
uptime is 3 hours
bigdesk shows 104000 gc actions, and 800 merges.
cpu has been uniform at 80-90% for pretty much forever.
memory is at used=21gb, actual=13gb
note:
node 1 is also running a few python scripts (server density, mongo
sync tool, geckoboard agent)
my questions:
why is cpu so high on node 2?
I was thinking that the 'primary' shards are the ones that do the
merging, and the 'slaves' do the receives of the complete index, but
in both cases lucene indices have to do a merge. not so much a
question as making sure I'm understanding everything.
my data is routed, which in theory should mean that some shards
would be doing a lot more work than others. does ES account for 'work'
and move shards from machine to machine? or does it just account for
shard size?
we migrated to 19.2 and I'm seeing the same thing. the node that takes
the "point" of the requests has the cpu pegged again.
but with the new version of ES comes a new version of big desk (I
can't love this tool enough), and in this one I noticed that that node
has over 100k http connections, while the second node only 20. going
to look into balancing this out first.
On Thu, Apr 5, 2012 at 8:05 AM, Oren Mazor o...@wildbit.com wrote:
these are two RS machines, identical in every way other than the busy one
(node 2), is all non-primaries. all of the queries are going via node 1,
the one with the low cpu. node 2 (high cpu) is never actually queried,
which makes this even weirder.
Even if you don't send explicit search request to them, they are still
being searched on because elasticsearch will distribute the data between
them.
if replication is basically just mirroring of requests, then the merge
activity should be identical on both, no?
First, note that the merge stats are only from the time the node was
started. And no, even if both are started on teh same time, they will
differ because they are not completely in sync (doc wise they are, time
wise of when things kick off, like merges, might differ).
Can you gist another set of jstack? Can you also include one from node1?
On Wednesday, April 4, 2012 5:35:58 PM UTC-4, kimchy wrote:
Because the number of replicas is 1 and you have 2 nodes, then it is
strange that one node has high CPU load compared to the other... . Is the
OS the same on both? Are you running on a virtualized / AWS env? Is the JVM
the same on both? Can you gist a couple of jstack results from the node
with the high load? (Moved
**share/jstack.htmlhttp://docs.oracle.com/javase/1.5.0/docs/tooldocs/share/jstack.html
).
On Wed, Apr 4, 2012 at 10:49 PM, Oren Mazor o...@wildbit.com wrote:
hey guys,
I've been on/off dealing with some ES issues, and I think I've
resolved a LOT of them. but there's still one thing that bugs me.
My structure:
40gb index
two nodes, ten shards, one replica.
each node is a quad core 32gb of ram machine. min=max=16gb
our data is extremely uniform, and extremely small. but comes in at a
high rate of around 100 index actions per second.
ES is at 18.6 (I know, we'll be updating this week)
currently, all of the shards are marked as primary on node one.
node 1:
uptime is 9 hours.
big desk shows 11399 garbage collection actions, and 2000 merges.
cpu has been uniform at under 5% for pretty much ever.
memory is at used=30gb, actual=14gb
node 2:
uptime is 3 hours
bigdesk shows 104000 gc actions, and 800 merges.
cpu has been uniform at 80-90% for pretty much forever.
memory is at used=21gb, actual=13gb
note:
node 1 is also running a few python scripts (server density, mongo
sync tool, geckoboard agent)
my questions:
why is cpu so high on node 2?
I was thinking that the 'primary' shards are the ones that do the
merging, and the 'slaves' do the receives of the complete index, but
in both cases lucene indices have to do a merge. not so much a
question as making sure I'm understanding everything.
my data is routed, which in theory should mean that some shards
would be doing a lot more work than others. does ES account for 'work'
and move shards from machine to machine? or does it just account for
shard size?
I've noticed another thing. the nodes have different data. is this a
function of re-play of index/delete actions? is there a way to modify
the frequency that happens? it looks like its a day or two behind…
we migrated to 19.2 and I'm seeing the same thing. thenodethat takes
the "point" of the requests has the cpu pegged again.
but with the new version of ES comes a new version of big desk (I
can't love this tool enough), and in this one I noticed that thatnode
has over 100k http connections, while the secondnodeonly 20. going
to look into balancing this out first.
On Thu, Apr 5, 2012 at 8:05 AM, Oren Mazor o...@wildbit.com wrote:
these are two RS machines, identical in every way other than the busy one
(node2), is all non-primaries. all of the queries are going vianode1,
the one with the low cpu.node2 (high cpu) is never actually queried,
which makes this even weirder.
Even if you don't send explicit search request to them, they are still
being searched on because elasticsearch will distribute the data between
them.
if replication is basically just mirroring of requests, then the merge
activity should be identical on both, no?
First, note that the merge stats are only from the time thenodewas
started. And no, even if both are started on teh same time, they will
differ because they are not completely insync(doc wise they are, time
wise of when things kick off, like merges, might differ).
Can you gist another set of jstack? Can you also include one from node1?
On Wednesday, April 4, 2012 5:35:58 PM UTC-4, kimchy wrote:
Because the number of replicas is 1 and you have 2 nodes, then it is
strange that onenodehas high CPU load compared to the other... . Is the
OS the same on both? Are you running on a virtualized / AWS env? Is the JVM
the same on both? Can you gist a couple of jstack results from thenode
with the high load? (Moved
**share/jstack.htmlhttp://docs.oracle.com/javase/1.5.0/docs/tooldocs/share/jstack.html
).
On Wed, Apr 4, 2012 at 10:49 PM, Oren Mazor o...@wildbit.com wrote:
hey guys,
I've been on/off dealing with some ES issues, and I think I've
resolved a LOT of them. but there's still one thing that bugs me.
My structure:
40gb index
two nodes, ten shards, one replica.
eachnodeis a quad core 32gb of ram machine. min=max=16gb
our data is extremely uniform, and extremely small. but comes in at a
high rate of around 100 index actions per second.
ES is at 18.6 (I know, we'll be updating this week)
currently, all of the shards are marked as primary onnodeone.
node1:
uptime is 9 hours.
big desk shows 11399 garbage collection actions, and 2000 merges.
cpu has been uniform at under 5% for pretty much ever.
memory is at used=30gb, actual=14gb
node2:
uptime is 3 hours
bigdesk shows 104000 gc actions, and 800 merges.
cpu has been uniform at 80-90% for pretty much forever.
memory is at used=21gb, actual=13gb
note:
node1 is also running a few python scripts (server density, mongo
synctool, geckoboard agent)
my questions:
why is cpu so high onnode2?
I was thinking that the 'primary' shards are the ones that do the
merging, and the 'slaves' do the receives of the complete index, but
in both cases lucene indices have to do a merge. not so much a
question as making sure I'm understanding everything.
my data is routed, which in theory should mean that some shards
would be doing a lot more work than others. does ES account for 'work'
and move shards from machine to machine? or does it just account for
shard size?
I've noticed another thing. the nodes have different data. is this a
function of re-play of index/delete actions? is there a way to modify
the frequency that happens? it looks like its a day or two behind…
we migrated to 19.2 and I'm seeing the same thing. thenodethat takes
the "point" of the requests has the cpu pegged again.
but with the new version of ES comes a new version of big desk (I
can't love this tool enough), and in this one I noticed that thatnode
has over 100k http connections, while the secondnodeonly 20. going
to look into balancing this out first.
On Thu, Apr 5, 2012 at 8:05 AM, Oren Mazor o...@wildbit.com wrote:
these are two RS machines, identical in every way other than the
busy one
(node2), is all non-primaries. all of the queries are going vianode1,
the one with the low cpu.node2 (high cpu) is never actually queried,
which makes this even weirder.
Even if you don't send explicit search request to them, they are still
being searched on because elasticsearch will distribute the data
between
them.
if replication is basically just mirroring of requests, then the
merge
activity should be identical on both, no?
First, note that the merge stats are only from the time thenodewas
started. And no, even if both are started on teh same time, they will
differ because they are not completely insync(doc wise they are, time
wise of when things kick off, like merges, might differ).
Can you gist another set of jstack? Can you also include one from
node1?
On Wednesday, April 4, 2012 5:35:58 PM UTC-4, kimchy wrote:
Because the number of replicas is 1 and you have 2 nodes, then it is
strange that onenodehas high CPU load compared to the other... . Is
the
OS the same on both? Are you running on a virtualized / AWS env? Is
the JVM
the same on both? Can you gist a couple of jstack results from
thenode
with the high load? ( Moved
**share/jstack.html< jstack - Stack Trace>
).
On Wed, Apr 4, 2012 at 10:49 PM, Oren Mazor o...@wildbit.com
wrote:
hey guys,
I've been on/off dealing with some ES issues, and I think I've
resolved a LOT of them. but there's still one thing that bugs me.
My structure:
40gb index
two nodes, ten shards, one replica.
eachnodeis a quad core 32gb of ram machine. min=max=16gb
our data is extremely uniform, and extremely small. but comes in
at a
high rate of around 100 index actions per second.
ES is at 18.6 (I know, we'll be updating this week)
currently, all of the shards are marked as primary onnodeone.
node1:
uptime is 9 hours.
big desk shows 11399 garbage collection actions, and 2000 merges.
cpu has been uniform at under 5% for pretty much ever.
memory is at used=30gb, actual=14gb
node2:
uptime is 3 hours
bigdesk shows 104000 gc actions, and 800 merges.
cpu has been uniform at 80-90% for pretty much forever.
memory is at used=21gb, actual=13gb
note:
node1 is also running a few python scripts (server density, mongo
synctool, geckoboard agent)
my questions:
why is cpu so high onnode2?
I was thinking that the 'primary' shards are the ones that do
the
merging, and the 'slaves' do the receives of the complete index,
but
in both cases lucene indices have to do a merge. not so much a
question as making sure I'm understanding everything.
my data is routed, which in theory should mean that some shards
would be doing a lot more work than others. does ES account for
'work'
and move shards from machine to machine? or does it just account
for
shard size?
Lets say I index a new document every minute. with replica turned on,
it means that one of my nodes is going to be missing an entire day's
worth of indexing activity.
I'm currently running without replicas (but with multiple nodes) and
its running great. as soon as I turn on replica, I can get two
different results for a search query.
let me know what debug info I can get for you on this.
I've noticed another thing. the nodes have different data. is this a
function of re-play of index/delete actions? is there a way to modify
the frequency that happens? it looks like its a day or two behind…
we migrated to 19.2 and I'm seeing the same thing. thenodethat takes
the "point" of the requests has the cpu pegged again.
but with the new version of ES comes a new version of big desk (I
can't love this tool enough), and in this one I noticed that thatnode
has over 100k http connections, while the secondnodeonly 20. going
to look into balancing this out first.
On Thu, Apr 5, 2012 at 8:05 AM, Oren Mazor o...@wildbit.com wrote:
these are two RS machines, identical in every way other than the
busy one
(node2), is all non-primaries. all of the queries are going vianode1,
the one with the low cpu.node2 (high cpu) is never actually queried,
which makes this even weirder.
Even if you don't send explicit search request to them, they are still
being searched on because elasticsearch will distribute the data
between
them.
if replication is basically just mirroring of requests, then the
merge
activity should be identical on both, no?
First, note that the merge stats are only from the time thenodewas
started. And no, even if both are started on teh same time, they will
differ because they are not completely insync(doc wise they are, time
wise of when things kick off, like merges, might differ).
Can you gist another set of jstack? Can you also include one from
node1?
On Wednesday, April 4, 2012 5:35:58 PM UTC-4, kimchy wrote:
Because the number of replicas is 1 and you have 2 nodes, then it is
strange that onenodehas high CPU load compared to the other... . Is
the
OS the same on both? Are you running on a virtualized / AWS env? Is
the JVM
the same on both? Can you gist a couple of jstack results from
thenode
with the high load? ( Moved
**share/jstack.html< jstack - Stack Trace>
).
On Wed, Apr 4, 2012 at 10:49 PM, Oren Mazor o...@wildbit.com
wrote:
hey guys,
I've been on/off dealing with some ES issues, and I think I've
resolved a LOT of them. but there's still one thing that bugs me.
My structure:
40gb index
two nodes, ten shards, one replica.
eachnodeis a quad core 32gb of ram machine. min=max=16gb
our data is extremely uniform, and extremely small. but comes in
at a
high rate of around 100 index actions per second.
ES is at 18.6 (I know, we'll be updating this week)
currently, all of the shards are marked as primary onnodeone.
node1:
uptime is 9 hours.
big desk shows 11399 garbage collection actions, and 2000 merges.
cpu has been uniform at under 5% for pretty much ever.
memory is at used=30gb, actual=14gb
node2:
uptime is 3 hours
bigdesk shows 104000 gc actions, and 800 merges.
cpu has been uniform at 80-90% for pretty much forever.
memory is at used=21gb, actual=13gb
note:
node1 is also running a few python scripts (server density, mongo
synctool, geckoboard agent)
my questions:
why is cpu so high onnode2?
I was thinking that the 'primary' shards are the ones that do
the
merging, and the 'slaves' do the receives of the complete index,
but
in both cases lucene indices have to do a merge. not so much a
question as making sure I'm understanding everything.
my data is routed, which in theory should mean that some shards
would be doing a lot more work than others. does ES account for
'work'
and move shards from machine to machine? or does it just account
for
shard size?
Replication is synchronous, once you index a document, it will also be
indexed on the replica. Are you playing with the refresh interval by any
chance?
Lets say I index a new document every minute. with replica turned on,
it means that one of my nodes is going to be missing an entire day's
worth of indexing activity.
I'm currently running without replicas (but with multiple nodes) and
its running great. as soon as I turn on replica, I can get two
different results for a search query.
let me know what debug info I can get for you on this.
I've noticed another thing. the nodes have different data. is this a
function of re-play of index/delete actions? is there a way to modify
the frequency that happens? it looks like its a day or two behind…
we migrated to 19.2 and I'm seeing the same thing. thenodethat takes
the "point" of the requests has the cpu pegged again.
but with the new version of ES comes a new version of big desk (I
can't love this tool enough), and in this one I noticed that thatnode
has over 100k http connections, while the secondnodeonly 20. going
to look into balancing this out first.
On Thu, Apr 5, 2012 at 8:05 AM, Oren Mazor o...@wildbit.com
wrote:
these are two RS machines, identical in every way other than the
busy one
(node2), is all non-primaries. all of the queries are going
vianode1,
the one with the low cpu.node2 (high cpu) is never actually
queried,
which makes this even weirder.
Even if you don't send explicit search request to them, they are
still
being searched on because elasticsearch will distribute the data
between
them.
if replication is basically just mirroring of requests, then the
merge
activity should be identical on both, no?
First, note that the merge stats are only from the time thenodewas
started. And no, even if both are started on teh same time, they
will
differ because they are not completely insync(doc wise they are,
time
wise of when things kick off, like merges, might differ).
Can you gist another set of jstack? Can you also include one from
node1?
On Wednesday, April 4, 2012 5:35:58 PM UTC-4, kimchy wrote:
Because the number of replicas is 1 and you have 2 nodes, then
it is
strange that onenodehas high CPU load compared to the other...
. Is
the
OS the same on both? Are you running on a virtualized / AWS
env? Is
the JVM
the same on both? Can you gist a couple of jstack results from
thenode
with the high load? ( Moved
**share/jstack.html< jstack - Stack Trace>
).
On Wed, Apr 4, 2012 at 10:49 PM, Oren Mazor o...@wildbit.com
wrote:
hey guys,
I've been on/off dealing with some ES issues, and I think I've
resolved a LOT of them. but there's still one thing that bugs
me.
My structure:
40gb index
two nodes, ten shards, one replica.
eachnodeis a quad core 32gb of ram machine. min=max=16gb
our data is extremely uniform, and extremely small. but comes
in
at a
high rate of around 100 index actions per second.
ES is at 18.6 (I know, we'll be updating this week)
currently, all of the shards are marked as primary onnodeone.
node1:
uptime is 9 hours.
big desk shows 11399 garbage collection actions, and 2000
merges.
cpu has been uniform at under 5% for pretty much ever.
memory is at used=30gb, actual=14gb
node2:
uptime is 3 hours
bigdesk shows 104000 gc actions, and 800 merges.
cpu has been uniform at 80-90% for pretty much forever.
memory is at used=21gb, actual=13gb
note:
node1 is also running a few python scripts (server density,
mongo
synctool, geckoboard agent)
my questions:
why is cpu so high onnode2?
I was thinking that the 'primary' shards are the ones that
do
the
merging, and the 'slaves' do the receives of the complete
index,
but
in both cases lucene indices have to do a merge. not so much a
question as making sure I'm understanding everything.
my data is routed, which in theory should mean that some
shards
would be doing a lot more work than others. does ES account for
'work'
and move shards from machine to machine? or does it just
account
for
shard size?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.