Offloading Indexing CPU load

Debayan_Banerjee · April 10, 2015, 5:56am

Hi,

We have an ES cluster setup with 5 nodes. There is one index divided into 5
shards, and each shard has 4 replicas. Hence all 5 nodes have 5 shards
each, and all of them have the whole index.

From what I have seen, indexing is a CPU intensive operation. I would like
the indexing to happen only one machine (which I would not include behind
my production load balancer to serve read queries) and then the replication
of indices to happen to the other machines.

Is this possible? Can I limit indexing to just one machine and specify
which machine that should be?

--

Debayan Banerjee
devOps engineer

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOvawCs_E%2BwYUU-o1FMAFKsBckgh1eV%2BsA0Av1af4zJjwy8HXg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

warkolm · April 10, 2015, 6:31am

You need to have all the primaries on the same node for this to happen.
I'd suggest you reduce your replica count as that will be adding overhead
to things, even though replicas are handled in parallel, you have an
excessive amount of them.

On 10 April 2015 at 15:56, Debayan Banerjee debayan.banerjee@paytm.com
wrote:

Hi,

We have an ES cluster setup with 5 nodes. There is one index divided into
5 shards, and each shard has 4 replicas. Hence all 5 nodes have 5 shards
each, and all of them have the whole index.

From what I have seen, indexing is a CPU intensive operation. I would like
the indexing to happen only one machine (which I would not include behind
my production load balancer to serve read queries) and then the replication
of indices to happen to the other machines.

Is this possible? Can I limit indexing to just one machine and specify
which machine that should be?

--

Debayan Banerjee
devOps engineer

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAOvawCs_E%2BwYUU-o1FMAFKsBckgh1eV%2BsA0Av1af4zJjwy8HXg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAOvawCs_E%2BwYUU-o1FMAFKsBckgh1eV%2BsA0Av1af4zJjwy8HXg%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9X%2BNhmNHifsSh15mYrkmQ4XLCQZ%2Btn_-n3yeTExbJLxA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Debayan_Banerjee · April 10, 2015, 6:37am

HI Mark,

Is there a way to have all primaries on one machine?

As for having too many replicas, I currently have a setup where all read
queries can be served by any machine, without having to go to any other
machine to look for missing shards. My index size is 4 GB and stays in RAM.
My data size on disk is 1.1 GB. Do you still recommend reducing replicas?
Would that not add the overhead of read queries needing to wait for network
IO to fetch records from other machines from time to time?

On Fri, Apr 10, 2015 at 12:01 PM, Mark Walkom markwalkom@gmail.com wrote:

You need to have all the primaries on the same node for this to happen.
I'd suggest you reduce your replica count as that will be adding overhead
to things, even though replicas are handled in parallel, you have an
excessive amount of them.

On 10 April 2015 at 15:56, Debayan Banerjee debayan.banerjee@paytm.com
wrote:

Hi,

We have an ES cluster setup with 5 nodes. There is one index divided into
5 shards, and each shard has 4 replicas. Hence all 5 nodes have 5 shards
each, and all of them have the whole index.

From what I have seen, indexing is a CPU intensive operation. I would
like the indexing to happen only one machine (which I would not include
behind my production load balancer to serve read queries) and then the
replication of indices to happen to the other machines.

Is this possible? Can I limit indexing to just one machine and specify
which machine that should be?

--

Debayan Banerjee
devOps engineer

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAOvawCs_E%2BwYUU-o1FMAFKsBckgh1eV%2BsA0Av1af4zJjwy8HXg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAOvawCs_E%2BwYUU-o1FMAFKsBckgh1eV%2BsA0Av1af4zJjwy8HXg%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9X%2BNhmNHifsSh15mYrkmQ4XLCQZ%2Btn_-n3yeTExbJLxA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9X%2BNhmNHifsSh15mYrkmQ4XLCQZ%2Btn_-n3yeTExbJLxA%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--

Debayan Banerjee
devOps engineer
+91 8800846550

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOvawCvA%2BpG2FOFo%2BH9sddj22Or7UQft_nq8QHyPPvYAfDYURg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Multiple nodes on a powerful system? Elasticsearch	5	431	July 6, 2017
What is the maximum indexes I can have in one ES cluster? Elasticsearch	3	476	March 9, 2020
Any issues using 2 shards for an index? Elasticsearch	6	382	July 6, 2017
Difficult to tame cluster Elasticsearch	3	349	July 6, 2017
How to migrate index Elasticsearch	8	1578	July 5, 2017

Offloading Indexing CPU load

Related topics