Elasticsearch version (bin/elasticsearch --version): 6.3.0
Plugins installed: [ingest-geoip]
JVM version (java -version): openjdk version "1.8.0_171"
OS version (uname -a if on a Unix-like system): Linux elk-ela-1f 4.15.0-22-generic #24-Ubuntu SMP Wed May 16 12:15:17 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Description of the problem including expected versus actual behavior:
I have setup 4 nodes in one machine elk-ela-1f. Did this with elastic/ansible-elasticsearch.
Each node has setup cluster.routing.allocation.same_shard.host: true to prevent same shard allocation on same host. But when I check the shard allocation with head, I can see that same shard have been allocated on same host, but different nodes.
Below is the response for GET _nodes/elk-ela-1f-data-1,elk-ela-1f-data-2,elk-ela-1f-data-3,elk-ela-1f-data-4/stats to ensure that nodes share same host: https://pastebin.com/TtTYBE3D
And here is the list of shard allocation written form:
ngx-2018.06.20 4 p STARTED 29733 16.9mb 10.10.10.45 elk-ela-1f-data-2
ngx-2018.06.20 4 r STARTED 29740 16.9mb 10.10.10.43 elk-ela-1d-data-2
ngx-2018.06.20 7 p STARTED 29633 16.8mb 10.10.10.44 elk-ela-1e-data-1
ngx-2018.06.20 7 r STARTED 29615 16.8mb 10.10.10.46 elk-ela-1g-data-2
ngx-2018.06.20 5 p STARTED 29749 17mb 10.10.10.47 elk-ela-1h-data-1
ngx-2018.06.20 5 r STARTED 29762 16.9mb 10.10.10.44 elk-ela-1e-data-4
ngx-2018.06.20 3 p STARTED 29641 17mb 10.10.10.47 elk-ela-1h-data-3
ngx-2018.06.20 3 r STARTED 29638 16.8mb 10.10.10.45 elk-ela-1f-data-1
ngx-2018.06.20 9 p STARTED 29728 16.9mb 10.10.10.43 elk-ela-1d-data-4
ngx-2018.06.20 9 r STARTED 29726 17mb 10.10.10.44 elk-ela-1e-data-3
ngx-2018.06.20 6 p STARTED 29655 16.9mb 10.10.10.47 elk-ela-1h-data-2
ngx-2018.06.20 6 r STARTED 29650 17mb 10.10.10.43 elk-ela-1d-data-1
ngx-2018.06.20 2 p STARTED 29596 17mb 10.10.10.43 elk-ela-1d-data-3
ngx-2018.06.20 2 r STARTED 29575 17mb 10.10.10.44 elk-ela-1e-data-2
ngx-2018.06.20 8 p STARTED 29748 17mb 10.10.10.46 elk-ela-1g-data-2
ngx-2018.06.20 8 r STARTED 29748 33.9mb 10.10.10.46 elk-ela-1g-data-3
ngx-2018.06.20 1 p STARTED 29612 17mb 10.10.10.45 elk-ela-1f-data-4
ngx-2018.06.20 1 r STARTED 29609 17mb 10.10.10.45 elk-ela-1f-data-3
ngx-2018.06.20 0 p STARTED 29312 16.9mb 10.10.10.46 elk-ela-1g-data-1
ngx-2018.06.20 0 r STARTED 29313 16.7mb 10.10.10.44 elk-ela-1e-data-3
As you can see then following shard are allocated on the same host:
ngx-2018.06.20 1 p STARTED 29612 17mb 10.10.10.45 elk-ela-1f-data-4
ngx-2018.06.20 1 r STARTED 29609 17mb 10.10.10.45 elk-ela-1f-data-3
ngx-2018.06.20 8 p STARTED 29748 17mb 10.10.10.46 elk-ela-1g-data-2
ngx-2018.06.20 8 r STARTED 29748 33.9mb 10.10.10.46 elk-ela-1g-data-3
Looking at GET _nodes - it looks like all 4 nodes are data and ingest. How is that possible? I thought you had to have at least one master node - or am I misunderstanding/reading something incorrectly, or is there additional information missing?
Well that's very embarrasing. Turns out that in my playbook all the data nodes had the neccessary configuration for same_shard allocation but master didnt. I shouldnt had trusted my memory and double check the master configs. Very sorry for such dumb user experience debugging. And big thank you for pointing out that Case closed
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.