Want to try two different clusters sharing same data path (same index). The reason is, one cluster will keep on indexing streaming data. The other cluster will take care of all the searches and aggregations. So if at all the search cluster goes down due to memory hike on heavy aggregations, the data indexing will not get affected .
Tried the same in single server with two nodes. Each nodes has different "cluster.name" values.
Also set the "node.max_local_storage_nodes: 2" and the "path.data" pointing to same path in both the nodes.
But the problem is, the first started node creates a folder "0" inside the configured "path.data". The secondly started node creates folder "1". So obviously the data indexed by first node is not available to the second node. Tried running the second node with "node.data: false". But the problem still persists.
Is this approach possible ? Then what should be configuration so both the nodes will point to same index directory. Is there any other way to handle this scenario other than above mentioned one. ?
You can't really do that. It's better to have 2 real data nodes that belongs to the same cluster.
Then you can redirect your index operation to the first node and you can run the search requests to the second node and use ?preference=_local parameter which will try to execute the search only on local shards.
Thanks for the reply. If we run 2 data nodes in the same cluster, if the search node goes down, the shards hold by that node will be unavailable rt? That is affecting the writing and searching both.
Actually i am facing issue in indexing. I was running the ES cluster in two servers. Around 30,000 data per sec is streaming and indexing continously to the cluster. Meantime there are other batch jobs triggering periodically, like every 10 mins, 30 mins, etc, doing ES search and aggregation operations.
Once the cluster is up, the entire flow runs for 3-4 days without any issues. After that i am seeing lot of timeout exceptions in the ES logs. Finally the nodes (either one, sometimes both) crashes due to heap error (heap dump gets created in the configured path) and has to reboot the cluster. After reboot, again another 3-4 days no probz.
Recently i added one more node , so 3 nodes in the same cluster. same data rate. Atleast for 12+ days no issues. But after that the data writing stopped again due to timeout errors. I rebooted the indexing programs, but still they were not able to index. But this time none of the nodes crashed, also the ES search operations was still intact.
I thought the periodical searchs and aggregations if we separate from the writers, the load will be less for writers. So just thought of the possibilty of maintaining two different clusters on same data.
I am using ES version 6.3.1. Each server node is 256 GB RAM and 72 core cpu. I still wonder why after some days the writing performance degraded.
Heap size is set as 32 GB for each node.
Since it is streaming data continously i am creating daily index. Every day midnight the data starts indexing to a new index.
Each index with 8 shards. Yes, as you mentioned the replicas is set to "0".
Around 900 GB is the size of one daily index. But last time the ES throws timeout errors during day time, it was around 70 GB data in that day index.
Also I am running 5 instances of indexer programs per server parallely indexing to the local node. Bulk indexing is being used with a bulk size of 4000.
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open wr_in_2019_04_01 OHJtB0dqR8uQt0JZSD7XGg 8 0 786844659 0 772.2gb 772.2gb
green open wr_in_2019_02_26 HaQVUt5sSuaFMItfgEVzrQ 8 0 717735762 0 688gb 688gb
green open wr_in_2019_03_06 XcRYQEgkRgGQLbeakQ3dHA 8 0 171001764 0 170.3gb 170.3gb
green open wr_in_2019_03_16 cgFaU3MpRjOPhk8grgwg1A 8 0 848441559 0 788.5gb 788.5gb
green open wr_in_2019_02_25 fdNoz015RjSFIwJ8rxC2DQ 8 0 585444957 0 546.1gb 546.1gb
green open wr_in_2019_03_04 I8vDUW68T8Cmr2GqPHmJBg 8 0 98716602 0 87.7gb 87.7gb
green open wr_in_2019_02_28 ji67B_rASLaFqtsywj_dJg 8 0 368071956 0 339.2gb 339.2gb
green open wr_in_2019_03_24 c8KKkYkTR_6Ow_j8_74lvg 8 0 525759407 0 501.2gb 501.2gb
green open wr_in_2019_02_16 Tg-q0OyMRgSAnEL07vQ0Yg 8 0 733893550 0 658.1gb 658.1gb
green open wr_in_2019_03_07 E7BVFB07RbGjWqVQbkKHqQ 8 0 612399945 0 611.6gb 611.6gb
green open wr_in_2019_03_15 OKvgR_DaQ_62tEQ--xJ-2g 8 0 855747046 0 829.7gb 829.7gb
green open wr_in_2019_03_08 J714RSU-RR-5Djl4oj6i3Q 8 0 612603229 0 576.8gb 576.8gb
green open wr_in_2019_03_28 x7_2dJAMRfGPio44nPwTPg 8 0 1103147718 0 1tb 1tb
green open wr_in_2019_03_05 PH4MqkuJTKyqQF7EDYNROw 8 0 68814398 0 65.5gb 65.5gb
green open wr_in_2019_03_01 iPYs8lc6TjCKhZWVIbNijQ 8 0 426661793 0 406.6gb 406.6gb
green open wr_in_2019_03_20 Ix2ucLB1Q0iq6JkL5ADEvw 8 0 139402584 0 131.9gb 131.9gb
green open wr_in_2019_02_22 nVA0sDfJRPmLIiVGg1vMRA 8 0 638825042 0 599.8gb 599.8gb
green open wr_in_2019_03_23 1faz8CNmR-Web9O7_gzIOw 8 0 595548849 0 578.2gb 578.2gb
green open wr_in_2019_03_22 i5CmKaW6T2yr6sD4zt-Dfg 8 0 662460573 0 619.1gb 619.1gb
green open wr_in_2019_03_26 birtPXX6S9-HYLrtAmHqGw 8 0 1120588076 0 1tb 1tb
green open wr_in_2019_03_03 Qx4RX3rLRPyxTP_8NXvDcQ 8 0 152337877 0 135.5gb 135.5gb
green open wr_in_2019_03_11 daKfzRw7R4WIxrmFeA3mjg 8 0 874401017 0 801.9gb 801.9gb
green open wr_in_2019_02_24 PULbAJJaQsyyyYMRojN7zQ 8 0 447485224 0 405.7gb 405.7gb
green open wr_in_2019_02_27 dCu7wvlzRhyhMLDsAmI60A 8 0 394166451 0 369.7gb 369.7gb
green open wr_in_2019_03_18 AuKVNNDGQcuw-MFcCXdVwQ 8 0 785600530 0 756.2gb 756.2gb
green open wr_in_2019_03_12 _RhoX3sLTp2EOqg2bTTR1A 8 0 846456862 0 776.6gb 776.6gb
green open wr_in_2019_03_14 9XgW8k3gSiidL6W1H-fnRA 8 0 799785523 0 770.8gb 770.8gb
green open wr_in_2019_03_17 f4Fu-JraS9-UIlSxyYpUlg 8 0 491927793 0 473gb 473gb
green open wr_in_2019_04_06 7AIetiQzSgivyr6s-r0-tA 8 0 251721353 0 241.1gb 241.1gb
green open wr_in_2019_02_18 1hwKwqaNQ5KIRYkf5cNEWg 8 0 598246946 0 559.6gb 559.6gb
green open wr_in_2019_03_27 ILugbua7Tu698sRrn10gEg 8 0 1086803633 0 1tb 1tb
green open wr_in_2019_02_17 Q8LUG6FlR1aYSq9SANRWlA 8 0 371035051 0 330.6gb 330.6gb
green open wr_in_2019_02_20 RK2vxu6OTR6Kota_EJpmvg 8 0 613265355 0 573.8gb 573.8gb
green open wr_in_2019_03_13 2i9c7bemQnG0ggJRk2_2-g 8 0 923463353 0 875.4gb 875.4gb
green open wr_in_2019_03_29 gdsAaCgkTrKGagzBE_AkKw 8 0 1111865897 0 1tb 1tb
green open wr_in_2019_02_19 QUI01vc7SX-0Ep48Zpu7gg 8 0 480594121 0 432.7gb 432.7gb
green open wr_in_2019_02_23 TPukM9ayTPGkBWiau1eQmw 8 0 494078092 0 447.8gb 447.8gb
green open wr_in_2019_03_31 P3uJiSvvRUOySV_9tnCuMQ 8 0 596493086 0 567.7gb 567.7gb
green open wr_in_2019_03_09 87joIoqORX-77kfvX9rCXA 8 0 745600720 0 704.2gb 704.2gb
green open wr_in_2019_03_02 krbpnDYrSaWUWfphpz1Fdw 8 0 332391416 0 311.7gb 311.7gb
green open wr_in_2019_04_04 6YQPvKGJR8-uDfWW2egSBQ 8 0 612276383 0 578.3gb 578.3gb
green open wr_in_2019_04_03 lne1EmWiSXus2Eofo--mUg 8 0 47735900 0 47.3gb 47.3gb
green open wr_in_2019_03_30 vk4YCNSLSi2H_mgsGw1oxw 8 0 1093685353 0 1tb 1tb
green open wr_in_2019_03_10 4kdeQDLkRpmfGvRyIS9vBw 8 0 658052089 0 624.7gb 624.7gb
green open wr_in_2019_03_19 -5tSTjylST6QcwCyyYMA8Q 8 0 259249061 0 240.8gb 240.8gb
green open wr_in_2019_03_25 XSsN_pBZQKC3oREZUhLaLw 8 0 991991936 0 922.2gb 922.2gb
green open wr_in_2019_04_05 Y9AFZOa6TNim4NVEHc9AgQ 8 0 925187237 0 884.3gb 884.3gb
green open wr_in_2019_04_02 1d45Iv1-R6KemOgLCfZqyw 8 0 631105735 0 621.3gb 621.3gb
green open wr_in_2019_02_21 Q_VkBmpJRQ-NsdS8kFREfQ 8 0 532769275 0 491gb 491gb
Currently i set the "refresh_interval" : "60s" in the template. So the daily index is getting created with this configuration. Is this ok ? Bcz by default the value is 1 second.
Increasing to a higher number, say "180s" will help in stable indexing ?
Increasing that number is always good.
Note that in 7.0 you don't really care about it anymore as Elasticsearch is smart enough to do refresh when it's really needed.
May be you should consider rolling indices for your use case?
May I suggest you look at the following resources about sizing:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.