We have 10 data nodes and 1 master node and we have 20,000 lines
of logs/sec (peek).
Now we send our logs through our program by using BI(Bulk Insert) to 1
master node.
I'm wondering if I could BI to other node(data or master), because we will
add more logs.
Or should I add more master nodes and BI to those
master nodes ?
You should spin up a client node (master and data: false) and use it. That
way you prevent your master from OOMing.
Otherwise you can use any of your data nodes.
We have 10 data nodes and 1 master node and we have 20,000 lines
of logs/sec (peek).
Now we send our logs through our program by using BI(Bulk Insert) to 1
master node.
I'm wondering if I could BI to other node(data or master), because we will
add more logs.
Or should I add more master nodes and BI to those
master nodes ?
You should spin up a client node (master and data: false) and use it. That
way you prevent your master from OOMing.
Otherwise you can use any of your data nodes.
We have 10 data nodes and 1 master node and we have 20,000 lines
of logs/sec (peek).
Now we send our logs through our program by using BI(Bulk Insert) to 1
master node.
I'm wondering if I could BI to other node(data or master), because we will
add more logs.
Or should I add more master nodes and BI to those
master nodes ?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.