Some basic questions regarding ES

kartikelastic · March 15, 2016, 4:27pm

I am trying to learn Elasticsearch and have some questions, please answer if and when you get a chance:

The scenario:
3 master nodes 2 for quorum
3 data nodes

The questions:
Let’s say I need to forward apache httpd logs using logstash forwarder, do I forward from all web servers to one data node or do I distribute them across the data nodes?

Are indices created for each log type per data node and are they replicated on the other data nodes?
What are shards? (simplistic explanation would suffice).

If I use Kibana for visualization, do I connect it to “any one” data node or to “any one” master node?

Is it not bad practice to not have the ES nodes on public network?

What framework do you recommend for Hadoop/ES integration? I have Hadoop+Hive configured, any starting points would be helpful (I have gone through the elastic.co doc and am unable to find how to actually send Hive data to ES)

Thanks a LOT in advance. Regards,

warkolm · March 16, 2016, 12:49am

Use Filebeat instead, LSF is deprecated.

One index for it all. They are replicated by default

Basically, partitions of an index to allow distribution.

Not the masters.

That's a lot of negatives. Don't expose ES to the internet is the take away, even then make sure it's protected, just like any other datastore.

kartikelastic · March 16, 2016, 1:08am

Thank you very much

Topic		Replies	Views
Hadoop / Elasticsearch functionality Elasticsearch es-hadoop	20	3237	July 6, 2017
Have a couple of questions on ES Elasticsearch	2	338	July 6, 2017
Several questions about a cluster Elasticsearch	6	969	July 5, 2017
Query on Indexing using es-hadoop Elasticsearch es-hadoop	6	1957	July 6, 2017
ES best practices? Elasticsearch	4	550	July 6, 2017

Some basic questions regarding ES

Related topics