Where is the data stored? ElasticSearch YARN

Rafael_Pellon · December 16, 2014, 4:06pm

Hi

We are testing elasticsearch in a HDP environment using YARN.

We follow the instructions in the link
http://www.elasticsearch.org/blog/elasticsearch-yarn-and-ssl/ and upload a
lot of data but....

Where is the data stored? Is it in local file system / HDFS? Is it
persisted? What is the default configuration of ES-yarn version? In the
standalone version without using Yarn, you could configure all of this in
the config file.

Any information about this, will be useful.

Thanks in advance,
Rafa

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/28fc11fb-b55b-4664-9767-f893a6af0738%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

costin · December 16, 2014, 4:35pm

I recommend reading the project documentation [1]; there's a dedicated section that covers
storage [2].

[1] Elasticsearch Platform — Find real-time answers at scale | Elastic
[2] Elasticsearch Platform — Find real-time answers at scale | Elastic

On 12/16/14 6:06 PM, Rafael Pellon wrote:

Hi

We are testing elasticsearch in a HDP environment using YARN.

We follow the instructions in the link Elasticsearch Platform — Find real-time answers at scale | Elastic and upload a lot
of data but....

Where is the data stored? Is it in local file system / HDFS? Is it persisted? What is the default configuration of
ES-yarn version? In the standalone version without using Yarn, you could configure all of this in the config file.

Any information about this, will be useful.

Thanks in advance,
Rafa

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/28fc11fb-b55b-4664-9767-f893a6af0738%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/28fc11fb-b55b-4664-9767-f893a6af0738%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/54905F46.4060008%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Dan_Cieslak · January 17, 2015, 4:35pm

The section in

does
not really describe how to configure the different options, just that
they are available. From the page:

Each container can currently access its local storage - with proper

configuration this can be kept outside the disposable container folder thus
allowing the data to live between restarts. This is the recommended
approach as it offers the best performance and due to Elasticsearch itself,
redundancy as well (through replicas).

But below it says:

If no storage is configured, out of the box Elasticsearch will use its

container storage which means when the container is disposed, so is its
data. In other words, between restarts any existing data is destroyed.

What is not described is how to configure storage, especially for the
"recommended approach" where data would live between restarts

So how would one configure elasticsearch-yarn for the recommended approach?
Does one make changes in the elasticsearch.zip's config files? If so, what
settings?

Thanks
Dan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d1c1cf45-853f-402b-a49a-6bd8c32bd876%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

costin · January 19, 2015, 3:11pm

Installing a plugin or changing a configuration means restarting each node and since YARN is not persistent, one would
have to handle this outside.
es-yarn could potentially address that however at that point, it becomes more a puppet/chef feature which is outside the
scope of the project.
The simplest solution would be to simply modify the elasticsearch.zip that you are using as that one would be installed
on each node - whether it's a configuration
or installing a plugin, as long as its part of the zip, it will be distributed across each node.

On 1/17/15 6:35 PM, Dan Cieslak wrote:

The section in Elasticsearch Platform — Find real-time answers at scale | Elastic does
not really describe how to configure the different options, just that they are available. From the page:
Each container can currently access its local storage - with proper configuration this can be kept outside the
disposable container folder thus allowing the data to live between restarts. This is the recommended approach as it
offers the best performance and due to Elasticsearch itself, redundancy as well (through replicas).
But below it says:
If no storage is configured, out of the box Elasticsearch will use its container storage which means when the
container is disposed, so is its data. In other words, between restarts any existing data is destroyed.
What is not described is how to configure storage, especially for the "recommended approach" where data would live
between restarts

So how would one configure elasticsearch-yarn for the recommended approach? Does one make changes in the
elasticsearch.zip's config files? If so, what settings?

Thanks
Dan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d1c1cf45-853f-402b-a49a-6bd8c32bd876%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/d1c1cf45-853f-402b-a49a-6bd8c32bd876%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/54BD1EA9.3080807%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

tushar.hadoop · April 20, 2016, 7:53am

Hi Costin,

Do you have any tutorial regarding elasticsearch-YARN ?
I need to know how kibana can connect to ES-YARN ?
Documentation is not clear to me!

Regards,
Tushar

Topic		Replies	Views
Elasticsearch and Hadoop Questions Elasticsearch	10	377	July 6, 2017
Storing large amount of data in ES Elasticsearch	3	1355	July 6, 2017
Where is Elasticsearch storing data Elasticsearch	8	439	July 6, 2017
Can elasticsearch reads and stores data in HDFS by es-hadoop? Elasticsearch es-hadoop	6	2633	July 6, 2017
Index data in HDFS and Elasticsearch query it from HDFS Elasticsearch	1	427	July 6, 2017

Where is the data stored? ElasticSearch YARN

Related topics