Using doc fields streamable facility

Atharva_Patel · November 9, 2012, 8:19am

I have a use case where I need to append certain amount of text to an
existing field of the document which has been already indexed earlier.
After appending I will be reindexing the document. I am predicting the
amount of data which is going to be stored in that single field of my
document will be pretty large so I decided not to store the field as well
as disabled the _source feature on that document type.

I also feel as there will be several indexing/reindexing operations will be
going on simultaneously in the JVM (I am currently using Java API), it will
be highly memory inefficient to bring all such long string object.

I am wondering if the '*streamability' *the field in document in Java API
can be used in someway to make my use case memory efficient. I yes, I would
like to see a kind of pseudocode or example for steps stream operations to
follow to achieve the whole use case memory efficiently?

--

Ivan · November 9, 2012, 5:27pm

If the field is not stored and source is disabled, then there is nothing to
go on. No data to append to.

Even if things are stored, updating Lucene (and therefore Elasticsearch) is
a well known issue. Elasticsearch can get around it by using the update
API, which is essentially a re-indexing on the server side, but it still
revolves around the Lucene storage format.

--
Ivan

On Fri, Nov 9, 2012 at 12:19 AM, Atharva Patel patelatharva@gmail.comwrote:

I have a use case where I need to append certain amount of text to an
existing field of the document which has been already indexed earlier.
After appending I will be reindexing the document. I am predicting the
amount of data which is going to be stored in that single field of my
document will be pretty large so I decided not to store the field as well
as disabled the _source feature on that document type.

I also feel as there will be several indexing/reindexing operations will
be going on simultaneously in the JVM (I am currently using Java API), it
will be highly memory inefficient to bring all such long string object.

I am wondering if the '*streamability' *the field in document in Java API
can be used in someway to make my use case memory efficient. I yes, I would
like to see a kind of pseudocode or example for steps stream operations to
follow to achieve the whole use case memory efficiently?

--

--

Topic		Replies	Views
Using the Streamable field value in Java API Elasticsearch	6	543	July 6, 2017
Append to an existing field (String) Elasticsearch	7	1379	August 23, 2019
[Java] Stream large file while indexing Elasticsearch	10	2236	July 6, 2017
Disabling _source field Elasticsearch	22	1965	July 6, 2017
Indexing fields without storing Elasticsearch	6	469	July 6, 2017

Using doc fields streamable facility

Related topics