Adding "additional/external index" at runtime for a existing index

raghu83sn · November 7, 2016, 12:18pm

Hi ,
Can we create a index for a field at run time, which was earlier not specified to be indexed in the mappings .
Something like this :
a) Existing index has 4 fields a, b, c, d . Let us say 'a' and 'b' are indexed, 'c' and 'd' is not indexed.
b) Now down the line i got a requirement where i need field 'c' to be indexed.
c) Can Elasticsearch add a feature to create a external inverted index for field 'c' (external to the immutable segments). Something which can be created at runtime, like "create ext Index for index". This inverted index should span across all the data in the shard, at the point of time index create instruction is executed.
d) Then we can specify the external index name in query for results to be optimized.
e) Alter the same when data is modified by explictly running update/refresh index (or can configure it to be auto updated on change of data).
e) Also provision to delete the same after we are finished using the same and no longer need the same.(Since this will be external to the immutable segments).

Thanks and Regards,
Raghavendra

spinscale · November 7, 2016, 2:06pm

Hey,

sorry, but I absolutely dont understand the requirement here. First, you cannot create an external index in Elasticsearch. All data is part of the segment when it gets indexed.

Can you maybe step back a little and explain your actual requirement - completely without Elastisearch internals in mind?

--Alex

raghu83sn · November 7, 2016, 3:14pm

Hi Alex,
Thanks for your response, let me try to explain the what i intend to achieve.

a) To avoid re-indexing data to new index under following possible scenarios.

  If we have some fields in the index which is initially marked ("index" : false), now we figure out that it is needs to be indexed. Since we would have accumulated data over a period of time, re indexing is costly.

 When new fields start getting in to the index(unstructured/semi-structured data sources), by default they may be indexed (analysed) , if i figure out at a point i need a indexed(not analysed) version of inverted index for the same or vice-versa.  If this keeps happening with multiple such indices and multiple times, re-indexing looks costly.

b) I got this question by looking in to the Hive indices, which are external to the data stored.Can be created and dropped at anytime.

c) Elasticsearch does link different indices with Nested objects(where it may be creating a different index for storing nested objects), Nested indices can not be directly queried though, we have to go through the parent.

d) Similarly if elasticsearch provides provision to create index for specific fields at runtime (it may ultimately mean creating a new segment with inverted index), by reading the existing data.

This is just a thought, i'm not sure.
is there already a better way to address this scenario?
or is re-indexing data is a better and simpler approach to the above solution?
Please advice.

Thanks and Regards,
Raghu

system · December 5, 2016, 3:14pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.