Supporting query on dynamic columns in elastic search

Harish_Kommaraju · January 1, 2016, 2:30pm

I need to support query on dynamic (not predefined) tags in elastic
search. Lets say I have a blog document and wanted to support query on
different set of columns i.e. tagTypeA=valueX & tagTypeB=ValueY and
these tagTypeX columns are not known beforehand. There will be only one
value for each of these tags. The user will pass this additional data as
Map String::String to my API (no strict model / structure)

I am thinking of three ways to support this feature.

Declare that I can support a maximum of N type of dynamic tags
only per document (say 10) and create internal columns like Tag1,Tag2
... Tag10. Now have a config to maintain the mapping of TagTypeA=Tag1,
TagTypeB=Tag2 etc. In the code, iterate the input key value pair and
generate ES search query dynamically by using key to columnName mapping.
Pros : Simple to implement
Cons : Overhead of maintaining the mapping. This has to modified
every-time a new type of document/client is onboarded / new field has to
be added for existing client.
Create a non-analyzed field in ES with array of strings. When
storing the data, store in a concatenated format of
key+"Delimiter"+value. So if the input map has TagTypeA=Good &
TagTypeB=High, then this will be stored as
["TagTypeA-Good","TagTypeB-High"] in ES. When user queries, construct
back the contacted strings and search them.
Pros : No code changes required to onboard new clients / to add or
update new fields
Cons : First of all it doesn't sound clean. The key should not have
Delimeter. Changing mapping at later point of time is very tedious as we
have to change values of all existing string values.
Don't define any schema and let the json key - value pair of tags
passthrough to elastic search PUT call. For any new keys which are not
already present elastic search will automatically add it to the indices
with default type inference (which can be controlled using dynamic tempaltes).
Pros : No configuration or manual concatenation of input. Any addition
of columns in handled transparently without any manual effort.
Cons : We are relying on the default index creation settings of ES which
may not suit the requirement always. I feel there will be more cons on
this, but can't think of them any. Please suggest.

I am personally thinking on aligning to Option #3.

Can any one please share your views on above three approaches and if there is a better way to solve this.

Thanks,
Harish

Christian_Dahlqvist · January 2, 2016, 5:46am

Another option worth considering might be to store them as nested data types, each column stored as a document with a 'key' and 'value' field. This avoids having the mappings explode while still giving you a fair amount of flexibility.

Harish_Kommaraju · January 2, 2016, 12:40pm

Thanks for your suggestion. I also need to do aggregations on the keys i.e. queries like No. of entries having TagTypeA=23 & TagTypeB=45. Will the performance be impacted by having nested structure for these aggregations? (I understand that Option #2 is no longer a valid one when we need aggregations on these column names)

Topic		Replies	Views
Dynamic mapping Elasticsearch	9	353	November 14, 2022
Even more "dynamic" mapping? Elasticsearch	1	484	August 8, 2017
How to Map dynamic columns of a class in Elastic Elasticsearch	1	223	September 8, 2021
Alter dynamic mapping elasticsearch 5.3 Elasticsearch	1	530	May 2, 2017
Dynamic mapping Elasticsearch	3	275	October 13, 2020

Supporting query on dynamic columns in elastic search

Related topics