I want to store timeseries data in elasticsearch. But I am not sure how I should model the document.
Let's say, the document looks like this.
{
"timestamp": "iso 8601 date time",
"common_property1": "value",
"common_property2": "value",
"common_property3": "value",
"common_property4": "value",
"common_property5": "value",
"indivisual_properties": {
"proeprty1": "value",
"property2": {
"nesting_element": "value"
},
"property3": {
"nesting_element1": "value",
"nesting_element2": "value",
"nesting_element3": "value",
"nesting_element4": "value",
"nesting_element5": "value",
....this can go upto 70-80
},
"property4": {
"nesting_element": "value"
},
"property5": {
"nesting_element": "value"
},
"property6": {
"nesting_element": "value"
},
"proeprty7": "value",
"proeprty8": "value",
"proeprty9": "value",
"proeprty10": "value",
...this can go upto 80-90 properties
}
}
Now what we are doing is we are storing individual_property
with nesting element as string in the main document and we are also creating a separate index for the individual properties where we want to make a query using nesting_element
.
Now I am thinking of remodeling the document as following:
{
"timestamp": "iso 8601 date time",
"common_property1": "value",
"common_property2": "value",
"common_property3": "value",
"common_property4": "value",
"common_property5": "value",
"indivisual_properties": [
{
"name": "value",
"value": "value"
},
{
"name": "value",
"value": "value"
},
{
"name": "value",
"value": "value",
"nesting_element": "value"
},
{
"name": "value",
"value": "value"
},
{
"name": "value",
"value": "value",
"nesting_element": "value"
},
{
"name": "value",
"value": "value"
}
........... and probably 100-150 more properties
]
}
and thinking of using nested field type and nested field mapping so that we can query the individual properties based on name
and nesting element
.
But I am aware that the later model will require Lucene to index hundred more documents ( number of nested documents + 1 to be exact ).
In this scenario, which data model seems to be more performant/effecient? How will they impact indexing and search performance? Is there an alternative way to make it more efficient?