I currently use Elasticsearch with a normalized data model. But now I am presented with a project, with other characteristics, and I can not see without it or how to work with it.
The problem is that there is no fixed format in the data of millions of documents, in which according to their category one of the fields, features, is a json that can contain different data on "features" .
Example (only put data of "features" rest of data is same.
{
"transaction": "Rent"
"state": "Good state"
"rooms": 2,
"size": 250,
"bathrooms": 2,
"energy_class": "C"
"appointments": "furnished"
"floor": 'first'
"banks": false,
"features": [
"air conditioning", "gasoil heating", "stoneware", "washing machine", "dishwasher"
]
}
{
"transaction": "Sold"
"state": "Good state"
"size": 250000,
"fence": "cinegetic",
"houses": [
"0" => [
"type" => "Main house",
"size" => 350
"rooms" => 5
"bathrooms" => 2
],
"1" => [
"type" => "House service",
"size" => 100
"rooms" => 2
"bathrooms" => 1
],
]
"banks": false,
"features": [
"water sources", "river", "solar panels"
]
}
Of course, I need, search by keywords content in "features" array object.
On my latest experience, if declare a index, all data must be with same compositions. (data types)
Any ideas, or post with some doc for this question?