Schemaless data

sheep · May 16, 2013, 1:12am

Is there any known workaround to allow index shape that is truly schemaless? At least from querying point of view (using lucene syntax).
I.e. when the "logical" shape of the json data is {fruits: ["orange", {name: "lemon", genus: "citrus"}]},
On elasticsearch index, I want to be able to query that works with "fruits:orange" as well as "fruits.genus:citrus".

Unfortunately I do not have control over the consistency of the data shape. I do have the ability to transform or massage the json shape in any way I like, as long as it produces the desired index-shape when queried against by the user.

I'm wondering if there's any mapping trick that you can use to produce that effect? For instance, some mapping transformation that can unwrap an object property, does that feature exist in elasticsearch?
I.e. so that I can wrap my string within a faux object, e.g. {fruits: [{_text: "orange"}, {name: "lemon", genus: "citrus"}]}
while still allowing the said _text property to be queried by its parent, e.g. query "fruits:orange" in lieu of "fruits._text:orange".

Is this an achievable goal?

Thanks

sheep · May 17, 2013, 1:19am

I ended up solving this by flattening my json payload (of any shape) into a flat one-dimensional object. I.e. in my particular example, the json i end up storing in ES is:

{
"fruits": ["orange", "foo", "whatever"],
"fruits.name": ["lemon", "chilli", "etcetera"],
"fruits.genus: ["citrus", "herbs"]
}

... and so on to any number of nested properties.

Is there any drawback to this approach? If not, is there any reason this is not the way ElasticSearch structure its index internally (thus allowing a completely schemaless data)?

Cheers

Topic		Replies	Views
Schemaless Support for Elastic Search Queries Elasticsearch	1	1096	July 6, 2017
Understanding elasticsearch and index creation Elasticsearch	2	1498	February 18, 2018
Mapping gets messed up in result of indexing a Json document Elasticsearch	6	584	July 5, 2017
Mapping for objects with an arbitrary amount of properties Elasticsearch	4	2065	July 6, 2017
Explicit mapping for "not an array" Elasticsearch	3	1044	July 5, 2017

Schemaless data

Related topics