Multi_field field versus creating 2 separate fields?

datadev · October 18, 2011, 6:53am

While trying to resolve an issue with querying on multi_field fields
(https://groups.google.com/group/elasticsearch/t/1246b63c8a867d), I
implemented a workaround by 'flattening' the sub-fields in the multi-
field into top-level fields. EG: Instead of
{
"tweet" : {
"properties" : {
"name" : {
"type" : "multi_field",
"fields" : {
"name" : {"type" : "string", "index" :
"analyzed"},
"untouched" : {"type" : "string", "index" :
"not_analyzed"}
}
}
}
}
}

I now have 2 top-level fields:

{
"tweet" : {
"properties" : {
"name" : {"type" : "string", "index" : "analyzed"},
"name_untouched" : {"type" : "string", "index" :
"not_analyzed"}
}
}
}

My question is that does it actually make a difference in terms of
underlying storage efficiency in elasticsearch or runtime query
performance if I used the multi_field representation or the 2 separate
fields representation? EG: Does ES perform any optimizations to make
multi_field preferred if it is semantically appropriate? And, if the
answer is no (there is no difference in performance/efficiency), then
under what circumstances should multi_field be used?

kimchy · October 18, 2011, 5:30pm

The main difference with what you specified, with two explicit mappigns for
name and name_untouched, is that you need to repeat the "name" value twice
in the json, which makes it bigger. The mulit_field type option reuses the
same name value in the json.

On Tue, Oct 18, 2011 at 8:53 AM, datadev nji@adinfocenter.com wrote:

While trying to resolve an issue with querying on multi_field fields
(https://groups.google.com/group/elasticsearch/t/1246b63c8a867d), I
implemented a workaround by 'flattening' the sub-fields in the multi-
field into top-level fields. EG: Instead of
{
"tweet" : {
"properties" : {
"name" : {
"type" : "multi_field",
"fields" : {
"name" : {"type" : "string", "index" :
"analyzed"},
"untouched" : {"type" : "string", "index" :
"not_analyzed"}
}
}
}
}
}

I now have 2 top-level fields:

{
"tweet" : {
"properties" : {
"name" : {"type" : "string", "index" : "analyzed"},
"name_untouched" : {"type" : "string", "index" :
"not_analyzed"}
}
}
}

My question is that does it actually make a difference in terms of
underlying storage efficiency in elasticsearch or runtime query
performance if I used the multi_field representation or the 2 separate
fields representation? EG: Does ES perform any optimizations to make
multi_field preferred if it is semantically appropriate? And, if the
answer is no (there is no difference in performance/efficiency), then
under what circumstances should multi_field be used?

Topic		Replies	Views
Multi-fields vs Nested data type Elasticsearch	2	1613	July 18, 2022
Searching on multi-field fields Elasticsearch	8	1370	July 6, 2017
Question on store setting in a multi_field Elasticsearch	1	276	July 6, 2017
When to multi_field Elasticsearch	2	295	July 6, 2017
Multi field type mapping Elasticsearch	1	235	July 6, 2017

Multi_field field versus creating 2 separate fields?

Related topics