We have a mapping where one of the fields is an integer, but we want to
change this to a double. We want to avoid re-indexing, since there will be
a lot of documents at migration time. Hence, we were considering using a
"multi_field" (now apparently deprecated, but I guess the same applies for
the "fields" of a property) for this scenario, where the field is both
treated as an integer and a double. This means that on the day of
migration, all the old documents will only have the integer value set, and
all new documents will have the double value set. In our code, we will only
treat this value as a double. We have been doing some testing, and it seems
like it should work, but I would like to confirm our findings as expected
rather than "by chance". Let's call the field "the_field", and the double
property "double", like this:
Without changing our indexing code, when writing a double value to
"the_field", the value is automatically written to the field as a double.
When fetching the document back, the value of "the_field" is a double. New
and old documents looks the same. Old documents have integer values and new
documents have the double values for "the_field". I would expect new
documents to have "the_field.double" in the result instead, but this does
not seem to be the case (which is good for us, if that is intended).
When querying "the_field", say with a range query, both old and new
documents appear in the result, but the double part of the value in the new
documents are ignored. So 2.534 is treated as the value 2 in the range (or
in sorting). This means that if the range is "lte: 2", then even double
values up to < 3 are included in the range.
When querying "the_field.double", say with a range query both old and new
documents appear in the result, and the values from both old and new
documents are treated as double values, as opposed to the previous example.
So if the range is "lte: 2", then only integer and double values < 2 are
included in the range.
Are these observations correct, and as expected? Or is it a "side effect"
of some kind that we should not rely upon? And I assume the rules for
queries also applies to aggregations? If this is in fact expected behavior,
is it possible to "alias" "the_field.double" to "the_field" in queries, so
it is by default treated as the double value?
I hate to bump this post, but I would really appreciate if anyone has any
input regarding this.
Regards,
Nils-Helge Garli Hegvik
On Tuesday, November 25, 2014 4:20:49 PM UTC+1, nil...@gmail.com wrote:
We have a mapping where one of the fields is an integer, but we want to
change this to a double. We want to avoid re-indexing, since there will be
a lot of documents at migration time. Hence, we were considering using a
"multi_field" (now apparently deprecated, but I guess the same applies for
the "fields" of a property) for this scenario, where the field is both
treated as an integer and a double. This means that on the day of
migration, all the old documents will only have the integer value set, and
all new documents will have the double value set. In our code, we will only
treat this value as a double. We have been doing some testing, and it seems
like it should work, but I would like to confirm our findings as expected
rather than "by chance". Let's call the field "the_field", and the double
property "double", like this:
Without changing our indexing code, when writing a double value to
"the_field", the value is automatically written to the field as a double.
When fetching the document back, the value of "the_field" is a double. New
and old documents looks the same. Old documents have integer values and new
documents have the double values for "the_field". I would expect new
documents to have "the_field.double" in the result instead, but this does
not seem to be the case (which is good for us, if that is intended).
When querying "the_field", say with a range query, both old and new
documents appear in the result, but the double part of the value in the new
documents are ignored. So 2.534 is treated as the value 2 in the range (or
in sorting). This means that if the range is "lte: 2", then even double
values up to < 3 are included in the range.
When querying "the_field.double", say with a range query both old and
new documents appear in the result, and the values from both old and new
documents are treated as double values, as opposed to the previous example.
So if the range is "lte: 2", then only integer and double values < 2 are
included in the range.
Are these observations correct, and as expected? Or is it a "side effect"
of some kind that we should not rely upon? And I assume the rules for
queries also applies to aggregations? If this is in fact expected behavior,
is it possible to "alias" "the_field.double" to "the_field" in queries, so
it is by default treated as the double value?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.