Storing a floating-point value in a long field: what, no mapping error?

If I attempt to store a string value, such as "hello", in a field with an existing data type (mapping) of long, Elasticsearch unsurprisingly reports an error (number format exception).

However, I can store a floating-point value, such as 1.5, in a long field with—apparently, as far as I can tell (hence this topic)—no problem. The Elasticsearch log contains no error for this; I can get the ingested document, and the field has the original floating-point value (it hasn’t, for instance, been silently rounded or truncated to an integer).

Why doesn’t Elasticsearch report an error when I store a floating-point value in a long field?

What are the potential repercussions (in Elasticsearch or Kibana) of storing in Elasticsearch a floating-point value in a field that is mapped to the long data type?

Hi,

Elasticsearch tries to coerce floating point values to number by default. You can turn this off by setting the index.mapping.coerce setting to false. Internally the value is truncated which you can check if you get the doc_values for that particular field in your index:

GET your_index/_search
{ 
    ... match your document ...,
    "docvalue_fields": ["fieldName"]
}

You see the "original" value in the "_source" because we store the whole document string before any analysis takes place.

Thank you for the reply, much appreciated.

As you say, if I use the search that you cite, the response contains the truncated doc value. For example, for a field that is mapped to long:

"_source" : {
  ...
  "myfield" : 3.14159
  ...
},
"fields" : {
  "myfield" : [3]
}

Are the Elastic docs wrong?

From the Elastic docs for doc_values:

Doc values [...] store the same values as the _source

That statement is demonstrably false, as shown in the example response above. The doc value is 3, but the _source value is 3.14159.

Perhaps I’m still :slight_smile: missing something; or perhaps I’m applying a too-literal interpretation to the phrase “same values”.

Do you think that statement in the docs is correct? Or do you think it requires rewording, or qualification?

An example

To confirm that I’ve understood what you’ve told me...

In an index where the field myfield is mapped to the data type long, I have two documents: in one document, the _source value of myfield is the integer value 12; in the other document, the _source value of myfield is the floating-point value 12.5.

If I send the following request:

GET my_index/_search
{
  "query": {
    "range" : {
      "myfield" : {
        "gte" : 12.2,
        "lte" : 12.8
      }
    }
  }
}

I get zero hits. That is, it doesn’t find the 12.5.

If I’ve understood you correctly, this is because Elasticsearch is applying the query to the doc values that have been truncated due to the long mapping. Am I right?

Kibana

For an index containing _source values for myfield of 3.14159, 12, and 12.5 , a Kibana search with myfield added as a column—that is, the Kibana URL specifies columns:!(myfield)—shows the following column values:

3.142
12.5
12

So, it seems that Kibana is displaying values from _source (albeit rounded to the precision specified by the default pattern for the number format), not the doc values.

However, entering the following in the search field of the Discovery page:

myfield:(>12 AND <13)

gets “No results found”, because that search is based on the doc values.

I don’t mean to try your patience—you’ve already been very helpful; thanks again—but can you shed any light on why Kibana does not show doc values? Because it seems to me there’s a disconnect between the (_source) values that Kibana shows and the (doc) values that are used to determine what Kibana shows.

I think they are assuming you are indexing into a field with the correct type and don't do coercion (which can be turned to 'strict' as mentioned above)

That is correct.

I'm afraid I don't know, that question might be better answered in the kibana forum.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.