Note in your example I think there is a typo as the second PUT is on a different index my-index-2, so most likely my-index-2 has created a text+keyword field if you did not also create it as type long
That being said, when you run a search "_source" will return the exact json payload which you sent. Value 2 or "2" are acceptable as the field will store a number in both cases since the field is already declared a long.
If you try this though, the document will be rejected:
PUT my-index/test/3
{
"testId": "three"
}
# returns an error like failed to parse field [testId] of type [long] in document with id '3'. Preview of field's value: 'three'"
And you can run a max aggregation or check the mapping to see the datatype cannot change on an existing field :
#returns long
GET my-index/_mapping/test/field/testId
#returns 2.0
GET my-index/_search?size=0
{
"aggs": {
"max_testid": {
"max": {
"field": "testId"
}
}
}
}
Yes v6.x is clear as declaring mapping with document type is deprecated in v6, and yes I did expect a typo. Let me know if you need clarification, the previous answer should cover that "2" is just coming from the json payload you sent, and value is a long not a string since the field has long datatype (you cannot change the datatype on an existing field - without recreating the index and reindexing)
Thanks
I'm not sure you got me right (or I didn't get you right). The typo I mentioned was only in the example I posted. I then fixed it and shared in my second comment the correct reproduction steps and the responses I get from elastic.
Elastic accepts both values 1 and "2" on the same testId field under the same index my-index (which is mapped as long) and stores the first one as long and the second one as string.
As you mentioned in your first response, it won't accept a value like "three", which proves that it does makes a validation check, but at the end of the day, it stores the "2" as a string "2".
I would expect to either get an exception thrown of that it would store it as 2 (long value, without the quotes).
I know 6.8 is going end of life soon. I haven't tested that in 7.x yet to see if the same.
When a normal search is performed in elasticsearch it returns the source document as the _source, what the actual source document that was indexed .... the actual values are stored in the elsewhere (example) doc_values so as you can see below when I do a search that returns both. In this case when you send in a "2" elasticsearch can properly convert that to a number upon indexing but it does not alter the source, that is all that is happening if it is desired to for the source to be correct ... then the correct source needs to be indexed, elasticsearch does not alter the source document, it is not a compiler like C++, if it can index it will, it has some "liberalness" to it to ease compatibility with JSON etc. and ease ingestion, I think that is perhaps considered a feature not a bug.
However when you look at the actual doc_values you can see that in fact both are actually numeric fields.
Thanks @stephenb.
We got to notice this behavior when we accidentally started indexing the numbers as strings, and as the range in your example, we were using aggs (that came back as a number) and failed to equal the values from the _source (which were coming back as strings).
IMO this behavior is tricky and confusing. But I agree with you that it gives some sort of tolerance.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.