Elasticsearch returns different values from DSL and SQL query

pkshara · January 1, 2021, 8:08am

I am trying to query index using dsl and sql method, while comparing the results i found that the values are different.Below are the query and required details.

DSL query and result

SQL query and result

Result from Kibana

Values of chargeInUSD from both the result set differs. I am trying to understand the logic behind the value changes for it.

TIA

ylasri · January 1, 2021, 8:43am

May be i'm blind but i see exactly the same results

pkshara · January 1, 2021, 10:02am

Value of ChargeInUSD is different.

docId:ad2a5fb4bf83f9b71947d8acf1a3a28b58d9115a
value in SQL query = 472.4606018066406
dsl query = 472.4606038057

docID:c806e6213793f13726ed2cdb95b439b97e5011ec
value in SQL = 195.50082397460938
dsl query = 195.5008179837

highlighted the differed value .

pkshara · January 5, 2021, 6:51am

@ylasri Values are different.. have highlighted them

warkolm · January 5, 2021, 7:10am

Please don't post pictures of text, they are difficult to read, impossible to search and replicate (if it's code), and some people may not be even able to see them

It'd help a lot if you copied the queries and responses and posted them in the topic.

pkshara · January 5, 2021, 9:17am

Query in DSL

GET prv_charge_data_extract/_search
{
"_source": ["chargeInUSD","chargeInCntryCurr","exchangeRate"],
"query": {
"terms": {
"docID": [
"c806e6213793f13726ed2cdb95b439b97e5011ec",
"ad2a5fb4bf83f9b71947d8acf1a3a28b58d9115a"
]
}
}
}

Result -:
{
"_index" : "prv_charge_data_extract",
"_type" : "_doc",
"_id" : "ad2a5fb4bf83f9b71947d8acf1a3a28b58d9115a",
"_score" : 1.0,
"_source" : {
"chargeInUSD" : 472.4606038057,
"exchangeRate" : 0.817084,
"chargeInCntryCurr" : 386.04
}
},
{
"_index" : "prv_charge_data_extract",
"_type" : "_doc",
"_id" : "c806e6213793f13726ed2cdb95b439b97e5011ec",
"_score" : 1.0,
"_source" : {
"chargeInUSD" : 195.5008179837,
"exchangeRate" : 0.735956,
"chargeInCntryCurr" : 143.88
}
}

Query IN SQL -:
GET _sql?format=txt
{
"query":"""
SELECT docID, chargeInUSD, chargeInCntryCurr,exchangeRate FROM prv_charge_data_extract
WHERE docID in ('c806e6213793f13726ed2cdb95b439b97e5011ec','ad2a5fb4bf83f9b71947d8acf1a3a28b58d9115a')
"""
}

Result -:
docID | chargeInUSD |chargeInCntryCurr| exchangeRate
----------------------------------------+------------------+-----------------+------------------
ad2a5fb4bf83f9b71947d8acf1a3a28b58d9115a|472.4606018066406 |386.0400085449219|0.817084014415741
c806e6213793f13726ed2cdb95b439b97e5011ec|195.50082397460938|143.8800048828125|0.7359560132026672

the bold highlighted values differs in SQL and DSL results. After i tried it many queries for different indices, i observed that the values for float datatype filed gives different result.

warkolm · January 5, 2021, 9:19am

What is the mapping on the chargeInUSD field.

Please format your code/logs/config using the </> button, or markdown style back ticks. It helps to make things easy to read which helps us help you

pkshara · January 5, 2021, 9:21am

chargeInUSD is an float datatype field

Andrei_Stefan · January 6, 2021, 10:56am

What you get from Elasticsearch in query DSL is the _source meaning the exact "text" those documents were indexed with. What you get from SQL is very likely to be the value that was actually indexed. Try testing the same query DSL but adding docvalue_fields to it and put there your chargeInUSD field and see what you get back. Documentation about docvalue_fields can be checked here.

bogdan.pintea · January 11, 2021, 10:53am

@pkshara,
The difference stems from the fact that the floating point types have a discrete resolution: not all numbers within their value range can be represented with them. 472.4606038057 is such a number and the closest value that Elasticsearch JVM can assign to it is the one you got back.

You can improve the precision of your values by mapping these fields as double instead of float and that should likely cover your precision needs (but still does't rule out issues like this towards the end of the precision range, after some 15 decimal digits).

system · February 8, 2021, 10:53am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
The query results is different via kibana console and python api Elasticsearch	4	886	April 4, 2019
Mismatch results between kibana and python dsl Elasticsearch	1	466	January 30, 2019
Different output on query Elasticsearch language-clients	2	386	March 24, 2021
Why i got different result from these two DSL? Elasticsearch	2	359	January 17, 2017
SQL Vs. Query DSL Elasticsearch	2	694	March 28, 2022

Elasticsearch returns different values from DSL and SQL query

Related Topics