Hi
Problem Description
We want to migrate from elastic 7.x to 8. Migrated to 7.17.5 and resolved issues.
One of it was was location field with geo_shape mapping, we used strategy : recursive
. To resolve took backup (backup-of-index-surat, this too has explicit mapping , geo_shape with strategy : recursive
) and reindexed to the index with explicit mapping (dynamic: false) containing location with just geo_shape mapping without any strategy params. All documents were reindexed (counts matched)
The number of documents returned for geo_shape (circle, bbox, polygon) queries on older mapping and newer mapping for same data are different. Following are two queries :
- Linestring query: This w.r.t our data returns around 40K docs on older mapping and in newer, it returns zero docs
GET backup-of-index-surat/_count
{
"query": {
"bool": {
"filter": [
{
"terms": {
"id": [
"iisc.ac.in/89a36273d77dac4cf38114fca1bbe64392547f86/rs.iudx.io/surat-itms-realtime-information/surat-itms-live-eta"
],
"boost": 1
}
},
{
"geo_shape": {
"location": {
"shape": {
"type": "linestring",
"coordinates": [
[
72.842,
21.2
],
[
72.923,
20.8
]
]
},
"relation": "intersects"
}
}
},
{
"range": {
"observationDateTime": {
"from": "2020-10-12T00:00:00.000Z",
"to": "2020-10-22T00:00:00Z",
"include_lower": true,
"include_upper": true,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
}
- geo_shape circle query: This returns around 4k docs in older mapping and only 2k in newer mapped index
GET backup-of-index-surat/_count
{
"query": {
"bool": {
"filter": [
{
"terms": {
"id": [
"iisc.ac.in/89a36273d77dac4cf38114fca1bbe64392547f86/rs.iudx.io/surat-itms-realtime-information/surat-itms-live-eta"
],
"boost": 1
}
},
{
"geo_shape": {
"location": {
"shape": {
"radius": "10.0m",
"type": "Circle",
"coordinates": [
72.834,
21.178
]
},
"relation": "within"
}
}
},
{
"range": {
"observationDateTime": {
"from": "2020-10-12T00:00:00.000Z",
"to": "2020-10-22T00:00:00Z",
"include_lower": true,
"include_upper": true,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
}
What do we expect/Questions
-
Is this reindexing from pre-fix based geo_shape to bkd based geo_shape, the way to migrate the index to 8 version compatible?
-
why does the same geo_shape query return different number of docs for same data drastically , with just change on how its stored in elastic? The precision guaranteed by bkd tree (7 floating point) is higher than the data values stored (6 floating point) ? Is there need to change query to work with newer mapping?
-
The geo_shape queries should return same number of docs after migrating to BKD tree based representation.
Some relevant details:
Elastic version : 7.17.5 running as docker container
Data: All docs in the above indexed dataset are of Point type geo_shape location data. The latitude, longitude data values are of precision 6 floating points.
Please let me know if anything else is needed.