The _ignored field not working?

Summary
I'm using ignore_malformed, but get no results when searching for the _ignored field.

Details

I've isolated the problem to a small example. I:

  1. Create the index with ignore_malformed set to true.
  2. Insert a couple of documents with invalid values.
  3. Observe that both documents are in the index as expected.
  4. Search for the ignored fields, but I expected some results and got none.

Here's the test code:

PUT /test_points
{
  "settings": {
    "index.mapping.ignore_malformed": true
  },
  "mappings": {
    "properties": {
      "timestamp": { "type": "date" },
      "start_point": { "type": "geo_point" }
    }
  }
}

POST /test_points/_doc/
{
  "timestamp": "2022-07-23T11:42:00",
  "start_point": "120.0,120.0"
}

POST /test_points/_doc/
{
  "timestamp": "2022-07-23T11:43:00",
  "start_point": [130.0, 130.0]
}

Here are the unexpected results mentioned above:

GET /test_points/_search
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "test_points",
        "_id" : "p2QvLIIBVKX1Q3v1B-HR",
        "_score" : 1.0,
        "_source" : {
          "timestamp" : "2022-07-23T11:42:00",
          "start_point" : "120.0,120.0"
        }
      },
      {
        "_index" : "test_points",
        "_id" : "qGQvLIIBVKX1Q3v1EOGC",
        "_score" : 1.0,
        "_source" : {
          "timestamp" : "2022-07-23T11:43:00",
          "start_point" : [
            130.0,
            130.0
          ]
        }
      }
    ]
  }
}

The start_point for one document is returned as "120.0,120.0" and for the other [130.0, 130.0]. While this is what I sent over, I would expect Elasticsearch to do some normalization on values?

Not a big deal though, the bigger problem is this:

GET /test_points/_search
{
  "query": {
    "exists": {
      "field": "_ignored"
    }
  }
}
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

No hits. Shouldn't I get something as mentioned in the docs?

Hi @rokcarl

it is not working because in both documents the start_point supports both the coordinate array and the string.

"start_point": "120.0,120.0"

Geopoint expressed as a string with the format: "lat,lon".

"start_point": [130.0, 130.0]

Geopoint expressed as an array with the format: [ lon, lat]

Try this index:

POST /test_points/_doc/
{
  "timestamp": "2022-07-23T11:42:00",
  "start_point": 10
}

The query below will works:

GET test_points/_search
{
  "_source": false, 
  "query": {
    "exists": {
      "field": "_ignored"
    }
  }
}

Read more in Geopoint Type.

1 Like

Hmmm not sure if we're on the same page.

I wasn't surprised that Elasticsearch allowed me to index both documents, I was surprised that the search results returned values for the start_point in different format, but more importantly, that the values even got saved and that it didn't show up in the _ignored search, reason being that the actual value is not a valid point because 120.0 is not a valid latitude value.
See here:

DELETE /test_points_reject

PUT /test_points_reject
{
  "settings": {
  },
  "mappings": {
    "properties": {
      "timestamp": { "type": "date" },
      "start_point": { "type": "geo_point" }
    }
  }
}

POST /test_points_reject/_doc/
{
  "timestamp": "2022-07-22T15:31:00",
  "start_point": "120.0,120.0"
}

Result:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "mapper_parsing_exception",
        "reason" : "failed to parse field [start_point] of type [geo_point]"
      }
    ],
    "type" : "mapper_parsing_exception",
    "reason" : "failed to parse field [start_point] of type [geo_point]",
    "caused_by" : {
      "type" : "illegal_argument_exception",
      "reason" : "illegal latitude value [120.0] for start_point"
    }
  },
  "status" : 400
}

Do you know what might be happening here?

Anyone?