Ignore_malformed broken for geo_shape

corndog · November 22, 2021, 1:02am

I have set mappings for my index as such:

"geometry": {
    "type": "geo_shape",
    "ignore_malformed": true
}

When I Bulk POST documents, 1 fails due to malformed geometry. I have set the "ignore_malformed" parameter to true as mentioned in the docs.

Well I still get an exception thrown:

elasticsearch.helpers.errors.BulkIndexError: ('1 document(s) failed to index.', [{'index': {'_index': 'buildings', '_type': '_doc', '_id': 'VPSlEX0BWuwsi5mPMOb1', 'status': 400, 'error': {'type': 'mapper_parsing_exception', 'reason': 'failed to parse', 'caused_by': {'type': 'invalid_shape_exception', 'reason': 'Cannot determine orientation: signed area equal to 0. Points are collinear or polygon self-intersects.'}}, 'data': {'fid': 1761, 'OBJECTID': 280, 'Latitude': etc; etc

warkolm · November 22, 2021, 1:05am

What does the document that failed look like?

corndog · November 22, 2021, 1:07am

{"fid": 1761, "OBJECTID": 280, "Latitude": -####, "Longitude": ####, "searchable_building": "Building Name", "geometry": {"type": "MultiPolygon", "coordinates": [[[[#####]]]]}}

Sorry I can't post the actual coordinates etc it’s sensitive data, but nothing stands out to me about the geoshape

corndog · November 22, 2021, 1:32am

I figure this may be due to the Python library having integration issues with Elasticsearch. By default I'd expect having the mapping "ignore_malformed: true" to keep bulk indexing even when hitting fields with geo_shape errors.

But it seems in the Python library an exception is raised by default, crashing your program, regardless of your mapping. Which is counter intuitive.

I've circumvented this issue by including the parameter: "raise_on_error=False" in my Python code. Runs to completion now even with issue documents.

Eg:

helper.bulk(client=ES,  actions=actions, raise_on_error=False)

stephenb · November 22, 2021, 3:48am

Just to add a little bit of detail.

In the ingest pipeline that ignore malformed means continue processing the ingest pipeline there could be many steps after that it means don't fail on that step And stop the whole processing.

Ingest pipelines are pre-writing if the actual document into the index.

When the document is actually written to the index is where a mapping exception takes place.

My suspicion is what's happening is it's malformed in such a way that when it went to Write the actual document there was a mapping exception which I think you may already have figured out I just thought I would remind anyone reading this that pipelines run before writing and mapping exceptions happen on writing.

system · December 20, 2021, 3:49am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Error handling geopoint values Elasticsearch	4	266	August 20, 2021
Bug on ignore_malformed in 5.4.3 version Elasticsearch	1	454	October 9, 2017
Possible malformed shape detected - ES 7.9.3 Elasticsearch	3	446	November 30, 2020
Ignore_malformed fires mapper_parsing_exception for keyword type on 7.10.1 (it worked on 7.1.0) Elasticsearch	10	2442	January 19, 2021
Elasticsearch 5.x: Is it possible to ignore missing inner object when indexing? Elasticsearch	4	890	April 6, 2018

Ignore_malformed broken for geo_shape

Related topics