Indexing Multipolyon with Elasticearch

Hi there,

I'm using Elasticsearch 7.15 with GDAL 3.3.1 and trying to index the UK Land Registry Dataset.

I currently have these data stored in a PSQL table and am trying to use ogr2ogr to create the ES index.

I've defined a mapping

"mappings" : {
        "_meta" : {
          "fid" : "ogc_fid"
        },
        "properties" : {
          "additional_proprietor_indicator" : {
            "type" : "text"
          },
          "address" : {
            "type" : "text"
          },
          "area" : {
            "type" : "double"
          },
          "change_indicator" : {
            "type" : "text"
          },
          "class_of_title" : {
            "type" : "text"
          },
          "company_registration_no_1" : {
            "type" : "text"
          },
          "company_registration_no_2" : {
            "type" : "text"
          },
          "company_registration_no_3" : {
            "type" : "text"
          },
          "company_registration_no_4" : {
            "type" : "text"
          },
          "country_incorporated_1" : {
            "type" : "text"
          },
          "country_incorporated_2" : {
            "type" : "text"
          },
          "country_incorporated_3" : {
            "type" : "text"
          },
          "country_incorporated_4" : {
            "type" : "text"
          },
          "county" : {
            "type" : "text"
          },
          "created_at" : {
            "type" : "text"
          },
          "date_proprietor_added" : {
            "type" : "text"
          },
          "district" : {
            "type" : "text"
          },
          "estate_interest" : {
            "type" : "text"
          },
          "multiple_address_indicator" : {
            "type" : "text"
          },
          "ogc_fid" : {
            "type" : "long"
          },
          "poly_id" : {
            "type" : "integer"
          },
          "polygons" : {
            "type" : "geo_shape"
          },
          "postcode" : {
            "type" : "text"
          },
          "price_paid" : {
            "type" : "text"
          },
          "proprietor_1_address_1" : {
            "type" : "text"
          },
          "proprietor_1_address_2" : {
            "type" : "text"
          },
          "proprietor_1_address_3" : {
            "type" : "text"
          },
          "proprietor_2_address_1" : {
            "type" : "text"
          },
          "proprietor_2_address_2" : {
            "type" : "text"
          },
          "proprietor_2_address_3" : {
            "type" : "text"
          },
          "proprietor_3_address_1" : {
            "type" : "text"
          },
          "proprietor_3_address_2" : {
            "type" : "text"
          },
          "proprietor_3_address_3" : {
            "type" : "text"
          },
          "proprietor_4_address_1" : {
            "type" : "text"
          },
          "proprietor_4_address_2" : {
            "type" : "text"
          },
          "proprietor_4_address_3" : {
            "type" : "text"
          },
          "proprietor_name_1" : {
            "type" : "text"
          },
          "proprietor_name_2" : {
            "type" : "text"
          },
          "proprietor_name_3" : {
            "type" : "text"
          },
          "proprietor_name_4" : {
            "type" : "text"
          },
          "proprietorship_category_1" : {
            "type" : "text"
          },
          "proprietorship_category_2" : {
            "type" : "text"
          },
          "proprietorship_category_3" : {
            "type" : "text"
          },
          "proprietorship_category_4" : {
            "type" : "text"
          },
          "recent_status" : {
            "type" : "text"
          },
          "region" : {
            "type" : "text"
          },
          "registered_status" : {
            "type" : "text"
          },
          "tenure" : {
            "type" : "text"
          },
          "title_number" : {
            "type" : "text"
          },
          "updated_at" : {
            "type" : "text"
          },
          "uprn" : {
            "type" : "text"
          }
        }

and am using the command

ogr2ogr -progress --config ES_OVERWRITE 1 -f "Elasticsearch" {ES_HOST} PG:"{DB_CONN_STR}" "{DB_TABLE}" -nln {INDEX_NAME}

but keep getting the error

{"index":{"_index":"core_landregistry","_type":"FeatureCollection","_id":"Fwx-YHwByMdODXOcCpWC","status":400,"error":{"type":"illegal_argument_exception","reason":"Invalid type: expecting [_doc] but got [FeatureCollection]"}}}

I'm not quite sure where to start to try to solve this... A colleague ran the same command with the same mapping and didn't get the error. We have identical versions of ES and GDAL installed and the database table we're indexing was created from the same SQL dump.

Has anyone experienced this error before?

Welcome to our community! :smiley:

I don't know what ogr2ogr is, but does it have an option to define the Elasticsearch _type? If it does then use _doc.

Hi Mark,

ogr2ogr is a super powerful tool for converting geospatial data between different formats that comes packaged with GDAL, it does have a MAPPING_TYPE option that I can use to define the type as a doc, will give that a go.

I am still confused about this solution though, I thought mapping types were removed in 7.15? My understanding was that the _type field was deprecated in 6.0.0 so how would it make any difference? Am I missing something?

Thanks!
Andy

Hi Andy,

I think you might want to add this flag

-lco MAPPING_NAME=doc

to your GDAL command.

There's more details about using GDAL with Elasticsearch in this blog post.

Hi Nick,

Yeah this is exactly the flag I was going to try (I don't have access today so will have to try tomorrow). I'm still confused about why it would work though, the blog post you've linked to says:

The MAPPING_NAME parameter has no effect on Elasticsearch 7 and later because mapping types have been removed.

and we're using Elasticsearch 7.15 so not sure why I'm even getting a mapping type error in the first place!

Andy

Perhaps something has changed in GDAL 3.2 or 3.3. I wrote that blog based on GDAL 3.1. It might be a regression that was missed. I don't believe GDAL runs functional tests against an actual Elasticsearch instance.