String to geo_point for kafka connect


(mark teehan) #1

I'm using the elastic kafka-connector to push topic contents to an elastic index. As the Confluent schema registry doesn't contain a "geo_point" type, I'm passing the lat,long as a string; but I've been unable to get it to automatically convert to a geo_point.

I'm unclear if a template is needed or if a correctly formatted location string will be auto-detected and converted. Such as "location":"-1.2,53.1333". I believe that this should happen automatically.

If I pre-create this template, and the stream the events, it fails with the error below.

>     > curl --header "content-type: application/JSON" -XPUT localhost:9200/_template/st_location -d '
>     > {
>     >   "template": "st_location*",
>     >   "settings": {},
>     >   "mappings": {
>     >     "kafka-connect": {
>     >       "properties": {
>     >         "COL_LOCATION": {
>     >           "type": "geo_point"
>     >         }
>     >       }
>     >     }
>     >   }
>     > }'

This is the error. This implies that it searching for a simple string of [?,?].

Caused by: java.lang.NumberFormatException: For input string: ""location":"8"
	at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043) ~[?:?]
	at sun.misc.FloatingDecimal.parseDouble(FloatingDecimal.java:110) ~[?:?]
	at java.lang.Double.parseDouble(Double.java:538) ~[?:1.8.0_171]
	at org.elasticsearch.common.geo.GeoPoint.resetFromString(GeoPoint.java:93) ~[elasticsearch-6.4.2.jar:6.4.2]
	at org.elasticsearch.index.mapper.GeoPointFieldMapper.parseGeoPointStringIgnoringMalformed(GeoPointFieldMapper.java:365) ~[elasticsearch-6.4.2.jar:6.4.2]

...

So may be the template is the problem. I'd prefer if the geo_point is auto-converted without a template. After deleting the template, loading of the events succeeds; but as a string, not a geo_point. The logfile contains:
[2018-11-19T23:04:38,777][INFO ][o.e.c.m.MetaDataMappingService] [7wULmP3] [st_location/FqQs8i1_Thuc8H3M_ubQTg] create_mapping [kafka-connect]

but the mapping is a string:

> {
>   "mapping": {
>     "kafka-connect": {
>       "properties": {
>         "COL_LOCATION": {
>           "type": "text",
>           "fields": {
>             "keyword": {
>               "type": "keyword",
>               "ignore_above": 256
>             }
>           }
>         }
>       }
>     }
>   }
> }

This is a sample row pattern after loading an event without the template (as a string)

COL_LOCATION:"location":"-1.2,53.1333" 
_id:
ST_LOCATION+7+4 
_type:kafka-connect
_index:st_location
_score:1
"location":"-1.2,53.1333" 
_id:ST_LOCATION+7+4

I can emit the location string in any format (I'm using kSQL to build the string).

My question - what format should it be to be parsed and auto-converted to a geo-point?


(Igor Motov) #2

geo_point cannot be dynamically mapped based on the format of the input, you have to ether use an index template or dynamic mapping template.

I think it implies that you were trying to index a string that contained the word "location" and a number 8 in it as a geo_point.

Basically, instead of indexing "location":"-1.2,53.1333" you need to just index -1.2,53.1333