Problems with Geo_point type and doc_values

Hi all,

I have a very strange problem with geo_point type and "script_fields" or "fielddata_fields" search queries.

What I want to do is to retrieve latitude and longitude from a geo_point field created by a custom plugin that computes and indexes the centroid of a shape (https://github.com/opendatasoft/elasticsearch-plugin-geoshape). By definition, I don't have this geo_point in the document source, in order to get it in the result, I need to use fielddata_fields or script_fields.

1 ) fielddata_fields
It's working fine with one node (or one shard), but, since geo_point type can't be serialized an exception is raised with two nodes or more. It is actually a bug or not ? A working patch could be : https://gist.github.com/clement-tourriere/2aa7219bd1da96393cbd.

2 ) Using script_fields
When using doc['my_shape.centroid'].value, I have the same problem as above (same code used internally with getScriptValue()).
When using doc['my_shape.centroid'].lat and doc['my_shape.centroid'].lon, it's working just fine with simple request. When performing parallel queries on 10000 thousands documents minimum, it's raising an exception : https://gist.github.com/clement-tourriere/855a24f150efb06e8fec.
I have made a lot of tests to be able to reproduced it, and here's my conclusion.

I have started with indexing a geo_point field with doc_values set to true. A full example can be found here (writing in python with celery for parallel indexing/requests) : https://gist.github.com/clement-tourriere/82b27c25d5876ae53eef

Everything is working just fine.

The next example is using a geo_point defined as an external_value and doc_value set to true. (I have write a simple class for that, or you can use the geo_shape plugin to test this case) : https://gist.github.com/clement-tourriere/1ffe9eba5bfa0070e5d2.

With the same use case (10 parallel requests on 100000 documents), it is now raising the groovy bufferOverflow exception.
If I remove doc_value from external geo_point definition, it's working fine again.

Could you please help me with that.

Thank you again for your incredible work.

Regards,
Clément

Regarding #1, can you provide the error you see?

Sorry, I didn't put a gist to reproduce it (for the #1).

The error is :

"RemoteTransportException[[Yukon Jack][inet[/192.168.1.43:9301]][indices:data/read/search[phase/fetch/id]]]; nested: IOException[Can't write type [class org.elasticsearch.common.geo.GeoPoint]]; "

Here the gist : https://gist.github.com/clement-tourriere/8d06a0985f859666ae53

I just did that with 3 points, but no exception.

What version are you on?

I have tried with elasticsearch 1.6.2 and 2.0 beta1.

You need to have a cluster with a least two nodes for seeing the exception.

Do you still get it with ES 1.7.1?

Yes I still get it with es 1.7.1

The full stack is :

org.elasticsearch.transport.RemoteTransportException: [Scream][inet[/192.168.1.43:9301]][indices:data/read/search[phase/fetch/id]]
Caused by: java.io.IOException: Can't write type [class org.elasticsearch.common.geo.GeoPoint]
	at org.elasticsearch.common.io.stream.StreamOutput.writeGenericValue(StreamOutput.java:415)
	at org.elasticsearch.search.internal.InternalSearchHitField.writeTo(InternalSearchHitField.java:110)
	at org.elasticsearch.search.internal.InternalSearchHit.writeTo(InternalSearchHit.java:726)
	at org.elasticsearch.search.internal.InternalSearchHits.writeTo(InternalSearchHits.java:259)
	at org.elasticsearch.search.fetch.FetchSearchResult.writeTo(FetchSearchResult.java:104)
	at org.elasticsearch.transport.netty.NettyTransportChannel.sendResponse(NettyTransportChannel.java:99)
	at org.elasticsearch.transport.netty.NettyTransportChannel.sendResponse(NettyTransportChannel.java:76)
	at org.elasticsearch.search.action.SearchServiceTransportAction$FetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:869)
	at org.elasticsearch.search.action.SearchServiceTransportAction$FetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:862)
	at org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:279)
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

I can reproduce the #1 problem all the time, that is really strange it doesn't happen for you.

From scratch :

And for the #2 error, any hint ?

I haven't had time to test this with 2 nodes sorry, I'll try tomorrow.

I replicated this on 1.7.1.

I'll ask one of the geo team to take a gander and comment :slight_smile:

A work around for those needing one:

Add lat_lon: true for your geo_point field mapping and retrieve them using field_name.lat / field_name.lon as normal fields.

I can see you've already raised this - https://github.com/elastic/elasticsearch/issues/13340