Elasticsearch indexing is low when there are area fields


(joel) #1

Elasticsearch version : 6.4.0
Logstash version: 6.4.0

I'm using logstash to import data to Elasticsearch and indexing speed is really decreased when there are area type fields.

The number of documents I'm importing is 58.000 and every of these documents has a nested object with an area field type.
"stores": {
"type": "nested",
"properties": {
"id": {"type": "long"},
"normalized_name": {"type": "keyword"},
"area": {"type": "geo_shape", "tree": "quadtree", "precision": "2km"},
"location": {"type": "geo_point"}
}
}
The implementation I'm using for the area type is an envelope and is imported in logstash using the json function
It takes half an hour to complete the indexing and if I remove this nested object, it takes only 2 minutes.
There is a reason for this, I'm doing something wrong?

Thanks


(Ignacio Vera) #2

This is a know issue with geo_shape field types. This is the reason that from Elasticsearch version 6.6 it was added a new indexing strategy based in the BKD tree.

https://www.elastic.co/guide/en/elasticsearch/reference/6.7/geo-shape.html

If you want to speed up indexing you can upgrade to a version higher than 6.6. Is that an option for you?


(joel) #3

Thanks Ignacio,

Yes, I'll try it.