2193 documents have returned and it took 6.5 seconds! Isn't it too slow for this case? I mean 1M isn't a big number. My index has 185 MB size and system is running on single node. Index has 1 shard. JVM is running on 3072 MB memory and server has 16GB memory.
I've just included "VariantIds" field to improve performance but no luck.
Hi @Ignacio_Vera ,
my mapping is here.
This index is for testing purpose- we're trying to see the elasticsearch performance. So this is a bulk data, I'm creating VariantIds and VariantValues arrays randomly.
I've also a prodlist index but it isn't important for now.
So that's probably the reason.
You need to fetch 10000 documents per shard, send all of them to the coordinating node, then sort them and filter out the fields to send back.
There will be lots of response documents and I need all VariantIds values of those documents. Actually I'll use NEST and use ScrollAll method for this. But this is necessary.
Hi @Christian_Dahlqvist, thanks for the reply. I've used "includes" for not gathering all other fields but isn't that enough? And I saw that stored fields is not recommended in the reference. What does it do actually?
Ok, let me explain our case. I want to filter all variant types and values when a variant value selected in an E-Commerce catalog page. So I should filter variations and get all variant keys and values of those variations. Then I'll regroup them by types eg. Color, Size etc.
That's why i need all of the VariantIds.
We've this mechanism already in our Sql Server based system. But I thought that we can query this in ElasticSearch more faster than sql. But I couldn't prove it yet.
What struck me here is that in the mapping the variantIds are mapped to long. This datatype is optimise for range searches, but you are using a terms query. I think you should try to map those fields as keyword if you don't need range searches.
I've tested your suggestion, I have created another index which have VariantIds field with keyword type. I've also removed VariantValues field for a smaller document size. Took measurement of Elasticsearch has been decreased dramatically (it's about 200-300ms). But response time was still about 4 seconds because of the downloaded data size. So I've also changed my query for removing the repeating records:
I am glad to hear that this approach improve the query performance.
My understanding is that you are only interested in the variantIds that match the query. Therefore I think you can set the size of the request to 0. Aggregations will run always in all documents that match the query regardless of the values in size.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.