I am trying to speed up a query and not sure if I am going about it the right way (being a newcomer to ES).
I am trying to find an address in an index (that contains 200 million records) by it's constituent parts. And I am looking for specific values. So for instance, I have street number (1118), street name (Tower), street type (Rd) and zip code (90210). And I am trying to find the property with that address. So my query:
The cluster is healthy. The indices are green. And the shards started.
Your question reminded me that I omitted an important piece of information. I can't profile it because we are on an ancient version of Elastic (1.7.6), which doesn't seem to have a profiler from what I can tell. So the only hope here is that I screwed up the query somehow
Running the same query is inconsistent. Majority of responses will be around 400-800 ms. Then some will be like 6 to 20 ms. There is probably caching at play here that rapidly expires. But it doesn't matter - I don't run the same query over and over. The address lookups are different every time.
That was not my question. I was asking for the output of those commands.
A lot, like A LOT, of improvements have been done over the past years. You should really upgrade to 7.0. Specifically if you don't need the exact total number of hits you can benefit from many improvements. Also a newer JVM can help.
It does matter a lot even if you are searching for other terms.
BTW what kind of hardware do you have? Ssd drives?
3 boxes. Each one has 252GB of RAM. And each one has 2 CPUs. Which are 8 cores each and then each core is hyperthreaded giving it poor man's 32 cores per box.
Each box has 252GB of RAM, but the usage is around 30-40GB. The hard drives are all SSD.
As you have plenty of resources on your machines, would it be possible to start a 7.0.1 cluster on the same hardware, use reindex from remote to read from the 1.7 cluster?
Also maybe running hot threads API while a "long" query is running could tell something?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.