It's taking approx 1944 miliseconds for a size of 250 documents, if I remove the highlighter from the query I'm getting 977, that's 967 extra milliseconds for the highlighter which tells me something is causing it to run slow. I'm using the FVH highlighter, here is the output from the hot threads which also seems to indicate it's the highlighter.
With the CPU usage being high could it simply mean that we need more CPU power?
I've observed something interesting. Why is it that when I set the default query to OR the speed is a lot quicker! Shouldn't the OR clause make it slower?
The hot threads suggest time is spent decompressing the document. So I suspect that maybe with AND you are getting larger top hits on average than if you run with OR?
They are quite large, we index content from PDFs in a lot of cases more than 1MB. It's not absolutely necessary. By default we offer the user 10 results per page but they do have the option of selecting a maximum of 250.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.