I need to query an index with tens of millions of short documents.
The result set may contain > 100,000 documents, and I need to process a
single field from each document. It those are simple stored fields in *.fdt
file - it will take forever +-.
I thought document values will answer my need of reading a single field
from each document. But I cannot make it work.
Is there a way to make a query return a single field that is stored in doc
value from the *.dvd file, as opposed to slowely digging it from the *.fdt
file ?
I imagine some aggregations uses doc values though I haven't looked at the
code to be sure.
Nik
On Tue, Nov 25, 2014 at 4:31 PM, Tzahi jakubovitz tzahij@hotmail.com
wrote:
Hi all,
I need to query an index with tens of millions of short documents.
The result set may contain > 100,000 documents, and I need to process a
single field from each document. It those are simple stored fields in *.fdt
file - it will take forever +-.
I thought document values will answer my need of reading a single field
from each document. But I cannot make it work.
Is there a way to make a query return a single field that is stored in doc
value from the *.dvd file, as opposed to slowely digging it from the *.fdt
file ?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.