Hi all,
I have a tests index with 43 million documenst. there is a string document
value for each document. (about 5-10 character value for each document)
Mapping is:
{
"myindex" : {
"mappings" : {
"num_type" : {
"_type" : {
"store" : true
},
"properties" : {
"doc_value" : {
"type" : "string",
"doc_values_format" : "default"
},
"int1" : {
"type" : "integer",
"index" : "analyzed",
"store" : true
},
"int2" : {
.
.
.
I need to retrieve the document values only for queries that may return
about 100,000 documents result set. I do not need ranking or anything else
that will slow this down.
My understanding is that if the query is only a filter – ranking is not
computed, and it is faster.
Here is a small python program to test it:
*import *elasticsearch
es = elasticsearch.Elasticsearch()
results = es.search("myindex", "num_type",
{
"fields":["doc_value"],
"size":1000,
"query": {"filtered": {
"query": {"match_all":{}}
,"filter": {
"term": {"r_int3": 929}}
}}
},scroll="10s",search_type="scan")
while True:
results = es.scroll(results["_scroll_id"], scroll="10s")
if len(results["hits"]["hits"]) <= 0:
break
The query runs pretty slow, and I see there is huge number of access to the
*.fdt (field data) file.
But I ask for a document value field – so why does ES access the *.fdt.
Thanks a lot in advance.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/89480f13-b00e-4e3f-a538-15fdbd18f073%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.