Search Optimization and FileDownload


(suresh) #1

I have data loaded in ES using logstash, and i am using elasticsearch.js in my App to query and fetch the data. I am looking for an optimum solution in which my search is quick and Data File-Size is reduced.

In the present set up search - took 647, - hits total=45806 and Data/FileSize: 14.1MB, browser download Time:6.85s, Which was originally - took 3153, - hits total=45806 and Data/FileSize: 31.9MB, browser download Time:14.77s

I have tried to optimize the JSON search request as below.I need suggestion if there is better one then Ver1.3. I guess the problem in my App is on client side with filedownload option where Data/FileSize is hugh.

Ver1.0
GET k00125_car/_search
{"query":{"filtered":{"query":{"query_string":{"analyze_wildcard":true,"query":""}},"filter":{"bool":{"must":[{"range":{"@timestamp":{"gte":1286952143643}}}],"must_not":[]}}}},"highlight":{"pre_tags":["@kibana-highlighted-field@"],"post_tags":["@/kibana-highlighted-field@"],"fields":{"":{}},"fragment_size":2147483647},"size":1000000,"sort":[{"focus_tier":{"order":"desc","unmapped_type":"boolean"}}],"aggs":{"2":{"date_histogram":{"field":"@timestamp","interval":"1M","pre_zone":"+05:30","pre_zone_adjust_large_interval":true,"min_doc_count":0,"extended_bounds":{"min":1286952143643,"max":1444718543643}}}},"fields":["*","source"],"scriptfields":{},"fielddata_fields":["@timestamp"]}

Ver1.1
GET k00125_car/_search
{
"query": { "match_all": {} },
"size":1000000,
"source": ["bunit","companycode","customer_number","focus_tier","name","contact_phone","service_address","sum_svchrg"]
}

Ver1.2
GET k00125_car/_search
{
"size":1000000
}

Ver1.3
GET k00125_car/_search
{
"fields": ["bunit","company_code","customer_number","focus_tier","name","contact_phone","service_address","sum_svchrg"],
"size":1000000

}


Es search optimizing question
(Jimferenczi) #2

If you want to retrieve a lot of results (size: 1000000) you should use the scroll API:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html


(suresh) #3

Thankyou.
I am using combination of scroll and fields, because if i use { "sort": ["_doc"] } i feel that amount of data returned in response is hugh. Or is using filter more usefull. :slight_smile:

GET k00125_car/_search?scroll=1m
{
"fields": ["bunit","company_code","customer_number","focus_tier","name","contact_phone","service_address","sum_svchrg"],
"size":1000000
}


(Jimferenczi) #4

When using the scroll you should not use size or at least not a size of 1000000. The initial query will give you a scroll_id which should be passed to the scroll API in order to retrieve the next batch of results so it's not a one shot query. Using { "sort": ["_doc"] } does not change the amount of data returned, it is an optimization which makes the query faster.
Please read carefully this part of the documentation:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html#search-request-scroll


(system) #5