How to get the result based on custom sorting in elasticsearch?

We have a use case to get the result based on custom sorting from Elasticsearch.
I am using Elasticsearch v 5.1.2.

Mapping

client_obj.indices.create(:index=>'test',:body=>{:mappings=>{:texts=>{:properties=>{:number=>{:type=>"integer"},:text=>{:type=>"text",:term_vector=>"with_positions_offsets_payloads"}}}}})

arr = [1,3,200,100,2,10 ...] # 1million entry

From array(arr), I am expecting results as the number ordered in an array from Elasticsearch. I used below API to get the results. It worked for the small set of numbers but if the array size is more than 500k then function block in API will increase and my ES server is going down.

from and size value will very based on page number and size
GET /test/_search
{
"query": {
"function_score": {
"boost_mode": "replace",
"query": {
"constant_score": {
"query": {
"bool": {
"must": [ { "terms": { "number" : [1,3,200,100,2] }},
{"query_string" : { "query" : "#{keyword}" ,"default_field" : "text"}}
]
}
}
}
},
"functions": [
{ "filter": { "term": { "number": 1 } }, "weight" : 4 },
{ "filter": { "term": { "number": 3 } }, "weight" : 3 },
{ "filter": { "term": { "number": 200 } }, "weight" : 2 },
{ "filter": { "term": { "number": 100 } }, "weight" : 1 },
{ "filter": { "term": { "number": 2 } }, "weight" : 0 }
]
}
},
"_source": ["number"],
"size": 3,
"from": 0
}

I am getting below error if the same API called for 500k numbers

   [2017-03-30T09:01:08,898][INFO ][o.e.m.j.JvmGcMonitorService] [BYOiXkA] [gc][169] overhead, spent [269ms] collecting in the last [1s]
[2017-03-30T09:01:14,604][WARN ][o.e.m.j.JvmGcMonitorService] [BYOiXkA] [gc][172] overhead, spent [3s] collecting in the last [3.6s]
[2017-03-30T09:01:21,718][WARN ][o.e.m.j.JvmGcMonitorService] [BYOiXkA] [gc][176] overhead, spent [3.3s] collecting in the last [3.8s]
[2017-03-30T09:01:30,561][INFO ][o.e.m.j.JvmGcMonitorService] [BYOiXkA] [gc][old][179][6] duration [6.1s], collections [1]/[6.5s], total [6.1s]/[12.5s], memory [2.9gb]->[2.8gb]/[2.9gb], all_pools {[young] [266.2mb]->[214mb]/[266.2mb]}{[survivor] [21.2mb]->[0b]/[33.2mb]}{[old] [2.6gb]->[2.6gb]/[2.6gb]}
[2017-03-30T09:01:30,565][WARN ][o.e.m.j.JvmGcMonitorService] [BYOiXkA] [gc][179] overhead, spent [6.1s] collecting in the last [6.5s]
[2017-03-30T09:01:37,033][INFO ][o.e.m.j.JvmGcMonitorService] [BYOiXkA] [gc][old][180][7] duration [5s], collections [1]/[5.5s], total [5s]/[17.6s], memory [2.8gb]->[2.9gb]/[2.9gb], all_pools {[young] [214mb]->[266.2mb]/[266.2mb]}{[survivor] [0b]->[1.2mb]/[33.2mb]}{[old] [2.6gb]->[2.6gb]/[2.6gb]}
[2017-03-30T09:01:47,708][WARN ][o.e.m.j.JvmGcMonitorService] [BYOiXkA] [gc][183] overhead, spent [3s] collecting in the last [3s]
[2017-03-30T09:01:49,939][WARN ][o.e.m.j.JvmGcMonitorService] [BYOiXkA] [gc][184] overhead, spent [2.2s] collecting in the last [2.2s]
[2017-03-30T09:02:04,746][WARN ][o.e.m.j.JvmGcMonitorService] [BYOiXkA] [gc][185] overhead, spent [5.5s] collecting in the last [5.5s]
[2017-03-30T09:02:24,145][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [] fatal error in thread [elasticsearch[BYOiXkA][search][T#4]], exiting
java.lang.OutOfMemoryError: Java heap space
        at org.apache.lucene.util.FixedBitSet.<init>(FixedBitSet.java:115) ~[lucene-core-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:11]
        at org.apache.lucene.util.DocIdSetBuilder.upgradeToBitSet(DocIdSetBuilder.java:235) ~[lucene-core-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:11]
        at org.apache.lucene.util.DocIdSetBuilder.grow(DocIdSetBuilder.java:178) ~[lucene-core-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:11]

I have following queries

Do we have any other way to solve this problem?
Can we use the script in ES API to solve this problem? if yes how to do that?

Please help me to solve this problem

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.