nicom
(nicolas)
April 30, 2014, 9:27am
1
I have document having fields containing large array.
I would like to score according to the value of a nth element of such
array, but got very slow answer (5s) for only 10K document indexed.
my mapping:
document {
id: value,
field2: string,
field3: [ int_1,int_2, ... , int_10k] <- large array of 10K integers
}
assume I generated and indexed 10K documents with 1K random integer values
in the field 'field3'
I then use the following search query
GET /test/document/_search
{
"query":{
"function_score":{
"script_score" : {
"script" : " _source.fields3[12] * _source.fields3[11] "
}
=> got 5000 ms
however with basic Java object with a simple nested loop:
for all the documents
score[i] = doc[i].fields[12] * doc[i].fields[11]
sort by score
=> got < 50 ms
ES is 100 slower than a simple loop..
How to get similar performance with ES?
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/db53da70-4f75-4088-b9a6-2cde3caef062%40googlegroups.com .
For more options, visit https://groups.google.com/d/optout .
Hello,
Using _source for scripts is typically slow, because ES has to go to each
stored document and extract fields from there. A faster approach is to use
something like doc['field3'].values[12], which will used the field data
cache (already loaded in memory, at least after the first run):
More details about field data can be found here:
Best regards,
Radu
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/
On Wed, Apr 30, 2014 at 12:27 PM, NM n.maisonneuve@gmail.com wrote:
I have document having fields containing large array.
I would like to score according to the value of a nth element of such
array, but got very slow answer (5s) for only 10K document indexed.
my mapping:
document {
id: value,
field2: string,
field3: [ int_1,int_2, ... , int_10k] <- large array of 10K integers
}
assume I generated and indexed 10K documents with 1K random integer values
in the field 'field3'
I then use the following search query
GET /test/document/_search
{
"query":{
"function_score":{
"script_score" : {
"script" : " _source.fields3[12] * _source.fields3[11] "
}
=> got 5000 ms
however with basic Java object with a simple nested loop:
for all the documents
score[i] = doc[i].fields[12] * doc[i].fields[11]
sort by score
=> got < 50 ms
ES is 100 slower than a simple loop..
How to get similar performance with ES?
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com .
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/db53da70-4f75-4088-b9a6-2cde3caef062%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/db53da70-4f75-4088-b9a6-2cde3caef062%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout .
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHXA0_2wmDJFBJvJ1fTUsszaP7GjVtJYfSU-AbHMq6NS%2BVqhFw%40mail.gmail.com .
For more options, visit https://groups.google.com/d/optout .