Ok i've really stripped down the problem and made narrowed down the
problem:
Problem
1
2
elasticsearch sorting while using "size" and "from" not returning expected results after update
Create some records (using Ruby to create 100 records)
1
2
3
4
(1).upto(100) do |i|
curl -XPUT 'http://localhost:9200/twitter/tweet/#{i}' -d '{ "user" : "#{i}"}'
end
View initial search/sort results
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
curl -X GET 'http://localhost:9200/twitter/tweet/_search?pretty' -d '
{
"sort": [
{
"user": "asc"
}
],
"size": 10,
"from": 0
}
'
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 100,
"max_score" : null,
"hits" : [ {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "1",
"_score" : null, "_source" : { "user" : "1"},
"sort" : [ "1" ]
}, {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "10",
"_score" : null, "_source" : { "user" : "10"},
"sort" : [ "10" ]
}, {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "100",
"_score" : null, "_source" : { "user" : "100"},
"sort" : [ "100" ]
}, {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "11",
"_score" : null, "_source" : { "user" : "11"},
"sort" : [ "11" ]
}, {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "12",
"_score" : null, "_source" : { "user" : "12"},
"sort" : [ "12" ]
}, {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "13",
"_score" : null, "_source" : { "user" : "13"},
"sort" : [ "13" ]
}, {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "14",
"_score" : null, "_source" : { "user" : "14"},
"sort" : [ "14" ]
}, {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "15",
"_score" : null, "_source" : { "user" : "15"},
"sort" : [ "15" ]
}, {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "16",
"_score" : null, "_source" : { "user" : "16"},
"sort" : [ "16" ]
}, {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "17",
"_score" : null, "_source" : { "user" : "17"},
"sort" : [ "17" ]
} ]
}
Now let's update the first record
1
2
3
4
curl -X POST "http://localhost:9200/twitter/tweet/1" -d '
{"user":"1"}
'
And... for the problem (notice that previous first record is now missing)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
curl -X GET 'http://localhost:9200/twitter/tweet/_search?pretty' -d '
{
"sort": [
{
"user": "asc"
}
],
"size": 10,
"from": 0
}
'
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 100,
"max_score" : null,
"hits" : [ {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "10",
"_score" : null, "_source" : { "user" : "10"},
"sort" : [ "10" ]
}, {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "100",
"_score" : null, "_source" : { "user" : "100"},
"sort" : [ "100" ]
}, {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "11",
"_score" : null, "_source" : { "user" : "11"},
"sort" : [ "11" ]
}, {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "12",
"_score" : null, "_source" : { "user" : "12"},
"sort" : [ "12" ]
}, {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "13",
"_score" : null, "_source" : { "user" : "13"},
"sort" : [ "13" ]
}, {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "14",
"_score" : null, "_source" : { "user" : "14"},
"sort" : [ "14" ]
}, {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "15",
"_score" : null, "_source" : { "user" : "15"},
"sort" : [ "15" ]
}, {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "16",
"_score" : null, "_source" : { "user" : "16"},
"sort" : [ "16" ]
}, {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "17",
"_score" : null, "_source" : { "user" : "17"},
"sort" : [ "17" ]
}, {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "18",
"_score" : null, "_source" : { "user" : "18"},
"sort" : [ "18" ]
} ]
}
On Thursday, May 23, 2013 1:08:50 AM UTC-7, Aldo Sarmiento wrote:
I've updated my query to be:
curl -X GET '
http://localhost:9200/development::application-contacts/contact/_search?pretty'
-d '
{
"sort": [
{
"first_name_sort": "asc"
},
"_id"
],
"size": 20,
"from": 0
}
However, the record still doesn't get returned. I wonder if there is
something wrong with the mapping. Not sure if that would affect the sort
order, though.
On Thursday, May 23, 2013 12:24:39 AM UTC-7, simonw wrote:
hey, does this also happen if you use the _id as a secondary sort? If you
update the document and all docs have the same sort value it might depend
on what shard returns first. Can you try this as a tie-breaker?
On Wednesday, May 22, 2013 11:48:53 PM UTC+2, Aldo Sarmiento wrote:
Using a simple {"sort":[{"first_name":"asc"}], "size":20, "from":0}} is
returning inconsistent results.
When i rebuild the index, everything is returned correctly. However, as
soon as i update a document, it no longer gets returned (using 0.90.0
stable).
Here is a full gist of problem and mapping:
gist:d945848fd683f39d212c · GitHub
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.