Duplicated records returned using pagination after update

I'm using v0.90.0 of ES, and I just implemented pagination.

All was going fine until I updated one of my records, then I noticed that
the returned results had a record duplicated. Update another record and a
new duplicate.

The number of records did not increased, just one record was lost from the
index and replaced with a duplicate (the duplicate was a perfect duplicate
including the ID of the original).

The duplicate only appears when querying with from/size ... a google search
seems to have a few people having a similar problem, but no suggested
answers/solutions http://goo.gl/Y989w6

Asking for size 10, returns all 10 records correctly:

➜ chimp git:(array-tags) ✗ curl -X GET
'http://localhost:9200/people/person/_search?pretty' -d
'{"sort":[{"first_name":"asc"}],"filter":{"term":{"machine_id":58}},"size":10,"from":0}'
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"hits" : {
"total" : 10,
"max_score" : null,
"hits" : [ {
"_index" : "people",
"_type" : "person",
"_id" : "107",
"_score" : null, "_source" :
{"id":107,"first_name":"BANANA","last_name":"asdasd","machine_id":58,"email":"","website":"","telephone_number":"","tags":[]},
"sort" : [ "banana" ]
}, {
"_index" : "people",
"_type" : "person",
"_id" : "106",
"_score" : null, "_source" :
{"id":106,"first_name":"CABBAGE","last_name":"asd","machine_id":58,"email":"","website":"","telephone_number":"","tags":[]},
"sort" : [ "cabbage" ]
}, {
"_index" : "people",
"_type" : "person",
"_id" : "114",
"_score" : null, "_source" :
{"id":114,"first_name":"DAVE","last_name":"JONES","machine_id":58,"email":"","website":"","telephone_number":"","tags":["asdasd","asdasdad","red","HELLO","HELLO","blue
fruits"]},
"sort" : [ "dave" ]
}, {
"_index" : "people",
"_type" : "person",
"_id" : "115",
"_score" : null, "_source" :
{"id":115,"first_name":"ERIC","last_name":"","machine_id":58,"email":"","website":"","telephone_number":"","tags":[]},
"sort" : [ "eric" ]
}, {
"_index" : "people",
"_type" : "person",
"_id" : "103",
"_score" : null, "_source" :
{"id":103,"first_name":"GRETA","last_name":"Man","machine_id":58,"email":"","website":"","telephone_number":"","tags":[]},
"sort" : [ "greta" ]
}, {
"_index" : "people",
"_type" : "person",
"_id" : "102",
"_score" : null, "_source" :
{"id":102,"first_name":"Jack","last_name":"Jones","machine_id":58,"email":"","website":"","telephone_number":"","tags":[]},
"sort" : [ "jack" ]
}, {
"_index" : "people",
"_type" : "person",
"_id" : "100",
"_score" : null, "_source" :
{"id":100,"first_name":"NAME","last_name":"asd","machine_id":58,"email":"test@testme.com","website":"","telephone_number":"","tags":[]},
"sort" : [ "name" ]
}, {
"_index" : "people",
"_type" : "person",
"_id" : "111",
"_score" : null, "_source" :
{"id":111,"first_name":"NAME","last_name":"asdasdasd","machine_id":58,"email":"","website":"","telephone_number":"","tags":[]},
"sort" : [ "name" ]
}, {
"_index" : "people",
"_type" : "person",
"_id" : "108",
"_score" : null, "_source" :
{"id":108,"first_name":"PEANUT","last_name":"asdsd","machine_id":58,"email":"","website":"","telephone_number":"","tags":[]},
"sort" : [ "peanut" ]
}, {
"_index" : "people",
"_type" : "person",
"_id" : "104",
"_score" : null, "_source" :
{"id":104,"first_name":"WillX","last_name":"SmithX","machine_id":58,"email":"","website":"","telephone_number":"","tags":["hello","red"]},
"sort" : [ "willx" ]
} ]
}
}%

Asking for 5 records returns 5 records, but the updated record ("GRETA")
in the query above is missing (they are sorted by first_name so it should
be there)

➜ chimp git:(array-tags) ✗ curl -X GET
'http://localhost:9200/people/person/_search?pretty' -d
'{"sort":[{"first_name":"asc"}],"filter":{"term":{"machine_id":58}},"size":5,"from":0}'
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"hits" : {
"total" : 10,
"max_score" : null,
"hits" : [ {
"_index" : "people",
"_type" : "person",
"_id" : "107",
"_score" : null, "_source" :
{"id":107,"first_name":"BANANA","last_name":"asdasd","machine_id":58,"email":"","website":"","telephone_number":"","tags":[]},
"sort" : [ "banana" ]
}, {
"_index" : "people",
"_type" : "person",
"_id" : "106",
"_score" : null, "_source" :
{"id":106,"first_name":"CABBAGE","last_name":"asd","machine_id":58,"email":"","website":"","telephone_number":"","tags":[]},
"sort" : [ "cabbage" ]
}, {
"_index" : "people",
"_type" : "person",
"_id" : "114",
"_score" : null, "_source" :
{"id":114,"first_name":"DAVE","last_name":"JONES","machine_id":58,"email":"","website":"","telephone_number":"","tags":["asdasd","asdasdad","red","HELLO","HELLO","blue
fruits"]},
"sort" : [ "dave" ]
}, {
"_index" : "people",
"_type" : "person",
"_id" : "115",
"_score" : null, "_source" :
{"id":115,"first_name":"ERIC","last_name":"","machine_id":58,"email":"","website":"","telephone_number":"","tags":[]},
"sort" : [ "eric" ]
}, {
"_index" : "people",
"_type" : "person",
"_id" : "102",
"_score" : null, "_source" :
{"id":102,"first_name":"Jack","last_name":"Jones","machine_id":58,"email":"","website":"","telephone_number":"","tags":[]},
"sort" : [ "jack" ]
} ]
}
}%

Asking for the next page has "Jack Jones" effectively duplicated (he was
pulled into the first page of results incorrectly as GRETA was missing, but
he is correctly in this page:

chimp git:(array-tags) ✗ curl -X GET
'http://localhost:9200/people/person/_search?pretty' -d
'{"sort":[{"first_name":"asc"}],"filter":{"term":{"machine_id":58}},"size":5,"from":4}'
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"hits" : {
"total" : 10,
"max_score" : null,
"hits" : [ {
"_index" : "people",
"_type" : "person",
"_id" : "102",
"_score" : null, "_source" :
{"id":102,"first_name":"Jack","last_name":"Jones","machine_id":58,"email":"","website":"","telephone_number":"","tags":[]},
"sort" : [ "jack" ]
}, {
"_index" : "people",
"_type" : "person",
"_id" : "100",
"_score" : null, "_source" :
{"id":100,"first_name":"NAME","last_name":"asd","machine_id":58,"email":"test@testme.com","website":"","telephone_number":"","tags":[]},
"sort" : [ "name" ]
}, {
"_index" : "people",
"_type" : "person",
"_id" : "111",
"_score" : null, "_source" :
{"id":111,"first_name":"NAME","last_name":"asdasdasd","machine_id":58,"email":"","website":"","telephone_number":"","tags":[]},
"sort" : [ "name" ]
}, {
"_index" : "people",
"_type" : "person",
"_id" : "108",
"_score" : null, "_source" :
{"id":108,"first_name":"PEANUT","last_name":"asdsd","machine_id":58,"email":"","website":"","telephone_number":"","tags":[]},
"sort" : [ "peanut" ]
}, {
"_index" : "people",
"_type" : "person",
"_id" : "104",
"_score" : null, "_source" :
{"id":104,"first_name":"WillX","last_name":"SmithX","machine_id":58,"email":"","website":"","telephone_number":"","tags":["hello","red"]},
"sort" : [ "willx" ]
} ]
}
}%

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

This should be correct (changed "from" from 4 to 5):
{"sort":[{"first_name":"asc"}],"filter":{"term":{"machine_id":58}},"size":5,"from":5}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sorry i think i missed the point before. Is the mapping correct?
You have the ability to sort without duplicates if you use a mapping like:

......
"firstname_sort":
{
"type": "string",
"store": "yes",
"index": "not_analyzed"
}
......

A duplicated field for sorting and facetting is easy to implement, but
there are other possibilities.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.