Search API doesn't return consistent results. First query returns 2 hits, second in 3 hits, third 2 hits and so on

Example

curl -XGET 'http://localhost:9200/testolav/node/_search' -d'
{
"from": 0,
"size": 0,
"fields": [],
"query": {
"bool": {
"must": [
{
"term": {
"parent": "%2Ftestolav"
}
}
]
}
}
}'
{"took":2,"timed_out":false,"_shards":{"total":2,"successful":2,"failed":0},"hits":{"total":2,"max_score":3.014903,"hits":[]}}

_index
_type
_id
_score
size
type
name
parent
date_modified
testolav
node
%2Ftestolav%2F%E2%80%93
3.014903
0 bytes
file

%2Ftestolav
2013-01-29 16:07
testolav
node
%2Ftestolav%2Ftruls
3.014903

FOLDER
truls
%2Ftestolav
2013-01-29 16:10

Second query:
curl -XGET 'http://localhost:9200/testolav/node/_search' -d'
{
"from": 0,
"size": 0,
"fields": [],
"query": {
"bool": {
"must": [
{
"term": {
"parent": "%2Ftestolav"
}
}
]
}
}
}'
{"took":3,"timed_out":false,"_shards":{"total":2,"successful":2,"failed":0},"hits":{"total":3,"max_score":3.74084,"hits":[]}}

_index
_type
_id
_score
size
type
name
parent
date_modified
testolav
node
%2Ftestolav%2FMusic
3.74084

folder
Music
%2Ftestolav
2013-01-21 13:50
testolav
node
%2Ftestolav%2F%E2%80%93
3.014903
0 bytes
file

%2Ftestolav
2013-01-29 16:07
testolav
node
%2Ftestolav%2Ftruls
3.014903

FOLDER
truls
%2Ftestolav
2013-01-29 16:10

Is this supposed to happen? Is there a way to return a consistent search
result?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I changed the index mapping on the name property from

'name': { 'type': 'string', 'index': 'analyzed' } to 'name': { 'type':
'string', 'index': 'not_analyzed' }

And now the returned documents are always consistent.
But isn't it better to have the name field analyzed if you want it to be
searchable?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

On Tue, 2013-01-29 at 08:21 -0800, Olav Grønås Gjerde wrote:

I changed the index mapping on the name property from

'name': { 'type': 'string', 'index': 'analyzed' } to 'name':
{ 'type': 'string', 'index': 'not_analyzed' }

And now the returned documents are always consistent.

I think that this was coincidental. I think the real issue was that
somehow you managed to end up with more docs on the primaries than on
the replicas. So when you changed the mapping (ie reindexed) then it
sorted out the doc inconsistency.

I don't know how you ended up with replicas being different from
primaries. Did you have any OOM or other errors while indexing?.

But isn't it better to have the name field analyzed if you want it to
be searchable?

that depends what the name field represents in your data. A not_analyzed
field is still searchable, but eg if your name is "John Smith" and it is
analyzed, then you can search on "john". If it is not_analyzed then you
can only search on the exact value "John Smith".

clint

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.