Inconsistent results (again)


(George Sakkis) #1

Hi,

I'm starting a new thread as I can't can't reply to a previous one [1]
about the same (most likely) issue on a 2-node cluster:

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count?q=app_id:14956'
{"count":13589,"_shards":{"total":5,"successful":5,"failed":0}}

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count?q=app_id:14956'
{"count":0,"_shards":{"total":5,"successful":5,"failed":0}}

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count?q=app_id:14956'
{"count":13589,"_shards":{"total":5,"successful":5,"failed":0}}

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count?q=app_id:14956'
{"count":0,"_shards":{"total":5,"successful":5,"failed":0}}

There is one replica per shard and the num_docs of each primary matches the
respective replica. What could be the problem?

Thanks,
George

[1]
https://groups.google.com/group/elasticsearch/browse_frm/thread/7e1257f52f0e7de1


(Shay Banon) #2

Can you do a get mapping on the index, and see if you have app_id field in
several mapping types?

On Fri, Mar 30, 2012 at 4:34 PM, George Sakkis george.sakkis@gmail.comwrote:

Hi,

I'm starting a new thread as I can't can't reply to a previous one [1]
about the same (most likely) issue on a 2-node cluster:

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count?q=app_id:14956'
{"count":13589,"_shards":{"total":5,"successful":5,"failed":0}}

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count?q=app_id:14956'
{"count":0,"_shards":{"total":5,"successful":5,"failed":0}}

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count?q=app_id:14956'
{"count":13589,"_shards":{"total":5,"successful":5,"failed":0}}

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count?q=app_id:14956'
{"count":0,"_shards":{"total":5,"successful":5,"failed":0}}

There is one replica per shard and the num_docs of each primary matches
the respective replica. What could be the problem?

Thanks,
George

[1]
https://groups.google.com/group/elasticsearch/browse_frm/thread/7e1257f52f0e7de1


(George Sakkis) #3

Hi Shay,

yes multiple types have an app_id. Digging deeper into it, I found the
problem seems to happen for array fields (like app_id), not scalar
ones. Here's another example:

location is scalar

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=location:London'
{"count":575,"_shards":{"total":5,"successful":5,"failed":0}}
$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=location:London'
{"count":575,"_shards":{"total":5,"successful":5,"failed":0}}
$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=location:London'
{"count":575,"_shards":{"total":5,"successful":5,"failed":0}}
$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=location:London'
{"count":575,"_shards":{"total":5,"successful":5,"failed":0}}

mail is an array

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=mail:george.sakkis@gmail.com'
{"count":20,"_shards":{"total":5,"successful":5,"failed":0}}
$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=mail:george.sakkis@gmail.com'
{"count":1,"_shards":{"total":5,"successful":5,"failed":0}}
$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=mail:george.sakkis@gmail.com'
{"count":20,"_shards":{"total":5,"successful":5,"failed":0}}
$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=mail:george.sakkis@gmail.com'
{"count":1,"_shards":{"total":5,"successful":5,"failed":0}}

Just to clarify, this happens only for the one apparently corrupted
index (index_2012-03-30_00-10-07). Both previous indices and one I
built from scratch after that look stable.

Thanks,
George

On Mar 31, 10:52 pm, Shay Banon kim...@gmail.com wrote:

Can you do a get mapping on the index, and see if you have app_id field in
several mapping types?

On Fri, Mar 30, 2012 at 4:34 PM, George Sakkis george.sak...@gmail.comwrote:

Hi,

I'm starting a new thread as I can't can't reply to a previous one [1]
about the same (most likely) issue on a 2-node cluster:

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count?q=app_id:14956'
{"count":13589,"_shards":{"total":5,"successful":5,"failed":0}}

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count?q=app_id:14956'
{"count":0,"_shards":{"total":5,"successful":5,"failed":0}}

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count?q=app_id:14956'
{"count":13589,"_shards":{"total":5,"successful":5,"failed":0}}

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count?q=app_id:14956'
{"count":0,"_shards":{"total":5,"successful":5,"failed":0}}

There is one replica per shard and the num_docs of each primary matches
the respective replica. What could be the problem?

Thanks,
George

[1]
https://groups.google.com/group/elasticsearch/browse_frm/thread/7e125...


(Shay Banon) #4

Strange, fields with multiple values does not really affect this. My
thought was that app_id maybe has different core type (string in one
mapping type, numeric in another), and in 0.19, the resolution of using the
correct mapping when searching explicitly against the mapping type is
better. When searching across all types, then you might get strange results
in this case.

On Mon, Apr 2, 2012 at 12:20 PM, George Sakkis george.sakkis@gmail.comwrote:

Hi Shay,

yes multiple types have an app_id. Digging deeper into it, I found the
problem seems to happen for array fields (like app_id), not scalar
ones. Here's another example:

location is scalar

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=location:London'
{"count":575,"_shards":{"total":5,"successful":5,"failed":0}}
$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=location:London'
{"count":575,"_shards":{"total":5,"successful":5,"failed":0}}
$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=location:London'
{"count":575,"_shards":{"total":5,"successful":5,"failed":0}}
$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=location:London'
{"count":575,"_shards":{"total":5,"successful":5,"failed":0}}

mail is an array

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=mail:george.sakkis@gmail.com'
{"count":20,"_shards":{"total":5,"successful":5,"failed":0}}
$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=mail:george.sakkis@gmail.com'
{"count":1,"_shards":{"total":5,"successful":5,"failed":0}}
$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=mail:george.sakkis@gmail.com'
{"count":20,"_shards":{"total":5,"successful":5,"failed":0}}
$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=mail:george.sakkis@gmail.com'
{"count":1,"_shards":{"total":5,"successful":5,"failed":0}}

Just to clarify, this happens only for the one apparently corrupted
index (index_2012-03-30_00-10-07). Both previous indices and one I
built from scratch after that look stable.

Thanks,
George

On Mar 31, 10:52 pm, Shay Banon kim...@gmail.com wrote:

Can you do a get mapping on the index, and see if you have app_id field
in
several mapping types?

On Fri, Mar 30, 2012 at 4:34 PM, George Sakkis <george.sak...@gmail.com
wrote:

Hi,

I'm starting a new thread as I can't can't reply to a previous one [1]
about the same (most likely) issue on a 2-node cluster:

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count?q=app_id:14956'
{"count":13589,"_shards":{"total":5,"successful":5,"failed":0}}

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count?q=app_id:14956'
{"count":0,"_shards":{"total":5,"successful":5,"failed":0}}

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count?q=app_id:14956'
{"count":13589,"_shards":{"total":5,"successful":5,"failed":0}}

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count?q=app_id:14956'
{"count":0,"_shards":{"total":5,"successful":5,"failed":0}}

There is one replica per shard and the num_docs of each primary matches
the respective replica. What could be the problem?

Thanks,
George

[1]
https://groups.google.com/group/elasticsearch/browse_frm/thread/7e125.
..


(George Sakkis) #5

Ah no, app_id is integer in all mapping types. This is on 0.18.7.

Do you have any input on that comment from the other thread? Without
knowing much about the internals it sounds the most relevant so far:

"I've seen the mismatched counts issue on a 4-node cluster, although
it
was only after some reconfiguration work that introduced temporary
network partitions. Reindexing resolved the issue, and it hasn't
recurred. Recovery from losing a node might eliminate the imbalance."

Can I somehow check if there was a temp. network partition during
indexing, and if yes is it possible to recover from it (without having
to do a full reindeing that is)?

On Apr 3, 4:14 pm, Shay Banon kim...@gmail.com wrote:

Strange, fields with multiple values does not really affect this. My
thought was that app_id maybe has different core type (string in one
mapping type, numeric in another), and in 0.19, the resolution of using the
correct mapping when searching explicitly against the mapping type is
better. When searching across all types, then you might get strange results
in this case.

On Mon, Apr 2, 2012 at 12:20 PM, George Sakkis george.sak...@gmail.comwrote:

Hi Shay,

yes multiple types have an app_id. Digging deeper into it, I found the
problem seems to happen for array fields (like app_id), not scalar
ones. Here's another example:

location is scalar

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=location:London'
{"count":575,"_shards":{"total":5,"successful":5,"failed":0}}
$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=location:London'
{"count":575,"_shards":{"total":5,"successful":5,"failed":0}}
$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=location:London'
{"count":575,"_shards":{"total":5,"successful":5,"failed":0}}
$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=location:London'
{"count":575,"_shards":{"total":5,"successful":5,"failed":0}}

mail is an array

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=mail:george.sak...@gmail.com'
{"count":20,"_shards":{"total":5,"successful":5,"failed":0}}
$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=mail:george.sak...@gmail.com'
{"count":1,"_shards":{"total":5,"successful":5,"failed":0}}
$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=mail:george.sak...@gmail.com'
{"count":20,"_shards":{"total":5,"successful":5,"failed":0}}
$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=mail:george.sak...@gmail.com'
{"count":1,"_shards":{"total":5,"successful":5,"failed":0}}

Just to clarify, this happens only for the one apparently corrupted
index (index_2012-03-30_00-10-07). Both previous indices and one I
built from scratch after that look stable.

Thanks,
George

On Mar 31, 10:52 pm, Shay Banon kim...@gmail.com wrote:

Can you do a get mapping on the index, and see if you have app_id field
in
several mapping types?

On Fri, Mar 30, 2012 at 4:34 PM, George Sakkis <george.sak...@gmail.com
wrote:

Hi,

I'm starting a new thread as I can't can't reply to a previous one [1]
about the same (most likely) issue on a 2-node cluster:

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count?q=app_id:14956'
{"count":13589,"_shards":{"total":5,"successful":5,"failed":0}}

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count?q=app_id:14956'
{"count":0,"_shards":{"total":5,"successful":5,"failed":0}}

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count?q=app_id:14956'
{"count":13589,"_shards":{"total":5,"successful":5,"failed":0}}

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count?q=app_id:14956'
{"count":0,"_shards":{"total":5,"successful":5,"failed":0}}

There is one replica per shard and the num_docs of each primary matches
the respective replica. What could be the problem?

Thanks,
George

[1]
https://groups.google.com/group/elasticsearch/browse_frm/thread/7e125.
..


(Shay Banon) #6

If you had a network partition, you would end up with two clusters..., you
can check the logs to see if it happened (you will see disconnected nodes
messages).

Can you try and recreate the problem?

On Wed, Apr 4, 2012 at 11:27 AM, George Sakkis george.sakkis@gmail.comwrote:

Ah no, app_id is integer in all mapping types. This is on 0.18.7.

Do you have any input on that comment from the other thread? Without
knowing much about the internals it sounds the most relevant so far:

"I've seen the mismatched counts issue on a 4-node cluster, although
it
was only after some reconfiguration work that introduced temporary
network partitions. Reindexing resolved the issue, and it hasn't
recurred. Recovery from losing a node might eliminate the imbalance."

Can I somehow check if there was a temp. network partition during
indexing, and if yes is it possible to recover from it (without having
to do a full reindeing that is)?

On Apr 3, 4:14 pm, Shay Banon kim...@gmail.com wrote:

Strange, fields with multiple values does not really affect this. My
thought was that app_id maybe has different core type (string in one
mapping type, numeric in another), and in 0.19, the resolution of using
the
correct mapping when searching explicitly against the mapping type is
better. When searching across all types, then you might get strange
results
in this case.

On Mon, Apr 2, 2012 at 12:20 PM, George Sakkis <george.sak...@gmail.com
wrote:

Hi Shay,

yes multiple types have an app_id. Digging deeper into it, I found the
problem seems to happen for array fields (like app_id), not scalar
ones. Here's another example:

location is scalar

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=location:London'
{"count":575,"_shards":{"total":5,"successful":5,"failed":0}}
$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=location:London'
{"count":575,"_shards":{"total":5,"successful":5,"failed":0}}
$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=location:London'
{"count":575,"_shards":{"total":5,"successful":5,"failed":0}}
$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=location:London'
{"count":575,"_shards":{"total":5,"successful":5,"failed":0}}

mail is an array

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=mail:george.sak...@gmail.com'
{"count":20,"_shards":{"total":5,"successful":5,"failed":0}}
$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=mail:george.sak...@gmail.com'
{"count":1,"_shards":{"total":5,"successful":5,"failed":0}}
$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=mail:george.sak...@gmail.com'
{"count":20,"_shards":{"total":5,"successful":5,"failed":0}}
$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count? q=mail:george.sak...@gmail.com'
{"count":1,"_shards":{"total":5,"successful":5,"failed":0}}

Just to clarify, this happens only for the one apparently corrupted
index (index_2012-03-30_00-10-07). Both previous indices and one I
built from scratch after that look stable.

Thanks,
George

On Mar 31, 10:52 pm, Shay Banon kim...@gmail.com wrote:

Can you do a get mapping on the index, and see if you have app_id
field

in

several mapping types?

On Fri, Mar 30, 2012 at 4:34 PM, George Sakkis <
george.sak...@gmail.com

wrote:

Hi,

I'm starting a new thread as I can't can't reply to a previous one
[1]

about the same (most likely) issue on a 2-node cluster:

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count?q=app_id:14956'
{"count":13589,"_shards":{"total":5,"successful":5,"failed":0}}

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count?q=app_id:14956'
{"count":0,"_shards":{"total":5,"successful":5,"failed":0}}

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count?q=app_id:14956'
{"count":13589,"_shards":{"total":5,"successful":5,"failed":0}}

$ echo curl -s -XGET 'localhost:9200/index_2012-03-30_00-10-07/_count?q=app_id:14956'
{"count":0,"_shards":{"total":5,"successful":5,"failed":0}}

There is one replica per shard and the num_docs of each primary
matches

the respective replica. What could be the problem?

Thanks,
George

[1]

https://groups.google.com/group/elasticsearch/browse_frm/thread/7e125.

..


(system) #7