Can get document by ID but not find it in query all?

Hi all,
Why would we be able to find a document by its ID but not see it if we use
a q=: search? Or a search with no criteria.

This must be a newbie question because we must be missing something but at
the moment we don't know what to look at.

The index contains 99 events with ids like Id0, Id1, etc. The event we're
looking for as an id of 56505330-8b6c-11e2-894d-24be05270b5c.

We can put a document in the index and retrieve it directly. Here's the
retrival:
curl
localhost:9200/audit-events-sal_ci-2013-03-14/log/56505330-8b6c-11e2-894d-24be05270b5c

{"_index":"audit-events-sal_ci-2013-03-14","_type":"log","_id":"56505330-8b6c-11e2-894d-24be05270b5c","_version":1,"exists":true,
"_source" :
{"eventType":"ADD_USER","eventTime":"1363219822000","_id":"56505330-8b6c-11e2-894d-24be05270b5c"}}

However, if we look for it using q=: we can't find it.
curl
'localhost:9200/audit-events-sal_ci-2013-03-14/_search?pretty=true&q=:&size=200'

{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 99,
"max_score" : 1.0,
"hits" : [ {
"_index" : "audit-events-sal_ci-2013-03-14",
"_type" : "log",
"_id" : "Id3",
"_score" : 1.0, "_source" :
{"eventType":"ADD_USER","eventTime":"1363262403000","_id":"Id3"}
}, ...

If we search for stats we see the following - we should 100 events:
curl 'localhost:9200/audit-events-sal_ci-2013-03-14/_stats?pretty=true'

{
"ok" : true,
"_shards" : {
"total" : 10,
"successful" : 5,
"failed" : 0
},
"_all" : {
"primaries" : {
"docs" : {
"count" : 99,
"deleted" : 1
}, ...

If it's relevant to this problem, previous versions of this problem
involved deleting and re-adding a document with the same ID. We've now
removed that part of the test and are using a completely new ID in order to
exclude that.

We have Java code that does the following:
final SearchResponse response = client.prepareSearch(indexNames.toArray(new
String[] {})).setTypes("log")
.setIgnoreIndices(IgnoreIndices.MISSING).execute().actionGet();

It too only returns 99 records - and not the one we're looking for.

So, please put us out of our dire misery and tell us what nugget of
knowledge we have overlooked in our search for enlightenment.

Many thanks,
Edward

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

You still have a deletion, so it appears you are indexing a document with
the same id twice. When you reindex a document, Lucene will behind the
scene delete the original document and index the new one. Can you double
check your indexing code to see if you id generation is valid?

--
Ivan

On Mon, Jun 24, 2013 at 5:03 PM, Edward Sargisson esarge@pobox.com wrote:

Hi all,
Why would we be able to find a document by its ID but not see it if we use
a q=: search? Or a search with no criteria.

This must be a newbie question because we must be missing something but at
the moment we don't know what to look at.

The index contains 99 events with ids like Id0, Id1, etc. The event we're
looking for as an id of 56505330-8b6c-11e2-894d-24be05270b5c.

We can put a document in the index and retrieve it directly. Here's the
retrival:
curl
localhost:9200/audit-events-sal_ci-2013-03-14/log/56505330-8b6c-11e2-894d-24be05270b5c

{"_index":"audit-events-sal_ci-2013-03-14","_type":"log","_id":"56505330-8b6c-11e2-894d-24be05270b5c","_version":1,"exists":true,
"_source" :
{"eventType":"ADD_USER","eventTime":"1363219822000","_id":"56505330-8b6c-11e2-894d-24be05270b5c"}}

However, if we look for it using q=: we can't find it.
curl
'localhost:9200/audit-events-sal_ci-2013-03-14/_search?pretty=true&q=:&size=200'

{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 99,
"max_score" : 1.0,
"hits" : [ {
"_index" : "audit-events-sal_ci-2013-03-14",
"_type" : "log",
"_id" : "Id3",
"_score" : 1.0, "_source" :
{"eventType":"ADD_USER","eventTime":"1363262403000","_id":"Id3"}
}, ...

If we search for stats we see the following - we should 100 events:
curl 'localhost:9200/audit-events-sal_ci-2013-03-14/_stats?pretty=true'

{
"ok" : true,
"_shards" : {
"total" : 10,
"successful" : 5,
"failed" : 0
},
"_all" : {
"primaries" : {
"docs" : {
"count" : 99,
"deleted" : 1
}, ...

If it's relevant to this problem, previous versions of this problem
involved deleting and re-adding a document with the same ID. We've now
removed that part of the test and are using a completely new ID in order to
exclude that.

We have Java code that does the following:
final SearchResponse response =
client.prepareSearch(indexNames.toArray(new String[] {})).setTypes("log")
.setIgnoreIndices(IgnoreIndices.MISSING).execute().actionGet();

It too only returns 99 records - and not the one we're looking for.

So, please put us out of our dire misery and tell us what nugget of
knowledge we have overlooked in our search for enlightenment.

Many thanks,
Edward

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Ivan,
Thanks for the reply.
Our original tests were a replacement of a document with the same id (hence
giving us a deletion). However, the last test was a brand new id - and it
still did not appear in the response.

Cheers,
Edward

On Monday, June 24, 2013 5:15:06 PM UTC-7, Ivan Brusic wrote:

You still have a deletion, so it appears you are indexing a document with
the same id twice. When you reindex a document, Lucene will behind the
scene delete the original document and index the new one. Can you double
check your indexing code to see if you id generation is valid?

--
Ivan

On Mon, Jun 24, 2013 at 5:03 PM, Edward Sargisson <esa...@pobox.com<javascript:>

wrote:

Hi all,
Why would we be able to find a document by its ID but not see it if we
use a q=: search? Or a search with no criteria.

This must be a newbie question because we must be missing something but
at the moment we don't know what to look at.

The index contains 99 events with ids like Id0, Id1, etc. The event we're
looking for as an id of 56505330-8b6c-11e2-894d-24be05270b5c.

We can put a document in the index and retrieve it directly. Here's the
retrival:
curl
localhost:9200/audit-events-sal_ci-2013-03-14/log/56505330-8b6c-11e2-894d-24be05270b5c

{"_index":"audit-events-sal_ci-2013-03-14","_type":"log","_id":"56505330-8b6c-11e2-894d-24be05270b5c","_version":1,"exists":true,
"_source" :
{"eventType":"ADD_USER","eventTime":"1363219822000","_id":"56505330-8b6c-11e2-894d-24be05270b5c"}}

However, if we look for it using q=: we can't find it.
curl
'localhost:9200/audit-events-sal_ci-2013-03-14/_search?pretty=true&q=:&size=200'

{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 99,
"max_score" : 1.0,
"hits" : [ {
"_index" : "audit-events-sal_ci-2013-03-14",
"_type" : "log",
"_id" : "Id3",
"_score" : 1.0, "_source" :
{"eventType":"ADD_USER","eventTime":"1363262403000","_id":"Id3"}
}, ...

If we search for stats we see the following - we should 100 events:
curl 'localhost:9200/audit-events-sal_ci-2013-03-14/_stats?pretty=true'

{
"ok" : true,
"_shards" : {
"total" : 10,
"successful" : 5,
"failed" : 0
},
"_all" : {
"primaries" : {
"docs" : {
"count" : 99,
"deleted" : 1
}, ...

If it's relevant to this problem, previous versions of this problem
involved deleting and re-adding a document with the same ID. We've now
removed that part of the test and are using a completely new ID in order to
exclude that.

We have Java code that does the following:
final SearchResponse response =
client.prepareSearch(indexNames.toArray(new String[] {})).setTypes("log")
.setIgnoreIndices(IgnoreIndices.MISSING).execute().actionGet();

It too only returns 99 records - and not the one we're looking for.

So, please put us out of our dire misery and tell us what nugget of
knowledge we have overlooked in our search for enlightenment.

Many thanks,
Edward

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Any chance you can gist a full curl recreation of what you are doing?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 25 juin 2013 à 05:18, Edward Sargisson ejsarge@gmail.com a écrit :

Hi Ivan,
Thanks for the reply.
Our original tests were a replacement of a document with the same id (hence giving us a deletion). However, the last test was a brand new id - and it still did not appear in the response.

Cheers,
Edward

On Monday, June 24, 2013 5:15:06 PM UTC-7, Ivan Brusic wrote:

You still have a deletion, so it appears you are indexing a document with the same id twice. When you reindex a document, Lucene will behind the scene delete the original document and index the new one. Can you double check your indexing code to see if you id generation is valid?

--
Ivan

On Mon, Jun 24, 2013 at 5:03 PM, Edward Sargisson esa...@pobox.com wrote:

Hi all,
Why would we be able to find a document by its ID but not see it if we use a q=: search? Or a search with no criteria.

This must be a newbie question because we must be missing something but at the moment we don't know what to look at.

The index contains 99 events with ids like Id0, Id1, etc. The event we're looking for as an id of 56505330-8b6c-11e2-894d-24be05270b5c.

We can put a document in the index and retrieve it directly. Here's the retrival:
curl localhost:9200/audit-events-sal_ci-2013-03-14/log/56505330-8b6c-11e2-894d-24be05270b5c

{"_index":"audit-events-sal_ci-2013-03-14","_type":"log","_id":"56505330-8b6c-11e2-894d-24be05270b5c","_version":1,"exists":true, "_source" : {"eventType":"ADD_USER","eventTime":"1363219822000","_id":"56505330-8b6c-11e2-894d-24be05270b5c"}}

However, if we look for it using q=: we can't find it.
curl 'localhost:9200/audit-events-sal_ci-2013-03-14/_search?pretty=true&q=:&size=200'

{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 99,
"max_score" : 1.0,
"hits" : [ {
"_index" : "audit-events-sal_ci-2013-03-14",
"_type" : "log",
"_id" : "Id3",
"_score" : 1.0, "_source" : {"eventType":"ADD_USER","eventTime":"1363262403000","_id":"Id3"}
}, ...

If we search for stats we see the following - we should 100 events:
curl 'localhost:9200/audit-events-sal_ci-2013-03-14/_stats?pretty=true'

{
"ok" : true,
"_shards" : {
"total" : 10,
"successful" : 5,
"failed" : 0
},
"_all" : {
"primaries" : {
"docs" : {
"count" : 99,
"deleted" : 1
}, ...

If it's relevant to this problem, previous versions of this problem involved deleting and re-adding a document with the same ID. We've now removed that part of the test and are using a completely new ID in order to exclude that.

We have Java code that does the following:
final SearchResponse response = client.prepareSearch(indexNames.toArray(new String[] {})).setTypes("log")
.setIgnoreIndices(IgnoreIndices.MISSING).execute().actionGet();

It too only returns 99 records - and not the one we're looking for.

So, please put us out of our dire misery and tell us what nugget of knowledge we have overlooked in our search for enlightenment.

Many thanks,
Edward

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi everyone. My name is Jeremy; I'm looking at this with Edward.

Here's a curl recreation:


Verify the index "test" doesn't exist.

curl localhost:9200/test/_stats
{"error":"IndexMissingException[[test] missing]","status":404}

Create the log item "test1" in this index.

curl -XPUT localhost:9200/test/log/test1 -d
'{"eventType":"ADD_USER","eventTime":"1363219822000","_id":"test1"}'
{"ok":true,"_index":"test","_type":"log","_id":"test1","_version":1}

Check out the index stats. Hmm... Docs count is zero? Odd?

curl localhost:9200/test/_stats?pretty=true
{
"ok" : true,
"_shards" : {
"total" : 10,
"successful" : 5,
"failed" : 0
},
"_all" : {
"primaries" : {
"docs" : {
"count" : 0,
"deleted" : 0
}, ...

Look the item up by ID... Yep, it's there.

curl localhost:9200/test/log/test1
{"_index":"test","_type":"log","_id":"test1","_version":1,"exists":true,
"_source" :
{"eventType":"ADD_USER","eventTime":"1363219822000","_id":"test1"}}

Try a few different queries. I'm no query pro, but I think these are

right. Nothing seems to find it.

curl -XGET 'localhost:9200/test/log/_search?q=_id:test1' -d
'{"query":{"match_all":{}}}'
{"took":10,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}}

curl 'localhost:9200/test/_search?q=:&size=10'
{"took":2,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}}

curl -XGET localhost:9200/test/log/_search -d '{"query":{"match_all":{}}}'
{"took":1,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}}


Additionally, I've tried creating the a second index (test2) and mapping
manually before inserting the log item, with similar results. Like this:

curl -XPUT localhost:9200/test2
{"ok":true,"acknowledged":true}

curl -XPUT localhost:9200/test2/log/_mapping -d '{
"log" : {
"properties" : {
"_id" : {"type" : "string", "store" : "yes"},
"eventType" : {"type" : "string", "store" : "yes"},
"eventTime" : {"type" : "long", "store" : "yes"}
}
}
}'
{"ok":true,"acknowledged":true}


I haven't got as far as the deletion / reinsertion part that Edward was
talking about... I'd like to understand what's going on here first.
Perhaps when we understand this, the reinsert part might solve itself.
Like he said; I think we're missing something fundamental in our
understanding, so no answer is too small!

Thanks a lot.

-- Jeremy

On Monday, 24 June 2013 23:03:17 UTC-7, David Pilato wrote:

Any chance you can gist a full curl recreation of what you are doing?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Further to my previous message: I decided to turn up the logging to DEBUG
and restarted ElasticSearch to see what I could see. After restarting, it
took quite some time to settle down (a couple of minutes), with a lot of
logging about recovery. This may or may not be normal, and I don't want to
spam the group. If you think it's relevant and / or interesting, let me
know what to look for.

Eventually, it did settle, and running the stats on the index returned
docs.count = 1 (like it should), and all of the queries returned a result.

The only thing that I can think of that may be related here is that this
machine was upgraded from ElasticSearch 0.20.4 to 0.90.1 a few weeks ago.

Thanks again.

-- Jeremy

On Tuesday, 25 June 2013 10:37:00 UTC-7, Jeremy Karlson wrote:

Hi everyone. My name is Jeremy; I'm looking at this with Edward.

Here's a curl recreation:

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hey,

I've extracted the curl calls from your last mail and addred a
'refresh=true' to the index call (in order to make the indexed document
available for search immediately instead of the possibility of having to
wait for one second), and it seems to work now

curl -X DELETE localhost:9200/test/_stats
curl -X PUT localhost:9200/test/_stats
curl -XPUT 'localhost:9200/test/log/test1?refresh=true' -d
'{"eventType":"ADD_USER","eventTime":"1363219822000","_id":"test1"}'
curl 'localhost:9200/test/_stats?pretty=true'
curl localhost:9200/test/log/test1
curl -XGET 'localhost:9200/test/log/_search?q=_id:test1' -d
'{"query":{"match_all":{}}}'
curl 'localhost:9200/test/_search?q=:&size=10'
curl -XGET localhost:9200/test/log/_search -d '{"query":{"match_all":{}}}'

Long story short: A document can be retrieved with GET on its ID
immediately after it has been put into elasticsearch. However the
possibility of searching a document may last up to one second after
indexation, because this 'view' has to be refreshed in order to make sure
fresh indexed data can be searched. This happens automatically every second
in the background and can be enforced by adding 'refresh=true' to the index
operation (dont do this in a live system, it is a costly operation and
should not be done with every index operation).

Hope this helps.

--Alex

On Tue, Jun 25, 2013 at 8:12 PM, Jeremy Karlson jeremykarlson@gmail.comwrote:

Further to my previous message: I decided to turn up the logging to DEBUG
and restarted ElasticSearch to see what I could see. After restarting, it
took quite some time to settle down (a couple of minutes), with a lot of
logging about recovery. This may or may not be normal, and I don't want to
spam the group. If you think it's relevant and / or interesting, let me
know what to look for.

Eventually, it did settle, and running the stats on the index returned
docs.count = 1 (like it should), and all of the queries returned a result.

The only thing that I can think of that may be related here is that this
machine was upgraded from ElasticSearch 0.20.4 to 0.90.1 a few weeks ago.

Thanks again.

-- Jeremy

On Tuesday, 25 June 2013 10:37:00 UTC-7, Jeremy Karlson wrote:

Hi everyone. My name is Jeremy; I'm looking at this with Edward.

Here's a curl recreation:

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hey Alex,

Thanks for your reply and taking the time to give it a try.

Those curl statements that I wrote were run several minutes apart (or
longer), as I had to poke around through the ElasticSearch API to figure
them out. They didn't run back-to-back-to-back like my presentation
suggested. So assuming the default refresh time is 1 minute, it shouldn't
have been a factor.

Any other thoughts?

-- Jeremy

On Tuesday, 25 June 2013 23:42:19 UTC-7, Alexander Reelsen wrote:

Hey,

I've extracted the curl calls from your last mail and addred a
'refresh=true' to the index call (in order to make the indexed document
available for search immediately instead of the possibility of having to
wait for one second), and it seems to work now

curl -X DELETE localhost:9200/test/_stats
curl -X PUT localhost:9200/test/_stats
curl -XPUT 'localhost:9200/test/log/test1?refresh=true' -d
'{"eventType":"ADD_USER","eventTime":"1363219822000","_id":"test1"}'
curl 'localhost:9200/test/_stats?pretty=true'
curl localhost:9200/test/log/test1
curl -XGET 'localhost:9200/test/log/_search?q=_id:test1' -d
'{"query":{"match_all":{}}}'
curl 'localhost:9200/test/_search?q=:&size=10'
curl -XGET localhost:9200/test/log/_search -d '{"query":{"match_all":{}}}'

Long story short: A document can be retrieved with GET on its ID
immediately after it has been put into elasticsearch. However the
possibility of searching a document may last up to one second after
indexation, because this 'view' has to be refreshed in order to make sure
fresh indexed data can be searched. This happens automatically every second
in the background and can be enforced by adding 'refresh=true' to the index
operation (dont do this in a live system, it is a costly operation and
should not be done with every index operation).

Hope this helps.

--Alex

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hey,

if you run my sample, do you encounter the same errors (maybe I did a
mistake when converting your queries)?
Then, what version of elasticsearch are you using (I tried with 0.90.1)?

--Alex

On Wed, Jun 26, 2013 at 11:40 PM, Jeremy Karlson jeremykarlson@gmail.comwrote:

Hey Alex,

Thanks for your reply and taking the time to give it a try.

Those curl statements that I wrote were run several minutes apart (or
longer), as I had to poke around through the ElasticSearch API to figure
them out. They didn't run back-to-back-to-back like my presentation
suggested. So assuming the default refresh time is 1 minute, it shouldn't
have been a factor.

Any other thoughts?

-- Jeremy

On Tuesday, 25 June 2013 23:42:19 UTC-7, Alexander Reelsen wrote:

Hey,

I've extracted the curl calls from your last mail and addred a
'refresh=true' to the index call (in order to make the indexed document
available for search immediately instead of the possibility of having to
wait for one second), and it seems to work now

curl -X DELETE localhost:9200/test/_stats
curl -X PUT localhost:9200/test/stats
curl -XPUT 'localhost:9200/test/log/**test1?refresh=true' -d
'{"eventType":"ADD_USER","**eventTime":"1363219822000","
**id":"test1"}'
curl 'localhost:9200/test/stats?**pretty=true'
curl localhost:9200/test/log/test1
curl -XGET 'localhost:9200/test/log/
**search?q=_id:test1' -d
'{"query":{"match_all":{}}}'
curl 'localhost:9200/test/search?**q=:&size=10'
curl -XGET localhost:9200/test/log/
**search -d
'{"query":{"match_all":{}}}'

Long story short: A document can be retrieved with GET on its ID
immediately after it has been put into elasticsearch. However the
possibility of searching a document may last up to one second after
indexation, because this 'view' has to be refreshed in order to make sure
fresh indexed data can be searched. This happens automatically every second
in the background and can be enforced by adding 'refresh=true' to the index
operation (dont do this in a live system, it is a costly operation and
should not be done with every index operation).

Hope this helps.

--Alex

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I ran your sample. No, I don't get the same errors I did earlier. But, I
didn't expect to... I mentioned earlier that the problem "went away" when
I restarted ElasticSearch to turn up the logging. Since then (a day or two
ago) things have been running well.

And yeah, I'm running 0.90.1.

Thanks.

-- Jeremy

On Wednesday, 26 June 2013 23:42:28 UTC-7, Alexander Reelsen wrote:

Hey,

if you run my sample, do you encounter the same errors (maybe I did a
mistake when converting your queries)?
Then, what version of elasticsearch are you using (I tried with 0.90.1)?

--Alex

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I experienced this behaviour again. It doesn't happen to just one
document, it appears that after a while ES continues to receive changes
(new docs / updates / deletes), but no changes never appear to become
visible. If you restart ES, they all appear.

Still running 0.90.1.

-- Jeremy

On Thursday, 27 June 2013 14:28:47 UTC-7, Jeremy Karlson wrote:

I ran your sample. No, I don't get the same errors I did earlier. But, I
didn't expect to... I mentioned earlier that the problem "went away" when
I restarted ElasticSearch to turn up the logging. Since then (a day or two
ago) things have been running well.

And yeah, I'm running 0.90.1.

Thanks.

-- Jeremy

On Wednesday, 26 June 2013 23:42:28 UTC-7, Alexander Reelsen wrote:

Hey,

if you run my sample, do you encounter the same errors (maybe I did a
mistake when converting your queries)?
Then, what version of elasticsearch are you using (I tried with 0.90.1)?

--Alex

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi,

this sounds strange. Can you execute a refresh or a flush manually next
time it happens? See
http://www.elasticsearch.org/guide/reference/api/admin-indices-refresh/
http://www.elasticsearch.org/guide/reference/api/admin-indices-flush/

It sounds as if your data is stored in the translog (which makes it durable
and accessible by ID) but never gets committed to the index (which makes it
searchable). Also you long recovery times might come from the fact, that
the hugely filled translog is written into the index (speculating right
now, but might rexplain your recovery behaviour)

Did you change any refresh settings, any translog settings?

--Alex

On Tue, Jul 16, 2013 at 2:51 AM, Jeremy Karlson jeremykarlson@gmail.comwrote:

I experienced this behaviour again. It doesn't happen to just one
document, it appears that after a while ES continues to receive changes
(new docs / updates / deletes), but no changes never appear to become
visible. If you restart ES, they all appear.

Still running 0.90.1.

-- Jeremy

On Thursday, 27 June 2013 14:28:47 UTC-7, Jeremy Karlson wrote:

I ran your sample. No, I don't get the same errors I did earlier. But,
I didn't expect to... I mentioned earlier that the problem "went away"
when I restarted ElasticSearch to turn up the logging. Since then (a day
or two ago) things have been running well.

And yeah, I'm running 0.90.1.

Thanks.

-- Jeremy

On Wednesday, 26 June 2013 23:42:28 UTC-7, Alexander Reelsen wrote:

Hey,

if you run my sample, do you encounter the same errors (maybe I did a
mistake when converting your queries)?
Then, what version of elasticsearch are you using (I tried with 0.90.1)?

--Alex

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Alex,

I apologize for the delay in getting back to you. We haven't seen this
behaviour in a while, so I had nothing to report. It came back again
today, so now I have something new to say.

What we're seeing could be explained by what you're describing. We can
still insert documents without problem, and look them up directly by ID,
but doing a query does not find them, even if you wait 10 seconds or more.
After we force the index to refresh, the documents appear in query results.

At the time, we hadn't tinkered with any refresh or translog settings. But
now we have now tried changing the index:refresh_interval setting. We
tried changing it to "2s" just to see if it made any difference or somehow
woke up the indexer. It didn't have any effect.

-- Jeremy

On Monday, July 15, 2013 11:40:51 PM UTC-7, Alexander Reelsen wrote:

Hi,

this sounds strange. Can you execute a refresh or a flush manually next
time it happens? See
http://www.elasticsearch.org/guide/reference/api/admin-indices-refresh/
http://www.elasticsearch.org/guide/reference/api/admin-indices-flush/

It sounds as if your data is stored in the translog (which makes it
durable and accessible by ID) but never gets committed to the index (which
makes it searchable). Also you long recovery times might come from the
fact, that the hugely filled translog is written into the index
(speculating right now, but might rexplain your recovery behaviour)

Did you change any refresh settings, any translog settings?

--Alex

On Tue, Jul 16, 2013 at 2:51 AM, Jeremy Karlson <jeremy...@gmail.com<javascript:>

wrote:

I experienced this behaviour again. It doesn't happen to just one
document, it appears that after a while ES continues to receive changes
(new docs / updates / deletes), but no changes never appear to become
visible. If you restart ES, they all appear.

Still running 0.90.1.

-- Jeremy

On Thursday, 27 June 2013 14:28:47 UTC-7, Jeremy Karlson wrote:

I ran your sample. No, I don't get the same errors I did earlier. But,
I didn't expect to... I mentioned earlier that the problem "went away"
when I restarted ElasticSearch to turn up the logging. Since then (a day
or two ago) things have been running well.

And yeah, I'm running 0.90.1.

Thanks.

-- Jeremy

On Wednesday, 26 June 2013 23:42:28 UTC-7, Alexander Reelsen wrote:

Hey,

if you run my sample, do you encounter the same errors (maybe I did a
mistake when converting your queries)?
Then, what version of elasticsearch are you using (I tried with 0.90.1)?

--Alex

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hey,

did you have a huge amount of updates shortly before, when this strange
behaviour happens? One of the reasons you refresh is not executing, might
be because the refresh, which started a second earlier has not yet been
finished. But the refresh you are executing manually should not be
different.

If you force the refresh, how long does it take until the documents are
searchable? Immediately?

--Alex

On Tue, Jul 30, 2013 at 9:12 PM, Jeremy Karlson jeremykarlson@gmail.comwrote:

Hi Alex,

I apologize for the delay in getting back to you. We haven't seen this
behaviour in a while, so I had nothing to report. It came back again
today, so now I have something new to say.

What we're seeing could be explained by what you're describing. We can
still insert documents without problem, and look them up directly by ID,
but doing a query does not find them, even if you wait 10 seconds or more.
After we force the index to refresh, the documents appear in query results.

At the time, we hadn't tinkered with any refresh or translog settings.
But now we have now tried changing the index:refresh_interval setting. We
tried changing it to "2s" just to see if it made any difference or somehow
woke up the indexer. It didn't have any effect.

-- Jeremy

On Monday, July 15, 2013 11:40:51 PM UTC-7, Alexander Reelsen wrote:

Hi,

this sounds strange. Can you execute a refresh or a flush manually next
time it happens? See
http://www.elasticsearch.org/guide/reference/api/admin-
indices-refresh/http://www.elasticsearch.org/guide/reference/api/admin-indices-refresh/
http://www.elasticsearch.org/**guide/reference/api/admin-**indices-flush/http://www.elasticsearch.org/guide/reference/api/admin-indices-flush/

It sounds as if your data is stored in the translog (which makes it
durable and accessible by ID) but never gets committed to the index (which
makes it searchable). Also you long recovery times might come from the
fact, that the hugely filled translog is written into the index
(speculating right now, but might rexplain your recovery behaviour)

Did you change any refresh settings, any translog settings?

--Alex

On Tue, Jul 16, 2013 at 2:51 AM, Jeremy Karlson jeremy...@gmail.comwrote:

I experienced this behaviour again. It doesn't happen to just one
document, it appears that after a while ES continues to receive changes
(new docs / updates / deletes), but no changes never appear to become
visible. If you restart ES, they all appear.

Still running 0.90.1.

-- Jeremy

On Thursday, 27 June 2013 14:28:47 UTC-7, Jeremy Karlson wrote:

I ran your sample. No, I don't get the same errors I did earlier.
But, I didn't expect to... I mentioned earlier that the problem "went
away" when I restarted ElasticSearch to turn up the logging. Since then (a
day or two ago) things have been running well.

And yeah, I'm running 0.90.1.

Thanks.

-- Jeremy

On Wednesday, 26 June 2013 23:42:28 UTC-7, Alexander Reelsen wrote:

Hey,

if you run my sample, do you encounter the same errors (maybe I did a
mistake when converting your queries)?
Then, what version of elasticsearch are you using (I tried with
0.90.1)?

--Alex

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@**googlegroups.com.

For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out
.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hm. I suppose "huge" is a subjective term, but no, we didn't have a large
number of updates. I would say this machine is under a low-to-moderate
load generally, with indexes and documents being deleted and recreated
fairly frequently. It's the target of a number of automated tests, so
things tend to get torn down and rebuilt with some regularity. It has
perhaps a dozen indexes with 35,000 docs in the most populated I see. Most
indexes are much less populated with a few hundred docs. Our documents are
probably 1 - 2 k each.

I have since restarted the machine and things have returned to "normal" (up
for debate - we're looking at some different weird behaviour now, but that
problem appears to have gone away), but as I recall forcing the manual
refresh was a quick operation that took almost no time running it from a
CURL command line.

-- Jeremy

On Wednesday, 31 July 2013 01:54:04 UTC-7, Alexander Reelsen wrote:

Hey,

did you have a huge amount of updates shortly before, when this strange
behaviour happens? One of the reasons you refresh is not executing, might
be because the refresh, which started a second earlier has not yet been
finished. But the refresh you are executing manually should not be
different.

If you force the refresh, how long does it take until the documents are
searchable? Immediately?

--Alex

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Back again!

At some point over the weekend, it appears Elasticsearch decided to stop
refreshing again. This time, I was unable to force refreshes to occur by
manually triggering them. Without any better ideas, I've turned on TRACE
debugging and enabled the debugger, in case it happens again.

It is currently in the process of starting up, and it's taking quite a
while. If I query the cluster health, I see this:

{
"cluster_name" : "MyCluster",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 5,
"number_of_data_nodes" : 1,
"active_primary_shards" : 27,
"active_shards" : 27,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 148
}

And then if I query again a minute later, I see this:

{
"cluster_name" : "MyCluster",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 5,
"number_of_data_nodes" : 1,
"active_primary_shards" : 31,
"active_shards" : 31,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 144
}

So it's certainly taking it's time in assigning shards. To me (with my limited ES knowledge) that does seem a little odd since this is a cluster of just one node. I don't know what's normal or not in the log at this point, but I see a lot of talk about recovering shards with a reason of "post recovery." ES was cleanly shutdown and restarted, so I'm unclear on what "recovery" might mean in this situation.

This is certainly starting to become very frustrating and concerning. Anyone else have any ideas?

-- Jeremy

On Wednesday, 31 July 2013 12:34:12 UTC-7, Jeremy Karlson wrote:

Hm. I suppose "huge" is a subjective term, but no, we didn't have a large

number of updates. I would say this machine is under a low-to-moderate
load generally, with indexes and documents being deleted and recreated
fairly frequently. It's the target of a number of automated tests, so
things tend to get torn down and rebuilt with some regularity. It has
perhaps a dozen indexes with 35,000 docs in the most populated I see. Most
indexes are much less populated with a few hundred docs. Our documents are
probably 1 - 2 k each.

I have since restarted the machine and things have returned to "normal"
(up for debate - we're looking at some different weird behaviour now, but
that problem appears to have gone away), but as I recall forcing the manual
refresh was a quick operation that took almost no time running it from a
CURL command line.

-- Jeremy

On Wednesday, 31 July 2013 01:54:04 UTC-7, Alexander Reelsen wrote:

Hey,

did you have a huge amount of updates shortly before, when this strange
behaviour happens? One of the reasons you refresh is not executing, might
be because the refresh, which started a second earlier has not yet been
finished. But the refresh you are executing manually should not be
different.

If you force the refresh, how long does it take until the documents are
searchable? Immediately?

--Alex

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Your situation is extremely odd. You state that your cluster has only one
node, but the cluster API is showing that number_of_nodes is 5. You can use
the cluster API to get all the nodes in the cluster:
http://localhost:9200/_cluster/nodeshttp://192.168.52.155:9200/_cluster/nodes
Change
localhost to the node that is running elasticsearch. Either there are other
elasticsearch clusters in your network, or you started several process on a
single server. If elasticsearch cannot bind to the default ports
(9200/9300). It will choose the next available port.

You can also use a management plugin such as head or bigdesk to quickly
visualize your cluster.

--
Ivan

On Tue, Aug 6, 2013 at 11:23 AM, Jeremy Karlson jeremykarlson@gmail.comwrote:

Back again!

At some point over the weekend, it appears Elasticsearch decided to stop
refreshing again. This time, I was unable to force refreshes to occur by
manually triggering them. Without any better ideas, I've turned on TRACE
debugging and enabled the debugger, in case it happens again.

It is currently in the process of starting up, and it's taking quite a
while. If I query the cluster health, I see this:

{
"cluster_name" : "MyCluster",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 5,
"number_of_data_nodes" : 1,
"active_primary_shards" : 27,
"active_shards" : 27,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 148
}

And then if I query again a minute later, I see this:

{
"cluster_name" : "MyCluster",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 5,
"number_of_data_nodes" : 1,
"active_primary_shards" : 31,
"active_shards" : 31,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 144
}

So it's certainly taking it's time in assigning shards. To me (with my limited ES knowledge) that does seem a little odd since this is a cluster of just one node. I don't know what's normal or not in the log at this point, but I see a lot of talk about recovering shards with a reason of "post recovery." ES was cleanly shutdown and restarted, so I'm unclear on what "recovery" might mean in this situation.

This is certainly starting to become very frustrating and concerning. Anyone else have any ideas?

-- Jeremy

On Wednesday, 31 July 2013 12:34:12 UTC-7, Jeremy Karlson wrote:

Hm. I suppose "huge" is a subjective term, but no, we didn't have a large

number of updates. I would say this machine is under a low-to-moderate
load generally, with indexes and documents being deleted and recreated
fairly frequently. It's the target of a number of automated tests, so
things tend to get torn down and rebuilt with some regularity. It has
perhaps a dozen indexes with 35,000 docs in the most populated I see. Most
indexes are much less populated with a few hundred docs. Our documents are
probably 1 - 2 k each.

I have since restarted the machine and things have returned to "normal"
(up for debate - we're looking at some different weird behaviour now, but
that problem appears to have gone away), but as I recall forcing the manual
refresh was a quick operation that took almost no time running it from a
CURL command line.

-- Jeremy

On Wednesday, 31 July 2013 01:54:04 UTC-7, Alexander Reelsen wrote:

Hey,

did you have a huge amount of updates shortly before, when this strange
behaviour happens? One of the reasons you refresh is not executing, might
be because the refresh, which started a second earlier has not yet been
finished. But the refresh you are executing manually should not be
different.

If you force the refresh, how long does it take until the documents are
searchable? Immediately?

--Alex

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Missed that number_of_data_nodes is 1. Your indices obviously think that
they have replicas. Scan the cluster state (
http://localhost:9200/_cluster/statehttp://192.168.52.155:9200/_cluster/state)
For each index, does index.number_of_replicas equal anything else but 0?
The lack of the setting means the default of 1 (I might be wrong on this
number).

Do you have perhaps index.auto_expand_replicas enabled?

--
Ivan

On Wed, Aug 7, 2013 at 9:04 AM, Ivan Brusic ivan@brusic.com wrote:

Your situation is extremely odd. You state that your cluster has only one
node, but the cluster API is showing that number_of_nodes is 5. You can use
the cluster API to get all the nodes in the cluster:
http://localhost:9200/_cluster/nodeshttp://192.168.52.155:9200/_cluster/nodes Change
localhost to the node that is running elasticsearch. Either there are other
elasticsearch clusters in your network, or you started several process on a
single server. If elasticsearch cannot bind to the default ports
(9200/9300). It will choose the next available port.

You can also use a management plugin such as head or bigdesk to quickly
visualize your cluster.

--
Ivan

On Tue, Aug 6, 2013 at 11:23 AM, Jeremy Karlson jeremykarlson@gmail.comwrote:

Back again!

At some point over the weekend, it appears Elasticsearch decided to stop
refreshing again. This time, I was unable to force refreshes to occur by
manually triggering them. Without any better ideas, I've turned on TRACE
debugging and enabled the debugger, in case it happens again.

It is currently in the process of starting up, and it's taking quite a
while. If I query the cluster health, I see this:

{
"cluster_name" : "MyCluster",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 5,
"number_of_data_nodes" : 1,
"active_primary_shards" : 27,
"active_shards" : 27,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 148
}

And then if I query again a minute later, I see this:

{
"cluster_name" : "MyCluster",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 5,
"number_of_data_nodes" : 1,
"active_primary_shards" : 31,
"active_shards" : 31,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 144
}

So it's certainly taking it's time in assigning shards. To me (with my limited ES knowledge) that does seem a little odd since this is a cluster of just one node. I don't know what's normal or not in the log at this point, but I see a lot of talk about recovering shards with a reason of "post recovery." ES was cleanly shutdown and restarted, so I'm unclear on what "recovery" might mean in this situation.

This is certainly starting to become very frustrating and concerning. Anyone else have any ideas?

-- Jeremy

On Wednesday, 31 July 2013 12:34:12 UTC-7, Jeremy Karlson wrote:

Hm. I suppose "huge" is a subjective term, but no, we didn't have a

large number of updates. I would say this machine is under a
low-to-moderate load generally, with indexes and documents being deleted
and recreated fairly frequently. It's the target of a number of automated
tests, so things tend to get torn down and rebuilt with some regularity.
It has perhaps a dozen indexes with 35,000 docs in the most populated I
see. Most indexes are much less populated with a few hundred docs. Our
documents are probably 1 - 2 k each.

I have since restarted the machine and things have returned to "normal"
(up for debate - we're looking at some different weird behaviour now, but
that problem appears to have gone away), but as I recall forcing the manual
refresh was a quick operation that took almost no time running it from a
CURL command line.

-- Jeremy

On Wednesday, 31 July 2013 01:54:04 UTC-7, Alexander Reelsen wrote:

Hey,

did you have a huge amount of updates shortly before, when this strange
behaviour happens? One of the reasons you refresh is not executing, might
be because the refresh, which started a second earlier has not yet been
finished. But the refresh you are executing manually should not be
different.

If you force the refresh, how long does it take until the documents are
searchable? Immediately?

--Alex

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Okay, now it's occurring on a different machine. Whatever it is, it's not
isolated. Auto_expand_replicas is not in our config file, and we haven't
changed it at runtime. I assume that means it's off.

The logging is set lower and there isn't much there. For example, here is
the whole log from 2013-08-06. This machine isn't very busy:

2013-08-06 16:24:05,892 INFO
|elasticsearch[Mephisto][clusterService#updateTask][T#1]|
c.metadata [Mephisto] [events-2013-08-06] creating index, cause [auto(bulk
api)], shards [5]/[1], mappings []
2013-08-06 16:24:07,808 INFO
|elasticsearch[Mephisto][clusterService#updateTask][T#1]|
c.metadata [Mephisto] [events-2013-08-06] update_mapping [log] (dynamic)
2013-08-06 16:24:12,873 INFO
|elasticsearch[Mephisto][clusterService#updateTask][T#1]|
c.metadata [Mephisto] [events-2013-08-07] creating index, cause [auto(bulk
api)], shards [5]/[1], mappings []
2013-08-06 16:24:14,884 INFO
|elasticsearch[Mephisto][clusterService#updateTask][T#1]|
c.metadata [Mephisto] [events-2013-08-07] update_mapping [log] (dynamic)
2013-08-06 16:46:49,844 INFO
|elasticsearch[Mephisto][clusterService#updateTask][T#1]|
c.metadata [Mephisto] [events-2013-08-06] update_mapping [log] (dynamic)
2013-08-06 22:48:58,019 INFO
|elasticsearch[Mephisto][clusterService#updateTask][T#1]|
c.metadata [Mephisto] [events-2013-08-10] creating index, cause [api],
shards [5]/[1], mappings []
2013-08-06 22:50:08,738 INFO
|elasticsearch[Mephisto][clusterService#updateTask][T#1]|
c.metadata [Mephisto] [events-2013-08-11] creating index, cause [api],
shards [5]/[1], mappings []

Here is _cluster/health:

{
"cluster_name" : "mycluster",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 2,
"number_of_data_nodes" : 1,
"active_primary_shards" : 85,
"active_shards" : 85,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 85
}

And _cluster/nodes:

{
"ok" : true,
"cluster_name" : "mycluster",
"nodes" : {
"Sz9SdW45QlGz0UvHQxqkCw" : {
"name" : "Dreamqueen",
"transport_address" : "inet[/192.168.1.90:9300]",
"hostname" : "writer",
"version" : "0.90.1",
"http_address" : "inet[/192.168.1.90:9200]",
"attributes" : {
"client" : "true",
"data" : "false"
}
},
"WImBPVMdQEOloVyO_zLCGg" : {
"name" : "Mephisto",
"transport_address" : "inet[/192.168.1.91:9300]",
"hostname" : "es1",
"version" : "0.90.1",
"http_address" : "inet[/192.168.1.91:9200]"
}
}
}

Interesting bits from _cluster/state:

... cluster info, including nodes...

"metadata" : {
"templates" : { },
"indices" : {
"events-2013-07-20" : {
"state" : "open",
"settings" : {
"index.number_of_shards" : "5",
"index.number_of_replicas" : "1",
"index.version.created" : "900199"
},
"mappings" : {
"log" : {
"properties" : {
"id" : {
"type" : "string"
},
"eventType" : {
"type" : "string"
},
"eventTime" : {
"type" : "string"
}
}
}
},
"aliases" : [ ]
},

  ... lots of index info, all basically looks the same...

"routing_table" : {
"indices" : {
"events-2013-07-20" : {
"shards" : {
"0" : [ {
"state" : "STARTED",
"primary" : true,
"node" : "WImBPVMdQEOloVyO_zLCGg",
"relocating_node" : null,
"shard" : 0,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 0,
"index" : "events-2013-07-20"
} ],
"1" : [ {
"state" : "STARTED",
"primary" : true,
"node" : "WImBPVMdQEOloVyO_zLCGg",
"relocating_node" : null,
"shard" : 1,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 1,
"index" : "events-2013-07-20"
} ],
"2" : [ {
"state" : "STARTED",
"primary" : true,
"node" : "WImBPVMdQEOloVyO_zLCGg",
"relocating_node" : null,
"shard" : 2,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 2,
"index" : "events-2013-07-20"
} ],
"3" : [ {
"state" : "STARTED",
"primary" : true,
"node" : "WImBPVMdQEOloVyO_zLCGg",
"relocating_node" : null,
"shard" : 3,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 3,
"index" : "events-2013-07-20"
} ],
"4" : [ {
"state" : "STARTED",
"primary" : true,
"node" : "WImBPVMdQEOloVyO_zLCGg",
"relocating_node" : null,
"shard" : 4,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 4,
"index" : "events-2013-07-20"
} ]
}
},

  ... more of the same for each index...

"routing_nodes" : {
"unassigned" : [ {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 0,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 1,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 2,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 3,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 4,
"index" : "events-2013-07-20"
},

Anything seem unusual there? Should number_of_replicas be 0, since this is
the only machine in the cluster? (The second machine is a client only.)

-- Jeremy

On Wednesday, 7 August 2013 09:14:10 UTC-7, Ivan Brusic wrote:

Missed that number_of_data_nodes is 1. Your indices obviously think that
they have replicas. Scan the cluster state (
http://localhost:9200/_cluster/statehttp://192.168.52.155:9200/_cluster/state)
For each index, does index.number_of_replicas equal anything else but 0?
The lack of the setting means the default of 1 (I might be wrong on this
number).

Do you have perhaps index.auto_expand_replicas enabled?

--
Ivan

On Wed, Aug 7, 2013 at 9:04 AM, Ivan Brusic <iv...@brusic.com<javascript:>

wrote:

Your situation is extremely odd. You state that your cluster has only one
node, but the cluster API is showing that number_of_nodes is 5. You can use
the cluster API to get all the nodes in the cluster:
http://localhost:9200/_cluster/nodeshttp://192.168.52.155:9200/_cluster/nodes Change
localhost to the node that is running elasticsearch. Either there are other
elasticsearch clusters in your network, or you started several process on a
single server. If elasticsearch cannot bind to the default ports
(9200/9300). It will choose the next available port.

You can also use a management plugin such as head or bigdesk to quickly
visualize your cluster.

--
Ivan

On Tue, Aug 6, 2013 at 11:23 AM, Jeremy Karlson <jeremy...@gmail.com<javascript:>

wrote:

Back again!

At some point over the weekend, it appears Elasticsearch decided to stop
refreshing again. This time, I was unable to force refreshes to occur by
manually triggering them. Without any better ideas, I've turned on TRACE
debugging and enabled the debugger, in case it happens again.

It is currently in the process of starting up, and it's taking quite a
while. If I query the cluster health, I see this:

{
"cluster_name" : "MyCluster",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 5,
"number_of_data_nodes" : 1,
"active_primary_shards" : 27,
"active_shards" : 27,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 148
}

And then if I query again a minute later, I see this:

{
"cluster_name" : "MyCluster",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 5,
"number_of_data_nodes" : 1,
"active_primary_shards" : 31,
"active_shards" : 31,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 144
}

So it's certainly taking it's time in assigning shards. To me (with my limited ES knowledge) that does seem a little odd since this is a cluster of just one node. I don't know what's normal or not in the log at this point, but I see a lot of talk about recovering shards with a reason of "post recovery." ES was cleanly shutdown and restarted, so I'm unclear on what "recovery" might mean in this situation.

This is certainly starting to become very frustrating and concerning. Anyone else have any ideas?

-- Jeremy

On Wednesday, 31 July 2013 12:34:12 UTC-7, Jeremy Karlson wrote:

Hm. I suppose "huge" is a subjective term, but no, we didn't have a

large number of updates. I would say this machine is under a
low-to-moderate load generally, with indexes and documents being deleted
and recreated fairly frequently. It's the target of a number of automated
tests, so things tend to get torn down and rebuilt with some regularity.
It has perhaps a dozen indexes with 35,000 docs in the most populated I
see. Most indexes are much less populated with a few hundred docs. Our
documents are probably 1 - 2 k each.

I have since restarted the machine and things have returned to "normal"
(up for debate - we're looking at some different weird behaviour now, but
that problem appears to have gone away), but as I recall forcing the manual
refresh was a quick operation that took almost no time running it from a
CURL command line.

-- Jeremy

On Wednesday, 31 July 2013 01:54:04 UTC-7, Alexander Reelsen wrote:

Hey,

did you have a huge amount of updates shortly before, when this
strange behaviour happens? One of the reasons you refresh is not executing,
might be because the refresh, which started a second earlier has not yet
been finished. But the refresh you are executing manually should not be
different.

If you force the refresh, how long does it take until the documents
are searchable? Immediately?

--Alex

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Clients should not be part of the cluster. If you have only one data node,
then you should have no replicas. Are the other nodes used for load
balancing? How are you initializing them (node.client, node.data,
node.master)?

--
Ivan

On Wed, Aug 7, 2013 at 1:50 PM, Jeremy Karlson jeremykarlson@gmail.comwrote:

Okay, now it's occurring on a different machine. Whatever it is, it's not
isolated. Auto_expand_replicas is not in our config file, and we haven't
changed it at runtime. I assume that means it's off.

The logging is set lower and there isn't much there. For example, here is
the whole log from 2013-08-06. This machine isn't very busy:

2013-08-06 16:24:05,892 INFO
|elasticsearch[Mephisto][clusterService#updateTask][T#1]|
c.metadata [Mephisto] [events-2013-08-06] creating index, cause [auto(bulk
api)], shards [5]/[1], mappings []
2013-08-06 16:24:07,808 INFO
|elasticsearch[Mephisto][clusterService#updateTask][T#1]|
c.metadata [Mephisto] [events-2013-08-06] update_mapping [log] (dynamic)
2013-08-06 16:24:12,873 INFO
|elasticsearch[Mephisto][clusterService#updateTask][T#1]|
c.metadata [Mephisto] [events-2013-08-07] creating index, cause [auto(bulk
api)], shards [5]/[1], mappings []
2013-08-06 16:24:14,884 INFO
|elasticsearch[Mephisto][clusterService#updateTask][T#1]|
c.metadata [Mephisto] [events-2013-08-07] update_mapping [log] (dynamic)
2013-08-06 16:46:49,844 INFO
|elasticsearch[Mephisto][clusterService#updateTask][T#1]|
c.metadata [Mephisto] [events-2013-08-06] update_mapping [log] (dynamic)
2013-08-06 22:48:58,019 INFO
|elasticsearch[Mephisto][clusterService#updateTask][T#1]|
c.metadata [Mephisto] [events-2013-08-10] creating index, cause [api],
shards [5]/[1], mappings []
2013-08-06 22:50:08,738 INFO
|elasticsearch[Mephisto][clusterService#updateTask][T#1]|
c.metadata [Mephisto] [events-2013-08-11] creating index, cause [api],
shards [5]/[1], mappings []

Here is _cluster/health:

{
"cluster_name" : "mycluster",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 2,
"number_of_data_nodes" : 1,
"active_primary_shards" : 85,
"active_shards" : 85,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 85
}

And _cluster/nodes:

{
"ok" : true,
"cluster_name" : "mycluster",
"nodes" : {
"Sz9SdW45QlGz0UvHQxqkCw" : {
"name" : "Dreamqueen",
"transport_address" : "inet[/192.168.1.90:9300]",
"hostname" : "writer",
"version" : "0.90.1",
"http_address" : "inet[/192.168.1.90:9200]",
"attributes" : {
"client" : "true",
"data" : "false"
}
},
"WImBPVMdQEOloVyO_zLCGg" : {
"name" : "Mephisto",
"transport_address" : "inet[/192.168.1.91:9300]",
"hostname" : "es1",
"version" : "0.90.1",
"http_address" : "inet[/192.168.1.91:9200]"
}
}
}

Interesting bits from _cluster/state:

... cluster info, including nodes...

"metadata" : {
"templates" : { },
"indices" : {
"events-2013-07-20" : {
"state" : "open",
"settings" : {
"index.number_of_shards" : "5",
"index.number_of_replicas" : "1",
"index.version.created" : "900199"
},
"mappings" : {
"log" : {
"properties" : {
"id" : {
"type" : "string"
},
"eventType" : {
"type" : "string"
},
"eventTime" : {
"type" : "string"
}
}
}
},
"aliases" : [ ]
},

  ... lots of index info, all basically looks the same...

"routing_table" : {
"indices" : {
"events-2013-07-20" : {
"shards" : {
"0" : [ {
"state" : "STARTED",
"primary" : true,
"node" : "WImBPVMdQEOloVyO_zLCGg",
"relocating_node" : null,
"shard" : 0,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 0,
"index" : "events-2013-07-20"
} ],
"1" : [ {
"state" : "STARTED",
"primary" : true,
"node" : "WImBPVMdQEOloVyO_zLCGg",
"relocating_node" : null,
"shard" : 1,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 1,
"index" : "events-2013-07-20"
} ],
"2" : [ {
"state" : "STARTED",
"primary" : true,
"node" : "WImBPVMdQEOloVyO_zLCGg",
"relocating_node" : null,
"shard" : 2,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 2,
"index" : "events-2013-07-20"
} ],
"3" : [ {
"state" : "STARTED",
"primary" : true,
"node" : "WImBPVMdQEOloVyO_zLCGg",
"relocating_node" : null,
"shard" : 3,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 3,
"index" : "events-2013-07-20"
} ],
"4" : [ {
"state" : "STARTED",
"primary" : true,
"node" : "WImBPVMdQEOloVyO_zLCGg",
"relocating_node" : null,
"shard" : 4,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 4,
"index" : "events-2013-07-20"
} ]
}
},

  ... more of the same for each index...

"routing_nodes" : {
"unassigned" : [ {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 0,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 1,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 2,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 3,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 4,
"index" : "events-2013-07-20"
},

Anything seem unusual there? Should number_of_replicas be 0, since this
is the only machine in the cluster? (The second machine is a client only.)

-- Jeremy

On Wednesday, 7 August 2013 09:14:10 UTC-7, Ivan Brusic wrote:

Missed that number_of_data_nodes is 1. Your indices obviously think that
they have replicas. Scan the cluster state (http://localhost:9200/_**
cluster/state http://192.168.52.155:9200/_cluster/state) For each
index, does index.number_of_replicas equal anything else but 0? The lack
of the setting means the default of 1 (I might be wrong on this number).

Do you have perhaps index.auto_expand_**replicas enabled?

--
Ivan

On Wed, Aug 7, 2013 at 9:04 AM, Ivan Brusic iv...@brusic.com wrote:

Your situation is extremely odd. You state that your cluster has only
one node, but the cluster API is showing that number_of_nodes is 5. You can
use the cluster API to get all the nodes in the cluster:
http://localhost:**9200/_cluster/nodeshttp://192.168.52.155:9200/_cluster/nodes Change
localhost to the node that is running elasticsearch. Either there are other
elasticsearch clusters in your network, or you started several process on a
single server. If elasticsearch cannot bind to the default ports
(9200/9300). It will choose the next available port.

You can also use a management plugin such as head or bigdesk to quickly
visualize your cluster.

--
Ivan

On Tue, Aug 6, 2013 at 11:23 AM, Jeremy Karlson jeremy...@gmail.comwrote:

Back again!

At some point over the weekend, it appears Elasticsearch decided to
stop refreshing again. This time, I was unable to force refreshes to occur
by manually triggering them. Without any better ideas, I've turned on
TRACE debugging and enabled the debugger, in case it happens again.

It is currently in the process of starting up, and it's taking quite a
while. If I query the cluster health, I see this:

{
"cluster_name" : "MyCluster",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 5,
"number_of_data_nodes" : 1,
"active_primary_shards" : 27,
"active_shards" : 27,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 148
}

And then if I query again a minute later, I see this:

{
"cluster_name" : "MyCluster",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 5,
"number_of_data_nodes" : 1,
"active_primary_shards" : 31,
"active_shards" : 31,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 144
}

So it's certainly taking it's time in assigning shards. To me (with my limited ES knowledge) that does seem a little odd since this is a cluster of just one node. I don't know what's normal or not in the log at this point, but I see a lot of talk about recovering shards with a reason of "post recovery." ES was cleanly shutdown and restarted, so I'm unclear on what "recovery" might mean in this situation.

This is certainly starting to become very frustrating and concerning. Anyone else have any ideas?

-- Jeremy

On Wednesday, 31 July 2013 12:34:12 UTC-7, Jeremy Karlson wrote:

Hm. I suppose "huge" is a subjective term, but no, we didn't have a

large number of updates. I would say this machine is under a
low-to-moderate load generally, with indexes and documents being deleted
and recreated fairly frequently. It's the target of a number of automated
tests, so things tend to get torn down and rebuilt with some regularity.
It has perhaps a dozen indexes with 35,000 docs in the most populated I
see. Most indexes are much less populated with a few hundred docs. Our
documents are probably 1 - 2 k each.

I have since restarted the machine and things have returned to
"normal" (up for debate - we're looking at some different weird behaviour
now, but that problem appears to have gone away), but as I recall forcing
the manual refresh was a quick operation that took almost no time running
it from a CURL command line.

-- Jeremy

On Wednesday, 31 July 2013 01:54:04 UTC-7, Alexander Reelsen wrote:

Hey,

did you have a huge amount of updates shortly before, when this
strange behaviour happens? One of the reasons you refresh is not executing,
might be because the refresh, which started a second earlier has not yet
been finished. But the refresh you are executing manually should not be
different.

If you force the refresh, how long does it take until the documents
are searchable? Immediately?

--Alex

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@**googlegroups.com.

For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out
.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.