Can get document by ID but not find it in query all?

Jeremy_Karlson · August 7, 2013, 9:44pm

Someone smarter than I just pointed that out. All of our other nodes are
clients only - node.client - no data, no master. So, number_of_replicas
should always be zero. I've changed that for all of the indexes we've
currently have, and I've set the default that way as well for any new
indexes that are created.

-- Jeremy

On Wednesday, 7 August 2013 14:27:14 UTC-7, Ivan Brusic wrote:

Clients should not be part of the cluster. If you have only one data node,
then you should have no replicas. Are the other nodes used for load
balancing? How are you initializing them (node.client, node.data,
node.master)?

--
Ivan

On Wed, Aug 7, 2013 at 1:50 PM, Jeremy Karlson <jeremy...@gmail.com<javascript:>

wrote:
Okay, now it's occurring on a different machine. Whatever it is, it's
not isolated. Auto_expand_replicas is not in our config file, and we
haven't changed it at runtime. I assume that means it's off.

The logging is set lower and there isn't much there. For example, here
is the whole log from 2013-08-06. This machine isn't very busy:

2013-08-06 16:24:05,892 INFO
|elasticsearch[Mephisto][clusterService#updateTask][T#1]|
c.metadata [Mephisto] [events-2013-08-06] creating index, cause [auto(bulk
api)], shards [5]/[1], mappings
2013-08-06 16:24:07,808 INFO
|elasticsearch[Mephisto][clusterService#updateTask][T#1]|
c.metadata [Mephisto] [events-2013-08-06] update_mapping [log] (dynamic)
2013-08-06 16:24:12,873 INFO
|elasticsearch[Mephisto][clusterService#updateTask][T#1]|
c.metadata [Mephisto] [events-2013-08-07] creating index, cause [auto(bulk
api)], shards [5]/[1], mappings
2013-08-06 16:24:14,884 INFO
|elasticsearch[Mephisto][clusterService#updateTask][T#1]|
c.metadata [Mephisto] [events-2013-08-07] update_mapping [log] (dynamic)
2013-08-06 16:46:49,844 INFO
|elasticsearch[Mephisto][clusterService#updateTask][T#1]|
c.metadata [Mephisto] [events-2013-08-06] update_mapping [log] (dynamic)
2013-08-06 22:48:58,019 INFO
|elasticsearch[Mephisto][clusterService#updateTask][T#1]|
c.metadata [Mephisto] [events-2013-08-10] creating index, cause [api],
shards [5]/[1], mappings
2013-08-06 22:50:08,738 INFO
|elasticsearch[Mephisto][clusterService#updateTask][T#1]|
c.metadata [Mephisto] [events-2013-08-11] creating index, cause [api],
shards [5]/[1], mappings

Here is _cluster/health:

{
"cluster_name" : "mycluster",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 2,
"number_of_data_nodes" : 1,
"active_primary_shards" : 85,
"active_shards" : 85,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 85
}

And _cluster/nodes:

{
"ok" : true,
"cluster_name" : "mycluster",
"nodes" : {
"Sz9SdW45QlGz0UvHQxqkCw" : {
"name" : "Dreamqueen",
"transport_address" : "inet[/192.168.1.90:9300]",
"hostname" : "writer",
"version" : "0.90.1",
"http_address" : "inet[/192.168.1.90:9200]",
"attributes" : {
"client" : "true",
"data" : "false"
}
},
"WImBPVMdQEOloVyO_zLCGg" : {
"name" : "Mephisto",
"transport_address" : "inet[/192.168.1.91:9300]",
"hostname" : "es1",
"version" : "0.90.1",
"http_address" : "inet[/192.168.1.91:9200]"
}
}
}

Interesting bits from _cluster/state:

... cluster info, including nodes...

"metadata" : {
"templates" : { },
"indices" : {
"events-2013-07-20" : {
"state" : "open",
"settings" : {
"index.number_of_shards" : "5",
"index.number_of_replicas" : "1",
"index.version.created" : "900199"
},
"mappings" : {
"log" : {
"properties" : {
"id" : {
"type" : "string"
},
"eventType" : {
"type" : "string"
},
"eventTime" : {
"type" : "string"
}
}
}
},
"aliases" :
},
  ... lots of index info, all basically looks the same...
"routing_table" : {
"indices" : {
"events-2013-07-20" : {
"shards" : {
"0" : [ {
"state" : "STARTED",
"primary" : true,
"node" : "WImBPVMdQEOloVyO_zLCGg",
"relocating_node" : null,
"shard" : 0,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 0,
"index" : "events-2013-07-20"
} ],
"1" : [ {
"state" : "STARTED",
"primary" : true,
"node" : "WImBPVMdQEOloVyO_zLCGg",
"relocating_node" : null,
"shard" : 1,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 1,
"index" : "events-2013-07-20"
} ],
"2" : [ {
"state" : "STARTED",
"primary" : true,
"node" : "WImBPVMdQEOloVyO_zLCGg",
"relocating_node" : null,
"shard" : 2,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 2,
"index" : "events-2013-07-20"
} ],
"3" : [ {
"state" : "STARTED",
"primary" : true,
"node" : "WImBPVMdQEOloVyO_zLCGg",
"relocating_node" : null,
"shard" : 3,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 3,
"index" : "events-2013-07-20"
} ],
"4" : [ {
"state" : "STARTED",
"primary" : true,
"node" : "WImBPVMdQEOloVyO_zLCGg",
"relocating_node" : null,
"shard" : 4,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 4,
"index" : "events-2013-07-20"
} ]
}
},
  ... more of the same for each index...
"routing_nodes" : {
"unassigned" : [ {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 0,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 1,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 2,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 3,
"index" : "events-2013-07-20"
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 4,
"index" : "events-2013-07-20"
},

Anything seem unusual there? Should number_of_replicas be 0, since this
is the only machine in the cluster? (The second machine is a client only.)

-- Jeremy

On Wednesday, 7 August 2013 09:14:10 UTC-7, Ivan Brusic wrote:

Missed that number_of_data_nodes is 1. Your indices obviously think that
they have replicas. Scan the cluster state (http://localhost:9200/_**
cluster/state http://192.168.52.155:9200/_cluster/state) For each
index, does index.number_of_replicas equal anything else but 0? The lack
of the setting means the default of 1 (I might be wrong on this number).

Do you have perhaps index.auto_expand_**replicas enabled?

--
Ivan

On Wed, Aug 7, 2013 at 9:04 AM, Ivan Brusic iv...@brusic.com wrote:

Your situation is extremely odd. You state that your cluster has only
one node, but the cluster API is showing that number_of_nodes is 5. You can
use the cluster API to get all the nodes in the cluster:
http://localhost:**9200/_cluster/nodeshttp://192.168.52.155:9200/_cluster/nodes Change
localhost to the node that is running elasticsearch. Either there are other
elasticsearch clusters in your network, or you started several process on a
single server. If elasticsearch cannot bind to the default ports
(9200/9300). It will choose the next available port.

You can also use a management plugin such as head or bigdesk to quickly
visualize your cluster.

--
Ivan

On Tue, Aug 6, 2013 at 11:23 AM, Jeremy Karlson jeremy...@gmail.comwrote:

Back again!

At some point over the weekend, it appears Elasticsearch decided to
stop refreshing again. This time, I was unable to force refreshes to occur
by manually triggering them. Without any better ideas, I've turned on
TRACE debugging and enabled the debugger, in case it happens again.

It is currently in the process of starting up, and it's taking quite a
while. If I query the cluster health, I see this:

{
"cluster_name" : "MyCluster",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 5,
"number_of_data_nodes" : 1,
"active_primary_shards" : 27,
"active_shards" : 27,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 148
}

And then if I query again a minute later, I see this:

{
"cluster_name" : "MyCluster",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 5,
"number_of_data_nodes" : 1,
"active_primary_shards" : 31,
"active_shards" : 31,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 144
}

So it's certainly taking it's time in assigning shards. To me (with my limited ES knowledge) that does seem a little odd since this is a cluster of just one node. I don't know what's normal or not in the log at this point, but I see a lot of talk about recovering shards with a reason of "post recovery." ES was cleanly shutdown and restarted, so I'm unclear on what "recovery" might mean in this situation.

This is certainly starting to become very frustrating and concerning. Anyone else have any ideas?

-- Jeremy

On Wednesday, 31 July 2013 12:34:12 UTC-7, Jeremy Karlson wrote:

Hm. I suppose "huge" is a subjective term, but no, we didn't have a

large number of updates. I would say this machine is under a
low-to-moderate load generally, with indexes and documents being deleted
and recreated fairly frequently. It's the target of a number of automated
tests, so things tend to get torn down and rebuilt with some regularity.
It has perhaps a dozen indexes with 35,000 docs in the most populated I
see. Most indexes are much less populated with a few hundred docs. Our
documents are probably 1 - 2 k each.

I have since restarted the machine and things have returned to
"normal" (up for debate - we're looking at some different weird behaviour
now, but that problem appears to have gone away), but as I recall forcing
the manual refresh was a quick operation that took almost no time running
it from a CURL command line.

-- Jeremy

On Wednesday, 31 July 2013 01:54:04 UTC-7, Alexander Reelsen wrote:

Hey,

did you have a huge amount of updates shortly before, when this
strange behaviour happens? One of the reasons you refresh is not executing,
might be because the refresh, which started a second earlier has not yet
been finished. But the refresh you are executing manually should not be
different.

If you force the refresh, how long does it take until the documents
are searchable? Immediately?

--Alex

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@**googlegroups.com.

For more options, visit https://groups.google.com/**groups/opt_out https://groups.google.com/groups/opt_out
.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
I search same thing, but once can get and once can not get? Elasticsearch	7	407	July 6, 2017
Search Periodically Failing ... Even After Many Refresh Intervals Elasticsearch	8	559	July 6, 2017
Index not refreshed. Query count result not accurate Elasticsearch	7	1913	July 6, 2017
Documents missing after indexing and refreshing Elasticsearch	16	2119	July 6, 2017
Refresh + child documents Elasticsearch	7	365	July 6, 2017

Can get document by ID but not find it in query all?

Related topics