Marvel.agent: failed to flush exporter bulks


(Sayakiss) #1

Installed marvel plugin to my elasticsearch, but the log gives:

[2015-11-29 15:59:52,514][ERROR][marvel.agent             ] [crawler_service_001] background thread had an uncaught exception
ElasticsearchException[failed to flush exporter bulks]
	at org.elasticsearch.marvel.agent.exporter.ExportBulk$Compound.flush(ExportBulk.java:104)
	at org.elasticsearch.marvel.agent.exporter.ExportBulk.close(ExportBulk.java:53)
	at org.elasticsearch.marvel.agent.AgentService$ExportingWorker.run(AgentService.java:201)
	at java.lang.Thread.run(Thread.java:745)
	Suppressed: ElasticsearchException[failed to flush [default_local] exporter bulk]; nested: ElasticsearchException[failure in bulk execution, only the first 100 failures are printed:
[0]: index [.marvel-es-2015.11.29], type [indices_stats], id [AVFSQGxfYu-NU9LOfmGT], message [UnavailableShardsException[[.marvel-es-2015.11.29][0] Primary shard is not active or isn't assigned to a known node. Timeout: [1m], request: org.elasticsearch.action.bulk.BulkShardRequest@333878cf]]
[1]: index [.marvel-es-2015.11.29], type [cluster_stats], id [AVFSQGxfYu-NU9LOfmGU], message [UnavailableShardsException[[.marvel-es-2015.11.29][0] Primary shard is not active or isn't assigned to a known node. Timeout: [1m], request: org.elasticsearch.action.bulk.BulkShardRequest@333878cf]]
[2]: index [.marvel-es-2015.11.29], type [cluster_state], id [AVFSQGxfYu-NU9LOfmGV], message [UnavailableShardsException[[.marvel-es-2015.11.29][0] Primary shard is not active or isn't assigned to a known node. Timeout: [1m], request: org.elasticsearch.action.bulk.BulkShardRequest@333878cf]]
[3]: index [.marvel-es-2015.11.29], type [nodes], id [AVFSQGxfYu-NU9LOfmGW], message [UnavailableShardsException[[.marvel-es-2015.11.29][0] Primary shard is not active or isn't assigned to a known node. Timeout: [1m], request: org.elasticsearch.action.bulk.BulkShardRequest@333878cf]]
[4]: index [.marvel-es-data], type [node], id [Ax7rXEaMRjmHXT0wKn3BtA], message [UnavailableShardsException[[.marvel-es-data][0] Primary shard is not active or isn't assigned to a known node. Timeout: [1m], request: org.elasticsearch.action.bulk.BulkShardRequest@29126d6d]]
[5]: index [.marvel-es-2015.11.29], type [shards], id [n5sRlr1tTiGh0B65lgPPOw:Ax7rXEaMRjmHXT0wKn3BtA:sfs-2015.11.12:2:p], message [UnavailableShardsException[[.marvel-es-2015.11.29][0] Primary shard is not active or isn't assigned to a known node. Timeout: [1m], request: org.elasticsearch.action.bulk.BulkShardRequest@333878cf]]
[6]: index [.marvel-es-2015.11.29], type [shards], id [n5sRlr1tTiGh0B65lgPPOw:_na:sfs-2015.11.12:2:r], message [UnavailableShardsException[[.marvel-es-2015.11.29][0] Primary shard is not active or isn't assigned to a known node. Timeout: [1m], request: org.elasticsearch.action.bulk.BulkShardRequest@333878cf]]
//ignore some...
[99]: index [.marvel-es-2015.11.29], type [shards], id [n5sRlr1tTiGh0B65lgPPOw:Ax7rXEaMRjmHXT0wKn3BtA:.watch_history-2015.11.25:0:p], message [UnavailableShardsException[[.marvel-es-2015.11.29][0] Primary shard is not active or isn't assigned to a known node. Timeout: [1m], request: org.elasticsearch.action.bulk.BulkShardRequest@333878cf]]]
		at org.elasticsearch.marvel.agent.exporter.local.LocalBulk.flush(LocalBulk.java:114)
		at org.elasticsearch.marvel.agent.exporter.ExportBulk$Compound.flush(ExportBulk.java:101)
		... 3 more

And when I run curl -XGET http://localhost:9200/_cat/indices?v | grep marvel, I saw:

red    open   .marvel-es-2015.11.29       1   1                                                   
red    open   .marvel-es-data             1   1                                                   

Then curl -XGET http://localhost:9200/_cat/shards | grep marvel:

.marvel-es-2015.11.29     0 r UNASSIGNED                                              
.marvel-es-data           0 p UNASSIGNED                                              
.marvel-es-data           0 r UNASSIGNED  

It seems elasticsearch didn't assign for node for marvel index, but I didn't know why.

I try to manually assign by:

curl -XPOST -d '{ "commands" : [ { "allocate" : { "index" : ".marvel-es-data", "shard" : 0, "node" :"crawler_service_001" } } ] }' http://172.16.11.17:9200/_cluster/reroute?pretty

results:

{
   "error" : {
     "root_cause" : [ {
        "type" : "illegal_argument_exception",
        "reason" : "[allocate] trying to allocate a primary shard [.marvel-es-data][0], which is disabled"
    } ],
    "type" : "illegal_argument_exception",
    "reason" : "[allocate] trying to allocate a primary shard [.marvel-es-data][0], which is disabled"
  },
  "status" : 400
}

Environment:

  • Elasticsearch - 2.1.0
  • marvel-agent - 2.1.0

(Sayakiss) #2

Okay, I moved all my indices to another place, and it worked...

And I will reimport the indices by logstash...

But I'm still curious about what happened to my elasticsearch and lead it to into such a state...


(Elain) #3

I've had the same problem.

elasticssearch 2.1.1
kibana 4.3.1


(Clay Gorman) #4

I have same issue here.

curl -XGET http://eth1:9200/_cat/shards -s | grep marvel
.marvel-es-2016.01.07 0 r STARTED 2077820  977.6mb 192.241.221.72  NODE_6 
.marvel-es-2016.01.07 0 p STARTED 2077820  965.8mb 192.241.208.110 NODE_2 
.marvel-es-data       0 p STARTED      15    8.9kb 192.241.221.72  NODE_6 
.marvel-es-data       0 r STARTED      15   11.3kb 192.241.208.110 NODE_2 

Mine seems to kinda load intermittently. I think its the shard activity on bottom having issues.

I have 4200 shards and 420 Indices. Any thoughts?


(flyingrabbit) #6

[2016-03-16 09:52:57,944][ERROR][marvel.agent ] [cs-mv1-agtfs] background thread had an uncaught exception
ElasticsearchException[failed to flush exporter bulks]
at org.elasticsearch.marvel.agent.exporter.ExportBulk$Compound.flush(ExportBulk.java:104)
at org.elasticsearch.marvel.agent.exporter.ExportBulk.close(ExportBulk.java:53)
at org.elasticsearch.marvel.agent.AgentService$ExportingWorker.run(AgentService.java:201)
at java.lang.Thread.run(Thread.java:745)
Suppressed: ElasticsearchException[failed to flush [default_local] exporter bulk]; nested: ElasticsearchException[failure in bulk execution:
[0]: index [.marvel-es-2016.03.16], type [indices_stats], id [null], message [[.marvel-es-2016.03.16] IndexNotFoundException[no such index]]
[1]: index [.marvel-es-2016.03.16], type [index_stats], id [null], message [[.marvel-es-2016.03.16] IndexNotFoundException[no such index]]
[2]: index [.marvel-es-2016.03.16], type [index_stats], id [null], message [[.marvel-es-2016.03.16] IndexNotFoundException[no such index]]
[3]: index [.marvel-es-2016.03.16], type [index_stats], id [null], message [[.marvel-es-2016.03.16] IndexNotFoundException[no such index]]
[4]: index [.marvel-es-2016.03.16], type [index_stats], id [null], message [[.marvel-es-2016.03.16] IndexNotFoundException[no such index]]
[5]: index [.marvel-es-2016.03.16], type [index_stats], id [null], message [[.marvel-es-2016.03.16] IndexNotFoundException[no such index]]
[6]: index [.marvel-es-2016.03.16], type [cluster_state], id [null], message [[.marvel-es-2016.03.16] IndexNotFoundException[no such index]]
[7]: index [.marvel-es-2016.03.16], type [nodes], id [null], message [[.marvel-es-2016.03.16] IndexNotFoundException[no such index]]
[9]: index [.marvel-es-2016.03.16], type [nodes], id [null], message [[.marvel-es-2016.03.16] IndexNotFoundException[no such index]]
[11]: index [.marvel-es-2016.03.16], type [nodes], id [null], message [[.marvel-es-2016.03.16] IndexNotFoundException[no such index]]
[13]: index [.marvel-es-2016.03.16], type [nodes], id [null], message [[.marvel-es-2016.03.16] IndexNotFoundException[no such index]]
[15]: index [.marvel-es-2016.03.16], type [index_recovery], id [null], message [[.marvel-es-2016.03.16] IndexNotFoundException[no such index]]
[16]: index [.marvel-es-2016.03.16], type [cluster_stats], id [null], message [[.marvel-es-2016.03.16] IndexNotFoundException[no such index]]


(flyingrabbit) #7

The weird thing is I tried exporting locally and it worked. It seems like a communication issue between the production cluster and the monitoring cluster. This is marvel 2.1.2.
I have tried pointing to the monitoring server cluster with FQDN, IP, Hostname.

I can pull up head from any server in the production node pointing at the monitoring node and vice versa. So I don't think it is a firewall issue.

Anyone seen anything like this?


(flyingrabbit) #8

I'm just wondering why this entry appears:
Suppressed: ElasticsearchException[failed to flush [default_local] exporter bulk]; nested: ElasticsearchException[failure in bulk execution:
It seems like its attempting to write locally. My setting is as follows:
marvel.agent.exporters:
id1:
type: http
host: ["http://NAMEOFTHEMONITORNODE:9200"]


(flyingrabbit) #9

I also tried making the marvel indexes manually on the monitoring server and they don't get populated. If I make the marvel indexes on the production cluster they do get populated. It's seems to be ignoring the setting for the marvel agent completely.


(Chris Earle) #10

I saw this occur recently for a user that had messed up their cluster. If this occurs, please verify that your Marvel indices are not red:

$ curl -XGET host:9200/_cat/indices/.marvel*?v

In the user's case, their .marvel-data-1 index was red because they had lost all of their shards, which was causing this exception to be triggered over and over again.


(Barry Kaplan) #11

We just got these across our cluster. 7 out of 10 data nodes dropped out of the cluster shortly after these exceptions.

(We are running marvel on the same cluster. ES 2.3.3)


(Dfaropennetwork) #12

Hi, i had the same problem, my solution:

~$ curl -u admin -XGET http://hsotname:9200/_cat/indices?v | grep marvel (or red or yellow)
red open .marvel-es-data-1 1 1
yellow open .marvel-es-1-2016.08.25 1 1 10837 1380 3.8mb 3.8mb
yellow open .marvel-es-1-2016.08.04 1 1 24934 1592 5.9mb 5.9mb
red open .marvel-es-1-2016.08.03 1 1

This is not necessary for my, i deleted

curl -u admin -XDELETE 'http://hostname:9200/.marvel-es-1-2016.08.03/'
{"acknowledged":true}

And I remove the replicas:

$ curl -u admin -XPUT 'hostname:9200/_settings' -d '
{
"index" : {
"number_of_replicas" : 0
}
}'

Now my cluster status is green.


(Seth S) #13

I've had the same issue, fix for me was as mentioned prior -- move your data to another location.. must be a perms issue?


(Jules Huang) #14

I have the same issue. I have three indices were marked as red .marvel-es-1-2017.03.04, .marvel-es-1-2017.03.05, and .marvel-es-1-2017.03.06. I deleted the first two with no issues. But when I deleted the first one, which is for today, one was auto created and still with red. Any idea how I can fix this?


(system) #15