How to debug org.elasticsearch.action.bulk .BulkRequestBuilder?


(Weldpua2008) #1

Hello,
Our developers are using elasticsearch Java bindings to query a Bulk elasticsearch API (org.elasticsearch.action.bulk .BulkRequestBuilder).
The 5-15% of the requests into a bulk request aren't processing on the production (stagging is ok).
How I can found the source of the problem? I tried to put the es.logger.level to INFO but it doesn't helps.

The code is similar to the following:

void executeBuilder(BulkRequestBuilder bulkRequestBuilder)
{
	if(bulkRequestBuilder.numberOfActions() == 0)
		return;
	
	++stats.flushes;
	BulkResponse response = bulkRequestBuilder.execute().actionGet();
	if (response.hasFailures()) {
              logger.warn("failed to execute" + response.buildFailureMessage());
          }
	
	for(BulkItemResponse resp : response){
		if(resp.isFailed()){
			stats.failed++;
		}else{
			stats.succeeded++;
		}
	}
}

the counter of stats.failed ~= 5-15% per bulk request.

PS:

curl http://localhost:9200/_nodes/_local?pretty=true

{
  "cluster_name" : "cluster",
  "nodes" : {
    "xxx" : {      
      "version" : "2.3.3",
      "build" : "218bdf1",      
      "settings" : {       
        "node" : {
          "data" : "true",        
          "master" : "true"
        },        
        "indices" : {"cache" : {"query" : {"size" : "5%"}}},                
        "bootstrap" : {
          "mlockall" : "true"
        },
        "foreground" : "false",
        "config" : {"ignore_system_properties" : "true"},        
      },
      "os" : { "refresh_interval_in_millis" : 1000, "allocated_processors" : 16 },
      "process" : {
        "refresh_interval_in_millis" : 1000,"id" : 45184,"mlockall" : true },
      "jvm" : {
        "pid" : 45184,
        "version" : "1.8.0_72-internal",
        "vm_name" : "OpenJDK 64-Bit Server VM",
        "vm_version" : "25.72-b05",
        "vm_vendor" : "Oracle Corporation",
        "start_time_in_millis" : 1470557125542,
        "mem" : {
          "heap_init_in_bytes" : 34359738368,
          "heap_max_in_bytes" : 34246361088,
          "non_heap_init_in_bytes" : 2555904,
          "non_heap_max_in_bytes" : 0,
          "direct_max_in_bytes" : 34246361088
        },
        "gc_collectors" : [ "ParNew", "ConcurrentMarkSweep" ],
        "memory_pools" : [ "Code Cache", "Metaspace", "Par Eden Space", "Par Survivor Space", "CMS Old Gen" ],
        "using_compressed_ordinary_object_pointers" : "false"
      },
      "thread_pool" : {        
        "bulk" : {
          "type" : "fixed",
          "min" : 16,
          "max" : 16,
          "queue_size" : 50
        }
      },     
      "plugins" : [ {
        "name" : "cloud-aws",
        "version" : "2.3.3",
        "description" : "The Amazon Web Service (AWS) Cloud plugin allows to use AWS API for the unicast discovery mechanism and add S3 repositories.",
        "jvm" : true,
        "classname" : "org.elasticsearch.plugin.cloud.aws.CloudAwsPlugin",
        "isolated" : true,
        "site" : false
      } ],
      "modules" : [ {
        "name" : "lang-expression",
        "version" : "2.3.3",
        "description" : "Lucene expressions integration for Elasticsearch",
        "jvm" : true,
        "classname" : "org.elasticsearch.script.expression.ExpressionPlugin",
        "isolated" : true,
        "site" : false
      }, {
        "name" : "lang-groovy",
        "version" : "2.3.3",
        "description" : "Groovy scripting integration for Elasticsearch",
        "jvm" : true,
        "classname" : "org.elasticsearch.script.groovy.GroovyPlugin",
        "isolated" : true,
        "site" : false
      }, {
        "name" : "reindex",
        "version" : "2.3.3",
        "description" : "_reindex and _update_by_query APIs",
        "jvm" : true,
        "classname" : "org.elasticsearch.index.reindex.ReindexPlugin",
        "isolated" : true,
        "site" : false
      } ]
    }
  }
}

(Jörg Prante) #2

Can you show the content of response.buildFailureMessage()? There you can read the error messages.

Also, check the ES cluster log files.


(Weldpua2008) #4

I can't see the response.buildFailureMessage() because the application code doesn't log it. It's a reason why I am searching another way to debug.

ES cluster log files
Are not helpful. I increased logvelel to INFO (?):

curl -XPUT http://Data01:9200/_cluster/settings -d '{
    "transient" : {
        "logger.level" : "INFO"
    }
}'

(Jörg Prante) #5

That is the only way to gain insight into failure messages from the client application perspective.


(Weldpua2008) #6

Thanks


(Weldpua2008) #7

Hi,
I faced with the following issues:

[2016-09-28 13:30:12,165][WARN ][action.bulk              ] [cl-v2ElasticData02] unexpected error during the primary phase for action [indices:data/write/bulk[s]], request [BulkShardRequest to [x-stats-agg-8b3c562a-5bb2-4680-8dcf-42f83dce30c7-569] containing [1999] requests]
[x-stats-agg-8b3c562a-5bb2-4680-8dcf-42f83dce30c7-569] IndexNotFoundException[no such index]
        at org.elasticsearch.cluster.routing.RoutingTable.shardRoutingTable(RoutingTable.java:108)
        at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase.doRun(TransportReplicationAction.java:461)
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
        at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase$2.onNewClusterState(TransportReplicationAction.java:547)
        at org.elasticsearch.cluster.ClusterStateObserver.waitForNextChange(ClusterStateObserver.java:146)
        at org.elasticsearch.cluster.ClusterStateObserver.waitForNextChange(ClusterStateObserver.java:98)
        at org.elasticsearch.cluster.ClusterStateObserver.waitForNextChange(ClusterStateObserver.java:90)
        at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase.retry(TransportReplicationAction.java:544)
        at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase$1.handleException(TransportReplicationAction.java:525)
        at org.elasticsearch.transport.TransportService$DirectResponseChannel.processException(TransportService.java:855)
        at org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:833)
        at org.elasticsearch.transport.DelegatingTransportChannel.sendResponse(DelegatingTransportChannel.java:68)
        at org.elasticsearch.transport.RequestHandlerRegistry$TransportChannelWrapper.sendResponse(RequestHandlerRegistry.java:146)
        at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase.finishAsFailed(TransportReplicationAction.java:730)
        at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase.doRun(TransportReplicationAction.java:656)
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
        at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:279)
        at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:271)
        at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:75)
        at org.elasticsearch.transport.TransportService$4.doRun(TransportService.java:376)
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

(system) #8