Hadoop and ES connectivity

Hi all,

ES is installed in one of my node and i am running a trail version of ES but i am unable to connect it to Hadoop.
My question is can we establish connectivity between ES and Hadoop in trail version.

ES-Hadoop should be able to connect to ES regardless of trial status or licensing, as long as it is a supported version of ES. Make sure that the version for ES-Hadoop matches as close as possible to the version of Elasticsearch you are using, and that if you are using any authentication features that you configure them properly in ES-Hadoop. Additionally, if you have any error messages, or exceptions to share, please post them here and we can take a look!

Hi James, Thanks for replying me,

These are my queries, all are working file, but i am unable run select statement and index was not created in elasticsearch

elasticsearch version - 6.3.2
Kibana Version - 6.3.2
es-hadoop Version - 6.3.2

es node - x.x.x.113
hive node - x.x.x.62

CREATE TABLE source(
POLID INT, NAME STRING )
ROW FORMAT DELIMITED FIELDS TERMINATED by ','
STORED AS TEXTFILE
LOCATION '/user/impu/csv/'

CREATE EXTERNAL TABLE source_test(
POLID INT, NAME STRING )
stored by 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES ('es.resource' = 'source/line',
'es.index.auto.create' = 'TRUE',
'es.nodes' = ' x.x.x.113',
'es.port' = '9200',
'es.nodes.discovery'='true',
'es.nodes.wan.only' ='false')

INSERT OVERWRITE TABLE source_test
SELECT POLID, NAME FROM source

select * from source_test

error after running select statement:

Bad status for request TFetchResultsReq(fetchType=0, operationHandle=TOperationHandle(hasResultSet=True, modifiedRowCount=None, operationType=0, operationId=THandleIdentifier(secret='\xce\x8aP\x16\xf7\xf8K\x9e\xb3\x14\xf4"\xc2K\x13-', guid='\xfeF\x92\xae\x86\xe6F\xff\x8d.A\x15@X\xbc\xf9')), orientation=4, maxRows=100): TFetchResultsResp(status=TStatus(errorCode=0, errorMessage='java.io.IOException: org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Index [source/line] missing and settings [es.index.read.missing.as.empty] is set to false', sqlState=None, infoMessages=['*org.apache.hive.service.cli.HiveSQLException:java.io.IOException: org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Index [source/line] missing and settings [es.index.read.missing.as.empty] is set to false:25:24', 'org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:463', 'org.apache.hive.service.cli.operation.OperationManager:getOperationNextRowSet:OperationManager.java:294', 'org.apache.hive.service.cli.session.HiveSessionImpl:fetchResults:HiveSessionImpl.java:769', 'sun.reflect.GeneratedMethodAccessor37:invoke::-1', 'sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43', 'java.lang.reflect.Method:invoke:Method.java:498', 'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78', 'org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36', 'org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63', 'java.security.AccessController:doPrivileged:AccessController.java:-2', 'javax.security.auth.Subject:doAs:Subject.java:422', 'org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1917', 'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59', 'com.sun.proxy.$Proxy23:fetchResults::-1', 'org.apache.hive.service.cli.CLIService:fetchResults:CLIService.java:462', 'org.apache.hive.service.cli.thrift.ThriftCLIService:FetchResults:ThriftCLIService.java:694', 'org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1553', 'org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1538', 'org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39', 'org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39', 'org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56', 'org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286', 'java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1149', 'java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:624', 'java.lang.Thread:run:Thread.java:748', '*java.io.IOException:org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Index [source/line] missing and settings [es.index.read.missing.as.empty] is set to false:29:4', 'org.apache.hadoop.hive.ql.exec.FetchOperator:getNextRow:FetchOperator.java:508', 'org.apache.hadoop.hive.ql.exec.FetchOperator:pushRow:FetchOperator.java:415', 'org.apache.hadoop.hive.ql.exec.FetchTask:fetch:FetchTask.java:140', 'org.apache.hadoop.hive.ql.Driver:getResults:Driver.java:2069', 'org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:458', '*org.elasticsearch.hadoop.EsHadoopIllegalArgumentException:Index [source/line] missing and settings [es.index.read.missing.as.empty] is set to false:35:6', 'org.elasticsearch.hadoop.rest.RestService:findPartitions:RestService.java:238', 'org.elasticsearch.hadoop.mr.EsInputFormat:getSplits:EsInputFormat.java:412', 'org.elasticsearch.hadoop.hive.EsHiveInputFormat:getSplits:EsHiveInputFormat.java:113', 'org.elasticsearch.hadoop.hive.EsHiveInputFormat:getSplits:EsHiveInputFormat.java:50', 'org.apache.hadoop.hive.ql.exec.FetchOperator:getNextSplits:FetchOperator.java:363', 'org.apache.hadoop.hive.ql.exec.FetchOperator:getRecordReader:FetchOperator.java:295', 'org.apache.hadoop.hive.ql.exec.FetchOperator:getNextRow:FetchOperator.java:446'], statusCode=3), results=None, hasMoreRows=None)

Is that insert operation actually sending any data to Elasticsearch? Does it have counters populated from the connector in the job output in Hadoop?

yes, the source table has the data which is present i that csv table

No, I mean, on the Hive job that is executed to perform the insert, were there any job counters that were made available? Can you turn on trace logging in the org.elasticsearch.hadoop.rest.commonshttp package to see what is being sent to the Elasticsearch server?

Hi James,

Thank you, Issue got solved.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.