Can Elasticsearch Transport Client work without ping

manju.nirmal · January 29, 2016, 10:42am

Hello all,

I have a use case where Elasticsearch Transport Client Java API is used to connect a Hadoop cluster to a remote ES cluster. However, pinging is disabled on the Hadoop cluster as a security feature. Is this the reason why I am getting NoNodeAvailableException while trying to establish connection. Is it required to enable ping or is there any other way?

Thanks for your help.

Manju

costin · January 29, 2016, 10:48am

This forum is mainly about ES-Hadoop project; in your case it looks like you are using custom code. Not sure what you mean by 'pinging' but a quick inspection of the network connectivity and ports between your client and ES should sort things out.

manju.nirmal · January 29, 2016, 10:55am

I am using the custom node within an ES-Hadoop Project to retrieve index names and types because the index names and types are created dynamically on a daily basis. I went for this solution as dynamic reading is not supported by the Cascading API.

costin · January 29, 2016, 11:41am

What do you mean by "dynamic reading"? How does this work in your code? Can provide a quick, high-level function?

manju.nirmal · January 29, 2016, 1:06pm

I have a function that uses the transport client to connect to ES cluster to retrieve all the existing indices and the types in each indices for specific interval.

String[] allIndices = client.admin().cluster().prepareState().execute().actionGet().getState().getMetaData()
			.concreteAllIndices();
Collections.addAll(listOfIndices, allIndices);

Map<String, List<String>> indexWithTypes = new HashMap<String, List<String>>();

for (String eachIndex : listOfIndices) {
		List<String> typeNames = new ArrayList<String>();
ImmutableOpenMap<String, MappingMetaData> indexMapping = client.admin().cluster().prepareState().execute()
				.actionGet().getState().getMetaData().index(eachIndex).getMappings();
		Iterator<String> indexMappingIterator = indexMapping.keysIt();

while (indexMappingIterator.hasNext()) {
			String mappingsResponseKey = indexMappingIterator.next();
			String typeName = indexMapping.get(mappingsResponseKey).type();
			typeNames.add(typeName);
		}
indexWithTypes.put(eachIndex, typeNames);
return indexWithTypes;

Iterate through this using the cascading API to get data from ES to store it in corresponding folders in Hadoop:

String indexName = indexWithType.getKey();
List<String> indexTypes = indexWithType.getValue();

for (String indexType : indexTypes) {
		String hdfsPath = hdfsDir + "/" + indexName  + "/" + indexType;

The Tap functions are:

Tap esInTap = new EsTap(indexName + "/" + indexType, Fields.ALL);

Tap hdfsOutTap = new Hfs(new TextLine(new Fields("line")), hdfsPath, SinkMode.UPDATE);

costin · January 29, 2016, 6:12pm

Why don't you use an alias instead? You can run a cron job that every X hours/days takes your interval and updates the alias. Then your job would simply point to it - it will be only the alias that would change and your cascading job would remain the same.
From your code I can't see how your specific 'interval' is but ES-Hadoop can read from multiple or even all indices.

manju.nirmal · February 1, 2016, 2:02pm

Found this at the Elasticsearch Hadoop Cascading documentation:

My requirement is exactly that, runtime resolution of index names and types. That is why I have implemented the Transport client to retrieve the index names and types and then used them in the cascading workflow to save data into HDFS.

I am not sure how alias would help me, because although the index name prefixes remain the same, the types in each index are different. Moreover, I need to store them in correspondingly named folders in HDFS.

For example, if index-2015.12.12 has type1 and type2. The folder structure in HDFS should be hfdspath/index-2015.12.12/type1/docs, hfdspath/index-2015.12.12/type2/docs.

However if index-2015.12.13 has type1, type2 and type3, then they should be saved under hfdspath/index-2015.12.13/type1/docs, hfdspath/index-2015.12.13/type2/docs and hfdspath/index-2015.12.13/type3/docs

Topic		Replies	Views
Missing index issue in a cluster Elasticsearch es-hadoop	13	3190	August 31, 2017
NoNodeAvailableException when indexing from Hadoop Elasticsearch	9	552	July 6, 2017
Installing alongside Cassandra Elasticsearch	8	879	July 6, 2017
Elasticsearch - node client does not connect to cluster Elasticsearch	10	1730	July 6, 2017
TransportClient not connecting Elasticsearch	18	1476	July 6, 2017

Can Elasticsearch Transport Client work without ping

Related topics