Interacting with elasticsearch on EC2 via Java API


#1

Hi, I wrote a program for deleting records according to their timestamp from an elasticsearch node running on an EC2 machine. When I run the program from my local machine, I can connect to the ES node via public IP address and it works as expected. If I try to execute the same program from another EC2 instance within the same availability zone and security group I get an empty result of the execution.

This is how i connect to ES:

TransportClient transportClient = new TransportClient();
this.esClient = transportClient.addTransportAddress(new InetSocketTransportAddress(es_address, 9300));

This is the relevant part of code

FilterBuilder fb = new IndicesFilterBuilder(new RangeFilterBuilder("timestamp")
			.from(params.get("start_date"))
			.to(params.get("end_date"))
			.includeLower(false)
			.includeUpper(true), "test").noMatchFilter("none");	
		
		//search
		SearchResponse sr = this.esClient.prepareSearch()
				.setQuery(new MatchAllQueryBuilder())
				.setPostFilter(fb)
				.setScroll(new TimeValue(10000))
				.setSize(10000)//keep the scroll context alive for 10 seconds
				.get();
		
		LOG.info("Search request completed in : " + sr.getTook()  + " "+ sr.getHits().getTotalHits());
		
		//scroll the result
		boolean scroll = true;
		while(scroll)
		{
			for(SearchHit hit : sr.getHits().getHits())
			{
				bp.add(new DeleteRequest(hit.getIndex(), hit.getType(), hit.getId()));
			}
			sr = this.esClient.prepareSearchScroll(sr.getScrollId())
				.setScroll(new TimeValue(10000)) //keep the scroll context alive for 10 seconds
				.get();
			LOG.info("Search request completed in : " + sr.getTook());
			if (sr.getHits().getHits().length == 0) { //no more results
		        scroll = false; 
		    }
		}

According to the program output, I can connect to the node, but from my local machine the search request returns hits and then deletes them, instread from another EC2 instance the same search returns 0 documents. I am not using AWS plugin.
PS. I have a spark and a storm cluster running in the same VPC feeding correctly the same node.


(David Pilato) #2

I'm wondering if you actually hit the exact same cluster or by any chance another one?

So I would try from the application to print the cluster state for example and see if it's the same in both cases.


#3

No it is the same node.
this is from my local computer

08:50:53.823 [main] DEBUG org.elasticsearch.client.transport - [Anne-Marie Cortez] adding address [[#transport#-1][Matteos-MBP.local][inet[/52.18.78.197:9300]]]
08:50:53.950 [main] DEBUG org.elasticsearch.transport.netty - [Anne-Marie Cortez] connected to node [[#transport#-1][Matteos-MBP.local][inet[/52.18.78.197:9300]]]
08:50:54.121 [main] DEBUG org.elasticsearch.transport.netty - [Anne-Marie Cortez] connected to node [[analytics_node001][5z2etnq6QlCwzENDN9MoxQ][ip-172-31-23-179][inet[/52.18.78.197:9300]]{master=true}]

and this is from another ec2 instance

07:54:03.133 [main] DEBUG org.elasticsearch.client.transport - [Dragonwing] adding address [[#transport#-1][ip-172-31-24-41][inet[/52.18.78.197:9300]]]
07:54:03.204 [main] DEBUG org.elasticsearch.transport.netty - [Dragonwing] connected to node [[#transport#-1][ip-172-31-24-41][inet[/52.18.78.197:9300]]]
07:54:03.285 [main] DEBUG org.elasticsearch.transport.netty - [Dragonwing] connected to node [[analytics_node001][5z2etnq6QlCwzENDN9MoxQ][ip-172-31-23-179][inet[/52.18.78.197:9300]]{master=true}]

From within the ec2 VPC I used the private ip of the es node, but I get no hits also when using the public ip


#4

Ok, it was a very stupid mistake caused by the different system times between my pc and any remote EC2 instance


(system) #5