Fetching data from elastic search using java code but performance is slow


(Pravin) #1

Hi,

I implement below code for fetching data from Elastic Search, but performance is very slow.
Please check below code and please improve the solution

package com.cemciq.test;

import java.io.BufferedInputStream;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

import org.apache.http.HttpEntity;
import org.apache.http.HttpHost;
import org.apache.http.entity.ContentType;
import org.apache.http.nio.entity.NStringEntity;
import org.apache.lucene.search.Query;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.Response;
import org.elasticsearch.client.RestClient;
import org.json.JSONArray;
import org.json.JSONException;
import org.json.JSONObject;

import com.cemciq.Constents;
import com.cemciq.solr.client.CemSolrClient;
import com.cemciq.util.CemException;

public class TestClass {
	public static void main(String[] args) throws CemException, IOException, JSONException {

	
		TestClass.getCIQMatchedCompanyListForSorted1();
		
	}
	
public static void getCIQMatchedCompanyListForSorted1() throws CemException, IOException, JSONException {
		
	
		Map<String, String> paramMap = new HashMap<String, String>();
		List testList=new ArrayList();
		SearchResponse response1;
		//Client client; 
		JSONArray hitsArray;
		Query query;
		RestClient restClient = RestClient.builder(new HttpHost("172.21.153.176", 9200, "http")).build();
		String coalitionId="";
		
		
		String abc="12 WEST CAPITAL MANAGEMENT LP";
		
		//paramMap.put("minimum_should_match", "90");
		//paramMap.put("pretty", "true");
		
		HttpEntity entity = new NStringEntity(
				"{\n" +
		
						"    \"query\" : {\n" +
							
						"	\"bool\" : {\n"+
								"	\"minimum_should_match\" : \"10%\","+
								
								"    \"must\": [ \n"+
										"{ \n"+
										
											"  \"match_phrase\" : { \n"+
											"  \"bank_client_entity_name.keyword\" : { \n"
											+ " \"query\" : \"12 WEST CAPITAL MANAGEMENT LP\" } \n"+
											"} \n"+
								
											"} \n"+
										"] \n"+	
						
						          "} \n"+
						
						       "}, \n"+ 
						" \"size\" : 10 \n"+
						"}", ContentType.APPLICATION_JSON);
		
	
		Response response = restClient.performRequest("GET", "/hist_latest_5.5.0/_search",Collections.<String, String>emptyMap(),
				entity);
		BufferedInputStream br = new BufferedInputStream(response.getEntity().getContent());
		
		String res = "";
		while (br.available()>0) {
	  		res += (char)br.read();
	  	}
		
		 br.close();
		 
		
		 
	    JSONObject json = new JSONObject(res);
		JSONObject hits = json.getJSONObject("hits");
		 hitsArray = hits.getJSONArray("hits");
		ArrayList<Object> coalitionList = new ArrayList<Object>();
		for (int i=0; i<hitsArray.length(); i++) {
			ArrayList<String> fileNameIdArrList = new ArrayList<String>();
		JSONObject h = hitsArray.getJSONObject(i);
		JSONObject sourceJObj = h.getJSONObject("_source");
		
		System.out.println("sourceJObj :"+i+" "+sourceJObj);
		
		
	}
		
		
	
}
	
	
	
	
}

(David Pilato) #2

Please format your code using </> icon as explained in this guide. It will make your post more readable.

Or use markdown style like:

```
CODE
```

I'm editing your code...

How do you know it's "slow"?

Can you add a trace before and after this line and print current time:

Response response = restClient.performRequest("GET", "/hist_latest_5.5.0/_search",Collections.<String, String>emptyMap(),
			entity);

(Pravin) #3

Hi,
Please check below code, we get data in 790 millisec, but i want in 300 to 400 millisec
So, please tell me possible or not and any another solution

Long startTimeForSolrSearch=0l;
Long endTimeForSolrSearch=0l;
startTimeForSolrSearch=0l;
endTimeForSolrSearch=0l;
startTimeForSolrSearch=Calendar.getInstance().getTimeInMillis();
TestClass.getCIQMatchedCompanyListForSorted1();

	endTimeForSolrSearch=Calendar.getInstance().getTimeInMillis();
	
    System.out.println("Time Taken for HIS by bankClientEntityId from Solr: "+(endTimeForSolrSearch-startTimeForSolrSearch) +" ms.");

(Christian Dahlqvist) #4

Why are you creating the client within the code you are benchmarking? Typically you would reuse an existing client and not do this per request.


(David Pilato) #5

Please again, format your code.
Also can you print what is the JSON response? I'd like to see the took value.


(Pravin) #6

Hi,

the above code is for demo purpose.
In actually i am creating one client for all record


(Pravin) #7

Hi,
Please check below output in json

OUTPUT:

sourceJObj :0 {"parent2_entity_sector":null,"period_name":"FY16","parent4_entity_region":null,"coalition_id":"342938","parent1_entity_country":null,"user_confirmation":"Verified","bank_id":12,"product_id":null,"cbcdid":null,"bank_client_entity_sector":null,"parent4_entity_country":null,"id":136008771,"qc_source":null,"parent2_entity_country":null,"parent3_entity_sector":null,"parent1_entity_region":null,"parent4_client_entity_id":null,"hdid":null,"region_id":null,"ra_comments":"this is RA","bank_client_entity_country":null,"listsource":"INST","parent1_client_entity_id":null,"parent3_entity_country":null,"coalition_name":"J. Goldman & Co","parent5_entity_region":null,"ultimate_id":"158358908","ciq_name":"12 West Capital Management LP","parent2_entity_region":null,"parent2_client_entity_id":null,"parent1_entity_name":null,"parent3_entity_name":null,"parent5_entity_country":null,"parent4_entity_name":null,"bank_client_entity_region":null,"parent5_entity_name":null,"parent4_entity_sector":null,"parent5_entity_sector":null,"user_verified_date":"2017-09-18T10:24:13.000Z","parent3_client_entity_id":null,"qc_comments":"test qc comment","bank_name":null,"@version":"1","parent2_entity_name":null,"ra_source":null,"ciq_id":"158358908","date_status":"2017-09-17T22:24:13.000Z","ultimate_parent":"12 West Capital Management LP","bank_client_entity_name":"12 WEST CAPITAL MANAGEMENT LP","parent5_client_entity_id":null,"bank_client_entity_id":null,"record_type":null,"parent3_entity_region":null,"@timestamp":"2017-09-20T10:53:10.068Z","timeperiod_id":16,"parent1_entity_sector":null,"ciq_status":"Perfect - UP Shortlisted - 1H17","ultimatepercent":null,"companycountry":null}
Time Taken for HIS by bankClientEntityId from Solr: 720 ms.


(David Pilato) #8

Not a valid search response. Sorry.

Can you print the full content of the response please?
took field should be there.

And please format your code


(system) #9

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.