Read Excel File and Push All Data To Elasticsearch

I have created program to read excel file using java and aspose. I am creating column as key and row as value in json object but if when I read all file row wise and then my json values gets overwrite by last row values. To overcome this I am reading excel row wise and after each row I am sending data to elasticsearch but this process takes around 2hrs for 50000 records. I want code which will read all excel file and push data to elasticsearch in single trip.

Below is my code for reference.

JSONObject obj = new JSONObject();
FileInputStream file = new FileInputStream(new File("C:\Users\c-ajitb\Desktop\TempFiles\op\TestExcel.xlsx"));

 	Workbook workbook = new Workbook(file);
	Worksheet worksheet = workbook.getWorksheets().get(0);
	Cells cells = worksheet.getCells();
	
	Range range = cells.getMaxDisplayRange();
	int totalColumns = range.getColumnCount();
	int totalRows = range.getRowCount();
	RowCollection rows = cells.getRows();
    
	int maxColumn = cells.getMaxDataColumn();
	int maxRows = cells.getMaxDataRow();
			
	SimpleDateFormat format = new SimpleDateFormat("yyyy/MM/dd HH:mm:ss");
	Date d1 = new Date();
			
	List<Object> colList = new ArrayList<Object>();
	for(int i = 0; i<=0 ; i++)
	{
		for (int j = 0; j < totalColumns; j++) 
	    {
			colList.add(cells.get(i, j).getValue());
	    }
	}
			
	for (int i = 1; i < rows.getCount(); i++) 
	{
		
	    for (int j = 0; j < totalColumns; j++) 
	    {
	      	obj.put(colList.get(j), cells.get(i, j).getValue());
	        System.out.print(cells.get(i, j).getValue() + "\t");
	    }
	    System.out.println("");
	    Date d = new Date();
	    obj.put("Action_Date", dateFormat.format(d));
	    
	    String jsonString  = obj.toJSONString();
	    HttpEntity entity = new NStringEntity(jsonString, ContentType.APPLICATION_JSON);
		
		Response indexResponse = restClient.performRequest(
				"PUT",
				"/indexname/indextype/"+i,
				Collections.<String, String>emptyMap(),
				entity);
				System.out.println(EntityUtils.toString(indexResponse.getEntity()));
				
				obj.clear();
	}

You should use the bulk API instead.

That will be much faster!

Thank you for providing solution. Got solution with bulk API.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.