Read Excel File and Push All Data To Elasticsearch


(ajit) #1

I have created program to read excel file using java and aspose. I am creating column as key and row as value in json object but if when I read all file row wise and then my json values gets overwrite by last row values. To overcome this I am reading excel row wise and after each row I am sending data to elasticsearch but this process takes around 2hrs for 50000 records. I want code which will read all excel file and push data to elasticsearch in single trip.

Below is my code for reference.

JSONObject obj = new JSONObject();
FileInputStream file = new FileInputStream(new File("C:\Users\c-ajitb\Desktop\TempFiles\op\TestExcel.xlsx"));

 	Workbook workbook = new Workbook(file);
	Worksheet worksheet = workbook.getWorksheets().get(0);
	Cells cells = worksheet.getCells();
	
	Range range = cells.getMaxDisplayRange();
	int totalColumns = range.getColumnCount();
	int totalRows = range.getRowCount();
	RowCollection rows = cells.getRows();
    
	int maxColumn = cells.getMaxDataColumn();
	int maxRows = cells.getMaxDataRow();
			
	SimpleDateFormat format = new SimpleDateFormat("yyyy/MM/dd HH:mm:ss");
	Date d1 = new Date();
			
	List<Object> colList = new ArrayList<Object>();
	for(int i = 0; i<=0 ; i++)
	{
		for (int j = 0; j < totalColumns; j++) 
	    {
			colList.add(cells.get(i, j).getValue());
	    }
	}
			
	for (int i = 1; i < rows.getCount(); i++) 
	{
		
	    for (int j = 0; j < totalColumns; j++) 
	    {
	      	obj.put(colList.get(j), cells.get(i, j).getValue());
	        System.out.print(cells.get(i, j).getValue() + "\t");
	    }
	    System.out.println("");
	    Date d = new Date();
	    obj.put("Action_Date", dateFormat.format(d));
	    
	    String jsonString  = obj.toJSONString();
	    HttpEntity entity = new NStringEntity(jsonString, ContentType.APPLICATION_JSON);
		
		Response indexResponse = restClient.performRequest(
				"PUT",
				"/indexname/indextype/"+i,
				Collections.<String, String>emptyMap(),
				entity);
				System.out.println(EntityUtils.toString(indexResponse.getEntity()));
				
				obj.clear();
	}

How to push large excel data to elasticsearch using bulk API
(David Pilato) #2

You should use the bulk API instead.

That will be much faster!


(ajit) #3

Thank you for providing solution. Got solution with bulk API.


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.