How to index and store pdf file in elastic search using spring boot?

Hi David,
I am trying to extract the content that is stored under _source.
I came across GetSourceRequest in https://www.elastic.co/guide/en/elasticsearch/client/java-rest/master/java-rest-high-document-get-source.html
But it is no longer present in
org.elasticsearch.client.core
From where Can I import GetSourceRequest?

I believe this is a new one which will become available in the 7.7 version.
In the meantime you can use the GET document API and get the source with it.

See https://www.elastic.co/guide/en/elasticsearch/client/java-rest/7.x/java-rest-high-document-get.html#java-rest-high-document-get-response

Hi David,
I tried using Get API .When I am trying to convert into a file and trying to open ,it is showing as below and i am unable to open it.

Error
Failed to load PDF document.

Here is what I have tried:

@GetMapping("/getfile")
public void getdata() throws IOException
{
	
	String result="C:\\Users\\Documents\\pdfs\\x.pdf";
	GetRequest getRequest = new GetRequest("twitter","_doc", "56");
	GetResponse getResponse = client().get(getRequest, RequestOptions.DEFAULT);
	byte[] sourceAsBytes =getResponse.getSourceAsBytes();
 	File resultfile = new File(result);
    FileOutputStream fos = new FileOutputStream(resultfile);
    fos.write(sourceAsBytes);
    fos.flush();
    fos.close();	
}

I dont know where it is going wrong.

String sourceAsString = getResponse.getSourceAsString(); 

When I am trying to fetch source as string program terminates here.

You need to read the source, then access to the field where you stored the document in BASE64, then decode the BASE64 and generate the binary from this.

How to access the field?

Does this mean reading the source?

From https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-document-get.html

If you stored the field itself (in the mapping) the easiest way is probably (assuming a 7.6 version):

GetRequest getRequest = new GetRequest("twitter", "56"); 
request.fetchSourceContext(FetchSourceContext.DO_NOT_FETCH_SOURCE);
request.storedFields("message"); 
GetResponse getResponse = client.get(request, RequestOptions.DEFAULT);
String message = getResponse.getField("message").getValue();

If you did not:

GetRequest getRequest = new GetRequest("twitter", "56"); 
GetResponse getResponse = client.get(request, RequestOptions.DEFAULT);
if (getResponse.isExists()) {
    Map<String, Object> sourceAsMap = getResponse.getSourceAsMap();
    // This will contain your BASE64 content probably 
    String message = sourceAsMap.get("message");
}

Hi David,
I am using 7.4 version.When I tried using the above code

Map<String, Object> sourceAsMap = getResponse.getSourceAsMap(); 

I dont know why when I am trying to ftech anything in the form of string, program doesn't return anything.

So As you mentioned the different approach by using 7.6 ,I installed 7.6 version and when I am trying to use the new version , it is showing

master not discovered or elected yet, an election requires a node with id 
[eJcn6jOSRw65wy4liDC1VA], have discovered [{10INLPC0SP3TF}
{1Farf6X-R1u0LM6KNgh4Kw}{4zM2-RhjSwaQxt15w3To9A}{127.0.0.1}{127.0.0.1:9300}

How to fix the above problem?

7.4 or 7.6 should behave the same.

It's a totally different problem. Please open a new question to fix this specific problem. And share details about what you did exactly, including configuration files, logs...

If i am using the above function to fetch data in field message , it is showing

{
    "timestamp": "2020-03-24T05:46:40.284+0000",
    "status": 500,
    "error": "Internal Server Error",
    "message": "No message available",
    "path": "/es/getfile"
}

In Elasticsearch it is stored in below format

{
  "_index": "twitter",
  "_type": "_doc",
  "_id": "56",
  "_version": 1,
  "_seq_no": 0,
  "_primary_term": 1,
  "found": true,
  "_source": {
    "message": "JVBERi0xLjcNCg0KNCAwIG9iag0KPDwNCi9FIDYwNDQyDQovSCBbIDM......"
  }
}

Under field "message" pdf is stored.but if i am trying to fetch the field message ,it return null.
Does this have anything to do with version?

ok sure

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.