I have a use case where I have to compare resume documents (ms word, pdf) against job description. Since resumes are highly unstructured documents, I am struggling to clean up the documents, removing invalid characters and create a json.
Then I came across Ingest Attachment plugin. My questions are -
- Is it possible to ingest the attachment directly from a physical drive location?
- The attachment to be ingested should always be Base64 encoded? How should I query the encoded attachment data with non-encoded query string?
Any help is appreciated.