Searching for content in pdf and word documents

I have many word and pdf documents. I want to search for some content in these documents. For such an application, are elasticsearch and kibana used? If so, and I have the free trial (signed up yesterday), how do I ingest my documents, stored in google drive into elasticsearch and then search for content in the documents?

If you want to use Logstash then your files need to be in one of these plugins or would need to move them to a local drive.

https://www.elastic.co/guide/en/logstash/current/input-plugins.html

Then would need to install and use the attachment processor in order to index the documents.

https://www.elastic.co/guide/en/elasticsearch/plugins/current/ingest-attachment.html

Thanks @aaron_nimocks for a quick response. I have very basic questions.

  • Should I use logstash for this?
  • Using kibana, can I not point to the google docs where my pdf and word documents are stored?
  • If I were to use logstash, which of the plugins should I use if I have the document in google docs?
  • what is indexing documents? how do I go to cli to install the attachment processor? I have the 14 day cloud free trial and do see the kibana dashboard

I am a first time user (complete newbie) and got access to the 14 day trial just yesterday.

@tnkumar Indexing PDFs for searching isn't the easiest thing to start learning with. I am not even sure you can do this with a free trial of elastic cloud.

Can you explain what you are trying to accomplish? How many documents and what do you want to do with them? There might be easier solutions.

@aaron_nimocks - Thanks for quick response and questions.
Use case I am considering is as follows

  • There are 20 pdf documents and 20 word documents that describe procedures for the tasks in the office - e.g. for a medical receptionist at the front desk, what should I collect from patients when they come to a front desk?
  • When she types this question in search, it shows her the particular document and the chapter in the document where this procedure is described.

I just came across elastic workspace search. Wondering whether that may be appropriate for such a use case

@tnkumar Both products would work for this but Workplace Search is probably your best option since it can connect directly to Google Drive and requires little configuration and setup vs using ELK.

https://www.elastic.co/guide/en/workplace-search/current/workplace-search-google-drive-connector.html

1 Like

Thanks @aaron_nimocks - Is it possible to get started with Workplace search using the 14 day trial account? I do not see a thread specific to Enterprise search or workspace search. If such a thread exists, I can post my questions there.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.