Index PDF in Elastic App Search

Hello,

I am using elastic App Search, and I want to index some PDF Documents with data such as title, writer, etc.

Is there any option for index them directly (I know FSCRAWLER for Elastic Search)??

My second option is to codify a Ruby program using gems that can make transform the pdf data in a Json document.

I think to use Apache tika but I do not know if there are better options for doing it.

Thank you for your help.

There's a PR coming to connect to Workplace Search. See

It should not be super hard to implement then a similar thing for AppSearch.

But do you really want AppSearch or Workplace Search ?

1 Like

Mi idea is index some PDFs and can search one using a label that says about the topic.

For example a library with a science book, If I interest in learn about frogs, I just want to look for documents with those labels.

My idea is a search in this library, trying to look for a document that has a specific label

Thank you very much.

I feel Workplace Search fitting more naturally to this use case.

1 Like

Yes, I think so, but workplace search needs a month suscriptition for using it so it is why I choose elastic app search.
I see your point and yes, I think Workplace could be more useful in this case.

Some Workplace Search capabilities are included in the Basic / Free license.

2 Likes

Hello,
I am trying to install WorkPlace search and it shows me the next message: Workplace Search requires Platinum features of the Elastic Stack. Starting a trial enables the full product functionality for 30 days. Learn more about Elastic Stack licenses.
So I understand it is not.
I am Installing it In an Ubuntu 20.04, and I have the version 7.8.1

Thank you so much

How did you install the stack and Enterprise Search and where does that message come from?

1 Like

I have installed it local in an ubuntu 20.04.
I run ElasticSearch, and I run Enterprise search.
when I log in in enterprise search it shows me the two options:

That is a link that shows my computer screen

Thank you very much

Version 7.8.1 ... hmmm Any chance you can try a clean install with 7.9.2?

I try so.

Thank you and I write my experiences with it them jejeje

Hello,

I figured out how to run workplace search in ubuntu, son now I want to index some PDF, if I use FSCRAWLER, I will need a windows machine so I need to reach my Ubuntu machine from outside it.

My second option is using other source like google drive or one drive.

Thank you very much

Why this?

1 Like

because as I know FSCRAWLER doesn´t work in linux.

I was be able to run it in windows.

what I saw yesterday was workplace-search-ruby-master, that I can use for index some documents, but what I see is that I can´t index PDFs directly, I need to parser them to json first.

The best solution for my project is that I can index PDFs directly, and create a field "label", I don't know if is it possible.

Thank you very much

Source?

Sorry but I think this is wrong.

ok I will try to use it and I tell you what I get.

Thank you very much

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.