Integrate ElasticSearch, Logstash and fscrawler

akshaybhuradia · June 23, 2020, 5:26am

I have meta data for document in database.
Example : filename, file description, file_tags and etc

Earlier i was using this data for search. elasticsearch--for search and logstash--for accessing data source(db --> table record)

Now i want to do also word search within document.

can i merge fscrawler with logstash

for example : add one more field content in es template and this column used by fscrawler.

dadoonet · June 23, 2020, 7:04am

You can use the FSCrawler REST API and the _simulate endpoint from any application you could write. Then merge the result with the data coming from your database.

Not sure if this is doable with LS though.

akshaybhuradia · June 23, 2020, 8:43am

i am using own upload server for docs

for example : (meta record for docs like doc_id, user_id, doc_name, file_description and tags) it is in one index "elsearch"

second index fscrawler(it has different fields) how do i relate this index with elsearch index.

combining one record file metadata correspond its file content.

Query : querystring like : "xyz"(this string represent file_description or content in file)

dadoonet · June 23, 2020, 9:00am

I'm saying that you can use the FSCrawler simulate endpoint to get the text from the binary. Once you have this content, you can update the existing JSon documents.

You will end up with documents looking like:

{
  "doc_id": "coming from your upload service",
  "user_id": "coming from your upload service",
  "doc_name": "coming from your upload service",
  "file_description": "coming from your upload service",
  "tags": ["coming from your upload service", "coming from your upload service"],
  "content": "This content is coming from FSCrawler simulate endpoint"
}

system · July 21, 2020, 9:00am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Indexing Word documents Elasticsearch	2	1489	April 10, 2019
ElasticSearch Indexing question Elasticsearch	22	3760	July 5, 2017
Index Db content and linked Filesystem content Elasticsearch	3	669	September 11, 2017
Search combined/JOIN indexes Elasticsearch	6	293	December 15, 2022
Ingesting documents (pdf, word, .txt) to elasticsearch Elasticsearch	31	38663	March 21, 2017

Integrate ElasticSearch, Logstash and fscrawler

Related topics