Is there any option to define multiple key value pairs based on delimeter while ingesting files through FSCrawler in Elasticsearch?

Nirmal_Raj · January 18, 2021, 6:06am

I'm trying to ingest multiple extension files like (.pdf, .doc, .docx, .csv) using FSCrawler in Elasticsearch . After ingesting, all the details inside the files mapped under one key value pair called content.

Is there any option or any other tool to ingest multiple extensions files with multiple key value pairs mapped based on delimeter (:, ; )?

dadoonet · January 20, 2021, 2:48pm

Welcome!

What do you mean? Do you have a concrete example of a document indexed by FSCrawler and what it should look like to fit your use case?

Nirmal_Raj · January 21, 2021, 4:52am

Hi David, Thanks for the reply!!!
Here is my Use-Case: Resume Analytics using Full Search Query in Elastic Search

I'm trying to import resumes (in different extensions - .pdf, .doc, .docx) into elastic search using FS Crawler.
Successfully ingested all the resume's into elastic search with FS Crawler.
Example: I'm trying to check the XXYY.pdf resume, while checking I have noticed that all the details ( name , mail-id, mobile number, experience , summary ) inside XXYY.pdf are mapped under one key "Content" .
I'm looking to parse the details separately as multiple keys for name, mail-id, mobile, experience and summary instead of having all the details in one key (content).
Is it possible, to parse the details in FS Crawler based on ( using FS Crawler?
Pls suggest, am I missing something? or do I need to parse the resume before ingesting into FS Crawler?

Index1091×779 26.1 KB

system · February 18, 2021, 4:53am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

dadoonet · April 2, 2021, 1:49pm

Sorry for the late response.

I see.

No. FSCrawler can not recognize and extract entities from a text.
That's a process you'd need to run on the content field.
May be you can use something like:

In an ingest pipeline and configure this pipeline in FSCrawler.

Topic		Replies	Views
Fscrawler creating custome mapping Elasticsearch	2	534	March 12, 2019
FSCrawler - Elastic Mapping Changes Elasticsearch	4	563	May 5, 2020
Does FSCrawler support chunking? Elastic Search crawler	8	264	October 4, 2024
Filesearch solution using ES 5.5.0 Elasticsearch	13	1791	August 30, 2017
ElasticSearch Indexing question Elasticsearch	22	3848	July 5, 2017

Is there any option to define multiple key value pairs based on delimeter while ingesting files through FSCrawler in Elasticsearch?

Related topics