I am new to Elasticsearch and I am trying to achieve a text search functionality using Elasticsearch. I have over 100 documents and every document has lines starting with timestamp notations.
Eg.
00:00:00 - 00:01:00 This is first line
00:01:01 - 00:02:30 This is second line
00:02:30 - 00:03:45 This is third line
...
And so on.
I am splitting each of these lines into different paragraphs and performing a text search over the documents.
Now, I want to search by keyword wherein 1 or more keywords would be defined for let's say lines between timestamp 00:00:00 - 00:05:00. So based on keyword search, the entire data from 00:00:00 - 00:05:00 should be returned. As in all the lines in between these timestamps should be returned based on keyword search.
Can you please help me understand how to achieve this functionality using Elasticsearch?
We are getting text documents which are only timestamp and text detail. These files are actual speech text, hence contain timestamp (time without date) and space separated text.
So, I am splitting my data into time and message fields.
So my document having line 1 as:
00:00:00 - 00:01:00 This is first line
is split into 2 fields i.e.
time: 00:00:00 - 00:01:00
message: This is first line
This is exactly how I have the data in my documents.
Keyword: cricket, football
00:00:00 - 00:01:00 This is first line
00:01:01 - 00:02:30 This is second line
00:02:30 - 00:03:45 This is third line
00:03:45 - 00:05:00 This is fourth line
Keyword: tennis
00:05:00 - 00:06:55 This is fifth line
00:06:55 - 00:07:45 This is sixth line
...
So, I have 1 or more keywords(cricket, football) for a paragraph within time range 00:00:00 - 00:05:00 and based on keyword search(cricket) the entire paragraph should be returned.
Also, I am not sure how keyword will be stored, will it be separate table? how to define the relation?
We need to search text on basis of keyword and return data within the time range.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.