Keyword search using Elastic search


I am new to Elasticsearch and I am trying to achieve a text search functionality using Elasticsearch. I have over 100 documents and every document has lines starting with timestamp notations.
00:00:00 - 00:01:00 This is first line
00:01:01 - 00:02:30 This is second line
00:02:30 - 00:03:45 This is third line
And so on.
I am splitting each of these lines into different paragraphs and performing a text search over the documents.

Now, I want to search by keyword wherein 1 or more keywords would be defined for let's say lines between timestamp 00:00:00 - 00:05:00. So based on keyword search, the entire data from 00:00:00 - 00:05:00 should be returned. As in all the lines in between these timestamps should be returned based on keyword search.

Can you please help me understand how to achieve this functionality using Elasticsearch?

Thanks in advance!!


What are the JSON document looking like?
I mean, are you splitting the line in multiple fields like start_date, end_date and message?

If you could share a sample JSON document that could be helpful to guide you.

Another question. Do you have a full timestamp like with the date or only the hour of the day?

Hi David,

We are getting text documents which are only timestamp and text detail. These files are actual speech text, hence contain timestamp (time without date) and space separated text.

So, I am splitting my data into time and message fields.
So my document having line 1 as:

00:00:00 - 00:01:00 This is first line

is split into 2 fields i.e.

time: 00:00:00 - 00:01:00 
message: This is first line  

This is exactly how I have the data in my documents.

Keyword: cricket, football

00:00:00 - 00:01:00 This is first line

00:01:01 - 00:02:30 This is second line

00:02:30 - 00:03:45 This is third line

00:03:45 - 00:05:00 This is fourth line

Keyword: tennis

00:05:00 - 00:06:55 This is fifth line

00:06:55 - 00:07:45 This is sixth line


So, I have 1 or more keywords(cricket, football) for a paragraph within time range 00:00:00 - 00:05:00 and based on keyword search(cricket) the entire paragraph should be returned.

Also, I am not sure how keyword will be stored, will it be separate table? how to define the relation?
We need to search text on basis of keyword and return data within the time range.


This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.