I have a specific query where I will need to find phrases before others considering a start and end field representing seconds in a timeline.
The index mapping:
{
"thetext": {
"mappings": {
"thetext": {
"properties": {
"sentences": {
"type": "nested",
"properties": {
"end": {
"type": "double"
},
"speaker": {
"type": "integer"
},
"start": {
"type": "double"
},
"text": {
"type": "text"
}
}
}
}
}
}
}
}
Example of a doc:
{
"sentences": [
{
"sentenceId": 1,
"speaker": 1,
"start": 0.2,
"end": 0.9,
"text": "Hi"
},
{
"sentenceId": 2,
"speaker": 2,
"start": 1.5,
"end": 2.46,
"text": "Hello, good morning"
},
{
"sentenceId": 3,
"speaker": 1,
"start": 3.32,
"end": 6.88,
"text": "I wish to talk to your manager"
},
{
"sentenceId": 4,
"speaker": 2,
"start": 8.18,
"end": 23.27,
"text": "It won't be possible. He's not here."
},
{
"sentenceId": 5,
"speaker": 1,
"start": 23.51,
"end": 24.79,
"text": "Okay, thank you."
},
{
"sentenceId": 6,
"speaker": 2,
"start": 25.07,
"end": 25.51,
"text": "Bye bye"
}
]
}
What I want to search is, for example, docs that the phrase "thank you" occurs after "bye bye", but this can happen in different nested docs (that's why I must consider the start and end fields) OR in the same doc.
Is this possible using standard elasticsearch features? If not, would it be possible with a custom plugin?