I'm trying to figure out how to handle getting XML into Elasticsearch, and putting aside the namespace issue for a moment, I'm trying to understand what it would take to be able to search for a phrase that runs through an element or one that runs around one.
For example, this is the XML:
...on a cold winter day at the Times Institute see Section 32 - Work History, the defendant...
The need is to be able to search for "cold winter day" and "Times Institute" w/3 "defendant". Additionally, I need to be able to find timePeriod:winter w/5 "Times Institute".
So, I am trying to figure out what can and can't be done in Elasticsearch and what it will take in order to accomplish these things. I've read about the nested datatypes but those examples don't really talk about having text before and after the nested object, they're self-contained:
"user" : [
{
"first" : "John",
"last" : "Smith"
},
{
"first" : "Alice",
"last" : "White"
}
]
I'd like to figure out what structure or structures of JSON I need to transform the XML into in order to accomplish this and what the query will look like. Thanks!