Hey, I was wondering if you guys could help me out here.
The data I'm currently querying admits a nested field nodes
, eg:
{
"document_name": "Regulation 6966/76",
"nodes": [
{
"content": {
"industry": "finance",
"tags": ["TAG 01", "TAG 02", "TAG 03"],
...
}
}
]
...
}
I want to query the tags
field by confronting it with a supplied list of query_tags
.
If the intersection of the two lists is not empty, that node should be returned.
I wanted an exact match, so in the mappings I have:
"tags": {
"type": "keyword",
"index": "not_analyzed",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
Query:
GET documents/documents/_search
{
"query": {
"nested": {
"path": "nodes",
"query": {
"bool": {
"filter": [
{
"terms": {
"nodes.content.tags": ["TAG 01", "TAG 02"]
}
}
]
}
},
"inner_hits": {
"_source": "nodes.content.*"
}
}
}
}
The behavior of this query is highly unpredicted to an untrained eye like mine.
Sometimes it performs the expected task, sometimes not.
I've tried to investigate if the inconsistency was due to the white space but it does not seem to be.
I'm looking for two things:
- Is there something obviously wrong here.
- Suggestions on how to perform this task in any other way.
Thanks