Any tips? Span_near with nested documents


(Dusty OBrien) #1

Hi, I've tried searching, to no avail.

What I want to do is to find documents that contain nested documents, each with different properties, in-order.

E.g. I would like to find documents with nestedthings.myid=nested1 coming before nestedthings.myid=nested2, in the example below this would be document c1 but NOT c2.

NOTE: Eventually the nested documents will be complex documents. Each with perhaps a different schema. But for now they're trivially simple.

PUT span-test/container/c1
{
"myid": "c1",
"nestedthings": [
{
"myid": "nested1"
},
{
"myid": "nested2"
}
]
}

PUT span-test/container/c2
{
"myid": "c2",
"nestedthings": [
{
"myid": "nested2"
},
{
"myid": "nested1"
}
]
}

My understanding is that I want to use a span_near query to accomplish this the most efficiently. But I'm not wed to any particular approach.

NOTE: I'm using nested documents because they exist independently and eventually I'll have multiple properties under each and I want each nested-doc to act as a coherent whole. Anyways with span_near, all of the examples are processing single strings. I can't really serialize my nested documents into strings because of their eventual complexity.

Please help.

I'll reply here with index etc.


(Dusty OBrien) #2

INDEX

PUT span-test
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"container": {
"properties": {
"myid": {
"type": "string",
"index": "not_analyzed",
"index_options": "positions"
},
"nestedthings": {
"type": "nested",
"properties": {
"myid": {
"type": "string",
"index": "not_analyzed",
"index_options": "positions"
}
}
}
},
"_all": {
"enabled": true,
"index": "not_analyzed",
"analyzer": "keyword"
}
}
}
}

DATA

PUT span-test/container/c1
{
"myid": "c1",
"nestedthings": [
{
"myid": "nested1"
},
{
"myid": "nested2"
}
]
}

PUT span-test/container/c2
{
"myid": "c2",
"nestedthings": [
{
"myid": "nested2"
},
{
"myid": "nested1"
}
]
}

BASIC QUERIES

GET span-test/container/_search
{
"query": {"match_all": {}}
}
^^^^^ WORKS

GET span-test/container/_search
{
"query": {
"nested": {
"path": "nestedthings",
"query": {
"span_term": {
"nestedthings.myid": "nested1"
}
}
}
}
}
^^^^^ WORKS

GET span-test/container/_search
{
"query": {
"nested": {
"path": "nestedthings",
"query": {
"span_term": {
"nestedthings.myid": "nested2"
}
}
}
}
}
^^^^^ WORKS

GET span-test/container/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "nestedthings",
"query": {
"span_term": {
"nestedthings.myid": "nested1"
}
}
}
},
{
"nested": {
"path": "nestedthings",
"query": {
"span_term": {
"nestedthings.myid": "nested2"
}
}
}
}
]
}
}
}
^^^^^ KIND OF WORKS BUT RETURNS c2 (and c1) when I only want c1 (because of order)

GET span-test/container/_search
{
"query": {
"span_near": {
"clauses": [
{
"nested": {
"path": "nestedthings",
"query": {
"span_term": {
"nestedthings.myid": "nested1"
}
}
}
},
{
"nested": {
"path": "nestedthings",
"query": {
"span_term": {
"nestedthings.myid": "nested2"
}
}
}
}
],
"slop": 12,
"in_order": false
}
}
}
^^^^ DOESN'T WORK -- GIVES ERROR:
"error": {
"root_cause": [
{
"type": "query_parsing_exception",
"reason": "[nested] nested object under path [nestedthings] is not of nested type",
"index": "span-test",
"line": 4,
"col": 15
}
],

Reading the documentation, I get it that this isn't syntactically allowed. And so, I've tried various workarounds like wrapping the nested with span_multi, etc, taking me in some really weird places. Nothing really works.

Anyways, the above should illustrate what I really want to do.

Thanks!
Dusty


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.