Query Multiple/All Arbitrary-Named Fields


#1

Imagine this 'hypothetical' relational table:

doc         |     sentence            |    words
1                    1                      a,b,c
1                    2                     w,e,a,s
2                    1                    o,w,s,a,q,e,r
2                    2                     p,w,e,x,c,z
2                    3                    o,m,b,v,c,x

My aim is to be able to query perfect match some keywords to the words field and retrieve the doc number and sentence number. E.g. Retrieve entries which contain the word "o", and I would get doc 2, sentence 1 and doc 2 sentence 3. Now the reason for this being hypothetical is because of the variable number of words per sentence/document hence looking into 'schema-less' alternatives to achieve this.

From my limited ES knowledge, I thought the following structure might be good for organizing the data but not sure how query-friendly this is since when querying you would need to specify the field and in this case the Document and sentence names are variable (a doc can have 3 sentences, or 100 sentences etc) hence cannot use the querying a nested object(?). I came across the _all for query_string. Is this appropriate for the aim I'm after? Better alternatives maybe?

   Document1:
             Sentence1:
                       Words: [a,b,c]
             Sentence2:
                       Words: [w,e,a,s]
    Document2:
             Sentence1:
                       Words: [o,w,s,a,q,e,r]
             Sentence2:
                       Words: [p,w,e,x,c,z]
             Sentence3:
                       Words: [o,m,b,v,c,x]

(Isabel Drost-Fromm) #2

There's several ways for dealing with these kinds of relationships in Elasticsearch. You might be better off going through

https://www.elastic.co/guide/en/elasticsearch/guide/current/modeling-your-data.html

to figure out which is best for your problem/ query needs.

Hope this helps,
Isabel


(system) #3