Distinct key value pairs


(atoomkern) #1

I would like to be able to search all fields for a certain string and get
all distinct matching key value pairs as a result. It should also be
possible to add filters/queries to constrain the results. The original data
consists of millions of documents and a few thousand possible keys so I
simplified the data into the example below.

Mapping

{
"example_index": {
"example_type": {
"properties": {
"name": {"type":"string"},
"description": {"type":"string"},
"gender": {"type":"string"}
}
}
}
}

Data

{
"name": "John",
"job": "pilot",
"gender": "male"
},
{
"name": "Eric",
"job": "pilot",
"gender": "male"
},
{
"name": "Marie",
"job": "ceo",
"gender": "female"
}

Needed results

For example searching all fields for “ma” should return:

{
"gender": "male", (1x)
"gender": "female",
"name": "Marie"
}

Searching all fields for “ma” in combination with the query “job”: ”pilot”,
should return only:

{
"gender": "male" (1x)
}

Aggregation

With the aggregation framework this would partly be possible with the
following code:

{
"from" : "0",
"size" : "0",
"query": {
"match" : {
"job": "pilot"
}
},
"aggs" : {
"test" : {
"terms" : {
"field" : "_all",
"include" : ".ma."
}
}
}
}

But it only returns the unique values without the keys.

Does anyone have a suggestion to get the needed distinct key value pairs
with the aggregation framework or an other option in Elasticsearch?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/33dd2dda-ddb3-4b41-b7f1-fa62aee76ef9%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Binh Ly) #2

Not that I would recommend this for millions of documents, but you can
script the term values for the terms aggregation. For example, below will
produce a count of the distinct combinations of gender and job.

{
"aggs": {
"t1": {
"terms": {
"script": "doc['gender'].value + doc['job'].value"
}
}
}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0b8e9117-800c-4e1b-a916-f74127dcb377%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #3

Check if nested docs

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-nested-type.html

match your requirements.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoF7k2OOFWJ1YMR1fTo8R8C6bc4bWXHk5UBUJfzjg_q-Xw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #4