Here is a simple but difficult question. I want to do an aggregation for a
query results that should be use*"NOT IN"* functionality like any RDBMS'
SQL.
For example, I want to do a job something like below.
curl -XGET http://localhost:9200/my_index/my_type/_search?pretty -d '{
"query": {
"filtered": {
"filter": {
!!! Documents whose 'user_id' field value is 'NOT IN' distinct user_ids where the 'action' field value is 'signup' !!!
}
}
},
"aggregations": {
"distinct_users":{
"cardinality": {
"field": "user_id",
"precision_threshold": 1000000
}
}
}
}'
Is it possible to get the results what I want in Elasticsearch?
It depends how your documents are modeled. If both the action and user id
fields are in the same document, then this could be a simple not>term
filter, but I'm afraid that what you are looking for is a join, which
elasticsearch does not support (well, it does, but only specific cases via
parent/child and nested docs, not general-purpose joins).
On Wed, Jan 7, 2015 at 11:43 AM, Ho-sang Jeon jhsbeat@gmail.com wrote:
Here is a simple but difficult question. I want to do an aggregation for a
query results that should be use*"NOT IN"* functionality like any RDBMS'
SQL.
For example, I want to do a job something like below.
curl -XGET http://localhost:9200/my_index/my_type/_search?pretty -d '{
"query": {
"filtered": {
"filter": {
!!! Documents whose 'user_id' field value is 'NOT IN' distinct user_ids where the 'action' field value is 'signup' !!!
}
}
},
"aggregations": {
"distinct_users":{
"cardinality": {
"field": "user_id",
"precision_threshold": 1000000
}
}
}
}'
Is it possible to get the results what I want in Elasticsearch?
What I really want to get is the "Documents whose user_id DOES NOT signed
up based on these log data". So, documents [4, 5, 9, 10] are the final
results what I want to get.
Is it possible to get the results what I want in Elasticsearch?
Thanks in advance.
Not sure this really helps you but it might be easier and more reliable of
a search to do this as two separate queries the first would just be an agg
listing all distinct users and the second an agg listing users who have an
action of "signup"? and then just subtracting that list from the first.
On Wednesday, January 7, 2015 6:18:34 AM UTC-5, Ho-sang Jeon wrote:
Thanks Andrien Grand.
To clarify my quesion, I have added some example data below.
What I really want to get is the "Documents whose user_id DOES NOT signed
up based on these log data". So, documents [4, 5, 9, 10] are the final
results what I want to get.
Is it possible to get the results what I want in Elasticsearch?
Thanks in advance.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.