Is there any solution to do the “NOT IN” functionality in Elasticsearch?

Here is a simple but difficult question. I want to do an aggregation for a
query results that should be use*"NOT IN"* functionality like any RDBMS'
SQL.

For example, I want to do a job something like below.

curl -XGET http://localhost:9200/my_index/my_type/_search?pretty -d '{
"query": {
"filtered": {
"filter": {
!!! Documents whose 'user_id' field value is 'NOT IN' distinct user_ids where the 'action' field value is 'signup' !!!
}
}
},
"aggregations": {
"distinct_users":{
"cardinality": {
"field": "user_id",
"precision_threshold": 1000000
}
}
}
}'

Is it possible to get the results what I want in Elasticsearch?

Thanks in advance.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4aa88ff2-e0ae-45f3-85f4-9c6afba3842c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

It depends how your documents are modeled. If both the action and user id
fields are in the same document, then this could be a simple not>term
filter, but I'm afraid that what you are looking for is a join, which
elasticsearch does not support (well, it does, but only specific cases via
parent/child and nested docs, not general-purpose joins).

On Wed, Jan 7, 2015 at 11:43 AM, Ho-sang Jeon jhsbeat@gmail.com wrote:

Here is a simple but difficult question. I want to do an aggregation for a
query results that should be use*"NOT IN"* functionality like any RDBMS'
SQL.

For example, I want to do a job something like below.

curl -XGET http://localhost:9200/my_index/my_type/_search?pretty -d '{
"query": {
"filtered": {
"filter": {
!!! Documents whose 'user_id' field value is 'NOT IN' distinct user_ids where the 'action' field value is 'signup' !!!
}
}
},
"aggregations": {
"distinct_users":{
"cardinality": {
"field": "user_id",
"precision_threshold": 1000000
}
}
}
}'

Is it possible to get the results what I want in Elasticsearch?

Thanks in advance.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4aa88ff2-e0ae-45f3-85f4-9c6afba3842c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/4aa88ff2-e0ae-45f3-85f4-9c6afba3842c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6VeJZG_2qgs9H6TZpD2du8rRuy-G%2BYTBi6C4cc4uPPmw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Thanks Andrien Grand.

To clarify my quesion, I have added some example data below.

Here is an example data.

curl -s -XPOST 'localhost:9200/my_index/my_type/1' -d'{ "user_id": 1234, "action": "signup" }'
curl -s -XPOST 'localhost:9200/my_index/my_type/2' -d'{ "user_id": 1234, "action": "visit" }'
curl -s -XPOST 'localhost:9200/my_index/my_type/3' -d'{ "user_id": 1234, "action": "visit" }'
curl -s -XPOST 'localhost:9200/my_index/my_type/4' -d'{ "user_id": 5678, "action": "visit" }'
curl -s -XPOST 'localhost:9200/my_index/my_type/5' -d'{ "user_id": 5678, "action": "visit" }'
curl -s -XPOST 'localhost:9200/my_index/my_type/6' -d'{ "user_id": 9012, "action": "signup" }'
curl -s -XPOST 'localhost:9200/my_index/my_type/7' -d'{ "user_id": 9012, "action": "visit" }'
curl -s -XPOST 'localhost:9200/my_index/my_type/8' -d'{ "user_id": 9012, "action": "visit" }'
curl -s -XPOST 'localhost:9200/my_index/my_type/9' -d'{ "user_id": 3456, "action": "visit" }'
curl -s -XPOST 'localhost:9200/my_index/my_type/10' -d'{ "user_id": 3456, "action": "visit" }'

What I really want to get is the "Documents whose user_id DOES NOT signed
up based on these log data". So, documents [4, 5, 9, 10] are the final
results what I want to get.

Is it possible to get the results what I want in Elasticsearch?
Thanks in advance.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c58d06b0-e775-4c35-be6a-40e9ea8c82cf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Not sure this really helps you but it might be easier and more reliable of
a search to do this as two separate queries the first would just be an agg
listing all distinct users and the second an agg listing users who have an
action of "signup"? and then just subtracting that list from the first.

On Wednesday, January 7, 2015 6:18:34 AM UTC-5, Ho-sang Jeon wrote:

Thanks Andrien Grand.

To clarify my quesion, I have added some example data below.

Here is an example data.

curl -s -XPOST 'localhost:9200/my_index/my_type/1' -d'{ "user_id": 1234, "action": "signup" }'
curl -s -XPOST 'localhost:9200/my_index/my_type/2' -d'{ "user_id": 1234, "action": "visit" }'
curl -s -XPOST 'localhost:9200/my_index/my_type/3' -d'{ "user_id": 1234, "action": "visit" }'
curl -s -XPOST 'localhost:9200/my_index/my_type/4' -d'{ "user_id": 5678, "action": "visit" }'
curl -s -XPOST 'localhost:9200/my_index/my_type/5' -d'{ "user_id": 5678, "action": "visit" }'
curl -s -XPOST 'localhost:9200/my_index/my_type/6' -d'{ "user_id": 9012, "action": "signup" }'
curl -s -XPOST 'localhost:9200/my_index/my_type/7' -d'{ "user_id": 9012, "action": "visit" }'
curl -s -XPOST 'localhost:9200/my_index/my_type/8' -d'{ "user_id": 9012, "action": "visit" }'
curl -s -XPOST 'localhost:9200/my_index/my_type/9' -d'{ "user_id": 3456, "action": "visit" }'
curl -s -XPOST 'localhost:9200/my_index/my_type/10' -d'{ "user_id": 3456, "action": "visit" }'

What I really want to get is the "Documents whose user_id DOES NOT signed
up based on these log data". So, documents [4, 5, 9, 10] are the final
results what I want to get.

Is it possible to get the results what I want in Elasticsearch?
Thanks in advance.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1cf8f178-08de-4e8b-943e-73f3e5ce8042%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.