Deleting documents that are missing fields


(Jeff Dupont) #1

I can easily query for documents that are missing a particular term field,
however I'd like to free up that space and remove those documents. I've
tried this with no luck:

DELETE /my_index/pages/_search
{
"filter" : {
"missing" : {
"field" : "sentences",
"existence" : true,
"null_value" : true
}
}
}

It works fine to find them, but i can't find an easy way to remove them and
I have about 2million to remove as well.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c2d41bfb-145d-402e-a5aa-2f0329278bd9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Ivan Brusic) #2

I do not use delete by query, but have you tried using a fully formed query
and not just a filter? Perhaps an implicit match_all query is not being
set. Try using a filtered query with a match_all query and your filter.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-filtered-query.html

--
Ivan

On Fri, Jun 20, 2014 at 12:13 PM, Jeff Dupont jeff.dupont@gmail.com wrote:

I can easily query for documents that are missing a particular term field,
however I'd like to free up that space and remove those documents. I've
tried this with no luck:

DELETE /my_index/pages/_search
{
"filter" : {
"missing" : {
"field" : "sentences",
"existence" : true,
"null_value" : true
}
}
}

It works fine to find them, but i can't find an easy way to remove them
and I have about 2million to remove as well.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c2d41bfb-145d-402e-a5aa-2f0329278bd9%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c2d41bfb-145d-402e-a5aa-2f0329278bd9%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCguLamXCnrtV-bA-Ed03pGdB%2BVMrAt5-CYkqkvfnDaGw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Jeff Dupont) #3

I tried the filter on the match_all and it doesn't seem to find the results.

DELETE /my_index/pages/_search
{
"filtered": {
"match_all": {},
"filter": {
"missing": {
"field": "sentences",
"existence": true,
"null_value": true
}
}
}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b9944c98-88c0-41f8-9dfa-fd268353caad%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jeff Dupont) #4

I finally got it!! Thanks to this
thread https://groups.google.com/forum/#!topic/elasticsearch/Eb5ERjHXp4Y.
The syntax has changed where it now requires that the whole be wrapped by a
query property.

"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"missing": {
"field": "sentences",
"existence": true,
"null_value": true
}
}
}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/01502f85-eaf4-4433-9af4-553746cd641c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Ivan Brusic) #5

Yes, that was what I meant by fully formed query. I leave nothing to chance.

--
Ivan
On Jun 21, 2014 8:04 AM, "Jeff Dupont" jeff.dupont@gmail.com wrote:

I finally got it!! Thanks to this thread
https://groups.google.com/forum/#!topic/elasticsearch/Eb5ERjHXp4Y. The
syntax has changed where it now requires that the whole be wrapped by a
query property.

"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"missing": {
"field": "sentences",
"existence": true,
"null_value": true
}
}
}
}

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/01502f85-eaf4-4433-9af4-553746cd641c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/01502f85-eaf4-4433-9af4-553746cd641c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDn%2B4hb%3DnL-M6kvG7NRxr7x0oz4V61-GosZNUC3y77u6g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #6