_delete_by_query not working

Due to a miss setting, we have some docs in the ES that are GZ compressed. We fixed the file, and the info we need is loaded ok. But

we wish to delete the docs that are gz compressed.

I find the docs I want to delete. Ok no problem. The sources are message.gz file named, so we can key on the "source" of a name "*gz$"

GET filebeat-/_search
{
"query": {
"match": {
"source:" : "
.gz$"
}
}
}

But, when we try to delete the gz docs, it does not work.

"total": 0,
"deleted": 0,
? So is the query bad, or are we doing some thing not right for delete by query ? Help ?

POST filebeat-/_delete_by_query
{
"query": {
"match": {
"source:" : "
.gz$"
}
}
}

What version of ES and Kibana is being used? I ’m curious if the query actually matches anything. It seems “working” in your case but not any hits being returned by it? Can you share the what’s the response of the search? It looks like delete by query exists given the partial response that that is shown here.

Cheers
Rashmi

Hi,
My elasticseach is ver 5.6.4, my kibana is 5.6.4.

The search;
GET filebeat-/_search
{
"query": {
"match": {
"source:" : "
.gz$"
}
}
}
Returns;
{
"took": 46,
"timed_out": false,
"num_reduce_phases": 3,
"_shards": {
"total": 1161,
"successful": 1161,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}

The delete on query;
POST filebeat-/_delete_by_query
{
"query": {
"match": {
"source:" : "
.gz$"
}
}
}

Returns; Its the total and deleted zeros that make us think it did not work.
{
"took": 66,
"timed_out": false,
"total": 0,
"deleted": 0,
"batches": 0,
"version_conflicts": 0,
"noops": 0,
"retries": {
"bulk": 0,
"search": 0
},
"throttled_millis": 0,
"requests_per_second": -1,
"throttled_until_millis": 0,
"failures": []
}

Example of a source field, we have number of the sources with the YYYYMMDD format.
/var/log/archive/messages-20180228.gz

Kibana search ;
source:"*gz$"

And yes, the exclude is in place and working in the filebeat at this time.
exclude_files: [".gz$"]

Thanks JB

I think this is more of a ES discuss post. Will move it there.

Cheers
Rashmi

I see you have this:

POST filebeat-/_delete_by_query
{
  "query": {
    "match": {
      "source:" : ".gz$"
    }
  }
}

In my database I have a pattern like filebeat-YYYYMMDD, which is common for it. Can you try to do this:

POST filebeat-*/_delete_by_query
{
  "query": {
    "match": {
      "source:" : ".gz$"
    }
  }
}

NB: notice the star after filebeat-

I see the star, The star is in my Dev console, paste ( bad jb, bad jb ) error.
I did double check and try to run it just now.
Zero results.
=== My Try;
POST filebeat-*/_delete_by_query
{
"query": {
"match": {
"source:" : "/var/log/archive/message-20180302-1519961523-20180303-1520048583.gz"
}
}
}

==== Results;
{
"took": 63,
"timed_out": false,
"total": 0,
"deleted": 0,
"batches": 0,
"version_conflicts": 0,
"noops": 0,
"retries": {
"bulk": 0,
"search": 0
},
"throttled_millis": 0,
"requests_per_second": -1,
"throttled_until_millis": 0,
"failures": []
}

I do feel like I am missing some thing very easy, all comments are welcome. JB

I found the error.
"source" vs "source:", basic query error.

One side note, if a not existing field is used, very confusing. Double check your field names with care.

Two useful commands to see if the delete is working ok, edit the index name for your systems.

GET _tasks?detailed=true&actions=*/delete/byquery

GET _cat/indices/filebeat-2018.04.04?v&s=docs.count:desc

JB

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.