ES 7.3 - Delete by Query - Deletes more than match clause

Hi all, I am using Elastic Stack to process some logs. There are some logs I dont need so decided to delete them. I deleted them using delete by query and setting the match for log.file.path.

Here is the curl setup I used:

curl -X POST "localhost:9200/filebeat-7.3.0-2020.02.02/_delete_by_query?pretty" -H 'Content-Type: application/json' -d'
{
  "query": {
    "match": {
      "log.file.path": "/var/log/nginx/Webserver1--website-45563-2-access.log"
    }
  }
}
'

I also tried:

curl -X POST "localhost:9200/filebeat-7.3.0-2020.02.02/_delete_by_query?q=log.file.path:/var/log/nginx/Webserver1--website-45563-2-access.log&pretty" 

I expected the documents with file.log.path = /var/log/nginx/Webserver1--website-45563-2-access.log to be deleted but instead it also deleted documents with log.file.path:

/var/log/nginx/Webserver1--website-45563-3-access.log
/var/log/nginx/Webserver1--website-45563-4-access.log
/var/log/nginx/Webserver1--website-45563-5-access.log
/var/log/nginx/Webserver1--website-45563-6-access.log
/var/log/nginx/Webserver1--website-45563-7-access.log

Not sure if I am doing something wrong or if im expecting the wrong behavior.

Any help is much appreciated.

Thank You.

Probably related to the analyzer used for the field.
Which one is set on that field?

Please format your code, logs or configuration files using </> icon as explained in this guide and not the citation button. It will make your post more readable.

Or use markdown style like:

```
CODE
```

This is the icon to use if you are not using markdown format:

There's a live preview panel for exactly this reasons.

Lots of people read these forums, and many of them will simply skip over a post that is difficult to read, because it's just too large an investment of their time to try and follow a wall of badly formatted text.
If your goal is to get an answer to your questions, it's in your interest to make it as easy to read and understand as possible.
Please update your post.

Thank you for the reply and feedback about the formatting. I have now fixed the fomatting issues.

Please can you tell me how to find out which one is used for the field?

Thank You.

Have a look at the mapping.

Thank you for the reply. Here are the mappings:

first half:

{
  "filebeat-7.3.0-2020.02.02" : {
    "mappings" : {
      "properties" : {
        "@timestamp" : {
          "type" : "date"
        },
        "@version" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "agent" : {
          "properties" : {
            "ephemeral_id" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "hostname" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "id" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "type" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "version" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            }
          }
        },
        "cloud" : {
          "properties" : {
            "account" : {
              "properties" : {
                "id" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "availability_zone" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "image" : {
              "properties" : {
                "id" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "instance" : {
              "properties" : {
                "id" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "machine" : {
              "properties" : {
                "type" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "provider" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "region" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            }
          }
        },
        "cm" : {
          "properties" : {
            "client" : {
              "properties" : {
                "ip" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "date" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "http" : {
              "properties" : {
                "referer" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "request" : {
              "properties" : {
                "bytes" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                },
                "method" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                },
                "time" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "status" : {
              "properties" : {
                "code" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "user" : {
              "properties" : {
                "agent" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            }
          }
        },
        "ecs" : {
          "properties" : {
            "version" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            }
          }
        },
        "event" : {
          "properties" : {
            "dataset" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }

second half:

},
"module" : {
  "type" : "text",
  "fields" : {
    "keyword" : {
      "type" : "keyword",
      "ignore_above" : 256
    }
  }
},
"timezone" : {
  "type" : "text",
  "fields" : {
    "keyword" : {
      "type" : "keyword",
      "ignore_above" : 256
    }
  }
}
  }
},
"fileset" : {
  "properties" : {
"name" : {
  "type" : "text",
  "fields" : {
    "keyword" : {
      "type" : "keyword",
      "ignore_above" : 256
    }
  }
}
  }
},
"host" : {
  "properties" : {
"architecture" : {
  "type" : "text",
  "fields" : {
    "keyword" : {
      "type" : "keyword",
      "ignore_above" : 256
    }
  }
},
"containerized" : {
  "type" : "boolean"
},
"hostname" : {
  "type" : "text",
  "fields" : {
    "keyword" : {
      "type" : "keyword",
      "ignore_above" : 256
    }
  }
},
"id" : {
  "type" : "text",
  "fields" : {
    "keyword" : {
      "type" : "keyword",
      "ignore_above" : 256
    }
  }
},
"name" : {
  "type" : "text",
  "fields" : {
    "keyword" : {
      "type" : "keyword",
      "ignore_above" : 256
    }
  }
},
"os" : {
  "properties" : {
    "codename" : {
      "type" : "text",
      "fields" : {
        "keyword" : {
          "type" : "keyword",
          "ignore_above" : 256
        }
      }
    },
    "family" : {
      "type" : "text",
      "fields" : {
        "keyword" : {
          "type" : "keyword",
          "ignore_above" : 256
        }
      }
    },
    "kernel" : {
      "type" : "text",
      "fields" : {
        "keyword" : {
          "type" : "keyword",
          "ignore_above" : 256
        }
      }
    },
    "name" : {
      "type" : "text",
      "fields" : {
        "keyword" : {
          "type" : "keyword",
          "ignore_above" : 256
        }
      }
    },
    "platform" : {
      "type" : "text",
      "fields" : {
        "keyword" : {
          "type" : "keyword",
          "ignore_above" : 256
        }
      }
    },
    "version" : {
      "type" : "text",
      "fields" : {
        "keyword" : {
          "type" : "keyword",
          "ignore_above" : 256
        }
      }
    }
  }
}
  }
},
"input" : {
  "properties" : {
"type" : {
  "type" : "text",
  "fields" : {
    "keyword" : {
      "type" : "keyword",
      "ignore_above" : 256
    }
  }
}
  }
},
"log" : {
  "properties" : {
"file" : {
  "properties" : {
    "path" : {
      "type" : "text",
      "fields" : {
        "keyword" : {
          "type" : "keyword",
          "ignore_above" : 256
        }
      }
    }
  }
},
"offset" : {
  "type" : "long"
}
  }
},
"message" : {
  "type" : "text",
  "fields" : {
"keyword" : {
  "type" : "keyword",
  "ignore_above" : 256
}
  }
},
"service" : {
  "properties" : {
"type" : {
  "type" : "text",
  "fields" : {
    "keyword" : {
      "type" : "keyword",
      "ignore_above" : 256
    }
  }
}
  }
},
"tags" : {
  "type" : "text",
  "fields" : {
"keyword" : {
  "type" : "keyword",
  "ignore_above" : 256
}
  }
}
  }
}
  }
}

so log.file.path is a text field using the default analyzer.
At index time and search time, the text is analyzed and broken into several pieces of text. That's why it had match other files.
If you want to understand more about this, use the _analyze API.

If you want to do exact match, you can use instead log.file.path.keyword which is indexed without any analysis.

Hope this helps.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.