Найти поля содержащие пустую строку и удалить

Igor_Motov · September 26, 2019, 2:27pm

С точки зрения индекса строка состоящая из сплошных шумовых слов и пустая строка - это одно и то же. Поэтому если это поле еще как-то не проиндексировано, но в индексе информации об этом отличии просто нет. Это значит, что нам в любом случае получать source для записей с пустыми шумовыми словами и проверять ее. То есть как-то так:

DELETE test


PUT test
{
  "mappings": {
    "properties": {
      "text": {
        "type": "text",
        "analyzer": "english"
      }
    }
  }
}

PUT test/_doc/only_stop_words
{
  "text": "to be or not to be"
}

PUT test/_doc/not_empty
{
  "text": "that is the question"
}

PUT test/_doc/empty
{
  "text": ""
}

PUT test/_doc/null
{
  "text": null
}

PUT test/_doc/does_not_exist
{

}


POST test/_update_by_query
{
  "script": {
    "source": """
      if(ctx._source.text.length() > 0) {
        ctx.op = "noop"
      } else {
        ctx._source.remove('text')
      }
      
      """,
    "lang": "painless"
  },
  "query": {
    "bool": {
      "filter": {
        "exists": {
          "field": "text"
        }
      },
      "must_not": {
        "wildcard": {
          "text": "*"
        }
      }
    }
  }
}

POST test/_search

Topic		Replies	Views
Check empty string in nested attribute Elasticsearch	6	7050	March 1, 2017
How to found empty field in elasticsearch Elasticsearch	3	1005	December 21, 2018
Пустые поля Вопросы на русском языке	7	620	October 30, 2020
Ingest Pipelines - drop empty string fields Elasticsearch	1	1035	October 15, 2019
Deleting documents that are missing fields Elasticsearch	5	2088	July 6, 2017

Найти поля содержащие пустую строку и удалить

Related topics