Ingest pipeline->update by query updating only 999 documents?

POST /logdata/_update_by_query?pipeline=delete-referrer

Why this pipeline is updating only 999 documents at once?
and how to change it to whole index at once?

Thanks in advance

Update by query uses a scroll query behind the scenes and processes the results in batches. You can not perform the update in a single batch unless you have a very small index or number of matching documents. How is this causing a problem?

ohk thanks :slight_smile:

but 1 more thing here, i have around 1million record in my index, this pipeline deleted around .7 million record at 1st go and showed a time out error and after that this is scrolling 999 records per execution :-:thinking:

Then I guess you may need to tune the job and increase timeouts. Have a look in the docs for the options available. If you are deleting a majority of the data it can sometimes be easier to use the reindex API to copy the data you want to keep to a different index and then simply delete the original index.

i have tried creating an index (data for apache log) with below json

PUT /logdata
{
"settings" : {
"index": {
"number_of_shards" : 3 ,
"number_of_replicas" : 0
}
},
"mappings" : {
"doc" : {
"properties" : {
"@timestamp" : {
"type" : "date"
},
"@version" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"agent" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"auth" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"bytes" : {
"type" : "long"
},
"clientip" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"geoip" : {
"properties" : {
"city_name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"continent_code" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"country_code2" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"country_code3" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"country_name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"dma_code" : {
"type" : "long"
},
"ip" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"latitude" : {
"type" : "float"
},
"location" : {
"properties" : {
"lat" : {
"type" : "float"
},
"lon" : {
"type" : "float"
}
}
},
"longitude" : {
"type" : "float"
},
"postal_code" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"region_code" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"region_name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"timezone" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
},
"host" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"httpversion" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"ident" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"message" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"path" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"request" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"response" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"timestamp" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"verb" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}

i have removed the field referrer while creating a new index and then i am reindexing it using below command

POST _reindex
{
"source": {
"index": "logs"
},
"dest": {
"index": "logdata"
}
}

in this case the index is getting populated but again with referrer field as it is there in logs index?

where am i going wrong??

    "_index" : "logdata",
    "_type" : "doc",
    "_id" : "Hxh_amsBBJi38DQcDRuN",
    "_score" : 1.0,
    "_source" : {
      "request" : "/livesupport/index.php/chat/usertyping/113/c0ce00de8006ed5057014f402b18fcaaf298febf/true",
      "agent" : """"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36"""",
      "geoip" : {
        "timezone" : "Asia/Kolkata",
        "latitude" : 21.2333,
        "ip" : "122.175.132.60",
        "continent_code" : "AS",
        "city_name" : "Raipur",
        "country_code2" : "IN",
        "country_name" : "India",
        "country_code3" : "IN",
        "location" : {
          "lon" : 81.6333,
          "lat" : 21.2333
        },
        "region_name" : "Chhattisgarh",
        "postal_code" : "492001",
        "longitude" : 81.6333,
        "region_code" : "CT"
      },
      "auth" : "-",
      "ident" : "-",
      "verb" : "POST",
      "referrer" : """"https://orderhealth.in/livesupport/index.php/chat/readoperatormessage/(department)/1/(vid)/55fe5b63af565e94d762/(fullheight)/false/(vid)/55fe5b63af565e94d762?URLReferer=%2F%2Fwww.orderhealth.in%2Findex.php%3Froute%3Daccount%2Faccount&r=%2F%2Fwww.orderhealth.in%2Findex.php%3Froute%3Dcommon%2Fpopup_login&dt=My%20Account"""",
      "path" : "/usr/share/logstash/logs-data/log-file",
      "@timestamp" : "2019-06-18T12:10:39.705Z",
      "response" : "200",
      "bytes" : 22,
      "clientip" : "122.175.132.60",
      "@version" : "1",
      "host" : "ubu1604elk",
      "httpversion" : "1.1",
      "timestamp" : "12/Apr/2018:17:23:25 +0000"
    }

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.