Hello,
I'm lost with this question.
I'm playing with elasticsearch, I loaded apache logs from logstash, then I want to remove such logs and add it again, with a new logstash configuration.
For remove all the logs docs I'm using the next bash script:
function formar_bulk_delete
{
jq -M -c -r '{delete: .hits.hits[]|{_index: ._index, _id: ._id, _type: ._type}}' "$1"
}
i=$(date +%s)
output_search="search_$i.txt"
output_bulk="bulk_$i.txt"
curl -s -o $output_search -x '' \
"http://192.168.2.192:9200/_search?scroll=1m&size=100" -d '
{
"query": {
"match": {
"source": "httpd"
}
}
}
'
formar_bulk_delete $output_search > bulk_data_$i.txt
curl -s -o $output_bulk -x '' -XPOST "http://192.168.2.192:9200/_bulk" --data-binary @bulk_data_$i.txt
curl -s -x '' -XGET http://192.168.2.192:9200/_search/scroll -d '
{
"scroll": "1m",
"scroll_id": "'$id'"
}
' > $output_search
let i=$i+1
output_search="search_$i.txt"
output_bulk="bulk_$i.txt"
formar_bulk_delete $output_search > bulk_data_$i.txt
curl -o $output_bulk -s -x '' -XPOST "http://192.168.2.192:9200/_bulk" --data-binary @bulk_data_$i.txt
I'm not sure why it is not removing all the docs that I want, but, after exec such script, I get a error when I search a item that was removed:
curl -s -x '' http://192.168.2.192:9200/_search -d '{ "query": { "match": { "_id" : "AVMdWYmDtHY6VebGk5W5" } } }' > fallo_al_buscar_item_eliminado-resultado.json
jq '{"_shards failed": ._shards.failed,"_shards sucessful": ._shards.successful,"shard failure": ._shards.failures[0]}' fallo_al_buscar_item_eliminado-resultado.json
{
"_shards failed": 198,
"_shards sucessful": 1928,
"shard failure": {
"shard": 2,
"index": "logstash-2016.01.15",
"node": "trVvv6kYQUG6MrSagqFUgQ",
"reason": {
"type": "es_rejected_execution_exception",
"reason": "rejected execution of org.elasticsearch.transport.TransportService$4@7195089f on EsThreadPoolExecutor[search, queue capacity = 1000, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@24d59183[Running, pool size = 7, active threads = 7, queued tasks = 1000, completed tasks = 567730]]"
}
}
}
After such error, It will search ok
Do you have any tip ?
I have modified next parameter, so I can scroll from 400 to 400, I'm not sure how can I scroll + bulk on larger (1000 ? docs?)
threadpool.bulk.queue_size: 500
Thank you