Failing to search after removing docs

i5513 · February 26, 2016, 12:39pm

Hello,

I'm lost with this question.

I'm playing with elasticsearch, I loaded apache logs from logstash, then I want to remove such logs and add it again, with a new logstash configuration.

For remove all the logs docs I'm using the next bash script:

function formar_bulk_delete
{
        jq -M -c -r '{delete: .hits.hits[]|{_index: ._index, _id: ._id, _type: ._type}}' "$1"
}
i=$(date +%s)
output_search="search_$i.txt"
output_bulk="bulk_$i.txt"
    curl -s -o $output_search -x '' \
            "http://192.168.2.192:9200/_search?scroll=1m&size=100"  -d '
    {
            "query": {
                    "match": {
                            "source": "httpd"
                    }
            }
    }
    '
formar_bulk_delete $output_search > bulk_data_$i.txt
curl -s -o $output_bulk -x '' -XPOST "http://192.168.2.192:9200/_bulk" --data-binary @bulk_data_$i.txt
                curl -s -x '' -XGET http://192.168.2.192:9200/_search/scroll -d '
                {
                        "scroll": "1m",
                        "scroll_id": "'$id'"
                }
' > $output_search
let i=$i+1
output_search="search_$i.txt"
output_bulk="bulk_$i.txt"
formar_bulk_delete $output_search > bulk_data_$i.txt
curl -o $output_bulk -s -x '' -XPOST "http://192.168.2.192:9200/_bulk" --data-binary @bulk_data_$i.txt

I'm not sure why it is not removing all the docs that I want, but, after exec such script, I get a error when I search a item that was removed:

curl -s -x '' http://192.168.2.192:9200/_search -d '{ "query": { "match": { "_id" : "AVMdWYmDtHY6VebGk5W5" } } }' > fallo_al_buscar_item_eliminado-resultado.json
jq '{"_shards failed": ._shards.failed,"_shards sucessful": ._shards.successful,"shard failure": ._shards.failures[0]}' fallo_al_buscar_item_eliminado-resultado.json
{
"_shards failed": 198,
"_shards sucessful": 1928,
"shard failure": {
"shard": 2,
"index": "logstash-2016.01.15",
"node": "trVvv6kYQUG6MrSagqFUgQ",
"reason": {
"type": "es_rejected_execution_exception",
"reason": "rejected execution of org.elasticsearch.transport.TransportService$4@7195089f on EsThreadPoolExecutor[search, queue capacity = 1000, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@24d59183[Running, pool size = 7, active threads = 7, queued tasks = 1000, completed tasks = 567730]]"
}
}
}

After such error, It will search ok

Do you have any tip ?

I have modified next parameter, so I can scroll from 400 to 400, I'm not sure how can I scroll + bulk on larger (1000 ? docs?)
threadpool.bulk.queue_size: 500

Thank you

Mark_Harwood · February 26, 2016, 3:37pm

I'm not 100% certain what your script is doing but I can see that the error is that the server is overloaded with pending search requests (the queued tasks waiting for a free search thread is 1,000).

Generally in logging scenarios mass deletion of content is achieved by dropping whole indices rather than individual document deletes. This means that you organise log records into "time-based indices" (e.g. one per day) and use an alias to control what indices are seen as current. Old indices can then be deleted. Check out https://www.elastic.co/guide/en/elasticsearch/guide/master/time-based.html

i5513 · February 26, 2016, 9:40pm

Thank you!!

But, how can I see from where are such searchs??

Currently elasticsearch are only poblated by 5 or 6 servers which send its logs.

Nos I'm populating it with new logs , and removing it with scroll+bulk (my script do it) for test and for play with logstash configuration

My bulks have only 100 requests, how can I know on what are busy elasticsearch?

Thank you very much. Your help is appreciated

i5513 · February 26, 2016, 10:05pm

I can read how to know what is pending: https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-pending.html

Not sure if I can use such api when queue is full and not sure why 1000 pending tasks are there

Thank you!

Topic		Replies	Views
Can't execute bulk query Elasticsearch	7	425	July 6, 2017
Bluk data more than 10 will make data deleted with python libray Elasticsearch	5	550	July 6, 2017
Search returning deleted documents Elasticsearch	2	642	November 2, 2017
Deleted documents in search results Elasticsearch	6	4038	July 6, 2017
Delete documents in elastic search Elasticsearch	2	677	July 5, 2017

Failing to search after removing docs

Related topics