Cleaning out old data


(Johnathan Phan) #1

Hi There,

I am having issues cleaning out old data from my ES cluster.

I have installed Sense "Chrome plugin" to make requests to my cluster.

I am trying to delete documents between specific dates. I ahve 3 ES
clusters which all show up correctly in my bigdesk view.

The problem/issue I am having is that when I run the following Delete
request. The output sugests that the information was deleted.

DELETE /_all/_query
{
"filtered" : {
"query" : {
"match_all" : {}
},
"filter": {
"and": [
{
"range": {
"Time": {
"from": "20120101T000000",
"to": "20130901T000000"
}
}
},
{
"term": {
"Source": 1
}
}
]
}
}
}

I suspect the issue is down to the fact that I am running a ES cluster with
3 nodes. I hypothesis that the delete request was only performed on the
node the request went to. (The URL I am using is a DNS name that gets proxy
balanced via Apache to one of the three ES nodes)

Unless my understanding is incorrect, I thought Delete API would handle the
requests and send the delete requirements to all nodes to delete the
necessary entries in all indices, Since I used the _all indices. The reason
I got the hypothesis above from the fact that the indices listed in the
response are not all the indices on the system.

Can anyone offer any guidance to the request I need to construct to delete
entries before a specific time stamp?

Regards

John

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Johnathan Phan-2) #2

correction in the Json parsed

{
"filter": {
"and": [
{
"range": {
"Time": {
"from": "20120101T000000",
"to": "20120731T000000"
}
}
},
{
"term": {
"Source": 1
}
}
]
}
}

On Mon, Nov 25, 2013 at 1:29 PM, Johnathan Phan j.phan@ox-consulting.comwrote:

Hi There,

I am having issues cleaning out old data from my ES cluster.

I have installed Sense "Chrome plugin" to make requests to my cluster.

I am trying to delete documents between specific dates. I ahve 3 ES
clusters which all show up correctly in my bigdesk view.

The problem/issue I am having is that when I run the following Delete
request. The output sugests that the information was deleted.

DELETE /_all/_query
{
"filtered" : {
"query" : {
"match_all" : {}
},
"filter": {
"and": [
{
"range": {
"Time": {
"from": "20120101T000000",
"to": "20130901T000000"
}
}
},
{
"term": {
"Source": 1
}
}
]
}
}
}

I suspect the issue is down to the fact that I am running a ES cluster
with 3 nodes. I hypothesis that the delete request was only performed on
the node the request went to. (The URL I am using is a DNS name that gets
proxy balanced via Apache to one of the three ES nodes)

Unless my understanding is incorrect, I thought Delete API would handle
the requests and send the delete requirements to all nodes to delete the
necessary entries in all indices, Since I used the _all indices. The reason
I got the hypothesis above from the fact that the indices listed in the
response are not all the indices on the system.

Can anyone offer any guidance to the request I need to construct to delete
entries before a specific time stamp?

Regards

John

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/kcuaxGu8lFs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Johnathan Phan) #3

I have changed my filter as follows.

To test things.

DELETE /logstash-2013.07.31/_query
{
"filtered" : {
"query" : {
"match_all" : {}
},
"filter": {
"and": [
{
"range": {
"Time": {
"from": "20120101T000000",
"to": "20130901T000000"
}
}
},
{
"term": {
"Type": "Akamai"
}
}
]
}
}
}

I get the following output

{
"ok": true,
"_indices": {
"logstash-2013.07.31": {
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
}
}
}
}

No errors in elasticsearch logs. I don't know what the issue is. I don't
know why the documents are not being deleted.

I don't even want to put the type in.

I even tried

DELETE /logstash-2013.07.31/_query
{
"filtered" : {
"query" : {
"match_all" : {}
},
"filter": {
"range": {
"Time": {
"from": "20120101T000000",
"to": "20130901T000000"
}
}
}
}
}

and I get out

{
"ok": true,
"_indices": {
"logstash-2013.07.31": {
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
}
}
}
}

I can't see why the documents don't get deleted.

On Monday, November 25, 2013 1:29:55 PM UTC, Johnathan Phan wrote:

Hi There,

I am having issues cleaning out old data from my ES cluster.

I have installed Sense "Chrome plugin" to make requests to my cluster.

I am trying to delete documents between specific dates. I ahve 3 ES
clusters which all show up correctly in my bigdesk view.

The problem/issue I am having is that when I run the following Delete
request. The output sugests that the information was deleted.

DELETE /_all/_query
{
"filtered" : {
"query" : {
"match_all" : {}
},
"filter": {
"and": [
{
"range": {
"Time": {
"from": "20120101T000000",
"to": "20130901T000000"
}
}
},
{
"term": {
"Source": 1
}
}
]
}
}
}

I suspect the issue is down to the fact that I am running a ES cluster
with 3 nodes. I hypothesis that the delete request was only performed on
the node the request went to. (The URL I am using is a DNS name that gets
proxy balanced via Apache to one of the three ES nodes)

Unless my understanding is incorrect, I thought Delete API would handle
the requests and send the delete requirements to all nodes to delete the
necessary entries in all indices, Since I used the _all indices. The reason
I got the hypothesis above from the fact that the indices listed in the
response are not all the indices on the system.

Can anyone offer any guidance to the request I need to construct to delete
entries before a specific time stamp?

Regards

John

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Johnathan Phan) #4

For Future generations to come.

Ignore the examples online about the use of Time.

USe @Timestamp as the field to match as this in Kibana3/logstash 1.20.1 and
ES 0.90.4 is what the events are mapped against. Specificly be carful of
the time format in your script as well as it's case, space and character
sensitive.

DELETE /logstash-2013.07.31/_query
{
"filtered" : {
"query" : {
"match_all" : {}
},
"filter": {
"range": {
"@timestamp": {
"from": "2012-01-01T00:00:00.000Z",
"to": "2013-09-01T00:00:00.000Z"
}
}
}
}
}

On Monday, November 25, 2013 1:29:55 PM UTC, Johnathan Phan wrote:

Hi There,

I am having issues cleaning out old data from my ES cluster.

I have installed Sense "Chrome plugin" to make requests to my cluster.

I am trying to delete documents between specific dates. I ahve 3 ES
clusters which all show up correctly in my bigdesk view.

The problem/issue I am having is that when I run the following Delete
request. The output sugests that the information was deleted.

DELETE /_all/_query
{
"filtered" : {
"query" : {
"match_all" : {}
},
"filter": {
"and": [
{
"range": {
"Time": {
"from": "20120101T000000",
"to": "20130901T000000"
}
}
},
{
"term": {
"Source": 1
}
}
]
}
}
}

I suspect the issue is down to the fact that I am running a ES cluster
with 3 nodes. I hypothesis that the delete request was only performed on
the node the request went to. (The URL I am using is a DNS name that gets
proxy balanced via Apache to one of the three ES nodes)

Unless my understanding is incorrect, I thought Delete API would handle
the requests and send the delete requirements to all nodes to delete the
necessary entries in all indices, Since I used the _all indices. The reason
I got the hypothesis above from the fact that the indices listed in the
response are not all the indices on the system.

Can anyone offer any guidance to the request I need to construct to delete
entries before a specific time stamp?

Regards

John

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #5