How to search indexes which created per day

Hi,
Now I plan to create new index every day, index name could be yyyymmdd. So
there could be 30 indexes n a month, which mechnisam should be use to
search across indexes For instance, I need to search "a" in first 10 days
indexes of a month.

  1. use _all to search all indexes, and in the query string to specify the
    date range. will it take more time to look up all the indexes?
  2. search multiple indexes in url, like
    http://ip:9200/20130701,20130702,.... , in this way, it may be not easy to
    present all the index name one by one?
  3. create alias for the days, but the search range is not static, so not
    easy to create the exact alias.

What's the optimized way to handle such kind of search?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Depending on the time-frame you are searching on, you can select multiple
indices as you mentioned in your point number 2. That's how it's done by
default in logstash/Kibana.
First method that you mentioned will be horribly inefficient and kind of
defeats the purpose of using rolling indices.

On Wed, Jul 3, 2013 at 7:57 AM, lijionly@gmail.com wrote:

Hi,
Now I plan to create new index every day, index name could be yyyymmdd. So
there could be 30 indexes n a month, which mechnisam should be use to
search across indexes For instance, I need to search "a" in first 10 days
indexes of a month.

  1. use _all to search all indexes, and in the query string to specify the
    date range. will it take more time to look up all the indexes?
  2. search multiple indexes in url, like
    http://ip:9200/20130701,20130702,.... , in this way, it may be not easy
    to present all the index name one by one?
  3. create alias for the days, but the search range is not static, so not
    easy to create the exact alias.

What's the optimized way to handle such kind of search?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Regards,
Abhijeet Rastogi (shadyabhi)
http://blog.abhijeetr.com

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thnaks,

If the index is too much, like 60 days, will be the performance impacted?
to search such a lot indexes.
Is there any better way to search in multiple days?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

For example, I need to search from date 2013-03-12T15:23:11 to date
2013-05-23T13:23:11, should I add all the indexes into search url like:
http://ip:9200/20130312,20130313....20130523/_search
And then in the JSON body, I need to set like this:
{
"query":{
"query_string":{
"query":"$1"
}
},
"filter":{
"range":{
"LogDate":{
"from":"2013-03-12T15:25:10",
"to":"2013-03-12T15:25:13"
}
}
}
}

Is it a efficient way?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Basically, your aim should be to reduce the number of shards you're
executing your query on. That's the reason, executing your query on all
indices will be a bad thing to do as you'll be executing your query on the
shards on which you know that the data surely doesn't exist. If you're
indices are daily indices, it's a perfectly fine way of doing it.

On Wed, Jul 3, 2013 at 1:30 PM, lijionly@gmail.com wrote:

For example, I need to search from date 2013-03-12T15:23:11 to date
2013-05-23T13:23:11, should I add all the indexes into search url like:
http://ip:9200/20130312,20130313....20130523/_search
And then in the JSON body, I need to set like this:
{
"query":{
"query_string":{
"query":"$1"
}
},
"filter":{
"range":{
"LogDate":{
"from":"2013-03-12T15:25:10",
"to":"2013-03-12T15:25:13"
}
}
}
}

Is it a efficient way?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Regards,
Abhijeet Rastogi (shadyabhi)
http://blog.abhijeetr.com

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thanks You for your answer.

Actually, I'm confused by the concepts of shard

  1. why there should be shards? I can understand replicas is for fail-over,
    what's shards for?
  2. Does one index just store in 1 shard? not separated into multiple shards?
  3. How much improvement that shards bring to es, since there are 5 shards
    in default. That could be not so much help.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

An index by default consists of 5 shards and one replica. Shard is a way of
dividing the data into multiple independent units. (Each shard is a lucene
index). Shards play a important role in horizontal scaling as it divides
your index data across multiple nodes. Keeping more shards in the beginning
gives you the flexibility of adding new nodes when required as number of
shards cannot be changed for a index once it's created. (Actually, that's
not entirely true as you could also create one more index and create a
common alias for both index & they'll both be equivalent).

When I said that you should try to execute your queries on least number of
shards, in a way I meant indices only. I said shards because it's the
actual thing that stores the indexed data (a lucene index) and hence being
more explicit.
Having too many shards on less number of nodes is also pointless as it
creates a overhead to maintain too many lucene indices on lesser number of
nodes.

On Thu, Jul 4, 2013 at 7:30 AM, lijionly@gmail.com wrote:

Thanks You for your answer.

Actually, I'm confused by the concepts of shard

  1. why there should be shards? I can understand replicas is for fail-over,
    what's shards for?
  1. Does one index just store in 1 shard? not separated into multiple shards?
  1. How much improvement that shards bring to es, since there are 5 shards

in default. That could be not so much help.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Regards,
Abhijeet Rastogi (shadyabhi)
http://blog.abhijeetr.com

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

So to specify the shard which index operation should be taken is benifit in
multiple nodes, but for single node, it's default using 5 shards. And it's
not neccessary to specify the shard?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.