Percolation while indexing to rolling indexes (logs)

Here's an interesting problem.

Our system is built on ElasticSearch, and is using the rolling indexing
technique - that is, we have an index per time period. This approach is
also know in this forum as "indexing logs".

We use the percolation feature of ElasticSearch, and are interested in
percolating while indexing. The problem with doing that in this setup is
percolator queries are registered against a specific index, and that's not
necessarily the index we are indexing to.

The immediate solution is to register all queries to all indexes we have in
the system, but that's just ridiculous, as queries are updated and removed
over time.

What we want to have is a pseudo index we can associate all queries with,
and load them from the percolator if an appropriate flag was set in the
percolation / indexing request. A wildcard-based solution could work as
well.

I'm looking at this line:

indexName there should be set to that psuedo-index all queries are
registered against.

There may be better ways to do that - we would appreciate feedback from ES
devs and community. If this is the best way to go, we would be happy to
provide a pull request adding that feature.

Itamar.

Hey,

Since every index have the same mapping, I would suggest to create an empty
index and to register your percolator queries against this index.
Instead of adding percolate parameter to the index process, you could
simply query your percolator index to check if your document match one or
many queries

But anyways, there is no way to percolate and index in the same time with
rolling indexes.

Le lundi 28 janvier 2013 07:44:22 UTC-5, Itamar Syn-Hershko a écrit :

Here's an interesting problem.

Our system is built on Elasticsearch, and is using the rolling indexing
technique - that is, we have an index per time period. This approach is
also know in this forum as "indexing logs".

We use the percolation feature of Elasticsearch, and are interested in
percolating while indexing. The problem with doing that in this setup is
percolator queries are registered against a specific index, and that's not
necessarily the index we are indexing to.

The immediate solution is to register all queries to all indexes we have
in the system, but that's just ridiculous, as queries are updated and
removed over time.

What we want to have is a pseudo index we can associate all queries with,
and load them from the percolator if an appropriate flag was set in the
percolation / indexing request. A wildcard-based solution could work as
well.

I'm looking at this line:
https://github.com/elasticsearch/elasticsearch/blob/ea9a4d70cf7140d6cf5c3c12e59fca0718164f4a/src/main/java/org/elasticsearch/index/percolator/PercolatorService.java#L126

indexName there should be set to that psuedo-index all queries are
registered against.

There may be better ways to do that - we would appreciate feedback from ES
devs and community. If this is the best way to go, we would be happy to
provide a pull request adding that feature.

Itamar.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

This is what we do now, but its twice the work. Hence my question about the
best way to make this possible while indexing, with rolling indexes as well.

On Tue, Jan 29, 2013 at 5:48 PM, Loïc Bertron loic.bertron@gmail.comwrote:

Hey,

Since every index have the same mapping, I would suggest to create an
empty index and to register your percolator queries against this index.
Instead of adding percolate parameter to the index process, you could
simply query your percolator index to check if your document match one or
many queries
Elasticsearch Platform — Find real-time answers at scale | Elastic

But anyways, there is no way to percolate and index in the same time with
rolling indexes.

Le lundi 28 janvier 2013 07:44:22 UTC-5, Itamar Syn-Hershko a écrit :

Here's an interesting problem.

Our system is built on Elasticsearch, and is using the rolling indexing
technique - that is, we have an index per time period. This approach is
also know in this forum as "indexing logs".

We use the percolation feature of Elasticsearch, and are interested in
percolating while indexing. The problem with doing that in this setup is
percolator queries are registered against a specific index, and that's not
necessarily the index we are indexing to.

The immediate solution is to register all queries to all indexes we have
in the system, but that's just ridiculous, as queries are updated and
removed over time.

What we want to have is a pseudo index we can associate all queries with,
and load them from the percolator if an appropriate flag was set in the
percolation / indexing request. A wildcard-based solution could work as
well.

I'm looking at this line: https://github.com/**
elasticsearch/elasticsearch/**blob/ea9a4d70cf7140d6cf5c3c12e59fca
0718164f4a/src/main/java/org/elasticsearch/index/
percolator/PercolatorService.**java#L126https://github.com/elasticsearch/elasticsearch/blob/ea9a4d70cf7140d6cf5c3c12e59fca0718164f4a/src/main/java/org/elasticsearch/index/percolator/PercolatorService.java#L126

indexName there should be set to that psuedo-index all queries are
registered against.

There may be better ways to do that - we would appreciate feedback from
ES devs and community. If this is the best way to go, we would be happy to
provide a pull request adding that feature.

Itamar.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi again,

I ended up with what I believe to be a simple and elegant solution,
although I might have gotten some of the naming wrong.

Here's a pull request:

What I did was to define a new "system index", which I named _global, and
registering queries against it will ensure they participate in all
percolation operations within the cluster along with queries registered
against the original query.

To make this work you need to define an index named "_global", and to give
it some mapping. Since this is intended to be used in clusters employing
the "rolling indices" pattern, it is safe to assume the mapping of _global
will match the mapping of any other index in the system, and this is what
did.

Comments welcome.

Itamar.

On Tue, Jan 29, 2013 at 6:10 PM, Itamar Syn-Hershko itamar@code972.comwrote:

This is what we do now, but its twice the work. Hence my question about
the best way to make this possible while indexing, with rolling indexes as
well.

On Tue, Jan 29, 2013 at 5:48 PM, Loïc Bertron loic.bertron@gmail.comwrote:

Hey,

Since every index have the same mapping, I would suggest to create an
empty index and to register your percolator queries against this index.
Instead of adding percolate parameter to the index process, you could
simply query your percolator index to check if your document match one or
many queries
Elasticsearch Platform — Find real-time answers at scale | Elastic

But anyways, there is no way to percolate and index in the same time with
rolling indexes.

Le lundi 28 janvier 2013 07:44:22 UTC-5, Itamar Syn-Hershko a écrit :

Here's an interesting problem.

Our system is built on Elasticsearch, and is using the rolling indexing
technique - that is, we have an index per time period. This approach is
also know in this forum as "indexing logs".

We use the percolation feature of Elasticsearch, and are interested in
percolating while indexing. The problem with doing that in this setup is
percolator queries are registered against a specific index, and that's not
necessarily the index we are indexing to.

The immediate solution is to register all queries to all indexes we have
in the system, but that's just ridiculous, as queries are updated and
removed over time.

What we want to have is a pseudo index we can associate all queries
with, and load them from the percolator if an appropriate flag was set in
the percolation / indexing request. A wildcard-based solution could work as
well.

I'm looking at this line: https://github.com/**
elasticsearch/elasticsearch/**blob/ea9a4d70cf7140d6cf5c3c12e59fca
0718164f4a/src/main/java/org/elasticsearch/index/
percolator/PercolatorService.**java#L126https://github.com/elasticsearch/elasticsearch/blob/ea9a4d70cf7140d6cf5c3c12e59fca0718164f4a/src/main/java/org/elasticsearch/index/percolator/PercolatorService.java#L126

indexName there should be set to that psuedo-index all queries are
registered against.

There may be better ways to do that - we would appreciate feedback from
ES devs and community. If this is the best way to go, we would be happy to
provide a pull request adding that feature.

Itamar.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.