Which Request for index level operations?


(Ivan Brusic) #1

Writing a plugin to refresh index settings (analyzer internals) and was
wondering what type of ActionRequest should I subclass.

BroadcastOperationRequests work at the shard level, which would be called
too many times for my request. SingleCustomOperation seems more
appropriate, but I was hoping to find a class with a name more like
IndexOperationRequest.

Cheers,
Ivan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #2

Have a look at the index refresh action implementation and you will see
that it's implemented by broadcasting the request to the shards of the
index.

What I do not understand, why should it be called "too many times"?

You have to refresh each shard, and that is exactly what a broadcast
operation will do. The naming "broadcast" may seem a little bit
unfortunate, "broadcast" means just simultaneous execution on nodes which
hold the relevant shards.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Ivan Brusic) #3

Thanks Jörg.

I see two conflicting statements in your response, which I already ran into
when going through the code:
"broadcasting the request to the shards of the index"
"simultaneous execution on nodes which hold the relevant shards"

Is the request executed once per node if that node has more than one shard
for the specific index? Or is the request executed once per shard (my
definition of "too many times").

Ultimately, I need to rethink my approach. I am converting a plugin that is
refreshing analyzers at a configured interval into one that us action
based. The approach is a bit different (plus, I haven't touched this code
in awhile!). Perhaps I need to think at the cluster level.

Cheers,

Ivan

On Tue, Sep 10, 2013 at 2:00 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Have a look at the index refresh action implementation and you will see
that it's implemented by broadcasting the request to the shards of the
index.

What I do not understand, why should it be called "too many times"?

You have to refresh each shard, and that is exactly what a broadcast
operation will do. The naming "broadcast" may seem a little bit
unfortunate, "broadcast" means just simultaneous execution on nodes which
hold the relevant shards.

Jörg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #4

The broadcast works by enumerating the shards, and yes, if a node hosts
more than one shard, a node has to execute multiple actions, one per shard.
You can't execute less actions. And executing actions on a node in parallel
is no real problem since we have multicore CPUs, thread pools, bandwidth on
I/O etc. plus the opportunity to add more nodes.

Not sure what you intend by refreshing analyzers. What is the challenge?
Reloading config files?

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Ivan Brusic) #5

I have several modified token filter factories that instantiate updated
filters. The contents are derived from a database instead of a flat file
and are refreshed at a set interval. I decide to change the update process
since refreshing periodically is not particularly efficient.

As of now, it is the filter factories' responsibility to update themselves,
but I am going to refactor the code so that the refresh refresh is
externalized. After review the code a bit more, I am going to create a
new NodesOperationRequest with the IndicesAnalysisService injected. Via
the IndicesAnalysisService, I should be able to find the appropriate filter
factories and send a refresh notice.

A NodesOperationRequest will send one request per node, correct? Started
down this path, but there is so much boilerplate code!

--
Ivan

On Tue, Sep 10, 2013 at 3:02 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

The broadcast works by enumerating the shards, and yes, if a node hosts
more than one shard, a node has to execute multiple actions, one per shard.
You can't execute less actions. And executing actions on a node in parallel
is no real problem since we have multicore CPUs, thread pools, bandwidth on
I/O etc. plus the opportunity to add more nodes.

Not sure what you intend by refreshing analyzers. What is the challenge?
Reloading config files?

Jörg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #6