I understand you want to connect from with a Java application server to a
remote Elasticsearch Cluster?
Your "embedded node" startup ( I think you refer to the node client, not
the transport client) takes some seconds because the cluster discovery
takes time. After that discovery (the zen discovery has 5 sec timeout), the
node has become part of the cluster. In the "client mode" you refer to, the
node is being made invisible to the other cluster members.
Quick answer to your questions: 1) Yes and no, it depends what you want to
do. You are right, not every threadpool is required for client node
operation. 2) The number of modules and services is not proportional to the
memory resource consumption, and therefore, you can't avoid gc by just
disabling modules or services. Memory pressure arises when you start
indexing and searching.
If I understand your question correctly, you are wondering about why
discovery takes place and why a client takes by default a lot of
functionality covered in services and modules, even when all you want to do
There are several options:
design a Settings parameter for minimal resource setup (not the empty
setting, unfortunately there is no written guide I know for what is the
most minimal setting, so you have to study the guide)
connect with a TransportClient. The difference to the NodeClient is that
a transport client is designed for remote access, it has no direct access
to the cluster state objects (mapping, indices), it manages cluster
connections explicitly by network addressing, and it organizes node
failover in the background
use HTTP REST from Java, e.g. by using the Jest client
use my experimental websocket client
via bulk operations available, it requires the websocket transport
With HTTP REST and the websocket client, there is no startup, no cluster
membership, no discovery, no plugins, no services at all. So you have to
manage the submission of actions, the evaluation of responses, and the
failover of the node connections by yourself. This is how script languages
like Perl/Python/Ruby connect to ES.
On Wednesday, October 31, 2012 9:45:37 AM UTC+1, amjath khan wrote:
As part of my application deployment in an application server, I am
creating an embedded Node. This Node will be used to perform index related
operations from my web application. As per the log, it takes around 5-8
secs to create and start a node, which includes the other entities like
modules, thread pools, etc. 1) Are all these modules and services required
for the Node ?
2) Is there a way to control the modules being initialized at a Node ? For
example, I do not want to start gc, what should i do ?
The Node will be used only to perform insert/update/delete operations on
the indices. It is created in the client mode ( no data will get stored )