Installing alongside Cassandra

We are looking at using elastic search to index our data that we
currently store to Cassandra. I was wondering if there are any
concerns running elastic search on the same nodes that we use for
Cassandra?

Also which ports are required to be opened for proper communication
from node to node and client to node?

Anthony

On Mon, Oct 17, 2011 at 11:21 PM, Anthony Ikeda <anthony.ikeda.dev@gmail.com

wrote:

We are looking at using Elasticsearch to index our data that we
currently store to Cassandra. I was wondering if there are any
concerns running Elasticsearch on the same nodes that we use for
Cassandra?

Running them on the same machine is possible, but, they will affect each
other (IO, network, CPU).

Also which ports are required to be opened for proper communication
from node to node and client to node?

By default, elasticsearch will use post 9300 for node to node and Java API
communication, and port 9200 for HTTP endpoint.

Anthony

Thanks Shay. Also as for the ports, I set up a basic cluster in our dev
environment, as far as I know there are no blocked ports, but trying to
index any data leaves the Client hanging - no errors are reported thus the
question about what ports need to be opened.

Running locally works fine though.

Anthony

On Mon, Oct 17, 2011 at 4:55 PM, Shay Banon kimchy@gmail.com wrote:

On Mon, Oct 17, 2011 at 11:21 PM, Anthony Ikeda <
anthony.ikeda.dev@gmail.com> wrote:

We are looking at using Elasticsearch to index our data that we
currently store to Cassandra. I was wondering if there are any
concerns running Elasticsearch on the same nodes that we use for
Cassandra?

Running them on the same machine is possible, but, they will affect each
other (IO, network, CPU).

Also which ports are required to be opened for proper communication
from node to node and client to node?

By default, elasticsearch will use post 9300 for node to node and Java API
communication, and port 9200 for HTTP endpoint.

Anthony

Which client are you using? Have the different nodes found each other and
formed a cluster (you can see that in the logs, or the cluster state API)?

One of the reasons why it might "hang" (not really hang, but wait for a
timeout fo 1m) is if there aren't enough active shards for hte document to
be indexed. This can happen, for example, if you have 1 node, and set
number_of_replicas to 2 (3 copies), and then try and index a doc. By
default, it expects a quorum of shards to be active. See write consistency
in the index API docs:
Elasticsearch Platform — Find real-time answers at scale | Elastic.

On Tue, Oct 18, 2011 at 2:02 AM, Anthony Ikeda
anthony.ikeda.dev@gmail.comwrote:

Thanks Shay. Also as for the ports, I set up a basic cluster in our dev
environment, as far as I know there are no blocked ports, but trying to
index any data leaves the Client hanging - no errors are reported thus the
question about what ports need to be opened.

Running locally works fine though.

Anthony

On Mon, Oct 17, 2011 at 4:55 PM, Shay Banon kimchy@gmail.com wrote:

On Mon, Oct 17, 2011 at 11:21 PM, Anthony Ikeda <
anthony.ikeda.dev@gmail.com> wrote:

We are looking at using Elasticsearch to index our data that we
currently store to Cassandra. I was wondering if there are any
concerns running Elasticsearch on the same nodes that we use for
Cassandra?

Running them on the same machine is possible, but, they will affect each
other (IO, network, CPU).

Also which ports are required to be opened for proper communication
from node to node and client to node?

By default, elasticsearch will use post 9300 for node to node and Java API
communication, and port 9200 for HTTP endpoint.

Anthony

I using the Java API.

I seem to get these RemoteTransportExceptions:

org.elasticsearch.transport.RemoteTransportException: [Georgianna
Castleberry][inet[/10.130.202.34:9300]][indices/index/shard/index]

Caused by: org.elasticsearch.action.UnavailableShardsException:
[registry][4] [2] shardIt, [0] active : Timeout waiting for [1m], request:
index...

at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$3.onTimeout(
TransportShardReplicationOperationAction.java:455)195297
[main][au.com.ikeda.testing.ground.foundation.service.TestRegistryIndexer.testRegistryIndexer(
TestRegistryIndexer.java:44)] DEBUG
au.com.ikeda.testing.ground.foundation.service.TestRegistryIndexer -
Indexed [REG:e3a05447-d481-406a-a135-627c21d0c903]

at
org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(
InternalClusterService.java:305)

at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(
ThreadPoolExecutor.java:886)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:662)

It just seems to sit there and every now and then it will proceed but hang
again.

Anthony

On Mon, Oct 17, 2011 at 5:11 PM, Shay Banon kimchy@gmail.com wrote:

Which client are you using? Have the different nodes found each other and
formed a cluster (you can see that in the logs, or the cluster state API)?

One of the reasons why it might "hang" (not really hang, but wait for a
timeout fo 1m) is if there aren't enough active shards for hte document to
be indexed. This can happen, for example, if you have 1 node, and set
number_of_replicas to 2 (3 copies), and then try and index a doc. By
default, it expects a quorum of shards to be active. See write consistency
in the index API docs:
Elasticsearch Platform — Find real-time answers at scale | Elastic.

On Tue, Oct 18, 2011 at 2:02 AM, Anthony Ikeda <
anthony.ikeda.dev@gmail.com> wrote:

Thanks Shay. Also as for the ports, I set up a basic cluster in our dev
environment, as far as I know there are no blocked ports, but trying to
index any data leaves the Client hanging - no errors are reported thus the
question about what ports need to be opened.

Running locally works fine though.

Anthony

On Mon, Oct 17, 2011 at 4:55 PM, Shay Banon kimchy@gmail.com wrote:

On Mon, Oct 17, 2011 at 11:21 PM, Anthony Ikeda <
anthony.ikeda.dev@gmail.com> wrote:

We are looking at using Elasticsearch to index our data that we
currently store to Cassandra. I was wondering if there are any
concerns running Elasticsearch on the same nodes that we use for
Cassandra?

Running them on the same machine is possible, but, they will affect each
other (IO, network, CPU).

Also which ports are required to be opened for proper communication
from node to node and client to node?

By default, elasticsearch will use post 9300 for node to node and Java
API communication, and port 9200 for HTTP endpoint.

Anthony

How do you construct the client? Are you sure its connected to the cluster?
When you create the registry index, has it been properly allocated (check
the cluster state or health API).

On Tue, Oct 18, 2011 at 2:39 AM, Anthony Ikeda
anthony.ikeda.dev@gmail.comwrote:

I using the Java API.

I seem to get these RemoteTransportExceptions:

org.elasticsearch.transport.RemoteTransportException: [Georgianna
Castleberry][inet[/10.130.202.34:9300]][indices/index/shard/index]

Caused by: org.elasticsearch.action.UnavailableShardsException:
[registry][4] [2] shardIt, [0] active : Timeout waiting for [1m], request:
index...

at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$3.onTimeout(
TransportShardReplicationOperationAction.java:455)195297
[main][au.com.ikeda.testing.ground.foundation.service.TestRegistryIndexer.testRegistryIndexer(
TestRegistryIndexer.java:44)] DEBUG
au.com.ikeda.testing.ground.foundation.service.TestRegistryIndexer -
Indexed [REG:e3a05447-d481-406a-a135-627c21d0c903]

at
org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(
InternalClusterService.java:305)

at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(
ThreadPoolExecutor.java:886)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:662)

It just seems to sit there and every now and then it will proceed but hang
again.

Anthony

On Mon, Oct 17, 2011 at 5:11 PM, Shay Banon kimchy@gmail.com wrote:

Which client are you using? Have the different nodes found each other and
formed a cluster (you can see that in the logs, or the cluster state API)?

One of the reasons why it might "hang" (not really hang, but wait for a
timeout fo 1m) is if there aren't enough active shards for hte document to
be indexed. This can happen, for example, if you have 1 node, and set
number_of_replicas to 2 (3 copies), and then try and index a doc. By
default, it expects a quorum of shards to be active. See write consistency
in the index API docs:
Elasticsearch Platform — Find real-time answers at scale | Elastic.

On Tue, Oct 18, 2011 at 2:02 AM, Anthony Ikeda <
anthony.ikeda.dev@gmail.com> wrote:

Thanks Shay. Also as for the ports, I set up a basic cluster in our dev
environment, as far as I know there are no blocked ports, but trying to
index any data leaves the Client hanging - no errors are reported thus the
question about what ports need to be opened.

Running locally works fine though.

Anthony

On Mon, Oct 17, 2011 at 4:55 PM, Shay Banon kimchy@gmail.com wrote:

On Mon, Oct 17, 2011 at 11:21 PM, Anthony Ikeda <
anthony.ikeda.dev@gmail.com> wrote:

We are looking at using Elasticsearch to index our data that we
currently store to Cassandra. I was wondering if there are any
concerns running Elasticsearch on the same nodes that we use for
Cassandra?

Running them on the same machine is possible, but, they will affect each
other (IO, network, CPU).

Also which ports are required to be opened for proper communication
from node to node and client to node?

By default, elasticsearch will use post 9300 for node to node and Java
API communication, and port 9200 for HTTP endpoint.

Anthony

At the moment I have a Member variable of Client with a getClient private
method:

public class ElasticRegistryIndexer implements RegistryIndexer {

private Logger mLog = Logger.getLogger(getClass());


private Client mClient;



private Client getClient() {



    if (mClient == null) {

        Properties settingsMap = new Properties();

        settingsMap.put("cluster.name", "registry_foundation");

        settingsMap.put("client.transport.sniff", "true");



        Settings settings =

ImmutableSettings.settingsBuilder().put(settingsMap).build();

        mClient = new TransportClient(settings);



        ((TransportClient)mClient).addTransportAddress(newInetSocketTransportAddress(

"10.130.202.34", 9300));

        ((TransportClient)mClient).addTransportAddress(newInetSocketTransportAddress(

"10.130.202.35", 9300));

// ((TransportClient)mClient).addTransportAddress(new
InetSocketTransportAddress("192.168.202.235", 9300));

        ClusterHealthRequest healthRequest = newClusterHealthRequest();

        ClusterHealthResponse response =

        mClient.admin().cluster().health(healthRequest).actionGet();

        for (ClusterIndexHealth health :response) {

            mLog.debug(health.getIndex());

        }

    }

    return mClient;

}

I haven't ben able to find any suitable docs on how to manage the client
properly as yet.

Anthony

On Mon, Oct 17, 2011 at 8:57 PM, Shay Banon kimchy@gmail.com wrote:

How do you construct the client? Are you sure its connected to the cluster?
When you create the registry index, has it been properly allocated (check
the cluster state or health API).

On Tue, Oct 18, 2011 at 2:39 AM, Anthony Ikeda <
anthony.ikeda.dev@gmail.com> wrote:

I using the Java API.

I seem to get these RemoteTransportExceptions:

org.elasticsearch.transport.RemoteTransportException: [Georgianna
Castleberry][inet[/10.130.202.34:9300]][indices/index/shard/index]

Caused by: org.elasticsearch.action.UnavailableShardsException:
[registry][4] [2] shardIt, [0] active : Timeout waiting for [1m], request:
index...

at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$3.onTimeout(
TransportShardReplicationOperationAction.java:455)195297
[main][au.com.ikeda.testing.ground.foundation.service.TestRegistryIndexer.testRegistryIndexer(
TestRegistryIndexer.java:44)] DEBUG
au.com.ikeda.testing.ground.foundation.service.TestRegistryIndexer -
Indexed [REG:e3a05447-d481-406a-a135-627c21d0c903]

at
org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(
InternalClusterService.java:305)

at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(
ThreadPoolExecutor.java:886)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:662)

It just seems to sit there and every now and then it will proceed but hang
again.

Anthony

On Mon, Oct 17, 2011 at 5:11 PM, Shay Banon kimchy@gmail.com wrote:

Which client are you using? Have the different nodes found each other and
formed a cluster (you can see that in the logs, or the cluster state API)?

One of the reasons why it might "hang" (not really hang, but wait for a
timeout fo 1m) is if there aren't enough active shards for hte document to
be indexed. This can happen, for example, if you have 1 node, and set
number_of_replicas to 2 (3 copies), and then try and index a doc. By
default, it expects a quorum of shards to be active. See write consistency
in the index API docs:
Elasticsearch Platform — Find real-time answers at scale | Elastic.

On Tue, Oct 18, 2011 at 2:02 AM, Anthony Ikeda <
anthony.ikeda.dev@gmail.com> wrote:

Thanks Shay. Also as for the ports, I set up a basic cluster in our dev
environment, as far as I know there are no blocked ports, but trying to
index any data leaves the Client hanging - no errors are reported thus the
question about what ports need to be opened.

Running locally works fine though.

Anthony

On Mon, Oct 17, 2011 at 4:55 PM, Shay Banon kimchy@gmail.com wrote:

On Mon, Oct 17, 2011 at 11:21 PM, Anthony Ikeda <
anthony.ikeda.dev@gmail.com> wrote:

We are looking at using Elasticsearch to index our data that we
currently store to Cassandra. I was wondering if there are any
concerns running Elasticsearch on the same nodes that we use for
Cassandra?

Running them on the same machine is possible, but, they will affect
each other (IO, network, CPU).

Also which ports are required to be opened for proper communication
from node to node and client to node?

By default, elasticsearch will use post 9300 for node to node and Java
API communication, and port 9200 for HTTP endpoint.

Anthony

ok, so you use the transport client (thats what I wanted to know, if its the
transport client or hte node client). The way you construct it looks good.

When you create the registry index, how many shards and how many replicas do
you have? How many nodes to you have in your cluster?

On Tue, Oct 18, 2011 at 2:39 AM, Anthony Ikeda
anthony.ikeda.dev@gmail.comwrote:

I using the Java API.

I seem to get these RemoteTransportExceptions:

org.elasticsearch.transport.RemoteTransportException: [Georgianna
Castleberry][inet[/10.130.202.34:9300]][indices/index/shard/index]

Caused by: org.elasticsearch.action.UnavailableShardsException:
[registry][4] [2] shardIt, [0] active : Timeout waiting for [1m], request:
index...

at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$3.onTimeout(
TransportShardReplicationOperationAction.java:455)195297
[main][au.com.ikeda.testing.ground.foundation.service.TestRegistryIndexer.testRegistryIndexer(
TestRegistryIndexer.java:44)] DEBUG
au.com.ikeda.testing.ground.foundation.service.TestRegistryIndexer -
Indexed [REG:e3a05447-d481-406a-a135-627c21d0c903]

at
org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(
InternalClusterService.java:305)

at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(
ThreadPoolExecutor.java:886)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:662)

It just seems to sit there and every now and then it will proceed but hang
again.

Anthony

On Mon, Oct 17, 2011 at 5:11 PM, Shay Banon kimchy@gmail.com wrote:

Which client are you using? Have the different nodes found each other and
formed a cluster (you can see that in the logs, or the cluster state API)?

One of the reasons why it might "hang" (not really hang, but wait for a
timeout fo 1m) is if there aren't enough active shards for hte document to
be indexed. This can happen, for example, if you have 1 node, and set
number_of_replicas to 2 (3 copies), and then try and index a doc. By
default, it expects a quorum of shards to be active. See write consistency
in the index API docs:
Elasticsearch Platform — Find real-time answers at scale | Elastic.

On Tue, Oct 18, 2011 at 2:02 AM, Anthony Ikeda <
anthony.ikeda.dev@gmail.com> wrote:

Thanks Shay. Also as for the ports, I set up a basic cluster in our dev
environment, as far as I know there are no blocked ports, but trying to
index any data leaves the Client hanging - no errors are reported thus the
question about what ports need to be opened.

Running locally works fine though.

Anthony

On Mon, Oct 17, 2011 at 4:55 PM, Shay Banon kimchy@gmail.com wrote:

On Mon, Oct 17, 2011 at 11:21 PM, Anthony Ikeda <
anthony.ikeda.dev@gmail.com> wrote:

We are looking at using Elasticsearch to index our data that we
currently store to Cassandra. I was wondering if there are any
concerns running Elasticsearch on the same nodes that we use for
Cassandra?

Running them on the same machine is possible, but, they will affect each
other (IO, network, CPU).

Also which ports are required to be opened for proper communication
from node to node and client to node?

By default, elasticsearch will use post 9300 for node to node and Java
API communication, and port 9200 for HTTP endpoint.

Anthony