Load cassandra data into elasticsearch

soujanya · January 7, 2016, 10:02am

Hi all,
My requirement is to load data from cassandra column family into elasticsearch.For that I have successfully install the plugins(mapper attachments and cassandra-river) required for cassandra integration into elasticsearch. And I tried to load data of cassandra into elasticsearch using curl as below
curl -XPUT 'localhost:9200/_river/prodinfo/_meta' -d '{
"type" : "cassandra",
"cassandra" : {
"cluster_name" : "test-cluster",
"keyspace" : "catalogks",
"column_family" : "info",
"batch_size" : 1000,
"hosts" : "host1:9161",
"username" : "username",
"password" : "password"
},
"index" : {
"index" : "prodinfo",
"type" : "product"
}
}'
The index is creating with the cassandra credentials which ever I have given in above,but there is no data and columns related to cassandra keyspace,column family. How should I acheive that. Any help would be greatly appreciated.

Thank you in advance.

dadoonet · January 7, 2016, 12:52pm

Rivers have been removed in elasticsearch 2.0 so you should not use them.

I'd try to inject data in Cassandra and Elasticsearch at the same time in your application layer.

soujanya · January 7, 2016, 12:55pm

Hi David,

Thank you for your prompt reply. But is there any other way to load cassandra data into elasticsearch. Please help me out in resolve this.

dadoonet · January 7, 2016, 12:58pm

I don't know any.

I'd read the data from Cassandra using my application layer and send JSON docs to ES then.

May be you can try a JDBC Driver for Cassandra and use JDBC input plugin from Logstash or jdbc importer?

soujanya · January 7, 2016, 1:02pm

Thank you for reply, I'll try using any driver for Cassandra.

ara · January 10, 2016, 12:15pm

Hi Soujanya,

you could use the datastax driver to connect to cassandra, select data you want to import and push it into ES with an http client and bulk API.
The Cassandra datastax driver is available in java (maven package or link on datastax site) or c# (nuget.org package). the java driver is a native not jdbc but usage is quite similar.
Both java and c# work very well, just be careful to use the version in sync with your cassandra install to avoid problems, some system tables changes broke things between versions.

For ES, an http client of your choice (apache http components in java or standard dotnet HttpClient in c#) with bulk API.

A problem you may have is that Cassandra is a distributed key value store and have strong limitations on SELECt statements in where clause and with table / partitions scans.
If you know the cassandra data model, and can deal with it this is fine, if you don't and want to do a basic 'SELECT * -or some columns-' without WHERE, I suggest you have a look at:
http://www.myhowto.org/bigdata/2013/11/04/scanning-the-entire-cassandra-column-family-with-cql/
to give you hints about paging the SELECT to deal with cluster partitions.

HTH

Regards,

Alain

Chiranjith_Rai · November 30, 2016, 7:25am

Hi David,
What is the reason for removing the feature river from elasticsearch?
any issues with it?
any alternate solutions for the same operation?

dadoonet · December 3, 2016, 10:11pm

You probably want tor read

Chiranjith_Rai · December 5, 2016, 5:06am

Thnak you Mr. David

Alberto_Sao_Marcos · January 26, 2017, 10:41am

I'm also looking into an solid option to index C* into ES.
any news on the subject?

sumit_gupta_sgt · March 29, 2017, 5:04pm

@soujanya Did you tried jdbc importer?

Topic		Replies	Views
Integration of ElasticSearch with Cassandra Elasticsearch	3	882	July 6, 2017
Using cassandra-jdbc plugin to fetch data from cassandra Elasticsearch	1	822	July 6, 2017
Cassandra with JDBC river plugin Elasticsearch	4	626	July 6, 2017
Cassandra data into ES : Elassandra? Elasticsearch	1	2207	March 20, 2017
Elasticsearch and cassandra integration? Elasticsearch	14	4207	July 6, 2017

Load cassandra data into elasticsearch

Related topics