we are building a Java application using Elasticsearch. Currently we are evaluating the client alternatives.
The native NodeClient isn't the right fit, because we don't need full node-functionality on our client. Instead CPU and memory should be minimized.
A Plain-Http-Rest-Client isn't optimal either. We like to eliminate the HTTP overhead.
The native TransportClient seems to be the right fit. But it seems to be only available installing the full artifact (org.elasticsearch:elasticsearch:1.7.2). This includes the full elasticsearch-server with all the Lucene-API. That's to much.
Why isn't there a lean artifact just holding the TransportClient. All the elasticsearch-stuff seems not to be required, right? Is there a future plan to release a TransportClient-only artifact?
For creating such a client, the Elasticsearch source code would have be refactored in many ways.
From your arguments, I suspect the required functionality is something like a write-only, JSON-only Java client using the native protocol on port 9300.
So, the refactoring would have to include
refactoring the client code into "write" and "read" methods, where "write" consists of indexing single docs or docs in bulk, and "read" consists of doc get, Lucene queries, filters, aggregations etc.
refactoring the "write" code into the server-dependent code (like scripting, plugins, modules, services) and "client-only" code (Jackson JSON, Netty, threads, pools, buffers, transport protocol)
This would be no doubt a refactoring that introduces new pivotal points and requires an extra legacy API layer so existing ES applications can still run. Such a kind of modularization project was https://github.com/jprante/elasticsearch-client which I tried three years ago with an old version of Elasticsearch.
Many of the exciting new developments for ES 2.0 go into the direction of providing more modular code with less and clean dependencies. I'm quite confident that a small footprint transport ES client will become a future project. Maybe for ES 2.x, maybe for ES 3.x
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.