HTTP REST ElasticSearch client without lucene dependencies

Hi,

I'm using ElasticSearch with JEST. But when I add the maven dependency for elasticsearch:

<dependency>
        <groupId>org.elasticsearch</groupId>
        <artifactId>elasticsearch</artifactId>
        <version>1.7.1</version>
        <!-- <exclusions> -->
        <!-- <exclusion> -->
        <!-- <groupId>*</groupId> -->
        <!-- <artifactId>*</artifactId> -->
        <!-- </exclusion> -->
        <!-- </exclusions> -->
    </dependency>
    <dependency>
        <groupId>io.searchbox</groupId>
        <artifactId>jest</artifactId>
        <version>0.1.6</version>
    </dependency>

It is getting 17+ jars(around 19+ MBs).

I'm pretty sure that I'm only using the elasticsearch jars for generating the JSON. So I'm assuming I don't need all those jars. Here's gist of how I use the ES jars.

    String json = null;
     ....
    FilterBuilder filter = FilterBuilders.andFilter(toArray(filters, FilterBuilder.class));
    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
    searchSourceBuilder.sort("shop.chaosEquiv", SortOrder.ASC);
    json = searchSourceBuilder.toString();

Is there a way to only include dependency for the Query/Filter builders?

No.

JEST does not provide a way to create your queries?

I have only followed the JEST usage documentation on their Github site:

Based on this documentation (copied the relevant parts below), I think the answer is No. JEST the ES client does not provide a way to build client queries.

I guess I'll have to copy paste code from ES. Hopefully it's not very hard to do..
Or I might just accept this reality.

Btw, this is just for my little open source project: http://thirdy.github.io/blackmarket/
Which is targeted to be used by non-programmers on consumer PC's. This fan-made tool will have a lot of stuff/features to be hopefully implemented. Having a very small download size is very desirable on this use case.

Searching Documents

Search queries can be either JSON String or created by Elasticsearch SourceBuilder Jest works with default Elasticsearch queries, it simply keeps things as is.

...
...

Elasticsearch Optional Dependency

If you want to use Elasticsearch's QueryBuilder or Settings classes, ensure to add Elasticsearch dependency.

org.elasticsearch elasticsearch ${elasticsearch.version}

Ok, Eclipse's m2e plugin really helps out, with the dependency hierarchy viewer, you just right click and exclude a transitive dependency.

And with the help of my unit test, here's what I came up:

<dependency>
            <groupId>org.elasticsearch</groupId>
            <artifactId>elasticsearch</artifactId>
            <version>1.7.1</version>
            <exclusions>
                <exclusion>
                    <artifactId>asm</artifactId>
                    <groupId>org.ow2.asm</groupId>
                </exclusion>
                <exclusion>
                    <artifactId>asm-commons</artifactId>
                    <groupId>org.ow2.asm</groupId>
                </exclusion>
                <exclusion>
                    <artifactId>antlr-runtime</artifactId>
                    <groupId>org.antlr</groupId>
                </exclusion>
                <exclusion>
                    <artifactId>snakeyaml</artifactId>
                    <groupId>org.yaml</groupId>
                </exclusion>
                <exclusion>
                    <artifactId>lucene-highlighter</artifactId>
                    <groupId>org.apache.lucene</groupId>
                </exclusion>
                <exclusion>
                    <artifactId>lucene-suggest</artifactId>
                    <groupId>org.apache.lucene</groupId>
                </exclusion>
                <exclusion>
                    <artifactId>lucene-join</artifactId>
                    <groupId>org.apache.lucene</groupId>
                </exclusion>
                <exclusion>
                    <artifactId>lucene-spatial</artifactId>
                    <groupId>org.apache.lucene</groupId>
                </exclusion>
                <exclusion>
                    <artifactId>lucene-grouping</artifactId>
                    <groupId>org.apache.lucene</groupId>
                </exclusion>
                <exclusion>
                    <artifactId>lucene-sandbox</artifactId>
                    <groupId>org.apache.lucene</groupId>
                </exclusion>
                <exclusion>
                    <artifactId>lucene-misc</artifactId>
                    <groupId>org.apache.lucene</groupId>
                </exclusion>
                <exclusion>
                    <artifactId>lucene-memory</artifactId>
                    <groupId>org.apache.lucene</groupId>
                </exclusion>
                <exclusion>
                    <artifactId>lucene-analyzers-common</artifactId>
                    <groupId>org.apache.lucene</groupId>
                </exclusion>
                <exclusion>
                    <artifactId>lucene-queryparser</artifactId>
                    <groupId>org.apache.lucene</groupId>
                </exclusion>
                <exclusion>
                    <artifactId>lucene-queries</artifactId>
                    <groupId>org.apache.lucene</groupId>
                </exclusion>
            </exclusions>
        </dependency>

However, apparently, the problem wasn't in the transitive dependencies. It was ES itself! elasticsearch-1.7.1.jar is 13.7mb! And looking inside this jar, I see Apache Lucene and Joda.. how come these are included in the jar itself and not as separate dependency?

Things are changing (a lot) in 2.0!

In ES 2.0.0-rc1, the ES jar is still 9105462 bytes.

About three years ago I refactored ES source code just for fun into server and client code base, with the aim to minimize the jar sizes and the set of dependencies when using ES from remote hosts.

One of the results was a transport client jar of ~2 MB and a Lucene API client jar got just added 1 MB plus the Lucene jars, ~5 MB or so (I don't remember exactly, sorry)

A lot has happened since then, but the ES source base is still a mix of client and server code, with mixed dependencies. With ES 2.0, I like to consider to pick up my old project again and apply it to refactor the ES 2.0 source but my time budget is small so I can not promise anything. Such a project is of very low priority since there is a functioning official Java client and the ES development is moving in the right directorion.

Maybe it is more promising to establish an alternative Java client project outside of the ES source base, also to experiment with new features, which is a more thrilling task.

+1
I would love to see Elasticsearch 2.0 with just a small client-jar (holding the TransportClient)