I spent a few hours getting a prototype of an Elasticsearch query language (dubbed "eql") up and running [0]. I'm posting here to see if there is any interest at all in a project like this.
Here are some examples that I have working:
query bank return 3 sort on balance asc, lastname desc;
query bank (balance, age, account_number)
filter
balance = 1110
and (age = 31 or account_number=953)
return 1;
index blogposts with post = '{"xyz":"this is a test", "foobar":100}';
get blogposts with post = "AU3Po0OOZX4PYDrqsDN1";
The github repo has a few animated gifs that demo eql running. If anyone has an interest in language design, I'd love to kick around some ideas on a full query language for Elasticsearch.
Looks like this is for working from a remote command line console?
I'm interested in these points:
the language parser should also be implemented as a plugin so it can be used over HTTP.
the "query"/"get" part should be addressable by a separate end point than the administrative commands so it can be used safely without the risk of modifying/deleting data ("read only mode")
reuse of query results in subsequent queries (assigning results to variables probably)
The console exists in the EQL repo (using JLine), but the main Antlr4 parser could be packaged via Maven separately. I used a similar approach on a parser at work, and it allows me to try out the library via command line/script.
Regarding your points (and apologies for the awkward inline format below):
the language parser should also be implemented as a plugin so it can be used over HTTP.
no problem here, it boils down to a Java Maven dependency to evaluate a query.
the "query"/"get" part should be addressable by a separate end point than the administrative commands so it can be used safely without the risk of modifying/deleting data ("read only mode")
Agreed, but I wonder if something like:
connect readonly foo:9300;
might look nicer to a user. In a previous parser, I made a database connection optional for each query command for exactly this reason.
reuse of query results in subsequent queries (assigning results to variables probably)
This is doable, but not in my prototype at the moment.
Does eql imply having to learn a new language structure?
Yes. It's specifically designed to not look like SQL. I worked at a nosql database company in the past, and mapping SQL to non-sql data stores was awkward. The tradeoff is/was to make a language that sounds as natural as possible.
nice! Are there any EQL examples that would be equal to more complicated ES queries? Like nested aggregations? Filtered query with non-trivial query and filter parts? I would like to get an idea of how this would look like - did not find anything like this in your repo.
Since it's only a prototype, the examples in the repo are all I have at the moment. Do you have any example queries in mind that are more complex that I could model after?
I'll keep chugging ahead and implement other pieces, time permitting. I won't post progress on this thread, but I'll keep the README updated with new statements here: https://github.com/metadave/eql
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.