Any interest in an Elasticsearch query language (EQL)?

Hello -

I spent a few hours getting a prototype of an Elasticsearch query language (dubbed "eql") up and running [0]. I'm posting here to see if there is any interest at all in a project like this.

Here are some examples that I have working:

query bank return 3 sort on balance asc, lastname desc;

query bank (balance, age, account_number)
  filter
    balance = 1110
    and (age = 31 or account_number=953)
  return 1;

index blogposts with post = '{"xyz":"this is a test", "foobar":100}';

get blogposts with post = "AU3Po0OOZX4PYDrqsDN1";

The github repo has a few animated gifs that demo eql running. If anyone has an interest in language design, I'd love to kick around some ideas on a full query language for Elasticsearch.

Cheers -
Dave

[0] https://github.com/metadave/eql

Cool!

Looks like this is for working from a remote command line console?

I'm interested in these points:

  • the language parser should also be implemented as a plugin so it can be used over HTTP.

  • the "query"/"get" part should be addressable by a separate end point than the administrative commands so it can be used safely without the risk of modifying/deleting data ("read only mode")

  • reuse of query results in subsequent queries (assigning results to variables probably)

  • presenting results in CSV, JSON arrays, or XML, like in my plugins https://github.com/jprante/elasticsearch-xml or https://github.com/jprante/elasticsearch-arrayformat

Thanks for the reply, Jörg.

The console exists in the EQL repo (using JLine), but the main Antlr4 parser could be packaged via Maven separately. I used a similar approach on a parser at work, and it allows me to try out the library via command line/script.

Regarding your points (and apologies for the awkward inline format below):

  • the language parser should also be implemented as a plugin so it can be used over HTTP.

no problem here, it boils down to a Java Maven dependency to evaluate a query.

  • the "query"/"get" part should be addressable by a separate end point than the administrative commands so it can be used safely without the risk of modifying/deleting data ("read only mode")

Agreed, but I wonder if something like:

connect readonly foo:9300; 

might look nicer to a user. In a previous parser, I made a database connection optional for each query command for exactly this reason.

  • reuse of query results in subsequent queries (assigning results to variables probably)

This is doable, but not in my prototype at the moment.

Excellent idea, I'll take a look at your plugins (woot Apache 2 license!)

Cheers -
Dave

Very nice idea!

Hi,

Interesting. We have an SQL layer on top of ES, though it's not on Github (yet?).
Does eql imply having to learn a new language structure?

Otis

Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/

Does eql imply having to learn a new language structure?

Yes. It's specifically designed to not look like SQL. I worked at a nosql database company in the past, and mapping SQL to non-sql data stores was awkward. The tradeoff is/was to make a language that sounds as natural as possible.

Cheers -
Dave

+1 for not using SQL.

Even for RDBMS, SQL is flawed, inconsistent, not easy to use. The "Cobol of the relational world".

For non-sql data, there is no standard, but maybe something will silently evolve - the community decides what will stand the test of time.

Hi Dave,

nice! Are there any EQL examples that would be equal to more complicated ES queries? Like nested aggregations? Filtered query with non-trivial query and filter parts? I would like to get an idea of how this would look like - did not find anything like this in your repo.

Regards,
Lukas

Hello Lukas -

Since it's only a prototype, the examples in the repo are all I have at the moment. Do you have any example queries in mind that are more complex that I could model after?

Thanks for taking a look!
Dave

Actually, I see some examples in the docs. I'll kick those around a bit and see what I can come up with.

Have a great weekend -
Dave

Ok, here's what simple aggregates look like:

  query bank 
     aggregate min_bal = min(balance), max_bal = max(balance);

   (truncated results)
   "aggregations" : {
      "max_bal" : {
        "value" : 49989.0
      },
      "min_bal" : {
        "value" : 1011.0
      }
    }

    // filter + aggregation
    query bank 
      filter age = 20 
      aggregate foo = min(balance), bar = max(balance);

   (truncated results)
    "aggregations" : {
       "foo" : {
       "value" : 1650.0
      },
    "bar" : {
       "value" : 49568.0
     }
  }

I'll keep chugging ahead and implement other pieces, time permitting. I won't post progress on this thread, but I'll keep the README updated with new statements here: https://github.com/metadave/eql

Cheers -
Dave

I like the idea. Good luck!