Any interest in an Elasticsearch query language (EQL)?


(Metadave) #1

Hello -

I spent a few hours getting a prototype of an Elasticsearch query language (dubbed "eql") up and running [0]. I'm posting here to see if there is any interest at all in a project like this.

Here are some examples that I have working:

query bank return 3 sort on balance asc, lastname desc;

query bank (balance, age, account_number)
  filter
    balance = 1110
    and (age = 31 or account_number=953)
  return 1;

index blogposts with post = '{"xyz":"this is a test", "foobar":100}';

get blogposts with post = "AU3Po0OOZX4PYDrqsDN1";

The github repo has a few animated gifs that demo eql running. If anyone has an interest in language design, I'd love to kick around some ideas on a full query language for Elasticsearch.

Cheers -
Dave

[0] https://github.com/metadave/eql


(Jörg Prante) #2

Cool!

Looks like this is for working from a remote command line console?

I'm interested in these points:

  • the language parser should also be implemented as a plugin so it can be used over HTTP.

  • the "query"/"get" part should be addressable by a separate end point than the administrative commands so it can be used safely without the risk of modifying/deleting data ("read only mode")

  • reuse of query results in subsequent queries (assigning results to variables probably)

  • presenting results in CSV, JSON arrays, or XML, like in my plugins https://github.com/jprante/elasticsearch-xml or https://github.com/jprante/elasticsearch-arrayformat


(Metadave) #3

Thanks for the reply, Jörg.

The console exists in the EQL repo (using JLine), but the main Antlr4 parser could be packaged via Maven separately. I used a similar approach on a parser at work, and it allows me to try out the library via command line/script.

Regarding your points (and apologies for the awkward inline format below):

  • the language parser should also be implemented as a plugin so it can be used over HTTP.

no problem here, it boils down to a Java Maven dependency to evaluate a query.

  • the "query"/"get" part should be addressable by a separate end point than the administrative commands so it can be used safely without the risk of modifying/deleting data ("read only mode")

Agreed, but I wonder if something like:

connect readonly foo:9300; 

might look nicer to a user. In a previous parser, I made a database connection optional for each query command for exactly this reason.

  • reuse of query results in subsequent queries (assigning results to variables probably)

This is doable, but not in my prototype at the moment.

Excellent idea, I'll take a look at your plugins (woot Apache 2 license!)

Cheers -
Dave


(Mark Walkom) #4

Very nice idea!


(Otis Gospodnetić) #5

Hi,

Interesting. We have an SQL layer on top of ES, though it's not on Github (yet?).
Does eql imply having to learn a new language structure?

Otis

Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/


(Metadave) #6

Does eql imply having to learn a new language structure?

Yes. It's specifically designed to not look like SQL. I worked at a nosql database company in the past, and mapping SQL to non-sql data stores was awkward. The tradeoff is/was to make a language that sounds as natural as possible.

Cheers -
Dave


(Jörg Prante) #7

+1 for not using SQL.

Even for RDBMS, SQL is flawed, inconsistent, not easy to use. The "Cobol of the relational world".

For non-sql data, there is no standard, but maybe something will silently evolve - the community decides what will stand the test of time.


(Lukas Vlcek) #8

Hi Dave,

nice! Are there any EQL examples that would be equal to more complicated ES queries? Like nested aggregations? Filtered query with non-trivial query and filter parts? I would like to get an idea of how this would look like - did not find anything like this in your repo.

Regards,
Lukas


(Metadave) #9

Hello Lukas -

Since it's only a prototype, the examples in the repo are all I have at the moment. Do you have any example queries in mind that are more complex that I could model after?

Thanks for taking a look!
Dave


(Metadave) #10

Actually, I see some examples in the docs. I'll kick those around a bit and see what I can come up with.

Have a great weekend -
Dave


(Metadave) #11

Ok, here's what simple aggregates look like:

  query bank 
     aggregate min_bal = min(balance), max_bal = max(balance);

   (truncated results)
   "aggregations" : {
      "max_bal" : {
        "value" : 49989.0
      },
      "min_bal" : {
        "value" : 1011.0
      }
    }

    // filter + aggregation
    query bank 
      filter age = 20 
      aggregate foo = min(balance), bar = max(balance);

   (truncated results)
    "aggregations" : {
       "foo" : {
       "value" : 1650.0
      },
    "bar" : {
       "value" : 49568.0
     }
  }

I'll keep chugging ahead and implement other pieces, time permitting. I won't post progress on this thread, but I'll keep the README updated with new statements here: https://github.com/metadave/eql

Cheers -
Dave


(Patrick Kik) #12

I like the idea. Good luck!


(system) #13