Elasticsearch-sql-cli Syntax :: "For the Last Five Minutes..." Query?

Hi Elasticsearch Gurus,

I’ve recently spun up an Elasticsearch v7.4.0 Docker container. Within the container, I’ve been playing with the “elasticsearch-sql-cli” script and successfully submitting SQL-like queries against my ES index. Which is pretty great.

Here’s a follow-up problem. My index is really massive, with several million data records, and growing rapidly. I need to find a way to submit an SQL-like query that basically says, “Give me X data only for the last five minutes.” As luck would have it, my index includes a “timestamp” record:

{
  "mapping": {
    "properties": {
      "@timestamp": {
        "type": "date"
      },…

Here’s a sample of what it looks like in a standard query:

Timestamp               | SrcIP      | DstIP      |Protocol|Port| sum
------------------------+------------+------------+--------+----+-------
2019-10-09T20:02:27.770Z|10.10.10.15 |10.10.10.16 |6       |5201|15000.0
2019-10-09T20:02:27.771Z|10.10.10.15 |10.10.10.16 |6       |5201|51000.0
2019-10-09T20:02:27.772Z|10.10.10.15 |10.10.10.16 |6       |5201|19500.0
2019-10-09T20:02:27.785Z|10.10.10.15 |10.10.10.16 |6       |5201|69000.0
2019-10-09T20:02:27.787Z|10.10.10.15 |10.10.10.16 |6       |5201|10500.0
2019-10-09T20:02:27.788Z|10.10.10.15 |10.10.10.16 |6       |5201|10500.0
2019-10-09T20:02:27.789Z|10.10.10.15 |10.10.10.16 |6       |5201|24000.0
2019-10-09T20:02:27.790Z|10.10.10.15 |10.10.10.16 |6       |5201|54000.0
2019-10-09T20:02:27.791Z|10.10.10.15 |10.10.10.16 |6       |5201|55500.0

Okay, so how would you say “Give me records from the last five minutes?” Standard SQL would have you do that like this. But I’ve tried this in my elasticsearch-sql-cli script, and I’m missing something in the syntax. Here’s a few attempts:

sql> select  \"@timestamp\", SrcIP, DstIP, Protocol, Port, sum(Total)
> from \"myindex2019.10.09\"
> where Port = 5201 AND
>       to_date(\"@timestamp\", 'MM/DD/YYYY HH24:MI:SS') >= sysdate - 5/(24*60)
> group by \"@timestamp\", SrcIP, DstIP, Protocol, Port;
> Bad request [Found 2 problem(s)
line 4:7: Unknown function [to_date], did you mean any of [CURDATE, TODAY, LOCATE, TRUNCATE]?
line 4:57: Unknown column [sysdate]]
sql>
sql>
sql> select  \"@timestamp\", SrcIP, DstIP, Protocol, Port, sum(Total)
> from \"myindex2019.10.09\"
> where Port = 5201 AND
>       \"@timestamp\" >= CURDATE - 5/(24*60)
> group by \"@timestamp\", SrcIP, DstIP, Protocol, Port;
> Bad request [Found 1 problem(s)
line 4:23: Unknown column [CURDATE]]
sql>
sql>
sql> select  \"@timestamp\", SrcIP, DstIP, Protocol, Port, sum(Total)
> from \"myindex2019.10.09\"
> where Port = 5201 AND
>       \"@timestamp\" >= CURDATE() - 5/(24*60)
> group by \"@timestamp\", SrcIP, DstIP, Protocol, Port;
> Bad request [Found 1 problem(s)
line 4:23: first argument of [CURDATE() - 5/(24*60)] must be [numeric], found value [CURDATE()] type [date]]
sql>
sql>

Any advice on how to crack this syntax?

And… as a general question… Is there an online guide to the ES/SQL syntax? It will be pretty tedious if I have to post on the forums every time I have a general question like this. Thanks!

Update... I've been researching SQL functions that allow you to manipulate time datatypes. It seems that many of the functions which normally would be helpful here aren't available. DATEADD(), for instance, isn't supported:

sql> select DATEADD( minute, 5, CURRENT_TIMESTAMP());
Bad request [Found 1 problem(s)
line 1:17: Unknown column [minute]]
sql>
sql> select DATEADD( 1, 1, CURRENT_TIMESTAMP );
Bad request [Found 1 problem(s)
line 1:8: Unknown function [DATEADD], did you mean [DATABASE]?]
sql>

I know that the ES implementation of SQL queries is "like SQL," but not actually SQL. Are there any online docs which help illustrate what is and is not supported? Thanks!

DATEADD and DATEDIFF were added recently to es-sql and will be available in 7.5.0.

But, in your specific scenario you should be able to do

SELECT  \"@timestamp\", SrcIP, DstIP, Protocol, Port, SUM(Total) \"myindex2019.10.09\" WHERE Port = 5201 AND \"@timestamp\" >= NOW() - INTERVAL 5 MINUTES GROUP BY \"@timestamp\", SrcIP, DstIP, Protocol, Port

So, basically NOW() - INTERVAL 5 MINUTES.

Though, I would suggest not grouping by @timestamp - it will be a rather intensive aggregation (creating buckets at millisecond level).

Regarding the documentation, I'd say it's pretty good and comprehensive. The language and syntax in general is covered here. While the bit you are interested in is in the INTERVALS section. If you have a any suggestions for improvements, please don't hesitate to ask for it in our github repo (Search/SQL label). Thanks.

YES! This did the trick. Thank you, this is incredibly helpful.

My apologies if I was grumbling about the SQL syntax. I did read through it before upgrading to 7.4.0, but that was weeks ago. I should have rescanned it before submitting this post.

No worries @redapplesonly. Please, don't hesitate to ask further if you encounter any other roadblocks.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.