In our application, our primary use case is to create dynamic queries based on a user's input. Because of the complexity of standard elasticsearch queries, creating a dynamic SQL query is a much better approach.
However, when executing SQL queries, the result that's returned lacks some basic meta data, such as, the number of records returned on the query (thus making pagination nearly impossible, and the fact that we have a requirement were we must display the number of records found for a query).
Which leads me to the point where i have to have hundreds (thousands?) of lines of code to create a dynamic elasticsearch query.
If the xpack SQL plugin simply returned the "Hits" count, that's all we would need. Has someone figured out a workaround for this limitation with SQL queries?
I really don't want to query the database twice (once to get the count, and then a second time to get the rows) -- and i'd really like to avoid creating dynamic elasticsearch queries, and using the official Java client is a non starter in that bringing it into our project is like tossing a hand grenade into our dependencies. And Jest, well that's just writing tons of JSON strings, non starter.
Just a reply to my own thread here, I am thinking of just creating the SQL query in my code, then just call the _xpack/sql/translate endpoint to just translate the query, then execute that query which gives me my record count (hits)
yeah it's two calls to ES, but that first one is really not doing anything massive. How's this sound?
@Andrei_Stefan - so, i was thinking if the response JSON had the same meta data that the standard queries returns (like the hits) then everything you'd need would be there.
So you have your section of the response JSON that contains the rows as it does now, and in the metadata section of the response there would be a Hits object. that wouldn't necessarily break any clients i would think using SQL.
And yes, absolutely COUNT(*) is a way to go, but then you have to query the database twice, once to get your count, then second to get your rows. right?
If we would have hits in there, then this wouldn't be SQL anymore, it would be a differently formatted JSON response from a REST endpoint served by Elasticsearch.
In SQL, there is the notion of a cursor, and this is what we are using as well, in the form of cursor field inside the response to be used afterwards to call the next "batch". Under the hood, this is nothing more than a scroll_id received back from a scroll type of search.
One other thing to keep in mind when thinking about the output of a sql search is that the REST endpoint (/_xpack/sql) is not the only access point to ES-SQL. We also have JDBC and ODBC (in the near future) access and those are standards that need to obey the same rules. And in there you don't have a total hits element inside the response. Those two work with cursors in the same manner as a relational database does.
That's why there is no hits inside a response. You'd have to include a COUNT(*) (if applicable to your specific query) or maybe use the translate API to obtain an ES query to use afterwards. In any case, the translate API should be fast, as it's not querying anything, it's just parsing the query and coming up with an ES query.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.