Elasticsearch SQL ODBC taking more time than previous version

I'm using SQL ODBC driver to fetch the results from elastic. I'm getting problem while fetching the documents from elastic. The query is as simple as 'select empno, name, address from emp'. In Elasticsearch version 7.6.2, the results were fetched and displayed on console within 24 milli seconds whereas in Elasticsearch version 7.12.0 it takes 200 milli seconds for the results to get displayed on console. The same query requires 42 milli seconds when fired from Kibana Dev Console.
Why is the time difference more when the Elasticsearch is known for its fast retrieving speed and also the time difference is more than RDBMS. Are there some settings that I need to change ?

Are these millisecond values averages? Have the requests been issued repetitively to bypass caching warming issues?

Then, when trying it out in Kibana's Dev Console, I assume you're using the SQL API, not the _search index API, right? Just to make sure the comparison is for the same flow path.
Also, I assume there's no scrolling involved, the Dev Console will only display the first page not the entire result set.

In any case, comparing Dev Console's timings to your application's is not that relevant, since Kibana will simply output the response. The driver's work is to take a JSON/CBOR object, break it down and copy it into the client app buffers, as configured through the API. So there's always going to be an overhead, excluding what the client app does with the data.

Assuming all things equal, you could tweak some ODBC driver settings that might influence the throughput:

  • Max page size: depending on your ES/index settings, this could be upped to 10K.
  • Data encoding: JSON vs CBOR.
  • Data compression: turn it off.

The driver should also log (on INFO level) the timings required for each page. You could compare those between versions, to better understand where the difference stems from.

Finally, if you have any control on how ODBC API is used, SQLGetData is a slower API than SQLBindCol. But I assume this hasn't changed between your attempts with different stack versions.

Also interesting would be if testing with a different application supporting ODBC would confirm your measured differences between versions.

Thank you for your reply. Lets not consider Kibana Dev Console's response time for a while, but I think the fetching time of Elasticsearch SQL ODBC must be nearby as compared to Kibana Dev Console in case of single document fetching where it is giving a big difference. I looked into the ODBC driver settings and got that those changes make only negligible difference and does not make much difference.

There seem to be two issues in your post:

  • the speed of 7.6.2 stack is slower than latest (7.12.0);
  • the speed of latest ODBC driver seems slow in your test (compared to Kibana rendering).

To the first point, in a quick&crude test, I find the latest to actually be a bit faster, but negligibly so. I used pyodbc (x64), executed a simple SELECT * on the kibana sample data that is readily available to index. I've also enabled INFO logging in the driver[*] and thus able to calculate that a bit over 60% of the time is spent in waiting for the results (from Elasticsearch); and I guess a large remaining chunk is likely spent in python/bindings code.
Which allows me to conclude that while moving data through the driver will be slower than what you'll get in Kibana Dev Console, it's not going to be nearly an order of magnitude slower (as your examples suggest).

I should also mention that I've used timing averages, discarding the first 10 searches, which can be more than twice slower than the remaining average (likely caching). The page size was set to 10K rows and compression on. JSON vs. CBOR made no difference (likely because the sample data is mostly text).

[*] the driver will then log messages like request answered, received code 200 and 1233653 bytes of type 'application/json; charset=UTF-8' back; times(ms): start: 297.000, total: 312.000. start indicates time delta to first response byte, total to last one and what matters.

Is your use case executing a high rate of queries on many different indices, most of which return a single page? Asking to understand this one-doc test relevance, both in setup and results.

Nevertheless, assuming your test methodology and results valid, it'd help to further detail your experience with timings from driver's log and those of your app (if you can for instance measure differences between SQLExecute/SQLPrepare and SQLGetData or SQLFetch, if binding buffers).

1 Like

I agree we cannot compare the Kibana Dev Console time with the Elastic SQL ODBC as they both are performing different operations, but I want to know why Elasticsearch SQL ODBC is taking more time for fetching than other RDBMS when Elasticsearch is known for fast fetching speed ?
See, the time taken by Elasticsearch is 200 milliseconds while the same operation is performed in other RDBMS in 20 to 25 milliseconds, then I think it is not preferable to use Elasticsearch over other RDBMS. Is it so ?

Elasticsearch is known for fast searching speed. I just wanted to make sure you did not mixed both concepts. So depending on the exact query, it can be much faster or may be slower...

Could you share the exact requests which are sent to both systems?

200-400ms would be plausible if you're just "downloading"/bulk-exporting a larger index with a high field cardinality (vs. a targeted search), but not normal for a small test.

I've shared a way to easily validate your results with a 3rd party application. Does this confirm your timings? (There are also many other BI tools that you can trial with.)

General inquiries are difficult to help with: even if the driver is acting up in the context of your app, without more details - queries, logs - it's just stabbing in the dark.

The SQL query was 'select name, age, salary from emp where empname = 'Bob''

Could you share the full output of the SQL response?

Is there only one match?

Could you run the following query in Kibana dev console?

GET /emp/_search
{
  "query": {
    "match": {
      "empname": "Bob"
    }
  }
}

And share the full json response?