Doing simple Lucene-ish things

rik · May 8, 2026, 9:46pm

hi all, first time using ES discuss. (And I posted the same question to ES slack as I don't know where this community mostly hangs?)

At the moment I'm using trial ES serverless resources, and the python API. I used to know Lucene pretty well, and trying to do things that used to be straight-forward, eg: How can I get the full list of keyword/tokens in an index for fields text and desc and iterate over them to get document counts? How about if it is a named entity recognition index with keywords put in ner_desc.entities.entity ? Thanks for any hints, eg pointers to how/if ES reveals lower-level Lucene features.

Christian_Dahlqvist · May 19, 2026, 9:38am

As far as I know Elasticsearch does not expose lower level Lucene features.

thecoop · May 20, 2026, 10:02am

Great question - I want to push back on an assumption I think you've made.

Elasticsearch is not a wrapper of Lucene, in that it takes Lucene and puts a REST interface on top of it. Elasticsearch is implemented using Lucene. There's many, many aspects of Elasticsearch that extend, complement, or replace functionality provided by Lucene. As such, Lucene doesn't really 'exist' as a separate standalone 'thing' within Elasticsearch. So there are no features of Lucene that can be exposed separately to what Elasticsearch exposes - everything is done by Elasticsearch, using many areas of functionality of Lucene to do so, but not as Elasticsearch-wrapping-Lucene.

Some Lucene concepts don't make sense in Elasticsearch due to how we've implemented something. Some other things map pretty well onto underlying Lucene functionality. And some are entirely Elasticsearch-only, not really using Lucene. You can't really split them up from outside Elasticsearch.

Hope that makes sense

rik · May 22, 2026, 12:52am

SImon, thanks very much for your thoughtful response. the more i work with ES the more I appreciate all the things it is trying to do, far beyond Lucene search. it would be some pointers to how functionality is divided up, according to you:

Some Lucene concepts don't make sense in Elasticsearch due to how we've implemented something.
Some other things map pretty well onto underlying Lucene functionality.
And some are entirely Elasticsearch-only,

I'd really appreciate pointers to design docs that let me know which is which.

here's a simple example: I was looking for access to (what I recall Lucene calling; no access to Lucene doc this minute) the Vocabulary: a full enumeration of all assigned keywords. I wound up writing a script that accomplishes this, but wonder if I did it right.

Topic		Replies	Views
How to access the underlying Lucene index in ElasticSearch custom handler? Elasticsearch	2	1329	May 15, 2015
Lucene API Elasticsearch	2	305	July 26, 2011
Lucene search features in elastic search Elasticsearch	4	388	May 9, 2011
New to ES Elasticsearch	4	363	November 28, 2011
Indexing custom Lucene documents Elasticsearch	5	632	February 17, 2011

Doing simple Lucene-ish things

Related topics