Problems retrieving term payload in custom query


#1

In elasticsearch 2.1.0 we have a plugin with a custom Lucene TokenFilter that adds a payload to a special term during indexing analysis, and we also have custom QueryParser that produces Lucene Query/Weight/Scorer classes that compare this payload against some query data to decide if a document matches a query or not.

I know that the payloads are being written as desired since I can easily see them via elasticsearch's termvector API.

However, while our Scorer can see all of the documents with this special term, I can never get access to the payload.

The approach to doing this in the Scorer is simply:

docs = leafReaderContext.reader().postings(specialTerm, PostingsEnum.PAYLOADS);
Then, walking through the docs via nextDoc() and trying to retrieve each doc's payload via:

BytesRef payloadRef = docs.getPayload();
if (payloadRef != null) {
byte[] payload = payloadRef.bytes;
}

I have logging in place so that I see that every document I have in a test index is being looked at, but for every document, the payloadRef comes back null.

What am I doing wrong? Is there some setting I need to enable somewhere to allow the postings method to do what it says it can do?

Is there a better way to do this?

Any advice would be appreciated!

Bob


#2

I got a reply on the Lucene users mailing list that solved my problem, and may help someone else, so I will give it here:

Adrien Grand via lucene.apache.org 

Note that payloads are stored per position, not per document. Maybe the
problem is that you never call docs.nextPosition()?

For me, since I know that my special term only appears once, all I had to do was add a call to docs.nextPosition() before trying to retrieve the payload.


(system) #3