2.1.0: NullPointerException, not sure where to start

With elasticsearch 2.1.0 I'm getting the following errors across 4 of the 5 shards with a pretty simple search. From the error messaging alone I'm not even sure where to start.

Command:

curl -XGET http://localhost:9200/member_site/attachment/_search?pretty --data-binary "@/tmp/queries/test4.json"

test4.json

{
  "size": 20,
  "from": 0,
  "query": {
    "query_string": {
      "query": "attachment",
      "lenient": true,
      "default_operator": "AND"
    }
  }
}

ERrors:

[2015-12-16 23:46:29,300][DEBUG][action.search.type       ] [Misty Knight] [member_site][0], node[yrZlGlg5RaiVWiegVhWTUw], [P], v[2], s[STARTED], a[id=_8XTZ5sWQhSreZwfkqTaxg]: Failed to execute [org.elasticsearch.action.search.SearchRequest@410756ce] lastShard [true]
RemoteTransportException[[Misty Knight][127.0.0.1:9300][indices:data/read/search[phase/query]]]; nested: QueryPhaseExecutionException[Query Failed [Failed to execute main query]]; nested: NullPointerException;
Caused by: QueryPhaseExecutionException[Query Failed [Failed to execute main query]]; nested: NullPointerException;
  at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:343)
  at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
  at org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
  at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
  at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
  at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
  at org.elasticsearch.transport.TransportService$4.doRun(TransportService.java:350)
  at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
  at org.apache.lucene.index.OrdTermState.copyFrom(OrdTermState.java:37)
  at org.apache.lucene.codecs.BlockTermState.copyFrom(BlockTermState.java:56)
  at org.apache.lucene.codecs.lucene50.Lucene50PostingsFormat$IntBlockTermState.copyFrom(Lucene50PostingsFormat.java:475)
  at org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekExact(SegmentTermsEnum.java:1014)
  at org.elasticsearch.common.lucene.all.AllTermQuery$1.scorer(AllTermQuery.java:152)
  at org.elasticsearch.common.lucene.all.AllTermQuery$1.scorer(AllTermQuery.java:102)
  at org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
  at org.apache.lucene.search.BooleanWeight.booleanScorer(BooleanWeight.java:201)
  at org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:233)
  at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:769)
  at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
  at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
  ... 10 more

Hmm that's no good: I suspect this is a bug in AllTermQuery. I'll dig...

You must be using index-time boosting ( https://www.elastic.co/guide/en/elasticsearch/reference/2.1/index-boost.html )? AllTermQuery should only be used in that case.

If you use query-time boosting instead, which is better for a number of reasons listed on that page (see the warning), you would side-step this bug.

OK I found the bug: https://github.com/elastic/elasticsearch/pull/15506

Thanks for reporting!

Also, it's not good to pass "lenient": true in your "query_string" request: this can only hide what could be serious errors in how you are querying elasticsearch!

Hey thanks for digging into this, and glad I could help uncover a bug!

You are correct that I'm using index-time boosting. If I wanted to replace that with query boosting, I'd need to enumerate every field I want to search, correct? Lets say I have 10 analyzed string fields. If I understand correctly those will all be included in the default _all field which query_string search hits by default. So if I remove boost from the mappings, I'd need to change all the query_string searches to include the array of fields with boosting applied (like 'title^2.5).

Just want to make sure there is not an easier syntax (such as just listing all the field boosts generically without having to modify the actual query).

Just to report back: avoiding index-time boosting and switching to query boosting did indeed dodge the bug. Getting rid of lenient will be trickier because it could reveal lots of "bugs"!

If I wanted to replace that with query boosting, I'd need to enumerate every field I want to search, correct?

That's correct, per-query you have to set the query-time boosts.

You could maybe keep using _all for those fields that are not boosted, and then add in additional query clauses only for those fields like title that you need to boost.

Thanks for confirming that not using index-time boosting sidesteps the bug!