Lucene Wildcard Search taking too much time

Hi Team,

Lucene Wildcard Search is taking around 1 minute to give the results.

Number of hits : 13000
Lucene Version : 5.5.2

Kindly let me know the problem or at least few suggestions.

Please find the attached log

at com.apache.lucene.store.DataInput.readVInt(DataInput.java:109)
at com.apache.lucene.index.TermBuffer.read(TermBuffer.java:87)
at com.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java:134)
at com.apache.lucene.search.FilteredTermEnum.next(FilteredTermEnum.java:78)
at com.apache.lucene.search.FilteredTermEnum.setEnum(FilteredTermEnum.java:57)
at com.apache.lucene.search.WildcardTermEnum.(WildcardTermEnum.java:65)
at com.apache.lucene.search.WildcardQuery.getEnum(WildcardQuery.java:59)
at com.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:107)
at com.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:139)
at com.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:298)
at com.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:578)
at com.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:383)

Thanks in advance.

Regards,
Sampath.

I think you will need to provide some additional information in order for anyone to be able to answer.

  • What version of Elasticsearch are you on?
  • What is the specification of the cluster in terms of node count, CPUs, RAM and storage?
  • How much data are you querying? How many shards do you have?
  • What does the query you are running look like?

And also: don't use wildcards as written here: Wildcard Query | Elasticsearch Guide [5.2] | Elastic

Note that this query can be slow, as it needs to iterate over many terms.

2 Likes

Hi Christian,

please find the below information,

Lucene Version: 5.5.2 (Not an Elastic Search)
No of nodes: 4 (Clustered Environment)
Search String : Comcast* AND insattach:false AND pyClassGroup:WB-TO-TOPSRF-Work* AND NOT insexternal:true

Regards,
Sampath.

As David pointed out, wildcard queries can be slow. See if you can index and query your data in a way that avoids wildcard queries.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.