Indexing and searching URLS with dashes


(Alex P) #1

Hi, here's a gist with a way to reproduce what I mention
here: https://gist.github.com/2029097

I'm trying to index domains (for instance, produktion-naja.de), I before
indexing urls, I get them through lowercase tokenizer and stop, ngram, and
word delimiter filters.

Whenever I'm trying to search urls, when I have a dash in query, I get no
search results. Whenever I replace dash with a whitespace, results come.

I'm not entirely sure what I am doing wrong here, everything seems to be
more or less in place.

Thanks


(Shay Banon) #2

Use text query instead of query_string, or wrap it with "". Don't use wildcards (*), it almost never a good idea to use it, and you already use ngrams.

On Tuesday, March 13, 2012 at 4:28 PM, Alex P wrote:

Hi, here's a gist with a way to reproduce what I mention here: https://gist.github.com/2029097

I'm trying to index domains (for instance, produktion-naja.de (http://produktion-naja.de)), I before indexing urls, I get them through lowercase tokenizer and stop, ngram, and word delimiter filters.

Whenever I'm trying to search urls, when I have a dash in query, I get no search results. Whenever I replace dash with a whitespace, results come.

I'm not entirely sure what I am doing wrong here, everything seems to be more or less in place.

Thanks


(system) #3