Analyzer settings for breaking up words on hyphens

Hello,

I have a field that is using the whitespace tokenizer, but I also want to
tokenize on hyphens (-) like the standard analyzer does. I'm having
trouble figuring out what additional custom settings I would have to put in
there in order to be able to tokenize off of hyphens as well.

Thanks,
Mike

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALdNedLtdAWEiQN%2BoUV17J5e8DowMbDva2pJn1S%3Dr9w1qtP9bA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

You can either use a pattern tokenizer with your patterns being whitespace

  • hypen, or further decompose your token post tokenization with the word
    delimiter token filter, which is much harder to use (and might be an
    overkill for your use case).

Cheers,

Ivan

On Mon, Oct 27, 2014 at 7:55 AM, Mike Topper topper@gmail.com wrote:

Hello,

I have a field that is using the whitespace tokenizer, but I also want to
tokenize on hyphens (-) like the standard analyzer does. I'm having
trouble figuring out what additional custom settings I would have to put in
there in order to be able to tokenize off of hyphens as well.

Thanks,
Mike

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALdNedLtdAWEiQN%2BoUV17J5e8DowMbDva2pJn1S%3Dr9w1qtP9bA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALdNedLtdAWEiQN%2BoUV17J5e8DowMbDva2pJn1S%3Dr9w1qtP9bA%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDeFdP4-imY0ReSZTkSAnfQ8o6_hWp9MAB0YcMOgDo9rA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Thanks! i'll go ahead and try the pattern tokenizer route.

On Mon, Oct 27, 2014 at 1:22 PM, Ivan Brusic ivan@brusic.com wrote:

You can either use a pattern tokenizer with your patterns being whitespace

  • hypen, or further decompose your token post tokenization with the word
    delimiter token filter, which is much harder to use (and might be an
    overkill for your use case).

Elasticsearch Platform — Find real-time answers at scale | Elastic

Elasticsearch Platform — Find real-time answers at scale | Elastic

Cheers,

Ivan

On Mon, Oct 27, 2014 at 7:55 AM, Mike Topper topper@gmail.com wrote:

Hello,

I have a field that is using the whitespace tokenizer, but I also want to
tokenize on hyphens (-) like the standard analyzer does. I'm having
trouble figuring out what additional custom settings I would have to put in
there in order to be able to tokenize off of hyphens as well.

Thanks,
Mike

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALdNedLtdAWEiQN%2BoUV17J5e8DowMbDva2pJn1S%3Dr9w1qtP9bA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALdNedLtdAWEiQN%2BoUV17J5e8DowMbDva2pJn1S%3Dr9w1qtP9bA%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDeFdP4-imY0ReSZTkSAnfQ8o6_hWp9MAB0YcMOgDo9rA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDeFdP4-imY0ReSZTkSAnfQ8o6_hWp9MAB0YcMOgDo9rA%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALdNedK9EfeL-FGbavnKO4t%3DkrQ%2BxeQ-O2p2wL-P_iqGSrhrsg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Or you could cheat and use a character filter to turn the hyphen into
spaces. Lots of ways to skin a cat.

On Mon, Oct 27, 2014 at 7:07 PM, Mike Topper topper@gmail.com wrote:

Thanks! i'll go ahead and try the pattern tokenizer route.

On Mon, Oct 27, 2014 at 1:22 PM, Ivan Brusic ivan@brusic.com wrote:

You can either use a pattern tokenizer with your patterns being
whitespace + hypen, or further decompose your token post tokenization with
the word delimiter token filter, which is much harder to use (and might be
an overkill for your use case).

Elasticsearch Platform — Find real-time answers at scale | Elastic

Elasticsearch Platform — Find real-time answers at scale | Elastic

Cheers,

Ivan

On Mon, Oct 27, 2014 at 7:55 AM, Mike Topper topper@gmail.com wrote:

Hello,

I have a field that is using the whitespace tokenizer, but I also want
to tokenize on hyphens (-) like the standard analyzer does. I'm having
trouble figuring out what additional custom settings I would have to put in
there in order to be able to tokenize off of hyphens as well.

Thanks,
Mike

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALdNedLtdAWEiQN%2BoUV17J5e8DowMbDva2pJn1S%3Dr9w1qtP9bA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALdNedLtdAWEiQN%2BoUV17J5e8DowMbDva2pJn1S%3Dr9w1qtP9bA%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDeFdP4-imY0ReSZTkSAnfQ8o6_hWp9MAB0YcMOgDo9rA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDeFdP4-imY0ReSZTkSAnfQ8o6_hWp9MAB0YcMOgDo9rA%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALdNedK9EfeL-FGbavnKO4t%3DkrQ%2BxeQ-O2p2wL-P_iqGSrhrsg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALdNedK9EfeL-FGbavnKO4t%3DkrQ%2BxeQ-O2p2wL-P_iqGSrhrsg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd1oEgb55Y0tVU6VNzDXEF6RJQRRFZ%3DW2_iKrRmJBMVW2Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.