How to do case insensitive search on terms?


(None) #1

Using ES 1.3.2

The current application I'm building only uses term queries for exact
matches.

Example query

"query": {
"term": {
"logType": "abc"
}

The field logType is pulled from external DB as all caps so for instance ABC

If i send the query

"query": {
"term": {
"logType": "ABC"
}

I get no results.

If I send the query

"query": {
"term": {
"logType": "abc"
}

I get results.

So does this mean I need to toLower the input before building the query or
there's an ES way of doing this?

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8fec1f56-1a59-4d17-bb01-f2ecd62bfbb4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #2

I assume you use the standard analyzer which uses by default a token filter
"lowercase".

Just use a custom analyzer, without "lowercase" token filter, and you will
get case-sensitive search.

Jörg

On Mon, Sep 15, 2014 at 5:44 PM, John Smith java.dev.mtl@gmail.com wrote:

Using ES 1.3.2

The current application I'm building only uses term queries for exact
matches.

Example query

"query": {
"term": {
"logType": "abc"
}

The field logType is pulled from external DB as all caps so for instance
ABC

If i send the query

"query": {
"term": {
"logType": "ABC"
}

I get no results.

If I send the query

"query": {
"term": {
"logType": "abc"
}

I get results.

So does this mean I need to toLower the input before building the query or
there's an ES way of doing this?

Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8fec1f56-1a59-4d17-bb01-f2ecd62bfbb4%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/8fec1f56-1a59-4d17-bb01-f2ecd62bfbb4%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHi-483Tgf5DmwXEfKrGRye1vjvGU4jFL0o8guNsGBaCw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Query String Query for case insensitive matching using ES-Spark connector
(Nik Everett) #3

Or if you want case insensitive search use a match query
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-match-query.html
.

On Mon, Sep 15, 2014 at 11:47 AM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

I assume you use the standard analyzer which uses by default a token
filter "lowercase".

Just use a custom analyzer, without "lowercase" token filter, and you will
get case-sensitive search.

Jörg

On Mon, Sep 15, 2014 at 5:44 PM, John Smith java.dev.mtl@gmail.com
wrote:

Using ES 1.3.2

The current application I'm building only uses term queries for exact
matches.

Example query

"query": {
"term": {
"logType": "abc"
}

The field logType is pulled from external DB as all caps so for instance
ABC

If i send the query

"query": {
"term": {
"logType": "ABC"
}

I get no results.

If I send the query

"query": {
"term": {
"logType": "abc"
}

I get results.

So does this mean I need to toLower the input before building the query
or there's an ES way of doing this?

Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8fec1f56-1a59-4d17-bb01-f2ecd62bfbb4%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/8fec1f56-1a59-4d17-bb01-f2ecd62bfbb4%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHi-483Tgf5DmwXEfKrGRye1vjvGU4jFL0o8guNsGBaCw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHi-483Tgf5DmwXEfKrGRye1vjvGU4jFL0o8guNsGBaCw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd3q6SNUP6gBDADV802JGSZyO0M6%2BDKP%2BBvBNRwaEr8p0A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #4

Ugh. Read my post exactly the other way round.

=> standard analyzer is not using token filter "lowercase"

=> you can build a custom analyzer with token filter 'lowercase" for
case-insensitive search.

Jörg

On Mon, Sep 15, 2014 at 5:47 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

I assume you use the standard analyzer which uses by default a token
filter "lowercase".

Just use a custom analyzer, without "lowercase" token filter, and you will
get case-sensitive search.

Jörg

On Mon, Sep 15, 2014 at 5:44 PM, John Smith java.dev.mtl@gmail.com
wrote:

Using ES 1.3.2

The current application I'm building only uses term queries for exact
matches.

Example query

"query": {
"term": {
"logType": "abc"
}

The field logType is pulled from external DB as all caps so for instance
ABC

If i send the query

"query": {
"term": {
"logType": "ABC"
}

I get no results.

If I send the query

"query": {
"term": {
"logType": "abc"
}

I get results.

So does this mean I need to toLower the input before building the query
or there's an ES way of doing this?

Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8fec1f56-1a59-4d17-bb01-f2ecd62bfbb4%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/8fec1f56-1a59-4d17-bb01-f2ecd62bfbb4%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEHpcSbsbC3S%3DMd9KeCZP0ikhPtCBZu-fVJ8NZod33THg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(None) #5

Thanks!

Yes I want case insensitive search and yes I'm using default analyzer.

Just as you guys answered I tried this...

{
"query": {
"query_string": {
"fields": ["logType"],
"query": "ABC"
}
}
}

and it worked. Since I'm searching for "exact" matches (not wildards or
anything like that) does it make a difference in performance? So far it
seems like no at least when testing through Sense on the same amount of
data.

On Monday, 15 September 2014 11:49:33 UTC-4, Nikolas Everett wrote:

Or if you want case insensitive search use a match query
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-match-query.html
.

On Mon, Sep 15, 2014 at 11:47 AM, joerg...@gmail.com <javascript:> <
joerg...@gmail.com <javascript:>> wrote:

I assume you use the standard analyzer which uses by default a token
filter "lowercase".

Just use a custom analyzer, without "lowercase" token filter, and you
will get case-sensitive search.

Jörg

On Mon, Sep 15, 2014 at 5:44 PM, John Smith <java.d...@gmail.com
<javascript:>> wrote:

Using ES 1.3.2

The current application I'm building only uses term queries for exact
matches.

Example query

"query": {
"term": {
"logType": "abc"
}

The field logType is pulled from external DB as all caps so for instance
ABC

If i send the query

"query": {
"term": {
"logType": "ABC"
}

I get no results.

If I send the query

"query": {
"term": {
"logType": "abc"
}

I get results.

So does this mean I need to toLower the input before building the query
or there's an ES way of doing this?

Thanks

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8fec1f56-1a59-4d17-bb01-f2ecd62bfbb4%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/8fec1f56-1a59-4d17-bb01-f2ecd62bfbb4%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHi-483Tgf5DmwXEfKrGRye1vjvGU4jFL0o8guNsGBaCw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHi-483Tgf5DmwXEfKrGRye1vjvGU4jFL0o8guNsGBaCw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b72784e7-48c1-4c10-ab69-77df2c1b6d42%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Nik Everett) #6

Use a match instead of a query_string - query_string has funky syntax that
can activate fuzzy matching and search on other fields and stuff - probably
not what you want. Otherwise it shouldn't have any real impact on
performance.

On Mon, Sep 15, 2014 at 12:04 PM, John Smith java.dev.mtl@gmail.com wrote:

Thanks!

Yes I want case insensitive search and yes I'm using default analyzer.

Just as you guys answered I tried this...

{
"query": {
"query_string": {
"fields": ["logType"],
"query": "ABC"
}
}
}

and it worked. Since I'm searching for "exact" matches (not wildards or
anything like that) does it make a difference in performance? So far it
seems like no at least when testing through Sense on the same amount of
data.

On Monday, 15 September 2014 11:49:33 UTC-4, Nikolas Everett wrote:

Or if you want case insensitive search use a match query
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-match-query.html
.

On Mon, Sep 15, 2014 at 11:47 AM, joerg...@gmail.com joerg...@gmail.com
wrote:

I assume you use the standard analyzer which uses by default a token
filter "lowercase".

Just use a custom analyzer, without "lowercase" token filter, and you
will get case-sensitive search.

Jörg

On Mon, Sep 15, 2014 at 5:44 PM, John Smith java.d...@gmail.com wrote:

Using ES 1.3.2

The current application I'm building only uses term queries for exact
matches.

Example query

"query": {
"term": {
"logType": "abc"
}

The field logType is pulled from external DB as all caps so for
instance ABC

If i send the query

"query": {
"term": {
"logType": "ABC"
}

I get no results.

If I send the query

"query": {
"term": {
"logType": "abc"
}

I get results.

So does this mean I need to toLower the input before building the query
or there's an ES way of doing this?

Thanks

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/8fec1f56-1a59-4d17-bb01-f2ecd62bfbb4%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/8fec1f56-1a59-4d17-bb01-f2ecd62bfbb4%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/CAKdsXoHi-483Tgf5DmwXEfKrGRye1vjvGU4jFL0
o8guNsGBaCw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHi-483Tgf5DmwXEfKrGRye1vjvGU4jFL0o8guNsGBaCw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b72784e7-48c1-4c10-ab69-77df2c1b6d42%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/b72784e7-48c1-4c10-ab69-77df2c1b6d42%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd0Bqco40M1102FXXY8DpyV9Yen0OwaXdE1wHND%3DsaKPNw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(None) #7

Thanks match works also.

On Monday, 15 September 2014 12:09:43 UTC-4, Nikolas Everett wrote:

Use a match instead of a query_string - query_string has funky syntax that
can activate fuzzy matching and search on other fields and stuff - probably
not what you want. Otherwise it shouldn't have any real impact on
performance.

On Mon, Sep 15, 2014 at 12:04 PM, John Smith <java.d...@gmail.com
<javascript:>> wrote:

Thanks!

Yes I want case insensitive search and yes I'm using default analyzer.

Just as you guys answered I tried this...

{
"query": {
"query_string": {
"fields": ["logType"],
"query": "ABC"
}
}
}

and it worked. Since I'm searching for "exact" matches (not wildards or
anything like that) does it make a difference in performance? So far it
seems like no at least when testing through Sense on the same amount of
data.

On Monday, 15 September 2014 11:49:33 UTC-4, Nikolas Everett wrote:

Or if you want case insensitive search use a match query
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-match-query.html
.

On Mon, Sep 15, 2014 at 11:47 AM, joerg...@gmail.com <joerg...@gmail.com

wrote:

I assume you use the standard analyzer which uses by default a token
filter "lowercase".

Just use a custom analyzer, without "lowercase" token filter, and you
will get case-sensitive search.

Jörg

On Mon, Sep 15, 2014 at 5:44 PM, John Smith java.d...@gmail.com
wrote:

Using ES 1.3.2

The current application I'm building only uses term queries for exact
matches.

Example query

"query": {
"term": {
"logType": "abc"
}

The field logType is pulled from external DB as all caps so for
instance ABC

If i send the query

"query": {
"term": {
"logType": "ABC"
}

I get no results.

If I send the query

"query": {
"term": {
"logType": "abc"
}

I get results.

So does this mean I need to toLower the input before building the
query or there's an ES way of doing this?

Thanks

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/8fec1f56-1a59-4d17-bb01-f2ecd62bfbb4%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/8fec1f56-1a59-4d17-bb01-f2ecd62bfbb4%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/CAKdsXoHi-483Tgf5DmwXEfKrGRye1vjvGU4jFL0
o8guNsGBaCw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHi-483Tgf5DmwXEfKrGRye1vjvGU4jFL0o8guNsGBaCw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b72784e7-48c1-4c10-ab69-77df2c1b6d42%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/b72784e7-48c1-4c10-ab69-77df2c1b6d42%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/07bb0325-ee98-4fe9-a2f1-a88a50a555f8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Harpreet Bhatia) #8

I have problem with something similar.

$tjson = '{
                        "query": {
                            "bool": {
                                "should": [
                                    {
                                        "match" : { "keywords": "'.$keywords.'" }
                                    },
                                    {
                                        "match" : { "treatment_name": "'.$keywords.'" }
                                    },
                                    {
                                        "wildcard": {"keywords": "'.$keywords.'*"}
                                    },
                                    {
                                        "wildcard": {"treatment_name": "'.$keywords.'*"}
                                    },
                                    {
                                        "fuzzy": {"keywords": "'.$keywords.'"}
                                    },
                                    {
                                        "fuzzy": {"treatment_name": "'.$keywords.'"}
                                    }
                                ],
                                "must":[
                                    {
                                        "match":{"treatment_status":"active"}
                                    }
                                ],
                                "minimum_should_match": 2
                            }
                        }
                    }';
        $treats['body'] = $tjson;
        $results = $app->client->search($treats);

http://example.com/api/search/heart return 8 results
http://example.com/api/search/HEART returns only 2 results
mapping is as following.

$indexParams['body']['settings']['analysis']['analyzer'] = array(
                                                                                              "analyzer_keyword" => array(
                                                                                                                    'tokenizer' => 'standard',
                                                                                                                    "filter"=>"lowercase"
                                                                                                                    )
                                                                                              );
...
"keywords": {
            "type": "string",
            "analyzer":"analyzer_keyword"
},"treatment_name": {
            "type": "string",
            "analyzer":"analyzer_keyword"},
...

(system) #9