Retaining case in a faceted search

Is there a way to do faceted searches using the Search API AND
maintain case. For example...

curl -X POST "http://localhost:9200/automobiles/automobile/_search?
pretty=true&q=make:B*" -d '{"size" : "0", "facets" : {"make" :
{ "terms" : {"field" : "make"} }}}'

...returns...

{
...
"facets" : {
"make" : {
...
"terms" : [ {
"term" : "bmw",
"count" : 1654
}, {
"term" : "buick",
"count" : 362
}, {
...
} ]
}
}

...but I want to retain the case ("BMW", "Buick").

Thanks in advance, Chuck

Hi Chuck,

When faceting on strings, they should either be not analyzed
(preferred) or tokenized with a KeywordTokenizer. What is happening in
your case is the terms are being indexed as lowercase by the default
analyzer.

--
Ivan

On Mon, Feb 13, 2012 at 9:14 AM, csh chuck.han@gmail.com wrote:

Is there a way to do faceted searches using the Search API AND
maintain case. For example...

curl -X POST "http://localhost:9200/automobiles/automobile/_search?
pretty=true&q=make:B*" -d '{"size" : "0", "facets" : {"make" :
{ "terms" : {"field" : "make"} }}}'

...returns...

{
...
"facets" : {
"make" : {
...
"terms" : [ {
"term" : "bmw",
"count" : 1654
}, {
"term" : "buick",
"count" : 362
}, {
...
} ]
}
}

...but I want to retain the case ("BMW", "Buick").

Thanks in advance, Chuck

Thanks for the quick response, Ivan! Will look into how to do this
(don't tell me :-)), as I am an ES newbie...

On Feb 13, 9:33 am, Ivan Brusic i...@brusic.com wrote:

Hi Chuck,

When faceting on strings, they should either be not analyzed
(preferred) or tokenized with a KeywordTokenizer. What is happening in
your case is the terms are being indexed as lowercase by the default
analyzer.

--
Ivan

On Mon, Feb 13, 2012 at 9:14 AM, csh chuck....@gmail.com wrote:

Is there a way to do faceted searches using the Search API AND
maintain case. For example...

curl -X POST "http://localhost:9200/automobiles/automobile/_search?
pretty=true&q=make:B*" -d '{"size" : "0", "facets" : {"make" :
{ "terms" : {"field" : "make"} }}}'

...returns...

{
...
"facets" : {
"make" : {
...
"terms" : [ {
"term" : "bmw",
"count" : 1654
}, {
"term" : "buick",
"count" : 362
}, {
...
} ]
}
}

...but I want to retain the case ("BMW", "Buick").

Thanks in advance, Chuck

Just in case you did not find out how, you need to explicitly define the mapping for that field to set index to not_analyzed. SEt the mapping in the create index API (simplest) when you create the index.

On Monday, February 13, 2012 at 9:02 PM, csh wrote:

Thanks for the quick response, Ivan! Will look into how to do this
(don't tell me :-)), as I am an ES newbie...

On Feb 13, 9:33 am, Ivan Brusic <i...@brusic.com (http://brusic.com)> wrote:

Hi Chuck,

When faceting on strings, they should either be not analyzed
(preferred) or tokenized with a KeywordTokenizer. What is happening in
your case is the terms are being indexed as lowercase by the default
analyzer.

--
Ivan

On Mon, Feb 13, 2012 at 9:14 AM, csh <chuck....@gmail.com (http://gmail.com)> wrote:

Is there a way to do faceted searches using the Search API AND
maintain case. For example...

curl -X POST "http://localhost:9200/automobiles/automobile/_search?
pretty=true&q=make:B*" -d '{"size" : "0", "facets" : {"make" :
{ "terms" : {"field" : "make"} }}}'

...returns...

{
...
"facets" : {
"make" : {
...
"terms" : [ {
"term" : "bmw",
"count" : 1654
}, {
"term" : "buick",
"count" : 362
}, {
...
} ]
}
}

...but I want to retain the case ("BMW", "Buick").

Thanks in advance, Chuck

I'm not quite getting the results I expect: I think I'm indexing the
way you suggested...

curl -XPUT localhost:9200/cars?pretty=true -d '{"index" :
{"analysis" : {"analyzer" : {"default" : {"type" : "keyword"}}}}}'

...because after populating ES, the following query gives me the fully-
retained fields:

curl -X POST "http://localhost:9200/cars/car/_search?
pretty=true&q=make:*" -d '{"size" : "0", "facets" : {"make" :
{ "terms" : {"field" : "make"} }}}'

I can even do a query now in which I ask for all "makes" that end in
"n"...

curl -X POST "http://localhost:9200/cars/car/_search?
pretty=true&q=make:*n" -d '{"size" : "0", "facets" : {"make" :
{ "terms" : {"field" : "make"} }}}'

...and I get the right result:

{
...
"facets" : {
"make" : {
"_type" : "terms",
"missing" : 0,
"total" : 3,
"other" : 0,
"terms" : [ {
"term" : "Aston Martin",
"count" : 2
}, {
"term" : "Nissan",
"count" : 1
} ]
}
}
}

However, if I ask for all "makes" that start with "a" or
"A" (q=make:A* or q=make:a*), I get no results (there should be
several--at least one as shown in the above example):

curl -X POST "http://localhost:9200/cars/car/_search?
pretty=true&q=make:a*" -d '{"size" : "0", "facets" : {"make" :
{ "terms" : {"field" : "make"} }}}'

Is that a bug, or is there something I'm missing?

thanks in advance, Chuck

On Feb 13, 9:33 am, Ivan Brusic i...@brusic.com wrote:

Hi Chuck,

When faceting on strings, they should either be not analyzed
(preferred) or tokenized with a KeywordTokenizer. What is happening in
yourcaseis the terms are being indexed as lowercase by the default
analyzer.

--
Ivan

On Mon, Feb 13, 2012 at 9:14 AM, csh chuck....@gmail.com wrote:

Is there a way to dofacetedsearches using the Search API AND
maintaincase. For example...

curl -X POST "http://localhost:9200/automobiles/automobile/_search?
pretty=true&q=make:B*" -d '{"size" : "0", "facets" : {"make" :
{ "terms" : {"field" : "make"} }}}'

...returns...

{
...
"facets" : {
"make" : {
...
"terms" : [ {
"term" : "bmw",
"count" : 1654
}, {
"term" : "buick",
"count" : 362
}, {
...
} ]
}
}

...but I want to retain thecase("BMW", "Buick").

Thanks in advance, Chuck

Got it! Need to put the wildcard directive in explicitly:

curl -X POST "http://localhost:9200/cars/car/_search?pretty=true" -d
'{"size" : "0", "query": {"wildcard" : { "make" : "A*" }}, "facets" :
{"make" : { "terms" : {"field" : "make"} }}}'

And, as expected, the wildcard is case-sensitive...

thanks, Chuck

On Feb 14, 8:58 am, csh chuck....@gmail.com wrote:

I'm not quite getting the results I expect: I think I'm indexing the
way you suggested...

curl -XPUT localhost:9200/cars?pretty=true -d '{"index" :
{"analysis" : {"analyzer" : {"default" : {"type" : "keyword"}}}}}'

...because after populating ES, the following query gives me the fully-
retained fields:

curl -X POST "http://localhost:9200/cars/car/_search?
pretty=true&q=make:*" -d '{"size" : "0", "facets" : {"make" :
{ "terms" : {"field" : "make"} }}}'

I can even do a query now in which I ask for all "makes" that end in
"n"...

curl -X POST "http://localhost:9200/cars/car/_search?
pretty=true&q=make:*n" -d '{"size" : "0", "facets" : {"make" :
{ "terms" : {"field" : "make"} }}}'

...and I get the right result:

{
...
"facets" : {
"make" : {
"_type" : "terms",
"missing" : 0,
"total" : 3,
"other" : 0,
"terms" : [ {
"term" : "Aston Martin",
"count" : 2
}, {
"term" : "Nissan",
"count" : 1
} ]
}
}

}

However, if I ask for all "makes" that start with "a" or
"A" (q=make:A* or q=make:a*), I get no results (there should be
several--at least one as shown in the above example):

curl -X POST "http://localhost:9200/cars/car/_search?
pretty=true&q=make:a*" -d '{"size" : "0", "facets" : {"make" :
{ "terms" : {"field" : "make"} }}}'

Is that a bug, or is there something I'm missing?

thanks in advance, Chuck

On Feb 13, 9:33 am, Ivan Brusic i...@brusic.com wrote:

Hi Chuck,

When faceting on strings, they should either be not analyzed
(preferred) or tokenized with a KeywordTokenizer. What is happening in
yourcaseis the terms are being indexed as lowercase by the default
analyzer.

--
Ivan

On Mon, Feb 13, 2012 at 9:14 AM, csh chuck....@gmail.com wrote:

Is there a way to dofacetedsearches using the Search API AND
maintaincase. For example...

curl -X POST "http://localhost:9200/automobiles/automobile/_search?
pretty=true&q=make:B*" -d '{"size" : "0", "facets" : {"make" :
{ "terms" : {"field" : "make"} }}}'

...returns...

{
...
"facets" : {
"make" : {
...
"terms" : [ {
"term" : "bmw",
"count" : 1654
}, {
"term" : "buick",
"count" : 362
}, {
...
} ]
}
}

...but I want to retain thecase("BMW", "Buick").

Thanks in advance, Chuck

Note, what you are doing is storing all text fields using the keyword analyzer, I am not sure that its what you really want. Only use that on fields that you want to facet, possibly with multi field mapping.

On Tuesday, February 14, 2012 at 10:34 PM, csh wrote:

Got it! Need to put the wildcard directive in explicitly:

curl -X POST "http://localhost:9200/cars/car/_search?pretty=true" -d
'{"size" : "0", "query": {"wildcard" : { "make" : "A*" }}, "facets" :
{"make" : { "terms" : {"field" : "make"} }}}'

And, as expected, the wildcard is case-sensitive...

thanks, Chuck

On Feb 14, 8:58 am, csh <chuck....@gmail.com (http://gmail.com)> wrote:

I'm not quite getting the results I expect: I think I'm indexing the
way you suggested...

curl -XPUT localhost:9200/cars?pretty=true -d '{"index" :
{"analysis" : {"analyzer" : {"default" : {"type" : "keyword"}}}}}'

...because after populating ES, the following query gives me the fully-
retained fields:

curl -X POST "http://localhost:9200/cars/car/_search?
pretty=true&q=make:*" -d '{"size" : "0", "facets" : {"make" :
{ "terms" : {"field" : "make"} }}}'

I can even do a query now in which I ask for all "makes" that end in
"n"...

curl -X POST "http://localhost:9200/cars/car/_search?
pretty=true&q=make:*n" -d '{"size" : "0", "facets" : {"make" :
{ "terms" : {"field" : "make"} }}}'

...and I get the right result:

{
...
"facets" : {
"make" : {
"_type" : "terms",
"missing" : 0,
"total" : 3,
"other" : 0,
"terms" : [ {
"term" : "Aston Martin",
"count" : 2
}, {
"term" : "Nissan",
"count" : 1
} ]
}
}

}

However, if I ask for all "makes" that start with "a" or
"A" (q=make:A* or q=make:a*), I get no results (there should be
several--at least one as shown in the above example):

curl -X POST "http://localhost:9200/cars/car/_search?
pretty=true&q=make:a*" -d '{"size" : "0", "facets" : {"make" :
{ "terms" : {"field" : "make"} }}}'

Is that a bug, or is there something I'm missing?

thanks in advance, Chuck

On Feb 13, 9:33 am, Ivan Brusic <i...@brusic.com (http://brusic.com)> wrote:

Hi Chuck,

When faceting on strings, they should either be not analyzed
(preferred) or tokenized with a KeywordTokenizer. What is happening in
yourcaseis the terms are being indexed as lowercase by the default
analyzer.

--
Ivan

On Mon, Feb 13, 2012 at 9:14 AM, csh <chuck....@gmail.com (http://gmail.com)> wrote:

Is there a way to dofacetedsearches using the Search API AND
maintaincase. For example...

curl -X POST "http://localhost:9200/automobiles/automobile/_search?
pretty=true&q=make:B*" -d '{"size" : "0", "facets" : {"make" :
{ "terms" : {"field" : "make"} }}}'

...returns...

{
...
"facets" : {
"make" : {
...
"terms" : [ {
"term" : "bmw",
"count" : 1654
}, {
"term" : "buick",
"count" : 362
}, {
...
} ]
}
}

...but I want to retain thecase("BMW", "Buick").

Thanks in advance, Chuck