How do I get whole values of a field, as a facet? (not individual terms!)


(ogregras) #1

In trying to get some facets for a search, using Java. But currently I'm
only able to get separated terms as a result, not fields' full values!

Let's say I index a document as such (simplified code):

====================
// XContentBuilder doc

List persons = new ArrayList();
persons.add("John Doe");
persons.add("Jane Doe");

doc.field("person", persons);

Now I do a search and I want Elastic to return facets containing, :

  • value = "John Doe", count : 23
  • value = "Jane Doe", count : 46

... If 23 documents in the result set contain "John Doe" as the person field
and 46 contain "Jane Doe".

With a TermsFacetBuilder, it seems I'm only able to have whitespace
separated terms.

For example, this:

=======================
TermsFacetBuilder facetBuilder =
FacetBuilders.termsFacet("person").field("person")).size(100).allTerms(false);

// SearchRequestBuilder searchRequestBuilder
searchRequestBuilder.addFacet(facetBuilder);

=======================

... would results in something like:

  • term = "John", count : 23
  • term = "Jane", count : 46
  • term = "Doe", count : 69

But what I want to show to the user as the result of his search is not
individual terms, but the full values of the person field: "John Doe" and
"Jane Doe". So they can filter their current search by clicking one of the
person name.

What should I use to achieve that? I guess it's something else than TermsFacetBuilder
but I don't see what.

Thanks in advance for any help...


(David Pilato) #2

For that need, I modified the mapping to use another analyzer on person.
So person is not tokenized.

There is perhaps a better way to do that...

Hope this helps
David :wink:

Le 17 août 2011 à 03:07, ogregras ogregras@gmail.com a écrit :

In trying to get some facets for a search, using Java. But currently I'm only able to get separated terms as a result, not fields' full values!

Let's say I index a document as such (simplified code):

====================
// XContentBuilder doc

List persons = new ArrayList();
persons.add("John Doe");
persons.add("Jane Doe");

doc.field("person", persons);

Now I do a search and I want Elastic to return facets containing, :

  • value = "John Doe", count : 23
  • value = "Jane Doe", count : 46

... If 23 documents in the result set contain "John Doe" as the person field and 46 contain "Jane Doe".

With a TermsFacetBuilder, it seems I'm only able to have whitespace separated terms.

For example, this:

=======================
TermsFacetBuilder facetBuilder = FacetBuilders.termsFacet("person").field("person")).size(100).allTerms(false);

// SearchRequestBuilder searchRequestBuilder
searchRequestBuilder.addFacet(facetBuilder);

=======================

... would results in something like:

  • term = "John", count : 23
  • term = "Jane", count : 46
  • term = "Doe", count : 69

But what I want to show to the user as the result of his search is not individual terms, but the full values of the person field: "John Doe" and "Jane Doe". So they can filter their current search by clicking one of the person name.

What should I use to achieve that? I guess it's something else than TermsFacetBuilder but I don't see what.

Thanks in advance for any help...


(ogregras) #3

I have to say I'm really surprised this is not built-in.

The documentation even sayshttp://www.elasticsearch.org/guide/reference/api/search/facets/
:

*(This is because the primary purpose of facets is to enable faceted
browsing http://en.wikipedia.org/wiki/Faceted_search, allowing the user to
refine her query based on the insight from the facet: restrict the search to
a specific category, price or date range, etc., most probably with a filterhttp://www.elasticsearch.org/guide/reference/api/search/filter.htmlbased on selected facet.)
*

But to use "faceted browsing" as it should, most of the time you can't only
display individual terms!

To write a custom analyser, do you have any tips to give me? Could you share
parts of Java code that does this?

Is this the wayhttp://stackoverflow.com/questions/6275727/define-custom-elasticsearch-analyzer-using-java-apito create custom analysers?

Thanks a lot!


(Shay Banon) #4

The simplest way is to map the persons field and set analyzer to be keyword
for it. You can use multi field mapping so you can still search on the
tokenized field, but facet on the non tokenized field.

Its not that its not built in, but, it really depends on what you want to
do. Search on a field requires tokenization, and faceting on it (usually)
does not.

On Wed, Aug 17, 2011 at 2:06 PM, ogregras ogregras@gmail.com wrote:

I have to say I'm really surprised this is not built-in.

The documentation even sayshttp://www.elasticsearch.org/guide/reference/api/search/facets/
:

*(This is because the primary purpose of facets is to enable faceted
browsing http://en.wikipedia.org/wiki/Faceted_search, allowing the user
to refine her query based on the insight from the facet: restrict the search
to a specific category, price or date range, etc., most probably with a
filterhttp://www.elasticsearch.org/guide/reference/api/search/filter.htmlbased on selected facet.)
*

But to use "faceted browsing" as it should, most of the time you can't only
display individual terms!

To write a custom analyser, do you have any tips to give me? Could you
share parts of Java code that does this?

Is this the wayhttp://stackoverflow.com/questions/6275727/define-custom-elasticsearch-analyzer-using-java-apito create custom analysers?

Thanks a lot!


(ogregras) #5

I am a new to Elastic and I have no idea how to do this (mapping a field and
set the analyzer to be keyword), but I'll search for examples.

At least now I know what I need to do, this is the first step! Thanks for
the help.


(David Pilato) #6

So welcome !!!

Here is some information :

http://www.elasticsearch.org/guide/reference/api/admin-indices-create-index.html http://www.elasticsearch.org/guide/reference/api/admin-indices-create-index.html

http://www.elasticsearch.org/guide/reference/api/admin-indices-put-mapping.html

http://www.elasticsearch.org/guide/reference/mapping/

http://www.elasticsearch.org/guide/reference/mapping/core-types.html

http://www.elasticsearch.org/guide/reference/index-modules/analysis/

I think that an index creation with this will do the job (define a mapping for type test with field1 using analyzer keyword).

curl -XPOST localhost:9200/test -d '{
"mappings" : {
"type1" : {
"properties" : {
"field1" : { "type" : "string", "analyzer" : "keyword" }
}
}
}
}'

Hope this helps

David.

De : elasticsearch@googlegroups.com [mailto:elasticsearch@googlegroups.com] De la part de ogregras
Envoyé : mercredi 17 août 2011 23:09
À : elasticsearch@googlegroups.com
Objet : Re: How do I get whole values of a field, as a facet? (not individual terms!)

I am a new to Elastic and I have no idea how to do this (mapping a field and set the analyzer to be keyword), but I'll search for examples.

At least now I know what I need to do, this is the first step! Thanks for the help.


(ogregras) #7

Thanks a lot David! This will help to get me started!


(ogregras) #8

With your help, I've finally been able to add a "keyword analyzer" mapping
to some of my fields!

It looks like something like this in Java, for one filed:

===========
// Map<String, Object> fieldsMap

fieldsMap.put("FIELD_NAME", new HashMap()
{{
put("type", "multi_field");
put("fields", new HashMap()
{{
put("FIELD_NAME", new HashMap()
{{
put("type", "String");
put("index", "analyzed");
}});
put("FACET_MAPPING", new HashMap()
{{
put("type", "String");
put("analyzer", "keyword");
}});
}});
}});

===========

I'm not sure it's 100% ok (suggestions are welcome!), but it seems to work:
I'm able to get the mappings for those fields after creating the index, and
I see both mapping are created, the default one (that has the same name as
the field itself, "FIELD_NAME") and my new mapping for the facets,
"FACET_MAPPING".

But now I don't know how to build my query so it searches in the default
"FIELD_NAME" mapping for the search itself (I guess by default it will use
this mapping, so nothing is required here), but that it searches in the new
mapping, "FACET_MAPPING", to return the facets!

I have something like:

===========
TermsFacetBuilder facetBuilder = FacetBuilders.termsFacet("The Section
Name")
.field("FIELD_NAME")
.size(100)
.allTerms(false);

// SearchRequestBuilder searchRequestBuilder;

searchRequestBuilder.addFacet(facetBuilder);

===========

I don't know where to specify that I want the facets to return the values
using the "FACET_MAPPING" mapping, not the default mapping that returns
individual terms!

Any tips?


(ogregras) #9

Nobody?

I'd just need a little hint on how to target a particular mapping in a
search (for the resulting facets).


(ogregras) #10

I finally found how to do it!

If it can help somebody:

You add the two mappings, as described previously, but you use* a unique
name for each field* in the facet mapping, not "FACET_MAPPING" everywhere as
I was doing. Let's say you use "[FIELD_NAME]_facet" for the name of the
facet mapping of a field named [FIELD_NAME].

Then, when searching, you use this new name to build the facet section of
your query. For example:

TermsFacetBuilder facetBuilder = FacetBuilders.termsFacet("One Section
Name").field("FIELD_NAME]_facet").size(100).allTerms(false);

Your query will then use the right "mapping".


I though that for the facets to use the correct mapping, you first had to 
specify a field name and then specify a mapping name belonging to this 
field. But in fact, it seems the mapping creates a new name for the field 
and you use this new name in the facet section of your query, instead of the 
original field name.

(Shay Banon) #11

Sorry for the late response... . In the multi field case, referring to the
non default field mapper is simple, in your case, it would have been:
FIELD_NAME.FACET_MAPPING. It is explained in the multi field page:
http://www.elasticsearch.org/guide/reference/mapping/multi-field-type.html.

So, you can facet on FIELD_NAME.FACET_MAPPING.

On Wed, Aug 24, 2011 at 3:04 AM, ogregras ogregras@gmail.com wrote:

I finally found how to do it!

If it can help somebody:

You add the two mappings, as described previously, but you use* a unique
name for each field* in the facet mapping, not "FACET_MAPPING" everywhere
as I was doing. Let's say you use "[FIELD_NAME]_facet" for the name of the
facet mapping of a field named [FIELD_NAME].

Then, when searching, you use this new name to build the facet section of
your query. For example:

TermsFacetBuilder facetBuilder = FacetBuilders.termsFacet("One Section
Name").field("FIELD_NAME]_facet").size(100).allTerms(false);

Your query will then use the right "mapping".


I though that for the facets to use the correct mapping, you first had to
specify a field name and then specify a mapping name belonging to this
field. But in fact, it seems the mapping creates a new name for the field
and you use this new name in the facet section of your query, instead of the
original field name.










(ogregras) #12

Thanks kimchy!


(mohammad) #13

Hello everyone,
well i am new to elastic search and i am facing some similar difficulties
as mentioned above. i tried implementing some of the suggested solution but
to no avail.
I am posting part of codes and will be very grateful if somebody could help
me out. Thanks in advance.

the codes are written in java:
// i have the following in the mapping part
CreateIndexRequestBuilder builder =
client.admin().indices().prepareCreate(index)

.setSettings(ImmutableSettings.settingsBuilder().loadFromSource(configIndex));

    builder.addMapping("StatTest",  "{\n" + 
    "    \"StatTest\" : {\n" + 
    "        \"_all\" : { \n" + 
    "            \"analyzer\":\"francais\" \n" + 
    "        },\n" + 
    "        \"properties\" : {\n" + 
    "            \"idUser\" : {\"type\" : \"string\", 

"analyzer":"francais"},\n" +
" "loginOfUser" : {"type" : "string",
"analyzer":"francais"},\n" +
" "nameOfUser" : {"type" : "string",
"analyzer":"francais"},\n" +
" }\n" +
" }\n" +
"}");

//the sample data stored are the following
{idUser: "0121", loginOfUser: "login0121", nameOfUser :"mona lisa"},
{idUser: "0122", loginOfUser: "login0122", nameOfUser :"James Dean"},

//i am trying to get facets based upon name of user
//TermsFacetBuilder fb =
FacetBuilders.termsFacet("idOfUser").field("loginOfUser");
TermsFacetBuilder fb =
FacetBuilders.termsFacet("idOfUser").field("nameOfUser");
SearchRequestBuilder srb1 =
client.prepareSearch().setIndices(index).addFacet(fb);
AndFilterBuilder myFilters = FilterBuilders.andFilter();
myFilters.add(FilterBuilders.termFilter("year", "2014"));
FilterBuilder fbBuilder = FilterBuilders.andFilter(myFilters);
FilteredQueryBuilder q =
QueryBuilders.filteredQuery(QueryBuilders.matchAllQuery(),fbBuilder);
SearchResponse sr = srb1.setQuery(q).execute().actionGet();

        TermsFacet f = (TermsFacet) 

sr.getFacets().facetsAsMap().get("idOfUser");
for (TermsFacet.Entry entry : f) {
String type = entry.getTerm().toString();
//System.out.println("....enter type : "+type);
//System.out.println("....enter entry.getCount() :
"+entry.getCount());

        }

//problems faced whenever i am trying to do a facet based on login of user,
everything works well
the variable type returns :
login0121
login0122

however when i try to do a facet based on nameOfUser , the following is
returned:
mona
lisa
James
Dean

/////
i want to retriev the usernames as one token only,
am i missing some codes somewhere
i will be very thankful if any one can help me on this
thanks in advance

On Wednesday, 24 August 2011 22:36:02 UTC+4, ogregras wrote:

Thanks kimchy!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a5cc7038-01d1-4a8e-ab7a-5f84d51e0296%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(jsbonline2006) #14

Hi,

You need to follow the steps given by Kimchy and David. To brief it again
here are the steps:

  1. You have to define your facet as multi_field value as follows

"mappings": {
"data": {
"properties": {
"name": {
"type": "multi_field",
"fields": {
"name": {
"type": "string",
"index": "analyzed"
},
"untouched": {
"type": "string",
"index": "not_analyzed"
}
}
},

Here my "name" field is multi_field value. I can use "name" for searching
purpose and "name.untouched" for faceting purpose.

I was facing same issue earlier as you mentioned in above thread. and then
above mapping and usage helped me in resolving this issue

Regards,
Jayesh Bhoyar
http://www.linkedin.com/in/jayeshbhoyar

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/dbb07c2b-61f7-4481-a0d7-989852653aeb%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #15