Terms Facet - Regex help with quantifiers


(Joe Wong) #1

I have a field called createdDate of type string and it's not analyzed. A
sample entry for this field is "20120724120000"

I like to run facets on this field after a certain query and do a facet
count. However I want the facet to count based on a certain portion of the
field. I am using the regex option to this.

for example i have a few ES documents that contain:
20120724120000
20120724120001
20120724120002
20120725120021
20120726120034

i like facets to count 20120724 =t 3, 20120725 = 1, 20120726 = 1 etc.

here is my facet query
curl -X POST "localhost:9200/stuff/_search?pretty=true" -d '{
"query": {
"text" : {
"message" : {
"query" : "test query",
"operator" : "and"
}
}
},
"facets" : { "createdDate" : { "terms" : {"field" : "createdDate", "size" :
2000, "regex" : "[0-9]{8}"}}}}'

The quantifier doesn't seem to be working for me, it returns 0 results. Is
it the correct syntax?
I'm using 0.19.6

thanks


(sujoysett) #2

Hi,

Try using scripts. As following. They modify the field value before
applying faceting.

curl -XGET 'http://localhost:9200/try_test/data/_search?pretty=true' -d '{
"query": {
"match_all": {}
},
"facets": {
"my_facet": {
"terms": {
"script_field":
"(_source.value+"").toLowerCase().substring(0,9)",
"size": 10
}
}
}
}'

This is a copy from some stuff I was recently working with, so you need to
modify a bit to meet your details (fields and requirement).

Sujoy,

On Tuesday, July 24, 2012 9:17:34 PM UTC+5:30, Joe Wong wrote:

I have a field called createdDate of type string and it's not analyzed. A
sample entry for this field is "20120724120000"

I like to run facets on this field after a certain query and do a facet
count. However I want the facet to count based on a certain portion of the
field. I am using the regex option to this.

for example i have a few ES documents that contain:
20120724120000
20120724120001
20120724120002
20120725120021
20120726120034

i like facets to count 20120724 =t 3, 20120725 = 1, 20120726 = 1 etc.

here is my facet query
curl -X POST "localhost:9200/stuff/_search?pretty=true" -d '{
"query": {
"text" : {
"message" : {
"query" : "test query",
"operator" : "and"
}
}
},
"facets" : { "createdDate" : { "terms" : {"field" : "createdDate", "size"
: 2000, "regex" : "[0-9]{8}"}}}}'

The quantifier doesn't seem to be working for me, it returns 0 results.
Is it the correct syntax?
I'm using 0.19.6

thanks


(Joe Wong) #3

Got it working, thanks Sujoy.

Would you know why the quantifier didn't work? I'm just wondering if there
is something wrong with my syntax

On Wednesday, July 25, 2012 11:21:27 AM UTC-4, Sujoy Sett wrote:

Hi,

Try using scripts. As following. They modify the field value before
applying faceting.

curl -XGET 'http://localhost:9200/try_test/data/_search?pretty=true' -d '{
"query": {
"match_all": {}
},
"facets": {
"my_facet": {
"terms": {
"script_field":
"(_source.value+"").toLowerCase().substring(0,9)",
"size": 10
}
}
}
}'

This is a copy from some stuff I was recently working with, so you need to
modify a bit to meet your details (fields and requirement).

Sujoy,

On Tuesday, July 24, 2012 9:17:34 PM UTC+5:30, Joe Wong wrote:

I have a field called createdDate of type string and it's not analyzed. A
sample entry for this field is "20120724120000"

I like to run facets on this field after a certain query and do a facet
count. However I want the facet to count based on a certain portion of the
field. I am using the regex option to this.

for example i have a few ES documents that contain:
20120724120000
20120724120001
20120724120002
20120725120021
20120726120034

i like facets to count 20120724 =t 3, 20120725 = 1, 20120726 = 1 etc.

here is my facet query
curl -X POST "localhost:9200/stuff/_search?pretty=true" -d '{
"query": {
"text" : {
"message" : {
"query" : "test query",
"operator" : "and"
}
}
},
"facets" : { "createdDate" : { "terms" : {"field" : "createdDate", "size"
: 2000, "regex" : "[0-9]{8}"}}}}'

The quantifier doesn't seem to be working for me, it returns 0 results.
Is it the correct syntax?
I'm using 0.19.6

thanks


(sujoysett) #4

Using regex patterns in terms facet probably doesn't modify the elements,
but used to determine whether to include a particular element for facet
calculation or not. I am not entirely sure on the above statement, but that
is what i interpret from the documents - The terms API allows to define
regex expression that will control which terms will be included in the
faceted list.

Sorry, I am not an expert here, but since I faced the same problem as
yours, I provided the solution I used as a suggestion.

Thanks,
Sujoy.

On Friday, July 27, 2012 2:30:16 AM UTC+5:30, Joe Wong wrote:

Got it working, thanks Sujoy.

Would you know why the quantifier didn't work? I'm just wondering if there
is something wrong with my syntax

On Wednesday, July 25, 2012 11:21:27 AM UTC-4, Sujoy Sett wrote:

Hi,

Try using scripts. As following. They modify the field value before
applying faceting.

curl -XGET 'http://localhost:9200/try_test/data/_search?pretty=true' -d
'{
"query": {
"match_all": {}
},
"facets": {
"my_facet": {
"terms": {
"script_field":
"(_source.value+"").toLowerCase().substring(0,9)",
"size": 10
}
}
}
}'

This is a copy from some stuff I was recently working with, so you need
to modify a bit to meet your details (fields and requirement).

Sujoy,

On Tuesday, July 24, 2012 9:17:34 PM UTC+5:30, Joe Wong wrote:

I have a field called createdDate of type string and it's not
analyzed. A sample entry for this field is "20120724120000"

I like to run facets on this field after a certain query and do a facet
count. However I want the facet to count based on a certain portion of the
field. I am using the regex option to this.

for example i have a few ES documents that contain:
20120724120000
20120724120001
20120724120002
20120725120021
20120726120034

i like facets to count 20120724 =t 3, 20120725 = 1, 20120726 = 1 etc.

here is my facet query
curl -X POST "localhost:9200/stuff/_search?pretty=true" -d '{
"query": {
"text" : {
"message" : {
"query" : "test query",
"operator" : "and"
}
}
},
"facets" : { "createdDate" : { "terms" : {"field" : "createdDate",
"size" : 2000, "regex" : "[0-9]{8}"}}}}'

The quantifier doesn't seem to be working for me, it returns 0 results.
Is it the correct syntax?
I'm using 0.19.6

thanks


(system) #5