"splitted" term in terms-stats facet

I'm indexing http GET/POST from custom apache-access logs from logstash with this filter:

filter {
grok {
type => "apache-access"
pattern => "%{IPORHOST:clientip} %{USER:ident} %{USER:auth} [%{HTTPDATE:timestamp}] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{NUMBER:Msec:int}"
}

date {
type => "apache-access"
	match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
	locale=>"en"
}

}

The "fields.request" contain long URL that are correctly displayed in normal search like:
curl -XGET http://localhost:9220/logstash-2013.06.21,logstash-2013.06.20/_search?pretty

I try to use terms-stats facets to retrieve the "slower" operations by the query:
curl -XGET http://localhost:9220/logstash-2013.06.21,logstash-2013.06.20/_search?pretty -d'
{
"query" : {
"match_all" : { }
},
"facets" : {
"tag_price_stats" : {
"terms_stats" : {
"key_field" : "request",
"value_field" : "Msec",
"order":"mean",
"size":10
}
}
}
}'

I receive an output like this:

{
"took" : 438,
"timed_out" : false,
"_shards" : {
"total" : 10,
"successful" : 10,
"failed" : 0
},
"hits" : {
"total" : 290356,
"max_score" : 1.0,
"hits" : [ {
"_index" : "logstash-2013.06.20",
"_type" : "apache-access",
"_id" : "zoo2EM3oT-qaFtSCmFiKUg",
"_score" : 1.0, "_source" : {"@source":"file://AMTEC-FD8B1486D/c:/SHARED/L
OG/PTTCWS01/access_log.1371768364","@tags":[],"@fields":{"clientip":["10.208.192
.161"],"ident":["-"],"auth":["-"],"timestamp":["21/Jun/2013:00:46:18 +0200"],"ve
rb":["GET"],"request":["/STAR/Login_EX/Presentation/SelectJobMenu.html?OPR_ID=48
675&GRP_ID=26521&SID=7341941&TRM_NAME=TRM0735CA05"],"httpversion":["1.1"],"respo
nse":["200"],"bytes":["143012"],"Msec":[2342]},"@timestamp":"2013-06-20T22:46:18
.000Z","@source_host":"AMTEC-FD8B1486D","@source_path":"/c:/SHARED/LOG/PTTCWS01/
access_log.1371768364","@message":"10.208.192.161 - - [21/Jun/2013:00:46:18 +020
0] "GET /STAR/Login_EX/Presentation/SelectJobMenu.html?OPR_ID=48675&GRP_ID=2652
1&SID=7341941&TRM_NAME=TRM0735CA05 HTTP/1.1" 200 143012 2342\r","@type":"apache
-access"}
}, {
"_index" : "logstash-2013.06.20",
"_type" : "apache-access",
"_id" : "RbauZ_ITQaelRAbh-90Dgw",
"_score" : 1.0, "_source" : {"@source":"file://AMTEC-FD8B1486D/c:/SHARED/L
OG/PTTCWS01/access_log.1371768364","@tags":[],"@fields":{"clientip":["10.208.192
.161"],"ident":["-"],"auth":["-"],"timestamp":["21/Jun/2013:00:46:14 +0200"],"ve
rb":["POST"],"request":["/STAR/PrintingEX/Business/TT_PrintRequest.asp"],"httpve
rsion":["1.1"],"response":["200"],"bytes":["74"],"Msec":[56777]},"@timestamp":"2
013-06-20T22:46:14.000Z","@source_host":"AMTEC-FD8B1486D","@source_path":"/c:/SH
ARED/LOG/PTTCWS01/access_log.1371768364","@message":"10.208.192.161 - - [21/Jun/
2013:00:46:14 +0200] "POST /STAR/PrintingEX/Business/TT_PrintRequest.asp HTTP/1
.1" 200 74 56777\r","@type":"apache-access"}
}, {
"_index" : "logstash-2013.06.20",
"_type" : "apache-access",
"_id" : "mWp7rDD6RqmI6ctHVRmjAw",
"_score" : 1.0, "_source" : {"@source":"file://AMTEC-FD8B1486D/c:/SHARED/L
OG/PTTCWS01/access_log.1371768364","@tags":[],"@fields":{"clientip":["10.208.192
.161"],"ident":["-"],"auth":["-"],"timestamp":["21/Jun/2013:00:46:12 +0200"],"ve
rb":["POST"],"request":["/STAR/PrintingEX/Business/TT_PrintRequest.asp"],"httpve
rsion":["1.1"],"response":["200"],"bytes":["74"],"Msec":[67788]},"@timestamp":"2
013-06-20T22:46:12.000Z","@source_host":"AMTEC-FD8B1486D","@source_path":"/c:/SH
ARED/LOG/PTTCWS01/access_log.1371768364","@message":"10.208.192.161 - - [21/Jun/
2013:00:46:12 +0200] "POST /STAR/PrintingEX/Business/TT_PrintRequest.asp HTTP/1
.1" 200 74 67788\r","@type":"apache-access"}
}, {
"_index" : "logstash-2013.06.20",
"_type" : "apache-access",
"_id" : "QYCUOrcfS0O4hsQ9nOxSBQ",
"_score" : 1.0, "_source" : {"@source":"file://AMTEC-FD8B1486D/c:/SHARED/L
OG/PTTCWS01/access_log.1371768364","@tags":[],"@fields":{"clientip":["10.208.192
.161"],"ident":["-"],"auth":["-"],"timestamp":["21/Jun/2013:00:46:16 +0200"],"ve
rb":["POST"],"request":["/STAR/PrintingEX/Business/TT_PrintRequest.asp"],"httpve
rsion":["1.1"],"response":["200"],"bytes":["74"],"Msec":[79842]},"@timestamp":"2
013-06-20T22:46:16.000Z","@source_host":"AMTEC-FD8B1486D","@source_path":"/c:/SH
ARED/LOG/PTTCWS01/access_log.1371768364","@message":"10.208.192.161 - - [21/Jun/
2013:00:46:16 +0200] "POST /STAR/PrintingEX/Business/TT_PrintRequest.asp HTTP/1
.1" 200 74 79842\r","@type":"apache-access"}
}, {
"_index" : "logstash-2013.06.20",
"_type" : "apache-access",
"_id" : "iRMY9HdjSLC8Lv_g7V4PRQ",
"_score" : 1.0, "_source" : {"@source":"file://AMTEC-FD8B1486D/c:/SHARED/L
OG/PTTCWS01/access_log.1371768364","@tags":[],"@fields":{"clientip":["10.208.192
.161"],"ident":["-"],"auth":["-"],"timestamp":["21/Jun/2013:00:46:18 +0200"],"ve
rb":["GET"],"request":["/STAR/BlankLogin.htm"],"httpversion":["1.1"],"response":
["304"],"Msec":[187]},"@timestamp":"2013-06-20T22:46:18.000Z","@source_host":"AM
TEC-FD8B1486D","@source_path":"/c:/SHARED/LOG/PTTCWS01/access_log.1371768364","@
message":"10.208.192.161 - - [21/Jun/2013:00:46:18 +0200] "GET /STAR/BlankLogin
.htm HTTP/1.1" 304 - 187\r","@type":"apache-access"}
}, {
"_index" : "logstash-2013.06.20",
"_type" : "apache-access",
"_id" : "rOCQFhNQQ7amyNT9lxkrQw",
"_score" : 1.0, "_source" : {"@source":"file://AMTEC-FD8B1486D/c:/SHARED/L
OG/PTTCWS01/access_log.1371768364","@tags":[],"@fields":{"clientip":["10.208.192
.161"],"ident":["-"],"auth":["-"],"timestamp":["21/Jun/2013:00:46:10 +0200"],"ve
rb":["POST"],"request":["/STAR/PrintingEX/Business/TT_PrintRequest.asp"],"httpve
rsion":["1.1"],"response":["200"],"bytes":["74"],"Msec":[57635]},"@timestamp":"2
013-06-20T22:46:10.000Z","@source_host":"AMTEC-FD8B1486D","@source_path":"/c:/SH
ARED/LOG/PTTCWS01/access_log.1371768364","@message":"10.208.192.161 - - [21/Jun/
2013:00:46:10 +0200] "POST /STAR/PrintingEX/Business/TT_PrintRequest.asp HTTP/1
.1" 200 74 57635\r","@type":"apache-access"}
}, {
"_index" : "logstash-2013.06.20",
"_type" : "apache-access",
"_id" : "mQP7ySjgTMuWqjGDph-6Wg",
"_score" : 1.0, "_source" : {"@source":"file://AMTEC-FD8B1486D/c:/SHARED/L
OG/PTTCWS01/access_log.1371768364","@tags":[],"@fields":{"clientip":["10.208.192
.161"],"ident":["-"],"auth":["-"],"timestamp":["21/Jun/2013:00:46:19 +0200"],"ve
rb":["GET"],"request":["/STAR/Style/StileLoginElsag.css"],"httpversion":["1.1"],
"response":["200"],"bytes":["1092"],"Msec":[213]},"@timestamp":"2013-06-20T22:46
:19.000Z","@source_host":"AMTEC-FD8B1486D","@source_path":"/c:/SHARED/LOG/PTTCWS
01/access_log.1371768364","@message":"10.208.192.161 - - [21/Jun/2013:00:46:19 +
0200] "GET /STAR/Style/StileLoginElsag.css HTTP/1.1" 200 1092 213\r","@type":"
apache-access"}
}, {
"_index" : "logstash-2013.06.20",
"_type" : "apache-access",
"_id" : "2HyhsF98QGGOckXtWyCvXg",
"_score" : 1.0, "_source" : {"@source":"file://AMTEC-FD8B1486D/c:/SHARED/L
OG/PTTCWS01/access_log.1371768364","@tags":[],"@fields":{"clientip":["10.208.192
.161"],"ident":["-"],"auth":["-"],"timestamp":["21/Jun/2013:00:46:19 +0200"],"ve
rb":["POST"],"request":["/STAR/PrintingEX/Business/TT_PrintRequest.asp"],"httpve
rsion":["1.1"],"response":["200"],"bytes":["74"],"Msec":[44177]},"@timestamp":"2
013-06-20T22:46:19.000Z","@source_host":"AMTEC-FD8B1486D","@source_path":"/c:/SH
ARED/LOG/PTTCWS01/access_log.1371768364","@message":"10.208.192.161 - - [21/Jun/
2013:00:46:19 +0200] "POST /STAR/PrintingEX/Business/TT_PrintRequest.asp HTTP/1
.1" 200 74 44177\r","@type":"apache-access"}
}, {
"_index" : "logstash-2013.06.20",
"_type" : "apache-access",
"_id" : "xMcW9gasQ7-jQYgl__rSow",
"_score" : 1.0, "_source" : {"@source":"file://AMTEC-FD8B1486D/c:/SHARED/L
OG/PTTCWS01/access_log.1371768364","@tags":[],"@fields":{"clientip":["10.208.192
.163"],"ident":["-"],"auth":["-"],"timestamp":["21/Jun/2013:00:46:06 +0200"],"ve
rb":["GET"],"request":["/STAR/BlankLogin.htm"],"response":["200"],"bytes":["150"
],"Msec":[602]},"@timestamp":"2013-06-20T22:46:06.000Z","@source_host":"AMTEC-FD
8B1486D","@source_path":"/c:/SHARED/LOG/PTTCWS01/access_log.1371768364","@messag
e":"10.208.192.163 - - [21/Jun/2013:00:46:06 +0200] "GET /STAR/BlankLogin.htm"
200 150 602\r","@type":"apache-access"}
}, {
"_index" : "logstash-2013.06.20",
"_type" : "apache-access",
"_id" : "MGFPi5YqSheH5rzz6wImXQ",
"_score" : 1.0, "source" : {"@source":"file://AMTEC-FD8B1486D/c:/SHARED/L
OG/PTTCWS01/access_log.1371768364","@tags":[],"@fields":{"clientip":["10.208.192
.161"],"ident":["-"],"auth":["-"],"timestamp":["21/Jun/2013:00:46:05 +0200"],"ve
rb":["POST"],"request":["/STAR/PrintingEX/Business/TT_PrintRequest.asp"],"httpve
rsion":["1.1"],"response":["200"],"bytes":["74"],"Msec":[206771]},"@timestamp":"
2013-06-20T22:46:05.000Z","@source_host":"AMTEC-FD8B1486D","@source_path":"/c:/S
HARED/LOG/PTTCWS01/access_log.1371768364","@message":"10.208.192.161 - - [21/Jun
/2013:00:46:05 +0200] "POST /STAR/PrintingEX/Business/TT_PrintRequest.asp HTTP/
1.1" 200 74 206771\r","@type":"apache-access"}
} ]
},
"facets" : {
"tag_price_stats" : {
"type" : "terms_stats",
"missing" : 0,
"terms" : [ {
"term" : "/STAR/Login_EX/Business/GeneraMenu.asp?CJOB_ID=55692&SC_LOCAL

OFFICE_ID=88232&CLIENT_IP=10.218.245.17&CLIENT_NAME=TTN2232C007&SID=7342229",
"count" : 1,
"total_count" : 1,
"min" : 2.79137897E8,
"max" : 2.79137897E8,
"total" : 2.79137897E8,
"mean" : 2.79137897E8
}, {
"term" : "EAL%3E%3C%2FNSEAL%3E%3CNCHEC%3E%3C%2FNCHEC%3E%3CCCODE%3E%3C%2F
CCODE%3E%3CCHVAL%3E%3C%2FCHVAL%3E%3C%2FOLD_DD%3E%3CSERVIZI%3E%3C!%5BCDATA%5B%5D%5D%3E%3C%2FSERVIZI%3E%3C%2FMSG%3E&SC_LOCAL_OFFICE_ID=63571&CLIENT_IP=10.218.30.2
2&CLIENT_NAME=TTO1571C008&SID=734",
"count" : 1,
"total_count" : 1,
"min" : 2.51097533E8,
"max" : 2.51097533E8,
"total" : 2.51097533E8,
"mean" : 2.51097533E8
}, {
"term" : "4694",
"count" : 1,
"total_count" : 1,
"min" : 2.51097533E8,
"max" : 2.51097533E8,
"total" : 2.51097533E8,
"mean" : 2.51097533E8
}, {
"term" : "/STAR/ControlloDispaccioEX/Business/ScanMLP.asp?MLP_CODE=RR558
764221MA&BAG_ID=214374148&BAG_TYPE=2&JOB_ID=2428341&CJOB_ID=57432&OPR_ID=96496&L
OC_OFFICE_ID=63571&ACC_OFFICE_ID=63571&IN_BUNDLE=0&I_PTYPE=EE&PTYPE=&DIST_FLAG=0
&D_PTYPE=&ACC_OFFICE_NAME=TORINO%",
"count" : 1,
"total_count" : 1,
"min" : 2.51097533E8,
"max" : 2.51097533E8,
"total" : 2.51097533E8,
"mean" : 2.51097533E8
}, {
"term" : "6",
"count" : 1,
"total_count" : 1,
"min" : 2.44085529E8,
"max" : 2.44085529E8,
"total" : 2.44085529E8,
"mean" : 2.44085529E8
}, {
"term" : "/STAR/ControlloDispaccioEX/Business/ScanMLP.asp?MLP_CODE=00854
9413014&BAG_ID=214504948&BAG_TYPE=2&JOB_ID=2428352&CJOB_ID=40252&OPR_ID=65309&LO
C_OFFICE_ID=40479&ACC_OFFICE_ID=40479&IN_BUNDLE=0&I_PTYPE=&PTYPE=&DIST_FLAG=0&D

PTYPE=&ACC_OFFICE_NAME=NAPOLI%20C",
"count" : 1,
"total_count" : 1,
"min" : 2.44085529E8,
"max" : 2.44085529E8,
"total" : 2.44085529E8,
"mean" : 2.44085529E8
}, {
"term" : "%3E%3C%2FNSEAL%3E%3CNCHEC%3E%3C%2FNCHEC%3E%3CCCODE%3E%3C%2FCCO
DE%3E%3CCHVAL%3E%3C%2FCHVAL%3E%3C%2FOLD_DD%3E%3CSERVIZI%3E%3C!%5BCDATA%5B%5D%5D%3E%3C%2FSERVIZI%3E%3C%2FMSG%3E&SC_LOCAL_OFFICE_ID=40479&CLIENT_IP=10.220.32.16&C
LIENT_NAME=TNA1479C102&SID=734267",
"count" : 1,
"total_count" : 1,
"min" : 2.44085529E8,
"max" : 2.44085529E8,
"total" : 2.44085529E8,
"mean" : 2.44085529E8
}, {
"term" : "INA%20CDM&OFFICE_TYPE=A&XML=%3CMSG%3E%3COLD_DD%3E%3CUACCN%3ELT
%20LATINA%20CDM%3C%2FUACCN%3E%3CCODAR%3E%3C%2FCODAR%3E%3CDEST%3E%3C%2FDEST%3E%3CZIP%3E%3C%2FZIP%3E%3CADDR%3E%3C%2FADDR%3E%3CDADDR%3E%3C%2FDADDR%3E%3CIVAL%3E%3C%2FIVAL%3E%3CWBEFS%3E%3C%2FWBEFS%3",
"count" : 1,
"total_count" : 1,
"min" : 2.33880432E8,
"max" : 2.33880432E8,
"total" : 2.33880432E8,
"mean" : 2.33880432E8
}, {
"term" : "IENT_IP=10.51.137.35&CLIENT_NAME=TRM3200C022&SID=7342350",
"count" : 1,
"total_count" : 1,
"min" : 2.33880432E8,
"max" : 2.33880432E8,
"total" : 2.33880432E8,
"mean" : 2.33880432E8
}, {
"term" : "E%3CWAFTS%3E%3C%2FWAFTS%3E%3CNSEAL%3E%3C%2FNSEAL%3E%3CNCHEC%3E
%3C%2FNCHEC%3E%3CCCODE%3E%3C%2FCCODE%3E%3CCHVAL%3E%3C%2FCHVAL%3E%3C%2FOLD_DD%3E%3CSERVIZI%3E%3C!%5BCDATA%5B%5D%5D%3E%3C%2FSERVIZI%3E%3C%2FMSG%3E&SPWDD=0&IPROD_PTYPE=&SC_LOCAL_OFFICE_ID=96200&CL",
"count" : 1,
"total_count" : 1,
"min" : 2.33880432E8,
"max" : 2.33880432E8,
"total" : 2.33880432E8,
"mean" : 2.33880432E8
} ]
}
}
}

There are many terms with the same numeric stats value...it seems that the long request was splitted in more terms (just 256 char)...What's wrong? Probable bug?

Thanks in advance for any suggestion.

What is your mapping for the request field? Is it not_analyzed? The search
API will return the original document with the fields intact (if source is
enabled) and not the way a field is actually indexed.

Also, please considering using gist.github.com for long output.

--
Ivan

On Tue, Jul 16, 2013 at 1:21 AM, sanmar marco.santonocito@selex-es.comwrote:

I'm indexing http GET/POST from custom apache-access logs from logstash
with
this filter:

filter {
grok {
type => "apache-access"
pattern => "%{IPORHOST:clientip} %{USER:ident} %{USER:auth}
[%{HTTPDATE:timestamp}] "(?:%{WORD:verb} %{NOTSPACE:request}(?:
HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response}
(?:%{NUMBER:bytes}|-) %{NUMBER:Msec:int}"
}

    date {
    type => "apache-access"
    match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
    locale=>"en"
    }

}

The "fields.request" contain long URL that are correctly displayed in
normal
search like:
curl -XGET

http://localhost:9220/logstash-2013.06.21,logstash-2013.06.20/_search?pretty

I try to use terms-stats facets to retrieve the "slower" operations by the
query:
curl -XGET

http://localhost:9220/logstash-2013.06.21,logstash-2013.06.20/_search?pretty
-d'
{
"query" : {
"match_all" : { }
},
"facets" : {
"tag_price_stats" : {
"terms_stats" : {
"key_field" : "request",
"value_field" : "Msec",
"order":"mean",
"size":10
}
}
}
}'

I receive an output like this:

{
"took" : 438,
"timed_out" : false,
"_shards" : {
"total" : 10,
"successful" : 10,
"failed" : 0
},
"hits" : {
"total" : 290356,
"max_score" : 1.0,
"hits" : [ {
"_index" : "logstash-2013.06.20",
"_type" : "apache-access",
"_id" : "zoo2EM3oT-qaFtSCmFiKUg",
"_score" : 1.0, "_source" :
{"@source":"file://AMTEC-FD8B1486D/c:/SHARED/L

OG/PTTCWS01/access_log.1371768364","@tags":[],"@fields":{"clientip":["10.208.192
.161"],"ident":["-"],"auth":["-"],"timestamp":["21/Jun/2013:00:46:18
+0200"],"ve

rb":["GET"],"request":["/STAR/Login_EX/Presentation/SelectJobMenu.html?OPR_ID=48

675&GRP_ID=26521&SID=7341941&TRM_NAME=TRM0735CA05"],"httpversion":["1.1"],"respo

nse":["200"],"bytes":["143012"],"Msec":[2342]},"@timestamp":"2013-06-20T22:46:18

.000Z","@source_host":"AMTEC-FD8B1486D","@source_path":"/c:/SHARED/LOG/PTTCWS01/
access_log.1371768364","@message":"10.208.192.161 - - [21/Jun/2013:00:46:18
+020
0] "GET
/STAR/Login_EX/Presentation/SelectJobMenu.html?OPR_ID=48675&GRP_ID=2652
1&SID=7341941&TRM_NAME=TRM0735CA05 HTTP/1.1" 200 143012
2342\r","@type":"apache
-access"}
}, {
"_index" : "logstash-2013.06.20",
"_type" : "apache-access",
"_id" : "RbauZ_ITQaelRAbh-90Dgw",
"_score" : 1.0, "_source" :
{"@source":"file://AMTEC-FD8B1486D/c:/SHARED/L

OG/PTTCWS01/access_log.1371768364","@tags":[],"@fields":{"clientip":["10.208.192
.161"],"ident":["-"],"auth":["-"],"timestamp":["21/Jun/2013:00:46:14
+0200"],"ve

rb":["POST"],"request":["/STAR/PrintingEX/Business/TT_PrintRequest.asp"],"httpve

rsion":["1.1"],"response":["200"],"bytes":["74"],"Msec":[56777]},"@timestamp":"2

013-06-20T22:46:14.000Z","@source_host":"AMTEC-FD8B1486D","@source_path":"/c:/SH
ARED/LOG/PTTCWS01/access_log.1371768364","@message":"10.208.192.161 - -
[21/Jun/
2013:00:46:14 +0200] "POST /STAR/PrintingEX/Business/TT_PrintRequest.asp
HTTP/1
.1" 200 74 56777\r","@type":"apache-access"}
}, {
"_index" : "logstash-2013.06.20",
"_type" : "apache-access",
"_id" : "mWp7rDD6RqmI6ctHVRmjAw",
"_score" : 1.0, "_source" :
{"@source":"file://AMTEC-FD8B1486D/c:/SHARED/L

OG/PTTCWS01/access_log.1371768364","@tags":[],"@fields":{"clientip":["10.208.192
.161"],"ident":["-"],"auth":["-"],"timestamp":["21/Jun/2013:00:46:12
+0200"],"ve

rb":["POST"],"request":["/STAR/PrintingEX/Business/TT_PrintRequest.asp"],"httpve

rsion":["1.1"],"response":["200"],"bytes":["74"],"Msec":[67788]},"@timestamp":"2

013-06-20T22:46:12.000Z","@source_host":"AMTEC-FD8B1486D","@source_path":"/c:/SH
ARED/LOG/PTTCWS01/access_log.1371768364","@message":"10.208.192.161 - -
[21/Jun/
2013:00:46:12 +0200] "POST /STAR/PrintingEX/Business/TT_PrintRequest.asp
HTTP/1
.1" 200 74 67788\r","@type":"apache-access"}
}, {
"_index" : "logstash-2013.06.20",
"_type" : "apache-access",
"_id" : "QYCUOrcfS0O4hsQ9nOxSBQ",
"_score" : 1.0, "_source" :
{"@source":"file://AMTEC-FD8B1486D/c:/SHARED/L

OG/PTTCWS01/access_log.1371768364","@tags":[],"@fields":{"clientip":["10.208.192
.161"],"ident":["-"],"auth":["-"],"timestamp":["21/Jun/2013:00:46:16
+0200"],"ve

rb":["POST"],"request":["/STAR/PrintingEX/Business/TT_PrintRequest.asp"],"httpve

rsion":["1.1"],"response":["200"],"bytes":["74"],"Msec":[79842]},"@timestamp":"2

013-06-20T22:46:16.000Z","@source_host":"AMTEC-FD8B1486D","@source_path":"/c:/SH
ARED/LOG/PTTCWS01/access_log.1371768364","@message":"10.208.192.161 - -
[21/Jun/
2013:00:46:16 +0200] "POST /STAR/PrintingEX/Business/TT_PrintRequest.asp
HTTP/1
.1" 200 74 79842\r","@type":"apache-access"}
}, {
"_index" : "logstash-2013.06.20",
"_type" : "apache-access",
"_id" : "iRMY9HdjSLC8Lv_g7V4PRQ",
"_score" : 1.0, "_source" :
{"@source":"file://AMTEC-FD8B1486D/c:/SHARED/L

OG/PTTCWS01/access_log.1371768364","@tags":[],"@fields":{"clientip":["10.208.192
.161"],"ident":["-"],"auth":["-"],"timestamp":["21/Jun/2013:00:46:18
+0200"],"ve

rb":["GET"],"request":["/STAR/BlankLogin.htm"],"httpversion":["1.1"],"response":

["304"],"Msec":[187]},"@timestamp":"2013-06-20T22:46:18.000Z","@source_host":"AM

TEC-FD8B1486D","@source_path":"/c:/SHARED/LOG/PTTCWS01/access_log.1371768364","@
message":"10.208.192.161 - - [21/Jun/2013:00:46:18 +0200] "GET
/STAR/BlankLogin
.htm HTTP/1.1" 304 - 187\r","@type":"apache-access"}
}, {
"_index" : "logstash-2013.06.20",
"_type" : "apache-access",
"_id" : "rOCQFhNQQ7amyNT9lxkrQw",
"_score" : 1.0, "_source" :
{"@source":"file://AMTEC-FD8B1486D/c:/SHARED/L

OG/PTTCWS01/access_log.1371768364","@tags":[],"@fields":{"clientip":["10.208.192
.161"],"ident":["-"],"auth":["-"],"timestamp":["21/Jun/2013:00:46:10
+0200"],"ve

rb":["POST"],"request":["/STAR/PrintingEX/Business/TT_PrintRequest.asp"],"httpve

rsion":["1.1"],"response":["200"],"bytes":["74"],"Msec":[57635]},"@timestamp":"2

013-06-20T22:46:10.000Z","@source_host":"AMTEC-FD8B1486D","@source_path":"/c:/SH
ARED/LOG/PTTCWS01/access_log.1371768364","@message":"10.208.192.161 - -
[21/Jun/
2013:00:46:10 +0200] "POST /STAR/PrintingEX/Business/TT_PrintRequest.asp
HTTP/1
.1" 200 74 57635\r","@type":"apache-access"}
}, {
"_index" : "logstash-2013.06.20",
"_type" : "apache-access",
"_id" : "mQP7ySjgTMuWqjGDph-6Wg",
"_score" : 1.0, "_source" :
{"@source":"file://AMTEC-FD8B1486D/c:/SHARED/L

OG/PTTCWS01/access_log.1371768364","@tags":[],"@fields":{"clientip":["10.208.192
.161"],"ident":["-"],"auth":["-"],"timestamp":["21/Jun/2013:00:46:19
+0200"],"ve

rb":["GET"],"request":["/STAR/Style/StileLoginElsag.css"],"httpversion":["1.1"],

"response":["200"],"bytes":["1092"],"Msec":[213]},"@timestamp":"2013-06-20T22:46

:19.000Z","@source_host":"AMTEC-FD8B1486D","@source_path":"/c:/SHARED/LOG/PTTCWS
01/access_log.1371768364","@message":"10.208.192.161 - -
[21/Jun/2013:00:46:19 +
0200] "GET /STAR/Style/StileLoginElsag.css HTTP/1.1" 200 1092
213\r","@type":"
apache-access"}
}, {
"_index" : "logstash-2013.06.20",
"_type" : "apache-access",
"_id" : "2HyhsF98QGGOckXtWyCvXg",
"_score" : 1.0, "_source" :
{"@source":"file://AMTEC-FD8B1486D/c:/SHARED/L

OG/PTTCWS01/access_log.1371768364","@tags":[],"@fields":{"clientip":["10.208.192
.161"],"ident":["-"],"auth":["-"],"timestamp":["21/Jun/2013:00:46:19
+0200"],"ve

rb":["POST"],"request":["/STAR/PrintingEX/Business/TT_PrintRequest.asp"],"httpve

rsion":["1.1"],"response":["200"],"bytes":["74"],"Msec":[44177]},"@timestamp":"2

013-06-20T22:46:19.000Z","@source_host":"AMTEC-FD8B1486D","@source_path":"/c:/SH
ARED/LOG/PTTCWS01/access_log.1371768364","@message":"10.208.192.161 - -
[21/Jun/
2013:00:46:19 +0200] "POST /STAR/PrintingEX/Business/TT_PrintRequest.asp
HTTP/1
.1" 200 74 44177\r","@type":"apache-access"}
}, {
"_index" : "logstash-2013.06.20",
"_type" : "apache-access",
"_id" : "xMcW9gasQ7-jQYgl__rSow",
"_score" : 1.0, "_source" :
{"@source":"file://AMTEC-FD8B1486D/c:/SHARED/L

OG/PTTCWS01/access_log.1371768364","@tags":[],"@fields":{"clientip":["10.208.192
.163"],"ident":["-"],"auth":["-"],"timestamp":["21/Jun/2013:00:46:06
+0200"],"ve

rb":["GET"],"request":["/STAR/BlankLogin.htm"],"response":["200"],"bytes":["150"

],"Msec":[602]},"@timestamp":"2013-06-20T22:46:06.000Z","@source_host":"AMTEC-FD

8B1486D","@source_path":"/c:/SHARED/LOG/PTTCWS01/access_log.1371768364","@messag
e":"10.208.192.163 - - [21/Jun/2013:00:46:06 +0200] "GET
/STAR/BlankLogin.htm"
200 150 602\r","@type":"apache-access"}
}, {
"_index" : "logstash-2013.06.20",
"_type" : "apache-access",
"_id" : "MGFPi5YqSheH5rzz6wImXQ",
"_score" : 1.0, "_source" :
{"@source":"file://AMTEC-FD8B1486D/c:/SHARED/L

OG/PTTCWS01/access_log.1371768364","@tags":[],"@fields":{"clientip":["10.208.192
.161"],"ident":["-"],"auth":["-"],"timestamp":["21/Jun/2013:00:46:05
+0200"],"ve

rb":["POST"],"request":["/STAR/PrintingEX/Business/TT_PrintRequest.asp"],"httpve

rsion":["1.1"],"response":["200"],"bytes":["74"],"Msec":[206771]},"@timestamp":"

2013-06-20T22:46:05.000Z","@source_host":"AMTEC-FD8B1486D","@source_path":"/c:/S
HARED/LOG/PTTCWS01/access_log.1371768364","@message":"10.208.192.161 - -
[21/Jun
/2013:00:46:05 +0200] "POST /STAR/PrintingEX/Business/TT_PrintRequest.asp
HTTP/
1.1" 200 74 206771\r","@type":"apache-access"}
} ]
},
"facets" : {
"tag_price_stats" : {
"type" : "terms_stats",
"missing" : 0,
"terms" : [ {
"term" :
"/STAR/Login_EX/Business/GeneraMenu.asp?CJOB_ID=55692&SC_LOCAL

OFFICE_ID=88232&CLIENT_IP=10.218.245.17&CLIENT_NAME=TTN2232C007&SID=7342229",
"count" : 1,
"total_count" : 1,
"min" : 2.79137897E8,
"max" : 2.79137897E8,
"total" : 2.79137897E8,
"mean" : 2.79137897E8
}, {
"term" :
"EAL%3E%3C%2FNSEAL%3E%3CNCHEC%3E%3C%2FNCHEC%3E%3CCCODE%3E%3C%2F

CCODE%3E%3CCHVAL%3E%3C%2FCHVAL%3E%3C%2FOLD_DD%3E%3CSERVIZI%3E%3C!%5BCDATA%5B%5D%5D%3E%3C%2FSERVIZI%3E%3C%2FMSG%3E&SC_LOCAL_OFFICE_ID=63571&CLIENT_IP=10.218.30.2
2&CLIENT_NAME=TTO1571C008&SID=734",
"count" : 1,
"total_count" : 1,
"min" : 2.51097533E8,
"max" : 2.51097533E8,
"total" : 2.51097533E8,
"mean" : 2.51097533E8
}, {
"term" : "4694",
"count" : 1,
"total_count" : 1,
"min" : 2.51097533E8,
"max" : 2.51097533E8,
"total" : 2.51097533E8,
"mean" : 2.51097533E8
}, {
"term" :
"/STAR/ControlloDispaccioEX/Business/ScanMLP.asp?MLP_CODE=RR558

764221MA&BAG_ID=214374148&BAG_TYPE=2&JOB_ID=2428341&CJOB_ID=57432&OPR_ID=96496&L

OC_OFFICE_ID=63571&ACC_OFFICE_ID=63571&IN_BUNDLE=0&I_PTYPE=EE&PTYPE=&DIST_FLAG=0
&D_PTYPE=&ACC_OFFICE_NAME=TORINO%",
"count" : 1,
"total_count" : 1,
"min" : 2.51097533E8,
"max" : 2.51097533E8,
"total" : 2.51097533E8,
"mean" : 2.51097533E8
}, {
"term" : "6",
"count" : 1,
"total_count" : 1,
"min" : 2.44085529E8,
"max" : 2.44085529E8,
"total" : 2.44085529E8,
"mean" : 2.44085529E8
}, {
"term" :
"/STAR/ControlloDispaccioEX/Business/ScanMLP.asp?MLP_CODE=00854

9413014&BAG_ID=214504948&BAG_TYPE=2&JOB_ID=2428352&CJOB_ID=40252&OPR_ID=65309&LO

C_OFFICE_ID=40479&ACC_OFFICE_ID=40479&IN_BUNDLE=0&I_PTYPE=&PTYPE=&DIST_FLAG=0&D_
PTYPE=&ACC_OFFICE_NAME=NAPOLI%20C",
"count" : 1,
"total_count" : 1,
"min" : 2.44085529E8,
"max" : 2.44085529E8,
"total" : 2.44085529E8,
"mean" : 2.44085529E8
}, {
"term" :
"%3E%3C%2FNSEAL%3E%3CNCHEC%3E%3C%2FNCHEC%3E%3CCCODE%3E%3C%2FCCO

DE%3E%3CCHVAL%3E%3C%2FCHVAL%3E%3C%2FOLD_DD%3E%3CSERVIZI%3E%3C!%5BCDATA%5B%5D%5D%3E%3C%2FSERVIZI%3E%3C%2FMSG%3E&SC_LOCAL_OFFICE_ID=40479&CLIENT_IP=10.220.32.16&C
LIENT_NAME=TNA1479C102&SID=734267",
"count" : 1,
"total_count" : 1,
"min" : 2.44085529E8,
"max" : 2.44085529E8,
"total" : 2.44085529E8,
"mean" : 2.44085529E8
}, {
"term" :
"INA%20CDM&OFFICE_TYPE=A&XML=%3CMSG%3E%3COLD_DD%3E%3CUACCN%3ELT

%20LATINA%20CDM%3C%2FUACCN%3E%3CCODAR%3E%3C%2FCODAR%3E%3CDEST%3E%3C%2FDEST%3E%3CZIP%3E%3C%2FZIP%3E%3CADDR%3E%3C%2FADDR%3E%3CDADDR%3E%3C%2FDADDR%3E%3CIVAL%3E%3C%2FIVAL%3E%3CWBEFS%3E%3C%2FWBEFS%3",
"count" : 1,
"total_count" : 1,
"min" : 2.33880432E8,
"max" : 2.33880432E8,
"total" : 2.33880432E8,
"mean" : 2.33880432E8
}, {
"term" :
"IENT_IP=10.51.137.35&CLIENT_NAME=TRM3200C022&SID=7342350",
"count" : 1,
"total_count" : 1,
"min" : 2.33880432E8,
"max" : 2.33880432E8,
"total" : 2.33880432E8,
"mean" : 2.33880432E8
}, {
"term" :
"E%3CWAFTS%3E%3C%2FWAFTS%3E%3CNSEAL%3E%3C%2FNSEAL%3E%3CNCHEC%3E

%3C%2FNCHEC%3E%3CCCODE%3E%3C%2FCCODE%3E%3CCHVAL%3E%3C%2FCHVAL%3E%3C%2FOLD_DD%3E%3CSERVIZI%3E%3C!%5BCDATA%5B%5D%5D%3E%3C%2FSERVIZI%3E%3C%2FMSG%3E&SPWDD=0&IPROD_PTYPE=&SC_LOCAL_OFFICE_ID=96200&CL",
"count" : 1,
"total_count" : 1,
"min" : 2.33880432E8,
"max" : 2.33880432E8,
"total" : 2.33880432E8,
"mean" : 2.33880432E8
} ]
}
}
}

There are many terms with the same numeric stats value...it seems that the
long request was splitted in more terms (just 256 char)...What's wrong?
Probable bug?

Thanks in advance for any suggestion.

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/splitted-term-in-terms-stats-facet-tp4038177.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thanks for reply and suggestion Ivan.
The field "request" is a simple string type (after a _mapping? request on index)
Can you explain why the "not_analyzed" option is needed?
I'm newbie to ES and after a look to doc...is this option the right way?
I want to perform also full text search for that field...

Greetings

Marco

Any analyzer will split a field into tokens depending on its settings. The
default analyzer is the Standard Analyzer. Facets work on the indexed
terms, not the original field, so if your field was tokenized into one or
more tokens, then each token is an entry in the facet. not_analyzed means
to not perform any analysis on the field. Analysis is all based on Lucene,
so if you want to learn more, you can also read Lucene/Solr documentation.

http://www.elasticsearch.org/guide/reference/index-modules/analysis/
http://lucene.apache.org/core/4_3_1/core/org/apache/lucene/analysis/package-summary.html

You can use the Analyze API to see how your fields are being indexed:
http://www.elasticsearch.org/guide/reference/api/admin-indices-analyze/

If you also want to perform full text search on the same field, use
multi-fields:
http://www.elasticsearch.org/guide/reference/mapping/multi-field-type/

Cheers,

Ivan

On Tue, Jul 16, 2013 at 11:08 PM, sanmar marco.santonocito@selex-es.comwrote:

Thanks for reply and suggestion Ivan.
The field "request" is a simple string type (after a _mapping? request on
index)
Can you explain why the "not_analyzed" option is needed?
I'm newbie to ES and after a look to doc...is this option the right way?
I want to perform also full text search for that field...

Greetings

Marco

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/splitted-term-in-terms-stats-facet-tp4038177p4038234.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thanks again Ivan, now the analyze concept it's clear.
In my problem description i forgot to write my settings:

index:
analysis:
analyzer:
default_index:
type: custom
tokenizer: whitespace
filter:
default_search:
type: custom
tokenizer: whitespace
filter:

Why long "request" (without space inside) term are broken in 255 char pieces?
It depends on token max length?
A token filter (for example http://www.elasticsearch.org/guide/reference/index-modules/analysis/length-tokenfilter/) can resolve my problem?

Marco

The max field length is much higher than 255 (default is 10k). I think the
issue is the the whitespace tokenizer uses a buffer limited to 255
characters. Consider using a keyword tokenizer or not analyzing the field
at all. Not analyzed means that the field is still being indexed, just not
tokenized.

--
Ivan

On Wed, Jul 17, 2013 at 11:32 PM, sanmar marco.santonocito@selex-es.comwrote:

Thanks again Ivan, now the analyze concept it's clear.
In my problem description i forgot to write my settings:

index:
analysis:
analyzer:
default_index:
type: custom
tokenizer: whitespace
filter:
default_search:
type: custom
tokenizer: whitespace
filter:

Why long "request" (without space inside) term are broken in 255 char
pieces?
It depends on token max length?
A token filter (for example

http://www.elasticsearch.org/guide/reference/index-modules/analysis/length-tokenfilter/
)
can resolve my problem?

Marco

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/splitted-term-in-terms-stats-facet-tp4038177p4038297.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thanks a lot Ivan. I choose to resolve my problem splitting long request in two different fields: URIPATH and URIPARAM (https://github.com/logstash/logstash/blob/master/patterns/grok-patterns)

After i'll map both with multi field to allow max search flexibility/capability...I hope that you agree with my decision.
I think that whitespace buffer limit may be a possible bug.

Marco