Group by not working for more than 1 word which contains space in between

Hi

I have be working for more than 3 days in this and tried many forums to fix this but still not possible.

Also I am new to Elasticsearch as this is our first project to implement.

For me Create New Index and Get the Index has been working fine. Now i want to get the search results with "group by" like in sql.

Below scripts are what i am using now

Step 1 :
I have Installed Elasticsearch 2.0 in Ubuntu 14.04. I able to create new Index using below code. Also i am using PHP to call ES

$hosts = array('our ip address:9200');
$client = \Elasticsearch\ClientBuilder::create()->setHosts($hosts)->build();
$index = "IndexName";
$params['index'] = $index;
$params['type'] = 'xyz';
$params['body']["id"] = "1";
$params['body']["title"] = "PHP Developer";
$params['body']["location"] = "United States";
$client->index($params);

once the above code runs Index successfully created.

Step 2 :
Able to look into the created Index using below link

`http://our ip address:9200/IndexName/_search?q=PHP&pretty

{
"took" : 30,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 0.8968174,
"hits" : [ {
"_index" : "IndexName",
"_type" : "xyz",
"_id" : "1545680",
"_score" : 0.8968174,
"_source":{"id":"1545680","title":"PHP Developer","location":"United States"}
}, {
"_index" : "IndexName",
"_type" : "xyz",
"_id" : "1539771",
"_score" : 0.753807,
"_source":{"id":"1539771","title":"PHP Team Lead","location":"United States"}
}, {
"_index" : "IndexName",
"_type" : "xyz",
"_id" : "1539772",
"_score" : 0.253807,
"_source":{"id":"1539772","title":"PHP Lead","location":"India"}
}
....

Step 3: Now i want to add group by using below code
So i added the below mapping while creating new Index
$params['body']['mappings']['xyz']['properties']['location'] = array('type' => 'string','fields' => array("untouched" => array('type' => 'string','index' => 'not_analyzed')));

Step 4: I use below code for group by

$params['index'] = "IndexName";
$params['type']  = 'xyz';
$params['body']['query']['query_string'] = array("query" => "PHP","default_operator" => "AND" ,"fields" => array("title"));
$params['body']["aggs"] =  array("location" => array("terms" => array("field" => "location.untouched")));
$response = $client->search($params); 

Step 5: The response below i get

[aggregations] => Array
(
[location] => Array
(
[doc_count_error_upper_bound] => 0
[sum_other_doc_count] => 0
[buckets] => Array
(
)
)
)

Step 6: I removed .untounched from aggs
$params['body']["aggs"] = array("location" => array("terms" => array("field" => "location")));

I got the below response

[aggregations] => Array
(
[location] => Array
(
[doc_count_error_upper_bound] => 0
[sum_other_doc_count] => 0
[buckets] => Array
(
[0] => Array
(
[key] => united
[doc_count] => 2
)
[1] => Array
(
[key] => india
[doc_count] => 1
)
[2] => Array
(
[key] => states
[doc_count] => 2
)
)
)
)

From the step 6 output i got almost group by working but the problem is the key get separated count.

How to fix this issue?

is there any setting need to changed?

Help me to fix this issue

Thanks
Aravin

Hi,

Sorry, I'm not too familiar with the PHP syntax (rather prefer some curly brackets ;-)), but I can see from your example that you already location.untouched as not_analyzed, which is the key thing here to the location strings don't get broken up. Your last step shows that the terms aggregation is working in principle, so the mapping you added in Step 3 somehow doesn't get used. I assume your are setting up a completely new index with that mapping and reindexing your test documents into it? And you are also also running your aggregation queries against that new index with updated mapping?

Just to be sure, whats the output when you do curl -XGET 'http://localhost:9200/YOURINDEX/_mapping/YOURTYPE'

Hi Cbuescher

Seems like the location not_analyzed is working fine now after reindex and set the mapping in create( function.

Now i am facing challenge in doing whitespace analyzer for make "C++ developer" keyword searchable.

after i added analyzer setting in index i am not able to get result where i able to get results before my _mapping and _setting json as below

for my _mapping - http://MYIP:9200/joblisting_bootstrap_new1/_mapping

`{"joblisting_bootstrap_new1":{"mappings":{"job":{"properties":{"approve":{"type":"string"},"city":{"type":"string"},"community_id":{"type":"string"},"comp_confi":{"type":"string"},"contact_confi":{"type":"string"},"description":{"type":"string"},"desirableskills":{"type":"string"},"e_u_organization":{"type":"string"},"employerid":{"type":"string"},"id":{"type":"string"},"insertdate":{"type":"date","format":"strict_date_optional_time||epoch_millis"},"job_feed_company_name":{"type":"string"},"job_feed_original_email":{"type":"string"},"job_feed_redirect_to":{"type":"string"},"job_feed_source_name":{"type":"string"},"job_type":{"type":"string"},"jobrole":{"type":"string","analyzer":"whitespace_analyzer"},"livedate":{"type":"date","format":"strict_date_optional_time||epoch_millis"},"location":{"type":"string","index":"not_analyzed"},"mandatoryskills":{"type":"string"},"post_as":{"type":"string"},"tag_id":{"type":"string"}}}}}}`

for _setting - http://MYIP:9200/joblisting_bootstrap_new1/_settings

{"joblisting_bootstrap_new1":{"settings":{"index":{"creation_date":"1459420737004","analysis":{"analyzer":{"whitespace_analyzer":{"filter":["lowercase"],"type":"custom","tokenizer":"whitespace"}}},"number_of_shards":"5","number_of_replicas":"1","uuid":"_csrQotYSRiiA550zVVkqA","version":{"created":"2010099"}}}}}

If i see my input and output
Input as array in php
> Array

    (
        [index] => joblisting_bootstrap_new1
        [type] => job
        [size] => 10
        [from] => 0
        [body] => Array
            (
                [query] => Array
                    (
                        [query_string] => Array
                            (
                                [query] => C++ Heveloper
                                [default_operator] => AND
                                [fields] => Array
                                    (
                                        [0] => jobrole
                                        [1] => location
                                    )
                            )
                    )
                [aggs] => Array
                    (
                        [location] => Array
                            (
                                [terms] => Array
                                    (
                                        [field] => location
                                    )
                            )
                    )
            )
    )

output as below
> Array

    (
        [took] => 2
        [timed_out] => 
        [_shards] => Array
            (
                [total] => 20
                [successful] => 20
                [failed] => 0
            )
        [hits] => Array
            (
                [total] => 0
                [max_score] => 
                [hits] => Array
                    (
                    )
            )
        [aggregations] => Array
            (
                [location] => Array
                    (
                        [doc_count_error_upper_bound] => 0
                        [sum_other_doc_count] => 0
                        [buckets] => Array
                            (
                            )
                    )
            )
    )

Please suggest me where is the issue in setting for analyzer.. If i reindex without analyzer setting the group by is working fine and i able to get the result as expected.

Thanks