Wrestling with analyzer

After some digging I found a way to show the indexed terms, and the
analyzer has performed the "ij" -> "y" mapping:

$ curl 'http://localhost:9200/geocoder/_search?pretty=true' -d '{
"query" : {
"match_all" : { }
},
"script_fields": {
"terms" : {
"script": "doc[field].values",
"params": {
"field": "_all"
}
}

}

}'
{
"took" : 10,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.0,
"hits" : [ {
"_index" : "geocoder",
"_type" : "address",
"_id" : "1",
"_score" : 1.0,
"fields" : {
"terms" : [ "2542", "573", "gn", "gravenhage", "s",
"wantsnydersgaarde" ]
}
} ]
}
}

Can anything else go wrong with the mapping?

Regards,
Joost

2013/8/5 Joost Cassee joost@cassee.net:

Hi Simon,

This is the mapping for the address type:

{
"properties": {
"street": { "type": "string" },
"housenumber": { "type": "string" },
"postal_code": { "type": "string", "analyzer": "postal_code" },
"city": { "type": "string" },
"point": { "type": "geo_point" }
}
}

I expected the city field to be analyzed by the default analyzer,
which is the one I configured, right?

Regards,
Joost

2013/8/2 simonw simon.willnauer@elasticsearch.com:

Can you send your mapping as well? Maybe you don't set the index / search
analyzer for the field "city" in the mapping. if you don't you won't use the
syn. filter there.

simon

On Wednesday, July 31, 2013 4:19:04 PM UTC+2, Joost Cassee wrote:

Hi,

At Go About we are using elasticsearch (0.90.2) for geocoding.
Unfortunately, I am running into a problem I could use some help with.

I use the following analyzer configuration:

index:
analysis:
analyzer:
default:
alias: [goabout]
type: custom
tokenizer: standard
filter: [lowercase, synonym, standard, asciifolding]
char_filter: [char_mapper]
postal_code:
tokenizer: keyword
filter: [lowercase]
tokenizer:
standard:
stopwords:
filter:
synonym:
type: synonym
synonyms:
- st => sint
- den haag => s gravenhage
- den bosch => s hertogenbosch
- jp => jan pieterszoon
- mh => maarten harpertszoon
char_filter:
char_mapper:
type: mapping
mappings:
- ij => y

I then the index the following document:

$ curl -XPUT http://localhost:9200/geocoder/address/1 -d "{"city":
"'s-Gravenhage", "point": {"lat": 52.034608082483366, "lon":
4.266201580347966}, "street": "Wantsnijdersgaarde", "postal_code":
"2542 GN", "housenumber": "573"}"

(We put a mapping first to make sure "point" is a geo_point, but this is
not relevant for this problem.)

The analyzer seems to work correctly:

$ curl -X GET "http://localhost:9200/geocoder/_analyze?pretty=true" -d
"Den Haag"
{
"tokens" : [ {
"token" : "s",
"start_offset" : 0,
"end_offset" : 3,
"type" : "SYNONYM",
"position" : 1
}, {
"token" : "gravenhage",
"start_offset" : 4,
"end_offset" : 8,
"type" : "SYNONYM",
"position" : 2
} ]
}

The analyzer seems to get use in both indexing and querying, as this query
(that exchanges "y" for "ij") finds the document:

$ curl -X GET
"http://localhost:9200/geocoder/_search?q=Wantsnydersgaarde&analyzer=goabout&pretty=true"
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 0.095891505,
"hits" : [ {
"_index" : "geocoder",
"_type" : "address",
"_id" : "1",
"_score" : 0.095891505, "_source" : {"city": "'s-Gravenhage",
"point": {"lat": 52.034608082483366, "lon": 4.266201580347966}, "street":
"Wantsnijdersgaarde", "postal_code": "2542 GN", "housenumber": "573"}
} ]
}
}

But this search query does not return results:

$ curl -X GET
"http://localhost:9200/geocoder/_search?q=den+haag&analyzer=goabout&pretty=true"
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : null,
"hits" :
}
}

Something is going on with the synonym filter. What can I do to further
debug my problem?

Regards,
Joost

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/NHR4uRa0y8E/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Joost Cassee
http://joost.cassee.net

--
Joost Cassee
http://joost.cassee.net

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.