Mapping and analysis issue on a river

alexandre_klein · February 24, 2015, 1:47pm

Hello,
My name is Alex i am working on my student project and, of course i am
quite a newbie with elasticsearch.
I have an issue with mapping and analyse :

I am using a mongoDB river witch is working perfectly, Here is how i create
my index :

curl -s -XPUT "http://localhost:9200/twitter" -d '

{
"twitter" : {
"mappings" : {
"id" : {
"type" : "long",
"store" : true
},
"text" : {
"type" : "string",
"store" : true,
"index" : "analysed"
},
"created_at" : {
"type" : "date",
"store" : true
},
"location" : {
"type" : "string",
"store" : true
},
"geo" : {
"type" : "geo_point",
"store" : true
}
},
"analysis":{
"analyser":{
"indexAnalyser" : {
"type":"custom",
"tokenizer":"standard",
"filter":["lowercase", "whitespace"]
},
"searchAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":[
"standard",
"lowercase",
"whitespace"
]
}
}
}
}
}'

And my river :

curl -s -XPUT "http://localhost:9200/_river/mongodb/_meta" -d '

{
"type": "mongodb",
"mongodb":
{"servers": [{
"host": "My IP", "port": 27017
}],
"options": {},
"db": "twitter",
"collection": "tweets"
},"index": {
"name": "twitter"
}
}'

The point is the tokenizer don't do his job.
When i check in the index metadata i can see this :

{

"state": "open",

"settings": {

"index": {

"twitter": {

"analysis": {

"analyser": {

"searchAnalyzer": {

"type": "custom",

"filter": [

"standard",

"lowercase",

"whitespace"
],

"tokenizer": "standard"
},

"indexAnalyser": {

"type": "custom",

"filter": [

"lowercase",

"whitespace"
],

"tokenizer": "standard"
}
}
},

"mappings": {

"id": {

"type": "long",

"store": "true"
},

"created_at": {

"type": "date",

"store": "true"
},

"location": {

"store": "true",

"type": "string"
},

"text": {

"type": "string",

"store": "true",

"index": "analysed"
},

"geo": {

"store": "true",

"type": "geo_point"
}
}
},

"uuid": "8_6Pliw_TAW_x34cFhIUtg",

"number_of_replicas": "1",

"number_of_shards": "5",

"version": {

"created": "1020299"
}
}
},

"mappings": {

"twitter": {

"properties": {

"id": {

"type": "long"
},

"text": {

"type": "string"
},

"geo": {

"properties": {

"lon": {

"type": "double"
},

"lat": {

"type": "double"
}
}
},

"location": {

"type": "string"
},

"created_at": {

"type": "string"
},

"country": {

"type": "string"
}
}
}
},

"aliases":

}

There is 2 mappings with one as a child of settings.

Can you be kind and Explain me what i did wrong?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f76fc651-ca65-4335-abf1-8754aa076c4f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

alexandre_klein · February 24, 2015, 1:55pm

And, if you need it my mongodb collection data structure :

{

"_id" : ObjectId("54eb3c35a710901a698b4567"),
"country" : "FR",
"created_at" : "Mon Feb 23 14:25:30 +0000 2015",
"geo" : {
"lat" : 49.119696,
"lon" : 6.176355
},
"id" : -812216320,
"location" : "Metz ",
"text" : "Passer des vacances formidable avec des gens géniaux sans
aucunes pression avec pour seul soucis s'éclater et se laisser vivre
#BONHEUR"
}

Le mardi 24 février 2015 14:48:00 UTC+1, alexand...@instantluxe.com a
écrit :

Hello,
My name is Alex i am working on my student project and, of course i am
quite a newbie with elasticsearch.
I have an issue with mapping and analyse :

I am using a mongoDB river witch is working perfectly, Here is how i
create my index :

curl -s -XPUT "http://localhost:9200/twitter" -d '

{
"twitter" : {
"mappings" : {
"id" : {
"type" : "long",
"store" : true
},
"text" : {
"type" : "string",
"store" : true,
"index" : "analysed"
},
"created_at" : {
"type" : "date",
"store" : true
},
"location" : {
"type" : "string",
"store" : true
},
"geo" : {
"type" : "geo_point",
"store" : true
}
},
"analysis":{
"analyser":{
"indexAnalyser" : {
"type":"custom",
"tokenizer":"standard",
"filter":["lowercase", "whitespace"]
},
"searchAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":[
"standard",
"lowercase",
"whitespace"
]
}
}
}
}
}'

And my river :

curl -s -XPUT "http://localhost:9200/_river/mongodb/_meta" -d '

{
"type": "mongodb",
"mongodb":
{"servers": [{
"host": "My IP", "port": 27017
}],
"options": {},
"db": "twitter",
"collection": "tweets"
},"index": {
"name": "twitter"
}
}'

The point is the tokenizer don't do his job.
When i check in the index metadata i can see this :

{

"state": "open",

"settings": {

"index": {

"twitter": {

"analysis": {

"analyser": {

"searchAnalyzer": {

"type": "custom",

"filter": [

"standard",

"lowercase",

"whitespace"
],

"tokenizer": "standard"
},

"indexAnalyser": {

"type": "custom",

"filter": [

"lowercase",

"whitespace"
],

"tokenizer": "standard"
}
}
},

"mappings": {

"id": {

"type": "long",

"store": "true"
},

"created_at": {

"type": "date",

"store": "true"
},

"location": {

"store": "true",

"type": "string"
},

"text": {

"type": "string",

"store": "true",

"index": "analysed"
},

"geo": {

"store": "true",

"type": "geo_point"
}
}
},

"uuid": "8_6Pliw_TAW_x34cFhIUtg",

"number_of_replicas": "1",

"number_of_shards": "5",

"version": {

"created": "1020299"
}
}
},

"mappings": {

"twitter": {

"properties": {

"id": {

"type": "long"
},

"text": {

"type": "string"
},

"geo": {

"properties": {

"lon": {

"type": "double"
},

"lat": {

"type": "double"
}
}
},

"location": {

"type": "string"
},

"created_at": {

"type": "string"
},

"country": {

"type": "string"
}
}
}
},

"aliases":

}

There is 2 mappings with one as a child of settings.

Can you be kind and Explain me what i did wrong?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e8718555-03b5-4978-ac97-244b6d8be46d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

dadoonet · February 24, 2015, 3:23pm

Bonjour Alexandre

I think you should look at Elasticsearch Platform — Find real-time answers at scale | Elastic http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-create-index.html#mappings

A lot of wrong things:

analyser vs analyzer
position of analyzers, …

Here is something which works

DELETE twitter
PUT /twitter
{
"settings": {
"analysis": {
"analyzer": {
"indexAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase"
]
},
"searchAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"standard",
"lowercase"
]
}
}
}
},
"mappings": {
"twitter": {
"properties": {
"id": {
"type": "long",
"store": true
},
"text": {
"type": "string",
"store": true,
"index": "analyzed"
},
"created_at": {
"type": "date",
"store": true
},
"location": {
"type": "string",
"store": true
},
"geo": {
"type": "geo_point",
"store": true
}
}
}
}
}
GET twitter

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr https://twitter.com/elasticsearchfr | @scrutmydocs https://twitter.com/scrutmydocs

Le 24 févr. 2015 à 14:55, alexandre.klein@instantluxe.com a écrit :

And, if you need it my mongodb collection data structure :

{
"_id" : ObjectId("54eb3c35a710901a698b4567"),
"country" : "FR",
"created_at" : "Mon Feb 23 14:25:30 +0000 2015",
"geo" : {
"lat" : 49.119696,
"lon" : 6.176355
},
"id" : -812216320,
"location" : "Metz ",
"text" : "Passer des vacances formidable avec des gens géniaux sans aucunes pression avec pour seul soucis s'éclater et se laisser vivre #BONHEUR"
}

Le mardi 24 février 2015 14:48:00 UTC+1, alexand...@instantluxe.com http://instantluxe.com/ a écrit :
Hello,
My name is Alex i am working on my student project and, of course i am quite a newbie with elasticsearch.
I have an issue with mapping and analyse :

I am using a mongoDB river witch is working perfectly, Here is how i create my index :

curl -s -XPUT "http://localhost:9200/twitter http://localhost:9200/twitter" -d '
{
"twitter" : {
"mappings" : {
"id" : {
"type" : "long",
"store" : true
},
"text" : {
"type" : "string",
"store" : true,
"index" : "analysed"
},
"created_at" : {
"type" : "date",
"store" : true
},
"location" : {
"type" : "string",
"store" : true
},
"geo" : {
"type" : "geo_point",
"store" : true
}
},
"analysis":{
"analyser":{
"indexAnalyser" : {
"type":"custom",
"tokenizer":"standard",
"filter":["lowercase", "whitespace"]
},
"searchAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":[
"standard",
"lowercase",
"whitespace"
]
}
}
}
}
}'

And my river :

curl -s -XPUT "http://localhost:9200/_river/mongodb/_meta http://localhost:9200/_river/mongodb/_meta" -d '
{
"type": "mongodb",
"mongodb":
{"servers": [{
"host": "My IP", "port": 27017
}],
"options": {},
"db": "twitter",
"collection": "tweets"
},"index": {
"name": "twitter"
}
}'

The point is the tokenizer don't do his job.
When i check in the index metadata i can see this :
{
"state": "open",
"settings": {
"index": {
"twitter": {
"analysis": {
"analyser": {
"searchAnalyzer": {
"type": "custom",
"filter": [
"standard"
,
"lowercase"
,
"whitespace"
],
"tokenizer": "standard"
},
"indexAnalyser": {
"type": "custom",
"filter": [
"lowercase"
,
"whitespace"
],
"tokenizer": "standard"
}
}
},
"mappings": {
"id": {
"type": "long",
"store": "true"
},
"created_at": {
"type": "date",
"store": "true"
},
"location": {
"store": "true",
"type": "string"
},
"text": {
"type": "string",
"store": "true",
"index": "analysed"
},
"geo": {
"store": "true",
"type": "geo_point"
}
}
},
"uuid": "8_6Pliw_TAW_x34cFhIUtg",
"number_of_replicas": "1",
"number_of_shards": "5",
"version": {
"created": "1020299"
}
}
},
"mappings": {
"twitter": {
"properties": {
"id": {
"type": "long"
},
"text": {
"type": "string"
},
"geo": {
"properties": {
"lon": {
"type": "double"
},
"lat": {
"type": "double"
}
}
},
"location": {
"type": "string"
},
"created_at": {
"type": "string"
},
"country": {
"type": "string"
}
}
}
},
"aliases":
}

There is 2 mappings with one as a child of settings.

Can you be kind and Explain me what i did wrong?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e8718555-03b5-4978-ac97-244b6d8be46d%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/e8718555-03b5-4978-ac97-244b6d8be46d%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9713D1B6-0797-4F64-BFDD-45BFCC07C982%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

alexandre_klein · February 24, 2015, 9:16pm

Thank you very much (Merci Beaucoup !).

I have a last question :

Everything works find, except my geo_point isn't populated in elasticsearch
:

Here are my queries :

curl -s -XPUT "http://localhost:9200/twitter" -d '

{
"settings": {
"analysis": {
"analyzer": {
"indexAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase", "asciifolding"
]
},
"searchAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
[ "standard","lowercase", "asciifolding" ]
]
}
},
"filter" : {
"frenchLowerFilter" : {
"type" : "lowercase",
"language" : "french"
},
"asciifolding" : {
"type" : "asciifolding"
}
}
}
},
"mappings": {
"twitterdb": {
"properties": {
"id": {
"type": "long",
"store": true
},
"text": {
"type": "string",
"store": true,
"analyzer" : "indexAnalyzer",
"filter" : ["frenchLowerFilter", "asciifolding"]
},
"created_at": {
"type": "date",
"format": "EE MMM d HH:mm:ss Z yyyy",
"store": true
},
"location": {
"type": "string",
"store": true
},
"geo": {
"type": "geo_point",
"store": true
}
}
}
}
}'

curl -s -XPUT "http://localhost:9200/_river/mongodb/_meta" -d '
{
"type": "mongodb",
"mongodb":
{"servers": [{
"host": "My IP", "port": 27017
}],
"options": {},
"db": "twitterdb",
"collection": "tweets"
},"index": {
"name": "twitter"
}
}'

My index Metadata :

{

"state": "open",
"settings": {
- "index": {
  - "uuid": "IKZbCpeeRm6Ns6UTXl43vg",
  - "analysis": {
    - "analyzer": {
      - "indexAnalyzer": {
        
        "type": "custom",
        
        "filter": [
        
        "lowercase",
        
        "asciifolding"
        ],
        
        "tokenizer": "whitespace"
        },
      - "searchAnalyzer": {
        
        "type": "custom",
        
        "filter": [
        
        [
        
        "standard",
        
        "lowercase",
        
        "asciifolding"
        ]
        ],
        
        "tokenizer": "whitespace"
        }
        },
    - "filter": {
      - "asciifolding": {
        
        "type": "asciifolding"
        },
      - "frenchLowerFilter": {
        
        "type": "lowercase",
        
        "language": "french"
        }
        }
        },
  - "number_of_replicas": "1",
  - "number_of_shards": "5",
  - "version": {
    - "created": "1020299"
      }
      }
      },
"mappings": {
- "twitterdb": {
  - "properties": {
    - "id": {
      - "store": true,
      - "type": "long"
        },
    - "text": {
      - "store": true,
      - "analyzer": "indexAnalyzer",
      - "type": "string"
        },
    - "geo": {
      - "store": true,
      - "type": "geo_point"
        },
    - "location": {
      - "store": true,
      - "type": "string"
        },
    - "created_at": {
      - "store": true,
      - "format": "EE MMM d HH:mm:ss Z yyyy",
      - "type": "date"
        },
    - "country": {
      - "type": "string"
        }
        }
        }
        },
"aliases":

}

And a sample of what is stored in my mongo :

{ "_id" : ObjectId("54ece244a71090fc698b4567"), "id" : 1733746688, "text" :
"#arte #bonheur au #travail. #Favi 2 règles: Et l'homme est bon! Organiser
en "mini usine", la régulation se fait à 90% sur le terrain",
"created_at" : "Tue Feb 24 20:37:28 +0000 2015", "location" : "Paris,
France", "geo" : { "lat" : 48.856506, "lon" : 2.352133 }, "country" : "FR"}

But when i look my _plugin/head, there is no geo, nor lat and nor lon.

I looked
at Elasticsearch Platform — Find real-time answers at scale | Elastic
and it looks good... i think i don't understand something about how to
store an object like "geo" : { "lat" : "", "lon" : "" } as a geo_point in
the mapping, but i don"t know what.

What did i failed pleased ?. (It will be my last question this month ;))

Le mardi 24 février 2015 16:23:45 UTC+1, David Pilato a écrit :

Bonjour Alexandre

I think you should look at
Elasticsearch Platform — Find real-time answers at scale | Elastic

A lot of wrong things:

analyser vs analyzer
position of analyzers, …

Here is something which works

DELETE twitter
PUT /twitter
{
"settings": {
"analysis": {
"analyzer": {
"indexAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase"
]
},
"searchAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"standard",
"lowercase"
]
}
}
}
},
"mappings": {
"twitter": {
"properties": {
"id": {
"type": "long",
"store": true
},
"text": {
"type": "string",
"store": true,
"index": "analyzed"
},
"created_at": {
"type": "date",
"store": true
},
"location": {
"type": "string",
"store": true
},
"geo": {
"type": "geo_point",
"store": true
}
}
}
}
}
GET twitter

--
David Pilato | Technical Advocate | Elasticsearch.com
http://Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr
https://twitter.com/elasticsearchfr | @scrutmydocs
https://twitter.com/scrutmydocs

Le 24 févr. 2015 à 14:55, alexand...@instantluxe.com <javascript:> a
écrit :

And, if you need it my mongodb collection data structure :

{

"_id" : ObjectId("54eb3c35a710901a698b4567"),
"country" : "FR",
"created_at" : "Mon Feb 23 14:25:30 +0000 2015",
"geo" : {
"lat" : 49.119696,
"lon" : 6.176355
},
"id" : -812216320,
"location" : "Metz ",
"text" : "Passer des vacances formidable avec des gens géniaux sans
aucunes pression avec pour seul soucis s'éclater et se laisser vivre
#BONHEUR"
}

Le mardi 24 février 2015 14:48:00 UTC+1, alexand...@instantluxe.com a
écrit :

Hello,
My name is Alex i am working on my student project and, of course i am
quite a newbie with elasticsearch.
I have an issue with mapping and analyse :

I am using a mongoDB river witch is working perfectly, Here is how i
create my index :

curl -s -XPUT "http://localhost:9200/twitter" -d '

{
"twitter" : {
"mappings" : {
"id" : {
"type" : "long",
"store" : true
},
"text" : {
"type" : "string",
"store" : true,
"index" : "analysed"
},
"created_at" : {
"type" : "date",
"store" : true
},
"location" : {
"type" : "string",
"store" : true
},
"geo" : {
"type" : "geo_point",
"store" : true
}
},
"analysis":{
"analyser":{
"indexAnalyser" : {
"type":"custom",
"tokenizer":"standard",
"filter":["lowercase", "whitespace"]
},
"searchAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":[
"standard",
"lowercase",
"whitespace"
]
}
}
}
}
}'

And my river :

curl -s -XPUT "http://localhost:9200/_river/mongodb/_meta" -d '

{
"type": "mongodb",
"mongodb":
{"servers": [{
"host": "My IP", "port": 27017
}],
"options": {},
"db": "twitter",
"collection": "tweets"
},"index": {
"name": "twitter"
}
}'

The point is the tokenizer don't do his job.
When i check in the index metadata i can see this :

{

"state": "open",

"settings": {

"index": {

"twitter": {

"analysis": {

"analyser": {

"searchAnalyzer": {

"type": "custom",

"filter": [

"standard",

"lowercase",

"whitespace"
],

"tokenizer": "standard"
},

"indexAnalyser": {

"type": "custom",

"filter": [

"lowercase",

"whitespace"
],

"tokenizer": "standard"
}
}
},

"mappings": {

"id": {

"type": "long",

"store": "true"
},

"created_at": {

"type": "date",

"store": "true"
},

"location": {

"store": "true",

"type": "string"
},

"text": {

"type": "string",

"store": "true",

"index": "analysed"
},

"geo": {

"store": "true",

"type": "geo_point"
}
}
},

"uuid": "8_6Pliw_TAW_x34cFhIUtg",

"number_of_replicas": "1",

"number_of_shards": "5",

"version": {

"created": "1020299"
}
}
},

"mappings": {

"twitter": {

"properties": {

"id": {

"type": "long"
},

"text": {

"type": "string"
},

"geo": {

"properties": {

"lon": {

"type": "double"
},

"lat": {

"type": "double"
}
}
},

"location": {

"type": "string"
},

"created_at": {

"type": "string"
},

"country": {

"type": "string"
}
}
}
},

"aliases":

}

There is 2 mappings with one as a child of settings.

Can you be kind and Explain me what i did wrong?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e8718555-03b5-4978-ac97-244b6d8be46d%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e8718555-03b5-4978-ac97-244b6d8be46d%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/aaa6fa8c-ac00-40d9-8794-6ba19bfdbcef%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

dadoonet · February 24, 2015, 9:40pm

Well. First you should use Marvel and sense. I've seen some bugs in Head in past when it comes to displaying content.
Run a single GET and check what you have.

David

Le 24 févr. 2015 à 22:16, alexandre.klein@instantluxe.com a écrit :

Thank you very much (Merci Beaucoup !).

I have a last question :

Everything works find, except my geo_point isn't populated in elasticsearch :

Here are my queries :

curl -s -XPUT "http://localhost:9200/twitter" -d '
{
"settings": {
"analysis": {
"analyzer": {
"indexAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase", "asciifolding"
]
},
"searchAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
[ "standard","lowercase", "asciifolding" ]
]
}
},
"filter" : {
"frenchLowerFilter" : {
"type" : "lowercase",
"language" : "french"
},
"asciifolding" : {
"type" : "asciifolding"
}
}
}
},
"mappings": {
"twitterdb": {
"properties": {
"id": {
"type": "long",
"store": true
},
"text": {
"type": "string",
"store": true,
"analyzer" : "indexAnalyzer",
"filter" : ["frenchLowerFilter", "asciifolding"]
},
"created_at": {
"type": "date",
"format": "EE MMM d HH:mm:ss Z yyyy",
"store": true
},
"location": {
"type": "string",
"store": true
},
"geo": {
"type": "geo_point",
"store": true
}
}
}
}
}'

curl -s -XPUT "http://localhost:9200/_river/mongodb/_meta" -d '
{
"type": "mongodb",
"mongodb":
{"servers": [{
"host": "My IP", "port": 27017
}],
"options": {},
"db": "twitterdb",
"collection": "tweets"
},"index": {
"name": "twitter"
}
}'

My index Metadata :

{
"state": "open",
"settings": {
"index": {
"uuid": "IKZbCpeeRm6Ns6UTXl43vg",
"analysis": {
"analyzer": {
"indexAnalyzer": {
"type": "custom",
"filter": [
"lowercase"
,
"asciifolding"
],
"tokenizer": "whitespace"
},
"searchAnalyzer": {
"type": "custom",
"filter": [
[
"standard"
,
"lowercase"
,
"asciifolding"
]
],
"tokenizer": "whitespace"
}
},
"filter": {
"asciifolding": {
"type": "asciifolding"
},
"frenchLowerFilter": {
"type": "lowercase",
"language": "french"
}
}
},
"number_of_replicas": "1",
"number_of_shards": "5",
"version": {
"created": "1020299"
}
}
},
"mappings": {
"twitterdb": {
"properties": {
"id": {
"store": true,
"type": "long"
},
"text": {
"store": true,
"analyzer": "indexAnalyzer",
"type": "string"
},
"geo": {
"store": true,
"type": "geo_point"
},
"location": {
"store": true,
"type": "string"
},
"created_at": {
"store": true,
"format": "EE MMM d HH:mm:ss Z yyyy",
"type": "date"
},
"country": {
"type": "string"
}
}
}
},
"aliases":
}

And a sample of what is stored in my mongo :

{
"_id" : ObjectId("54ece244a71090fc698b4567"),
"id" : 1733746688,
"text" : "#arte #bonheur au #travail. #Favi 2 règles: Et l'homme est bon! Organiser en "mini usine"
, la régulation se fait à 90% sur le terrain",
"created_at" : "Tue Feb 24 20:37:28 +0000 2015",
"location" : "Paris, France",
"geo" : {
"lat" : 48.856506,
"lon" : 2.352133
},
"country" : "FR"
}

But when i look my _plugin/head, there is no geo, nor lat and nor lon.

I looked at Elasticsearch Platform — Find real-time answers at scale | Elastic and it looks good... i think i don't understand something about how to store an object like "geo" : { "lat" : "", "lon" : "" } as a geo_point in the mapping, but i don"t know what.

What did i failed pleased ?. (It will be my last question this month ;))

Le mardi 24 février 2015 16:23:45 UTC+1, David Pilato a écrit :

Bonjour Alexandre

I think you should look at Elasticsearch Platform — Find real-time answers at scale | Elastic

A lot of wrong things:

analyser vs analyzer
position of analyzers, …

Here is something which works

DELETE twitter
PUT /twitter
{
"settings": {
"analysis": {
"analyzer": {
"indexAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase"
]
},
"searchAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"standard",
"lowercase"
]
}
}
}
},
"mappings": {
"twitter": {
"properties": {
"id": {
"type": "long",
"store": true
},
"text": {
"type": "string",
"store": true,
"index": "analyzed"
},
"created_at": {
"type": "date",
"store": true
},
"location": {
"type": "string",
"store": true
},
"geo": {
"type": "geo_point",
"store": true
}
}
}
}
}
GET twitter

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 24 févr. 2015 à 14:55, alexand...@instantluxe.com a écrit :

And, if you need it my mongodb collection data structure :

{
"_id" : ObjectId("54eb3c35a710901a698b4567"),
"country" : "FR",
"created_at" : "Mon Feb 23 14:25:30 +0000 2015",
"geo" : {
"lat" : 49.119696,
"lon" : 6.176355
},
"id" : -812216320,
"location" : "Metz ",
"text" : "Passer des vacances formidable avec des gens géniaux sans aucunes pression avec pour seul soucis s'éclater et se laisser vivre #BONHEUR"
}

Le mardi 24 février 2015 14:48:00 UTC+1, alexand...@instantluxe.com a écrit :

Hello,
My name is Alex i am working on my student project and, of course i am quite a newbie with elasticsearch.
I have an issue with mapping and analyse :

I am using a mongoDB river witch is working perfectly, Here is how i create my index :

curl -s -XPUT "http://localhost:9200/twitter" -d '
{
"twitter" : {
"mappings" : {
"id" : {
"type" : "long",
"store" : true
},
"text" : {
"type" : "string",
"store" : true,
"index" : "analysed"
},
"created_at" : {
"type" : "date",
"store" : true
},
"location" : {
"type" : "string",
"store" : true
},
"geo" : {
"type" : "geo_point",
"store" : true
}
},
"analysis":{
"analyser":{
"indexAnalyser" : {
"type":"custom",
"tokenizer":"standard",
"filter":["lowercase", "whitespace"]
},
"searchAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":[
"standard",
"lowercase",
"whitespace"
]
}
}
}
}
}'

And my river :

curl -s -XPUT "http://localhost:9200/_river/mongodb/_meta" -d '
{
"type": "mongodb",
"mongodb":
{"servers": [{
"host": "My IP", "port": 27017
}],
"options": {},
"db": "twitter",
"collection": "tweets"
},"index": {
"name": "twitter"
}
}'

The point is the tokenizer don't do his job.
When i check in the index metadata i can see this :

{
"state": "open",
"settings": {
"index": {
"twitter": {
"analysis": {
"analyser": {
"searchAnalyzer": {
"type": "custom",
"filter": [
"standard"
,
"lowercase"
,
"whitespace"
],
"tokenizer": "standard"
},
"indexAnalyser": {
"type": "custom",
"filter": [
"lowercase"
,
"whitespace"
],
"tokenizer": "standard"
}
}
},
"mappings": {
"id": {
"type": "long",
"store": "true"
},
"created_at": {
"type": "date",
"store": "true"
},
"location": {
"store": "true",
"type": "string"
},
"text": {
"type": "string",
"store": "true",
"index": "analysed"
},
"geo": {
"store": "true",
"type": "geo_point"
}
}
},
"uuid": "8_6Pliw_TAW_x34cFhIUtg",
"number_of_replicas": "1",
"number_of_shards": "5",
"version": {
"created": "1020299"
}
}
},
"mappings": {
"twitter": {
"properties": {
"id": {
"type": "long"
},
"text": {
"type": "string"
},
"geo": {
"properties": {
"lon": {
"type": "double"
},
"lat": {
"type": "double"
}
}
},
"location": {
"type": "string"
},
"created_at": {
"type": "string"
},
"country": {
"type": "string"
}
}
}
},
"aliases":
}

There is 2 mappings with one as a child of settings.

Can you be kind and Explain me what i did wrong?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e8718555-03b5-4978-ac97-244b6d8be46d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/aaa6fa8c-ac00-40d9-8794-6ba19bfdbcef%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/FC7FB41E-269D-4E85-8266-FE8AEE8F0F35%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

alexandre_klein · February 24, 2015, 10:01pm

You were right,

It was working but not shown in head.

Thank you very much !

Alex

Le mardi 24 février 2015 22:41:00 UTC+1, David Pilato a écrit :

Well. First you should use Marvel and sense. I've seen some bugs in Head
in past when it comes to displaying content.
Run a single GET and check what you have.

David

Le 24 févr. 2015 à 22:16, alexand...@instantluxe.com <javascript:> a
écrit :

Thank you very much (Merci Beaucoup !).

I have a last question :

Everything works find, except my geo_point isn't populated in
elasticsearch :

Here are my queries :

curl -s -XPUT "http://localhost:9200/twitter" -d '
{
"settings": {
"analysis": {
"analyzer": {
"indexAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase", "asciifolding"
]
},
"searchAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
[ "standard","lowercase", "asciifolding" ]
]
}
},
"filter" : {
"frenchLowerFilter" : {
"type" : "lowercase",
"language" : "french"
},
"asciifolding" : {
"type" : "asciifolding"
}
}
}
},
"mappings": {
"twitterdb": {
"properties": {
"id": {
"type": "long",
"store": true
},
"text": {
"type": "string",
"store": true,
"analyzer" : "indexAnalyzer",
"filter" : ["frenchLowerFilter", "asciifolding"]
},
"created_at": {
"type": "date",
"format": "EE MMM d HH:mm:ss Z yyyy",
"store": true
},
"location": {
"type": "string",
"store": true
},
"geo": {
"type": "geo_point",
"store": true
}
}
}
}
}'

curl -s -XPUT "http://localhost:9200/_river/mongodb/_meta" -d '
{
"type": "mongodb",
"mongodb":
{"servers": [{
"host": "My IP", "port": 27017
}],
"options": {},
"db": "twitterdb",
"collection": "tweets"
},"index": {
"name": "twitter"
}
}'

My index Metadata :

{

"state": "open",

"settings": {

"index": {

"uuid": "IKZbCpeeRm6Ns6UTXl43vg",

"analysis": {

"analyzer": {

"indexAnalyzer": {

"type": "custom",

"filter": [

"lowercase",

"asciifolding"
],

"tokenizer": "whitespace"
},

"searchAnalyzer": {

"type": "custom",

"filter": [

[

"standard",

"lowercase",

"asciifolding"
]
],

"tokenizer": "whitespace"
}
},

"filter": {

"asciifolding": {

"type": "asciifolding"
},

"frenchLowerFilter": {

"type": "lowercase",

"language": "french"
}
}
},

"number_of_replicas": "1",

"number_of_shards": "5",

"version": {

"created": "1020299"
}
}
},

"mappings": {

"twitterdb": {

"properties": {

"id": {

"store": true,

"type": "long"
},

"text": {

"store": true,

"analyzer": "indexAnalyzer",

"type": "string"
},

"geo": {

"store": true,

"type": "geo_point"
},

"location": {

"store": true,

"type": "string"
},

"created_at": {

"store": true,

"format": "EE MMM d HH:mm:ss Z yyyy",

"type": "date"
},

"country": {

"type": "string"
}
}
}
},

"aliases":

}

And a sample of what is stored in my mongo :

{ "_id" : ObjectId("54ece244a71090fc698b4567"), "id" : 1733746688, "text"
: "#arte #bonheur au #travail. #Favi 2 règles: Et l'homme est bon!
Organiser en "mini usine", la régulation se fait à 90% sur le terrain",
"created_at" : "Tue Feb 24 20:37:28 +0000 2015", "location" : "Paris,
France", "geo" : { "lat" : 48.856506, "lon" : 2.352133 }, "country" : "FR"}

But when i look my _plugin/head, there is no geo, nor lat and nor lon.

I looked at
Elasticsearch Platform — Find real-time answers at scale | Elastic
and it looks good... i think i don't understand something about how to
store an object like "geo" : { "lat" : "", "lon" : "" } as a geo_point in
the mapping, but i don"t know what.

What did i failed pleased ?. (It will be my last question this month ;))

Le mardi 24 février 2015 16:23:45 UTC+1, David Pilato a écrit :

Bonjour Alexandre

I think you should look at
Elasticsearch Platform — Find real-time answers at scale | Elastic

A lot of wrong things:

analyser vs analyzer
position of analyzers, …

Here is something which works

DELETE twitter
PUT /twitter
{
"settings": {
"analysis": {
"analyzer": {
"indexAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase"
]
},
"searchAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"standard",
"lowercase"
]
}
}
}
},
"mappings": {
"twitter": {
"properties": {
"id": {
"type": "long",
"store": true
},
"text": {
"type": "string",
"store": true,
"index": "analyzed"
},
"created_at": {
"type": "date",
"store": true
},
"location": {
"type": "string",
"store": true
},
"geo": {
"type": "geo_point",
"store": true
}
}
}
}
}
GET twitter

--
David Pilato | Technical Advocate | Elasticsearch.com
http://Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr
https://twitter.com/elasticsearchfr | @scrutmydocs
https://twitter.com/scrutmydocs

Le 24 févr. 2015 à 14:55, alexand...@instantluxe.com a écrit :

And, if you need it my mongodb collection data structure :

{
"_id" : ObjectId("54eb3c35a710901a698b4567"),
"country" : "FR",
"created_at" : "Mon Feb 23 14:25:30 +0000 2015",
"geo" : {
"lat" : 49.119696,
"lon" : 6.176355
},
"id" : -812216320,
"location" : "Metz ",
"text" : "Passer des vacances formidable avec des gens géniaux sans
aucunes pression avec pour seul soucis s'éclater et se laisser vivre
#BONHEUR"
}

Le mardi 24 février 2015 14:48:00 UTC+1, alexand...@instantluxe.com a
écrit :

Hello,
My name is Alex i am working on my student project and, of course i am
quite a newbie with elasticsearch.
I have an issue with mapping and analyse :

I am using a mongoDB river witch is working perfectly, Here is how i
create my index :

curl -s -XPUT "http://localhost:9200/twitter" -d '
{
"twitter" : {
"mappings" : {
"id" : {
"type" : "long",
"store" : true
},
"text" : {
"type" : "string",
"store" : true,
"index" : "analysed"
},
"created_at" : {
"type" : "date",
"store" : true
},
"location" : {
"type" : "string",
"store" : true
},
"geo" : {
"type" : "geo_point",
"store" : true
}
},
"analysis":{
"analyser":{
"indexAnalyser" : {
"type":"custom",
"tokenizer":"standard",
"filter":["lowercase", "whitespace"]
},
"searchAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":[
"standard",
"lowercase",
"whitespace"
]
}
}
}
}
}'

And my river :

curl -s -XPUT "http://localhost:9200/_river/mongodb/_meta" -d '
{
"type": "mongodb",
"mongodb":
{"servers": [{
"host": "My IP", "port": 27017
}],
"options": {},
"db": "twitter",
"collection": "tweets"
},"index": {
"name": "twitter"
}
}'

The point is the tokenizer don't do his job.
When i check in the index metadata i can see this :

{

"state": "open",

"settings": {

"index": {

"twitter": {

"analysis": {

"analyser": {

"searchAnalyzer": {

"type": "custom",

"filter": [

"standard",

"lowercase",

"whitespace"
],

"tokenizer": "standard"
},

"indexAnalyser": {

"type": "custom",

"filter": [

"lowercase",

"whitespace"
],

"tokenizer": "standard"
}
}
},

"mappings": {

"id": {

"type": "long",

"store": "true"
},

"created_at": {

"type": "date",

"store": "true"
},

"location": {

"store": "true",

"type": "string"
},

"text": {

"type": "string",

"store": "true",

"index": "analysed"
},

"geo": {

"store": "true",

"type": "geo_point"
}
}
},

"uuid": "8_6Pliw_TAW_x34cFhIUtg",

"number_of_replicas": "1",

"number_of_shards": "5",

"version": {

"created": "1020299"
}
}
},

"mappings": {

"twitter": {

"properties": {

"id": {

"type": "long"
},

"text": {

"type": "string"
},

"geo": {

"properties": {

"lon": {

"type": "double"
},

"lat": {

"type": "double"
}
}
},

"location": {

"type": "string"
},

"created_at": {

"type": "string"
},

"country": {

"type": "string"
}

...

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/25b24fb9-986d-4699-aca4-faa223b19064%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Elasticsearch Couchdb river Mapping problem Elasticsearch	3	395	July 6, 2017
JDBC River Fails When Mapping Is Used Elasticsearch	2	376	July 6, 2017
Problem with fs_river: multi_field type mapping Elasticsearch	2	364	July 6, 2017
Wrestling with analyzer Elasticsearch	5	414	July 6, 2017
How to create analyzer mapping for custom analyzer Elasticsearch	1	270	July 6, 2017

Mapping and analysis issue on a river

Related topics