Mapping and analysis issue on a river

Hello,
My name is Alex i am working on my student project and, of course i am
quite a newbie with elasticsearch.
I have an issue with mapping and analyse :

I am using a mongoDB river witch is working perfectly, Here is how i create
my index :

curl -s -XPUT "http://localhost:9200/twitter" -d '

{
"twitter" : {
"mappings" : {
"id" : {
"type" : "long",
"store" : true
},
"text" : {
"type" : "string",
"store" : true,
"index" : "analysed"
},
"created_at" : {
"type" : "date",
"store" : true
},
"location" : {
"type" : "string",
"store" : true
},
"geo" : {
"type" : "geo_point",
"store" : true
}
},
"analysis":{
"analyser":{
"indexAnalyser" : {
"type":"custom",
"tokenizer":"standard",
"filter":["lowercase", "whitespace"]
},
"searchAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":[
"standard",
"lowercase",
"whitespace"
]
}
}
}
}
}'

And my river :

curl -s -XPUT "http://localhost:9200/_river/mongodb/_meta" -d '

{
"type": "mongodb",
"mongodb":
{"servers": [{
"host": "My IP", "port": 27017
}],
"options": {},
"db": "twitter",
"collection": "tweets"
},"index": {
"name": "twitter"
}
}'

The point is the tokenizer don't do his job.
When i check in the index metadata i can see this :

{

  • "state": "open",
  • "settings": {
    • "index": {
      • "twitter": {
        • "analysis": {
          • "analyser": {
            • "searchAnalyzer": {
              • "type": "custom",
              • "filter": [
                • "standard",
                • "lowercase",
                • "whitespace"
                  ],
              • "tokenizer": "standard"
                },
            • "indexAnalyser": {
              • "type": "custom",
              • "filter": [
                • "lowercase",
                • "whitespace"
                  ],
              • "tokenizer": "standard"
                }
                }
                },
        • "mappings": {
          • "id": {
            • "type": "long",
            • "store": "true"
              },
          • "created_at": {
            • "type": "date",
            • "store": "true"
              },
          • "location": {
            • "store": "true",
            • "type": "string"
              },
          • "text": {
            • "type": "string",
            • "store": "true",
            • "index": "analysed"
              },
          • "geo": {
            • "store": "true",
            • "type": "geo_point"
              }
              }
              },
      • "uuid": "8_6Pliw_TAW_x34cFhIUtg",
      • "number_of_replicas": "1",
      • "number_of_shards": "5",
      • "version": {
        • "created": "1020299"
          }
          }
          },
  • "mappings": {
    • "twitter": {
      • "properties": {
        • "id": {
          • "type": "long"
            },
        • "text": {
          • "type": "string"
            },
        • "geo": {
          • "properties": {
            • "lon": {
              • "type": "double"
                },
            • "lat": {
              • "type": "double"
                }
                }
                },
        • "location": {
          • "type": "string"
            },
        • "created_at": {
          • "type": "string"
            },
        • "country": {
          • "type": "string"
            }
            }
            }
            },
  • "aliases":

}

There is 2 mappings with one as a child of settings.

Can you be kind and Explain me what i did wrong?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f76fc651-ca65-4335-abf1-8754aa076c4f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

And, if you need it my mongodb collection data structure :

{

"_id" : ObjectId("54eb3c35a710901a698b4567"),
"country" : "FR",
"created_at" : "Mon Feb 23 14:25:30 +0000 2015",
"geo" : {
"lat" : 49.119696,
"lon" : 6.176355
},
"id" : -812216320,
"location" : "Metz ",
"text" : "Passer des vacances formidable avec des gens géniaux sans
aucunes pression avec pour seul soucis s'éclater et se laisser vivre
#BONHEUR"
}

Le mardi 24 février 2015 14:48:00 UTC+1, alexand...@instantluxe.com a
écrit :

Hello,
My name is Alex i am working on my student project and, of course i am
quite a newbie with elasticsearch.
I have an issue with mapping and analyse :

I am using a mongoDB river witch is working perfectly, Here is how i
create my index :

curl -s -XPUT "http://localhost:9200/twitter" -d '

{
"twitter" : {
"mappings" : {
"id" : {
"type" : "long",
"store" : true
},
"text" : {
"type" : "string",
"store" : true,
"index" : "analysed"
},
"created_at" : {
"type" : "date",
"store" : true
},
"location" : {
"type" : "string",
"store" : true
},
"geo" : {
"type" : "geo_point",
"store" : true
}
},
"analysis":{
"analyser":{
"indexAnalyser" : {
"type":"custom",
"tokenizer":"standard",
"filter":["lowercase", "whitespace"]
},
"searchAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":[
"standard",
"lowercase",
"whitespace"
]
}
}
}
}
}'

And my river :

curl -s -XPUT "http://localhost:9200/_river/mongodb/_meta" -d '

{
"type": "mongodb",
"mongodb":
{"servers": [{
"host": "My IP", "port": 27017
}],
"options": {},
"db": "twitter",
"collection": "tweets"
},"index": {
"name": "twitter"
}
}'

The point is the tokenizer don't do his job.
When i check in the index metadata i can see this :

{

  • "state": "open",
  • "settings": {
    • "index": {
      • "twitter": {
        • "analysis": {
          • "analyser": {
            • "searchAnalyzer": {
              • "type": "custom",
              • "filter": [
                • "standard",
                • "lowercase",
                • "whitespace"
                  ],
              • "tokenizer": "standard"
                },
            • "indexAnalyser": {
              • "type": "custom",
              • "filter": [
                • "lowercase",
                • "whitespace"
                  ],
              • "tokenizer": "standard"
                }
                }
                },
        • "mappings": {
          • "id": {
            • "type": "long",
            • "store": "true"
              },
          • "created_at": {
            • "type": "date",
            • "store": "true"
              },
          • "location": {
            • "store": "true",
            • "type": "string"
              },
          • "text": {
            • "type": "string",
            • "store": "true",
            • "index": "analysed"
              },
          • "geo": {
            • "store": "true",
            • "type": "geo_point"
              }
              }
              },
      • "uuid": "8_6Pliw_TAW_x34cFhIUtg",
      • "number_of_replicas": "1",
      • "number_of_shards": "5",
      • "version": {
        • "created": "1020299"
          }
          }
          },
  • "mappings": {
    • "twitter": {
      • "properties": {
        • "id": {
          • "type": "long"
            },
        • "text": {
          • "type": "string"
            },
        • "geo": {
          • "properties": {
            • "lon": {
              • "type": "double"
                },
            • "lat": {
              • "type": "double"
                }
                }
                },
        • "location": {
          • "type": "string"
            },
        • "created_at": {
          • "type": "string"
            },
        • "country": {
          • "type": "string"
            }
            }
            }
            },
  • "aliases":

}

There is 2 mappings with one as a child of settings.

Can you be kind and Explain me what i did wrong?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e8718555-03b5-4978-ac97-244b6d8be46d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Bonjour Alexandre :slight_smile:

I think you should look at Elasticsearch Platform — Find real-time answers at scale | Elastic http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-create-index.html#mappings

A lot of wrong things:

analyser vs analyzer
position of analyzers, …

Here is something which works

DELETE twitter
PUT /twitter
{
"settings": {
"analysis": {
"analyzer": {
"indexAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase"
]
},
"searchAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"standard",
"lowercase"
]
}
}
}
},
"mappings": {
"twitter": {
"properties": {
"id": {
"type": "long",
"store": true
},
"text": {
"type": "string",
"store": true,
"index": "analyzed"
},
"created_at": {
"type": "date",
"store": true
},
"location": {
"type": "string",
"store": true
},
"geo": {
"type": "geo_point",
"store": true
}
}
}
}
}
GET twitter

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr https://twitter.com/elasticsearchfr | @scrutmydocs https://twitter.com/scrutmydocs

Le 24 févr. 2015 à 14:55, alexandre.klein@instantluxe.com a écrit :

And, if you need it my mongodb collection data structure :

{
"_id" : ObjectId("54eb3c35a710901a698b4567"),
"country" : "FR",
"created_at" : "Mon Feb 23 14:25:30 +0000 2015",
"geo" : {
"lat" : 49.119696,
"lon" : 6.176355
},
"id" : -812216320,
"location" : "Metz ",
"text" : "Passer des vacances formidable avec des gens géniaux sans aucunes pression avec pour seul soucis s'éclater et se laisser vivre #BONHEUR"
}

Le mardi 24 février 2015 14:48:00 UTC+1, alexand...@instantluxe.com http://instantluxe.com/ a écrit :
Hello,
My name is Alex i am working on my student project and, of course i am quite a newbie with elasticsearch.
I have an issue with mapping and analyse :

I am using a mongoDB river witch is working perfectly, Here is how i create my index :

curl -s -XPUT "http://localhost:9200/twitter http://localhost:9200/twitter" -d '
{
"twitter" : {
"mappings" : {
"id" : {
"type" : "long",
"store" : true
},
"text" : {
"type" : "string",
"store" : true,
"index" : "analysed"
},
"created_at" : {
"type" : "date",
"store" : true
},
"location" : {
"type" : "string",
"store" : true
},
"geo" : {
"type" : "geo_point",
"store" : true
}
},
"analysis":{
"analyser":{
"indexAnalyser" : {
"type":"custom",
"tokenizer":"standard",
"filter":["lowercase", "whitespace"]
},
"searchAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":[
"standard",
"lowercase",
"whitespace"
]
}
}
}
}
}'

And my river :

curl -s -XPUT "http://localhost:9200/_river/mongodb/_meta http://localhost:9200/_river/mongodb/_meta" -d '
{
"type": "mongodb",
"mongodb":
{"servers": [{
"host": "My IP", "port": 27017
}],
"options": {},
"db": "twitter",
"collection": "tweets"
},"index": {
"name": "twitter"
}
}'

The point is the tokenizer don't do his job.
When i check in the index metadata i can see this :
{
"state": "open",
"settings": {
"index": {
"twitter": {
"analysis": {
"analyser": {
"searchAnalyzer": {
"type": "custom",
"filter": [
"standard"
,
"lowercase"
,
"whitespace"
],
"tokenizer": "standard"
},
"indexAnalyser": {
"type": "custom",
"filter": [
"lowercase"
,
"whitespace"
],
"tokenizer": "standard"
}
}
},
"mappings": {
"id": {
"type": "long",
"store": "true"
},
"created_at": {
"type": "date",
"store": "true"
},
"location": {
"store": "true",
"type": "string"
},
"text": {
"type": "string",
"store": "true",
"index": "analysed"
},
"geo": {
"store": "true",
"type": "geo_point"
}
}
},
"uuid": "8_6Pliw_TAW_x34cFhIUtg",
"number_of_replicas": "1",
"number_of_shards": "5",
"version": {
"created": "1020299"
}
}
},
"mappings": {
"twitter": {
"properties": {
"id": {
"type": "long"
},
"text": {
"type": "string"
},
"geo": {
"properties": {
"lon": {
"type": "double"
},
"lat": {
"type": "double"
}
}
},
"location": {
"type": "string"
},
"created_at": {
"type": "string"
},
"country": {
"type": "string"
}
}
}
},
"aliases":
}

There is 2 mappings with one as a child of settings.

Can you be kind and Explain me what i did wrong?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e8718555-03b5-4978-ac97-244b6d8be46d%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/e8718555-03b5-4978-ac97-244b6d8be46d%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9713D1B6-0797-4F64-BFDD-45BFCC07C982%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Thank you very much :slight_smile: (Merci Beaucoup !).

I have a last question :

Everything works find, except my geo_point isn't populated in elasticsearch
:

Here are my queries :

curl -s -XPUT "http://localhost:9200/twitter" -d '

{
"settings": {
"analysis": {
"analyzer": {
"indexAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase", "asciifolding"
]
},
"searchAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
[ "standard","lowercase", "asciifolding" ]
]
}
},
"filter" : {
"frenchLowerFilter" : {
"type" : "lowercase",
"language" : "french"
},
"asciifolding" : {
"type" : "asciifolding"
}
}
}
},
"mappings": {
"twitterdb": {
"properties": {
"id": {
"type": "long",
"store": true
},
"text": {
"type": "string",
"store": true,
"analyzer" : "indexAnalyzer",
"filter" : ["frenchLowerFilter", "asciifolding"]
},
"created_at": {
"type": "date",
"format": "EE MMM d HH:mm:ss Z yyyy",
"store": true
},
"location": {
"type": "string",
"store": true
},
"geo": {
"type": "geo_point",
"store": true
}
}
}
}
}'

curl -s -XPUT "http://localhost:9200/_river/mongodb/_meta" -d '
{
"type": "mongodb",
"mongodb":
{"servers": [{
"host": "My IP", "port": 27017
}],
"options": {},
"db": "twitterdb",
"collection": "tweets"
},"index": {
"name": "twitter"
}
}'

My index Metadata :

{

  • "state": "open",
  • "settings": {
    • "index": {
      • "uuid": "IKZbCpeeRm6Ns6UTXl43vg",
      • "analysis": {
        • "analyzer": {
          • "indexAnalyzer": {
            • "type": "custom",
            • "filter": [
              • "lowercase",
              • "asciifolding"
                ],
            • "tokenizer": "whitespace"
              },
          • "searchAnalyzer": {
            • "type": "custom",
            • "filter": [
              • [
                • "standard",
                • "lowercase",
                • "asciifolding"
                  ]
                  ],
            • "tokenizer": "whitespace"
              }
              },
        • "filter": {
          • "asciifolding": {
            • "type": "asciifolding"
              },
          • "frenchLowerFilter": {
            • "type": "lowercase",
            • "language": "french"
              }
              }
              },
      • "number_of_replicas": "1",
      • "number_of_shards": "5",
      • "version": {
        • "created": "1020299"
          }
          }
          },
  • "mappings": {
    • "twitterdb": {
      • "properties": {
        • "id": {
          • "store": true,
          • "type": "long"
            },
        • "text": {
          • "store": true,
          • "analyzer": "indexAnalyzer",
          • "type": "string"
            },
        • "geo": {
          • "store": true,
          • "type": "geo_point"
            },
        • "location": {
          • "store": true,
          • "type": "string"
            },
        • "created_at": {
          • "store": true,
          • "format": "EE MMM d HH:mm:ss Z yyyy",
          • "type": "date"
            },
        • "country": {
          • "type": "string"
            }
            }
            }
            },
  • "aliases":

}

And a sample of what is stored in my mongo :

{ "_id" : ObjectId("54ece244a71090fc698b4567"), "id" : 1733746688, "text" :
"#arte #bonheur au #travail. #Favi 2 règles: Et l'homme est bon! Organiser
en "mini usine", la régulation se fait à 90% sur le terrain",
"created_at" : "Tue Feb 24 20:37:28 +0000 2015", "location" : "Paris,
France", "geo" : { "lat" : 48.856506, "lon" : 2.352133 }, "country" : "FR"}

But when i look my _plugin/head, there is no geo, nor lat and nor lon.

I looked
at Elasticsearch Platform — Find real-time answers at scale | Elastic
and it looks good... i think i don't understand something about how to
store an object like "geo" : { "lat" : "", "lon" : "" } as a geo_point in
the mapping, but i don"t know what.

What did i failed pleased ?. (It will be my last question this month ;))

Le mardi 24 février 2015 16:23:45 UTC+1, David Pilato a écrit :

Bonjour Alexandre :slight_smile:

I think you should look at
Elasticsearch Platform — Find real-time answers at scale | Elastic

A lot of wrong things:

analyser vs analyzer
position of analyzers, …

Here is something which works

DELETE twitter
PUT /twitter
{
"settings": {
"analysis": {
"analyzer": {
"indexAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase"
]
},
"searchAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"standard",
"lowercase"
]
}
}
}
},
"mappings": {
"twitter": {
"properties": {
"id": {
"type": "long",
"store": true
},
"text": {
"type": "string",
"store": true,
"index": "analyzed"
},
"created_at": {
"type": "date",
"store": true
},
"location": {
"type": "string",
"store": true
},
"geo": {
"type": "geo_point",
"store": true
}
}
}
}
}
GET twitter

--
David Pilato | Technical Advocate | Elasticsearch.com
http://Elasticsearch.com

@dadoonet https://twitter.com/dadoonet | @elasticsearchfr
https://twitter.com/elasticsearchfr | @scrutmydocs
https://twitter.com/scrutmydocs

Le 24 févr. 2015 à 14:55, alexand...@instantluxe.com <javascript:> a
écrit :

And, if you need it my mongodb collection data structure :

{

"_id" : ObjectId("54eb3c35a710901a698b4567"),
"country" : "FR",
"created_at" : "Mon Feb 23 14:25:30 +0000 2015",
"geo" : {
"lat" : 49.119696,
"lon" : 6.176355
},
"id" : -812216320,
"location" : "Metz ",
"text" : "Passer des vacances formidable avec des gens géniaux sans
aucunes pression avec pour seul soucis s'éclater et se laisser vivre
#BONHEUR"
}

Le mardi 24 février 2015 14:48:00 UTC+1, alexand...@instantluxe.com a
écrit :

Hello,
My name is Alex i am working on my student project and, of course i am
quite a newbie with elasticsearch.
I have an issue with mapping and analyse :

I am using a mongoDB river witch is working perfectly, Here is how i
create my index :

curl -s -XPUT "http://localhost:9200/twitter" -d '

{
"twitter" : {
"mappings" : {
"id" : {
"type" : "long",
"store" : true
},
"text" : {
"type" : "string",
"store" : true,
"index" : "analysed"
},
"created_at" : {
"type" : "date",
"store" : true
},
"location" : {
"type" : "string",
"store" : true
},
"geo" : {
"type" : "geo_point",
"store" : true
}
},
"analysis":{
"analyser":{
"indexAnalyser" : {
"type":"custom",
"tokenizer":"standard",
"filter":["lowercase", "whitespace"]
},
"searchAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":[
"standard",
"lowercase",
"whitespace"
]
}
}
}
}
}'

And my river :

curl -s -XPUT "http://localhost:9200/_river/mongodb/_meta" -d '

{
"type": "mongodb",
"mongodb":
{"servers": [{
"host": "My IP", "port": 27017
}],
"options": {},
"db": "twitter",
"collection": "tweets"
},"index": {
"name": "twitter"
}
}'

The point is the tokenizer don't do his job.
When i check in the index metadata i can see this :

{

  • "state": "open",
  • "settings": {
    • "index": {
      • "twitter": {
        • "analysis": {
          • "analyser": {
            • "searchAnalyzer": {
              • "type": "custom",
              • "filter": [
                • "standard",
                • "lowercase",
                • "whitespace"
                  ],
              • "tokenizer": "standard"
                },
            • "indexAnalyser": {
              • "type": "custom",
              • "filter": [
                • "lowercase",
                • "whitespace"
                  ],
              • "tokenizer": "standard"
                }
                }
                },
        • "mappings": {
          • "id": {
            • "type": "long",
            • "store": "true"
              },
          • "created_at": {
            • "type": "date",
            • "store": "true"
              },
          • "location": {
            • "store": "true",
            • "type": "string"
              },
          • "text": {
            • "type": "string",
            • "store": "true",
            • "index": "analysed"
              },
          • "geo": {
            • "store": "true",
            • "type": "geo_point"
              }
              }
              },
      • "uuid": "8_6Pliw_TAW_x34cFhIUtg",
      • "number_of_replicas": "1",
      • "number_of_shards": "5",
      • "version": {
        • "created": "1020299"
          }
          }
          },
  • "mappings": {
    • "twitter": {
      • "properties": {
        • "id": {
          • "type": "long"
            },
        • "text": {
          • "type": "string"
            },
        • "geo": {
          • "properties": {
            • "lon": {
              • "type": "double"
                },
            • "lat": {
              • "type": "double"
                }
                }
                },
        • "location": {
          • "type": "string"
            },
        • "created_at": {
          • "type": "string"
            },
        • "country": {
          • "type": "string"
            }
            }
            }
            },
  • "aliases":

}

There is 2 mappings with one as a child of settings.

Can you be kind and Explain me what i did wrong?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e8718555-03b5-4978-ac97-244b6d8be46d%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e8718555-03b5-4978-ac97-244b6d8be46d%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/aaa6fa8c-ac00-40d9-8794-6ba19bfdbcef%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Well. First you should use Marvel and sense. I've seen some bugs in Head in past when it comes to displaying content.
Run a single GET and check what you have.

David

Le 24 févr. 2015 à 22:16, alexandre.klein@instantluxe.com a écrit :

Thank you very much :slight_smile: (Merci Beaucoup !).

I have a last question :

Everything works find, except my geo_point isn't populated in elasticsearch :

Here are my queries :

curl -s -XPUT "http://localhost:9200/twitter" -d '
{
"settings": {
"analysis": {
"analyzer": {
"indexAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase", "asciifolding"
]
},
"searchAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
[ "standard","lowercase", "asciifolding" ]
]
}
},
"filter" : {
"frenchLowerFilter" : {
"type" : "lowercase",
"language" : "french"
},
"asciifolding" : {
"type" : "asciifolding"
}
}
}
},
"mappings": {
"twitterdb": {
"properties": {
"id": {
"type": "long",
"store": true
},
"text": {
"type": "string",
"store": true,
"analyzer" : "indexAnalyzer",
"filter" : ["frenchLowerFilter", "asciifolding"]
},
"created_at": {
"type": "date",
"format": "EE MMM d HH:mm:ss Z yyyy",
"store": true
},
"location": {
"type": "string",
"store": true
},
"geo": {
"type": "geo_point",
"store": true
}
}
}
}
}'

curl -s -XPUT "http://localhost:9200/_river/mongodb/_meta" -d '
{
"type": "mongodb",
"mongodb":
{"servers": [{
"host": "My IP", "port": 27017
}],
"options": {},
"db": "twitterdb",
"collection": "tweets"
},"index": {
"name": "twitter"
}
}'

My index Metadata :

{
"state": "open",
"settings": {
"index": {
"uuid": "IKZbCpeeRm6Ns6UTXl43vg",
"analysis": {
"analyzer": {
"indexAnalyzer": {
"type": "custom",
"filter": [
"lowercase"
,
"asciifolding"
],
"tokenizer": "whitespace"
},
"searchAnalyzer": {
"type": "custom",
"filter": [
[
"standard"
,
"lowercase"
,
"asciifolding"
]
],
"tokenizer": "whitespace"
}
},
"filter": {
"asciifolding": {
"type": "asciifolding"
},
"frenchLowerFilter": {
"type": "lowercase",
"language": "french"
}
}
},
"number_of_replicas": "1",
"number_of_shards": "5",
"version": {
"created": "1020299"
}
}
},
"mappings": {
"twitterdb": {
"properties": {
"id": {
"store": true,
"type": "long"
},
"text": {
"store": true,
"analyzer": "indexAnalyzer",
"type": "string"
},
"geo": {
"store": true,
"type": "geo_point"
},
"location": {
"store": true,
"type": "string"
},
"created_at": {
"store": true,
"format": "EE MMM d HH:mm:ss Z yyyy",
"type": "date"
},
"country": {
"type": "string"
}
}
}
},
"aliases":
}

And a sample of what is stored in my mongo :

{
"_id" : ObjectId("54ece244a71090fc698b4567"),
"id" : 1733746688,
"text" : "#arte #bonheur au #travail. #Favi 2 règles: Et l'homme est bon! Organiser en "mini usine"
, la régulation se fait à 90% sur le terrain",
"created_at" : "Tue Feb 24 20:37:28 +0000 2015",
"location" : "Paris, France",
"geo" : {
"lat" : 48.856506,
"lon" : 2.352133
},
"country" : "FR"
}

But when i look my _plugin/head, there is no geo, nor lat and nor lon.

I looked at Elasticsearch Platform — Find real-time answers at scale | Elastic and it looks good... i think i don't understand something about how to store an object like "geo" : { "lat" : "", "lon" : "" } as a geo_point in the mapping, but i don"t know what.

What did i failed pleased ?. (It will be my last question this month ;))

Le mardi 24 février 2015 16:23:45 UTC+1, David Pilato a écrit :

Bonjour Alexandre :slight_smile:

I think you should look at Elasticsearch Platform — Find real-time answers at scale | Elastic

A lot of wrong things:

analyser vs analyzer
position of analyzers, …

Here is something which works

DELETE twitter
PUT /twitter
{
"settings": {
"analysis": {
"analyzer": {
"indexAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase"
]
},
"searchAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"standard",
"lowercase"
]
}
}
}
},
"mappings": {
"twitter": {
"properties": {
"id": {
"type": "long",
"store": true
},
"text": {
"type": "string",
"store": true,
"index": "analyzed"
},
"created_at": {
"type": "date",
"store": true
},
"location": {
"type": "string",
"store": true
},
"geo": {
"type": "geo_point",
"store": true
}
}
}
}
}
GET twitter

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 24 févr. 2015 à 14:55, alexand...@instantluxe.com a écrit :

And, if you need it my mongodb collection data structure :

{
"_id" : ObjectId("54eb3c35a710901a698b4567"),
"country" : "FR",
"created_at" : "Mon Feb 23 14:25:30 +0000 2015",
"geo" : {
"lat" : 49.119696,
"lon" : 6.176355
},
"id" : -812216320,
"location" : "Metz ",
"text" : "Passer des vacances formidable avec des gens géniaux sans aucunes pression avec pour seul soucis s'éclater et se laisser vivre #BONHEUR"
}

Le mardi 24 février 2015 14:48:00 UTC+1, alexand...@instantluxe.com a écrit :

Hello,
My name is Alex i am working on my student project and, of course i am quite a newbie with elasticsearch.
I have an issue with mapping and analyse :

I am using a mongoDB river witch is working perfectly, Here is how i create my index :

curl -s -XPUT "http://localhost:9200/twitter" -d '
{
"twitter" : {
"mappings" : {
"id" : {
"type" : "long",
"store" : true
},
"text" : {
"type" : "string",
"store" : true,
"index" : "analysed"
},
"created_at" : {
"type" : "date",
"store" : true
},
"location" : {
"type" : "string",
"store" : true
},
"geo" : {
"type" : "geo_point",
"store" : true
}
},
"analysis":{
"analyser":{
"indexAnalyser" : {
"type":"custom",
"tokenizer":"standard",
"filter":["lowercase", "whitespace"]
},
"searchAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":[
"standard",
"lowercase",
"whitespace"
]
}
}
}
}
}'

And my river :

curl -s -XPUT "http://localhost:9200/_river/mongodb/_meta" -d '
{
"type": "mongodb",
"mongodb":
{"servers": [{
"host": "My IP", "port": 27017
}],
"options": {},
"db": "twitter",
"collection": "tweets"
},"index": {
"name": "twitter"
}
}'

The point is the tokenizer don't do his job.
When i check in the index metadata i can see this :

{
"state": "open",
"settings": {
"index": {
"twitter": {
"analysis": {
"analyser": {
"searchAnalyzer": {
"type": "custom",
"filter": [
"standard"
,
"lowercase"
,
"whitespace"
],
"tokenizer": "standard"
},
"indexAnalyser": {
"type": "custom",
"filter": [
"lowercase"
,
"whitespace"
],
"tokenizer": "standard"
}
}
},
"mappings": {
"id": {
"type": "long",
"store": "true"
},
"created_at": {
"type": "date",
"store": "true"
},
"location": {
"store": "true",
"type": "string"
},
"text": {
"type": "string",
"store": "true",
"index": "analysed"
},
"geo": {
"store": "true",
"type": "geo_point"
}
}
},
"uuid": "8_6Pliw_TAW_x34cFhIUtg",
"number_of_replicas": "1",
"number_of_shards": "5",
"version": {
"created": "1020299"
}
}
},
"mappings": {
"twitter": {
"properties": {
"id": {
"type": "long"
},
"text": {
"type": "string"
},
"geo": {
"properties": {
"lon": {
"type": "double"
},
"lat": {
"type": "double"
}
}
},
"location": {
"type": "string"
},
"created_at": {
"type": "string"
},
"country": {
"type": "string"
}
}
}
},
"aliases":
}

There is 2 mappings with one as a child of settings.

Can you be kind and Explain me what i did wrong?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e8718555-03b5-4978-ac97-244b6d8be46d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/aaa6fa8c-ac00-40d9-8794-6ba19bfdbcef%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/FC7FB41E-269D-4E85-8266-FE8AEE8F0F35%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

You were right,

It was working but not shown in head.

Thank you very much !

Alex

Le mardi 24 février 2015 22:41:00 UTC+1, David Pilato a écrit :

Well. First you should use Marvel and sense. I've seen some bugs in Head
in past when it comes to displaying content.
Run a single GET and check what you have.

David

Le 24 févr. 2015 à 22:16, alexand...@instantluxe.com <javascript:> a
écrit :

Thank you very much :slight_smile: (Merci Beaucoup !).

I have a last question :

Everything works find, except my geo_point isn't populated in
elasticsearch :

Here are my queries :

curl -s -XPUT "http://localhost:9200/twitter" -d '
{
"settings": {
"analysis": {
"analyzer": {
"indexAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase", "asciifolding"
]
},
"searchAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
[ "standard","lowercase", "asciifolding" ]
]
}
},
"filter" : {
"frenchLowerFilter" : {
"type" : "lowercase",
"language" : "french"
},
"asciifolding" : {
"type" : "asciifolding"
}
}
}
},
"mappings": {
"twitterdb": {
"properties": {
"id": {
"type": "long",
"store": true
},
"text": {
"type": "string",
"store": true,
"analyzer" : "indexAnalyzer",
"filter" : ["frenchLowerFilter", "asciifolding"]
},
"created_at": {
"type": "date",
"format": "EE MMM d HH:mm:ss Z yyyy",
"store": true
},
"location": {
"type": "string",
"store": true
},
"geo": {
"type": "geo_point",
"store": true
}
}
}
}
}'

curl -s -XPUT "http://localhost:9200/_river/mongodb/_meta" -d '
{
"type": "mongodb",
"mongodb":
{"servers": [{
"host": "My IP", "port": 27017
}],
"options": {},
"db": "twitterdb",
"collection": "tweets"
},"index": {
"name": "twitter"
}
}'

My index Metadata :

{

  • "state": "open",
  • "settings": {
    • "index": {
      • "uuid": "IKZbCpeeRm6Ns6UTXl43vg",
      • "analysis": {
        • "analyzer": {
          • "indexAnalyzer": {
            • "type": "custom",
            • "filter": [
              • "lowercase",
              • "asciifolding"
                ],
            • "tokenizer": "whitespace"
              },
          • "searchAnalyzer": {
            • "type": "custom",
            • "filter": [
              • [
                • "standard",
                • "lowercase",
                • "asciifolding"
                  ]
                  ],
            • "tokenizer": "whitespace"
              }
              },
        • "filter": {
          • "asciifolding": {
            • "type": "asciifolding"
              },
          • "frenchLowerFilter": {
            • "type": "lowercase",
            • "language": "french"
              }
              }
              },
      • "number_of_replicas": "1",
      • "number_of_shards": "5",
      • "version": {
        • "created": "1020299"
          }
          }
          },
  • "mappings": {
    • "twitterdb": {
      • "properties": {
        • "id": {
          • "store": true,
          • "type": "long"
            },
        • "text": {
          • "store": true,
          • "analyzer": "indexAnalyzer",
          • "type": "string"
            },
        • "geo": {
          • "store": true,
          • "type": "geo_point"
            },
        • "location": {
          • "store": true,
          • "type": "string"
            },
        • "created_at": {
          • "store": true,
          • "format": "EE MMM d HH:mm:ss Z yyyy",
          • "type": "date"
            },
        • "country": {
          • "type": "string"
            }
            }
            }
            },
  • "aliases":

}

And a sample of what is stored in my mongo :

{ "_id" : ObjectId("54ece244a71090fc698b4567"), "id" : 1733746688, "text"
: "#arte #bonheur au #travail. #Favi 2 règles: Et l'homme est bon!
Organiser en "mini usine", la régulation se fait à 90% sur le terrain",
"created_at" : "Tue Feb 24 20:37:28 +0000 2015", "location" : "Paris,
France", "geo" : { "lat" : 48.856506, "lon" : 2.352133 }, "country" : "FR"}

But when i look my _plugin/head, there is no geo, nor lat and nor lon.

I looked at
Elasticsearch Platform — Find real-time answers at scale | Elastic
and it looks good... i think i don't understand something about how to
store an object like "geo" : { "lat" : "", "lon" : "" } as a geo_point in
the mapping, but i don"t know what.

What did i failed pleased ?. (It will be my last question this month ;))

Le mardi 24 février 2015 16:23:45 UTC+1, David Pilato a écrit :

Bonjour Alexandre :slight_smile:

I think you should look at
Elasticsearch Platform — Find real-time answers at scale | Elastic

A lot of wrong things:

analyser vs analyzer
position of analyzers, …

Here is something which works

DELETE twitter
PUT /twitter
{
"settings": {
"analysis": {
"analyzer": {
"indexAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase"
]
},
"searchAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"standard",
"lowercase"
]
}
}
}
},
"mappings": {
"twitter": {
"properties": {
"id": {
"type": "long",
"store": true
},
"text": {
"type": "string",
"store": true,
"index": "analyzed"
},
"created_at": {
"type": "date",
"store": true
},
"location": {
"type": "string",
"store": true
},
"geo": {
"type": "geo_point",
"store": true
}
}
}
}
}
GET twitter

--
David Pilato | Technical Advocate | Elasticsearch.com
http://Elasticsearch.com

@dadoonet https://twitter.com/dadoonet | @elasticsearchfr
https://twitter.com/elasticsearchfr | @scrutmydocs
https://twitter.com/scrutmydocs

Le 24 févr. 2015 à 14:55, alexand...@instantluxe.com a écrit :

And, if you need it my mongodb collection data structure :

{
"_id" : ObjectId("54eb3c35a710901a698b4567"),
"country" : "FR",
"created_at" : "Mon Feb 23 14:25:30 +0000 2015",
"geo" : {
"lat" : 49.119696,
"lon" : 6.176355
},
"id" : -812216320,
"location" : "Metz ",
"text" : "Passer des vacances formidable avec des gens géniaux sans
aucunes pression avec pour seul soucis s'éclater et se laisser vivre
#BONHEUR"
}

Le mardi 24 février 2015 14:48:00 UTC+1, alexand...@instantluxe.com a
écrit :

Hello,
My name is Alex i am working on my student project and, of course i am
quite a newbie with elasticsearch.
I have an issue with mapping and analyse :

I am using a mongoDB river witch is working perfectly, Here is how i
create my index :

curl -s -XPUT "http://localhost:9200/twitter" -d '
{
"twitter" : {
"mappings" : {
"id" : {
"type" : "long",
"store" : true
},
"text" : {
"type" : "string",
"store" : true,
"index" : "analysed"
},
"created_at" : {
"type" : "date",
"store" : true
},
"location" : {
"type" : "string",
"store" : true
},
"geo" : {
"type" : "geo_point",
"store" : true
}
},
"analysis":{
"analyser":{
"indexAnalyser" : {
"type":"custom",
"tokenizer":"standard",
"filter":["lowercase", "whitespace"]
},
"searchAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":[
"standard",
"lowercase",
"whitespace"
]
}
}
}
}
}'

And my river :

curl -s -XPUT "http://localhost:9200/_river/mongodb/_meta" -d '
{
"type": "mongodb",
"mongodb":
{"servers": [{
"host": "My IP", "port": 27017
}],
"options": {},
"db": "twitter",
"collection": "tweets"
},"index": {
"name": "twitter"
}
}'

The point is the tokenizer don't do his job.
When i check in the index metadata i can see this :

{

  • "state": "open",
  • "settings": {
    • "index": {
      • "twitter": {
        • "analysis": {
          • "analyser": {
            • "searchAnalyzer": {
              • "type": "custom",
              • "filter": [
                • "standard",
                • "lowercase",
                • "whitespace"
                  ],
              • "tokenizer": "standard"
                },
            • "indexAnalyser": {
              • "type": "custom",
              • "filter": [
                • "lowercase",
                • "whitespace"
                  ],
              • "tokenizer": "standard"
                }
                }
                },
        • "mappings": {
          • "id": {
            • "type": "long",
            • "store": "true"
              },
          • "created_at": {
            • "type": "date",
            • "store": "true"
              },
          • "location": {
            • "store": "true",
            • "type": "string"
              },
          • "text": {
            • "type": "string",
            • "store": "true",
            • "index": "analysed"
              },
          • "geo": {
            • "store": "true",
            • "type": "geo_point"
              }
              }
              },
      • "uuid": "8_6Pliw_TAW_x34cFhIUtg",
      • "number_of_replicas": "1",
      • "number_of_shards": "5",
      • "version": {
        • "created": "1020299"
          }
          }
          },
  • "mappings": {
    • "twitter": {
      • "properties": {
        • "id": {
          • "type": "long"
            },
        • "text": {
          • "type": "string"
            },
        • "geo": {
          • "properties": {
            • "lon": {
              • "type": "double"
                },
            • "lat": {
              • "type": "double"
                }
                }
                },
        • "location": {
          • "type": "string"
            },
        • "created_at": {
          • "type": "string"
            },
        • "country": {
          • "type": "string"
            }

...

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/25b24fb9-986d-4699-aca4-faa223b19064%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.