I haven't escaped my hash as Radu has done in his example. The data,
however, seems to have indexed successfully even without escaping.
Here are three examples:
curl -XPUT http://localhost:9200/twitter/user/1 -d '{ "features" : "name
bob age thirty"}'
curl -XPUT http://localhost:9200/twitter/user/2 -d '{ "features" : {"name"
: "bob", "age" : "thirty"}}'
curl -XPUT http://localhost:9200/twitter/user/3 -d '{ "features" :
"{"name" : "bob", "age" : "thirty"}}'
Here's what I get when I run a full search:
curl -XGET localhost:9200/insta/_search
"hits": [
{
"_id": "1",
"_index": "insta",
"_score": 1.0,
"_source": {
"features": "name bob age thirty"
},
"_type": "user"
},
{
"_id": "2",
"_index": "insta",
"_score": 1.0,
"_source": {
"features": {
"age": "thirty",
"name": "bob"
}
},
"_type": "user"
},
{
"_id": "3",
"_index": "insta",
"_score": 1.0,
"_source": {
"features": "{"name" : "bob", "age" : "thirty"}"
},
"_type": "user"
}
]
And here's what I get with the text search:
"hits" : [ {
"_index" : "insta",
"_type" : "user",
"_id" : "1",
"_score" : 0.15342641, "_source" : { "features" : "name bob age
thirty"}
}, {
"_index" : "insta",
"_type" : "user",
"_id" : "3",
"_score" : 0.15342641, "_source" : { "features": "{"name" :
"bob", "age" : "thirty"}" }
} ]
I didn't think there would be a difference because the analyze API returns
the same results for both of the examples below:
curl -XGET "localhost:9200/twitter/_analyze?tokenizer=standard&pretty=true"
-d '{"name" : "bob", "age" : "thirty"}'
curl -XGET "localhost:9200/twitter/_analyze?tokenizer=standard&pretty=true"
-d '{"name" : "bob", "age" : "thirty"}}'
I'm curious to know how id 2 is stored internally. Also, I'll be sure to
escape strings the hash in the future (thanks for that all), but i'll need
to reindex a lot of data - is there any other to run text searches with
data indexed in the manner shown for id 2?
Thanks!
On Tuesday, October 23, 2012 5:20:20 PM UTC+8, simonw wrote:
On Tuesday, October 23, 2012 10:32:09 AM UTC+2, govind201 wrote:
My index has a field called "features" with mapping type "string" as
follows:
{
"twitter": {
"user": {
"properties": {
"features": {
"type": "string"
}
}
}
}
}
Here are three examples of potential values that may be stored in
"features":
Example 1: name bob age thirty
Example 2: ["name", "bob", "age", "thirty"]
Example 3: {"name" : "bob", "age" : "thirty"}
hey, are you escaping the hash correctly? it should be astring value
right?
simon
The query
curl -XGET http://localhost:9200/twitter/user/_search -d '{ "query" : {
"text" : { "features" : "age" } } }'
returns examples 1 and 2, but not example 3.
The "features" field may contain any number of keys (such as age, name,
fieldX, fieldYZ …) and I want to maintain a strict mapping, hence my choice
of the "string" type mapping. I do, however, want to carry out text
searches along the words, i.e. return example 3 if the query is either
"name", "bob", "age" or "thirty". Is there any query that can return the
results I seek?
FYI, running the analyze API along example 3:
curl -XGET
"localhost:9200/twitter/_analyze?tokenizer=standard&pretty=true" -d
'{"name" : "bob", "age" : "thirty"}'
seems to tokenize the string as expected into "thirty", "age", "bob" and
"name".
Thanks!
--