Searching a field where a url is stored doesn't give any results?

Hi there,

I'm storing documents where I have a certain field that contains a string
(sometimes it's a word and sometimes a URL).

However when I'm trying to search for on that field with either:
query_string or text it just won't give any results.

Here is a example of my query:
{"query":{"bool":{"must":[{"query_string":{"default_field":"log.@fields.data","query":"http://website.com/article/id/1234"}}],"must_not":[],"should":[]}},"from":0,"size":50,"sort":[],"facets":{}}

When using a wildcard it only matches a query with: '1234' and not
'
/article/id/1234'. Is there something I'm missing with slashes?

Vincent

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Do you use default mapping?
If so, try this:

curl -XPUT localhost:9200/mytest
curl -XGET localhost:9200/mytest/_analyze?pretty -d 'http://website.com/article/id/1234'

Here is the output:
{
"tokens" : [ {
"token" : "http",
"start_offset" : 0,
"end_offset" : 4,
"type" : "",
"position" : 1
}, {
"token" : "website.com",
"start_offset" : 7,
"end_offset" : 18,
"type" : "",
"position" : 2
}, {
"token" : "article",
"start_offset" : 19,
"end_offset" : 26,
"type" : "",
"position" : 3
}, {
"token" : "id",
"start_offset" : 27,
"end_offset" : 29,
"type" : "",
"position" : 4
}, {
"token" : "1234",
"start_offset" : 30,
"end_offset" : 34,
"type" : "",
"position" : 5
} ]
}

You can see how your field is broken into tokens.

You should use another analyzer for this field (keyword) or don't analyze at all this field.

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 13 mars 2013 à 09:42, Vincent vin.de.vos@gmail.com a écrit :

Hi there,

I'm storing documents where I have a certain field that contains a string (sometimes it's a word and sometimes a URL).

However when I'm trying to search for on that field with either: query_string or text it just won't give any results.

Here is a example of my query: {"query":{"bool":{"must":[{"query_string":{"default_field":"log.@fields.data","query":"http://website.com/article/id/1234"}}],"must_not":[],"should":[]}},"from":0,"size":50,"sort":[],"facets":{}}

When using a wildcard it only matches a query with: '1234' and not '/article/id/1234'. Is there something I'm missing with slashes?

Vincent

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi David,

thank you for the information, this explains a lot!

I'm using Logstash to annotate fields from my log files. I haven't yet
found a way to alter the default Logstash mapping but perhaps I can find
some options to exclude certain fields for this analysis though so I can
just store the data as it is.

Vincent

Op woensdag 13 maart 2013 09:59:03 UTC+1 schreef David Pilato het volgende:

Do you use default mapping?
If so, try this:

curl -XPUT localhost:9200/mytest
curl -XGET localhost:9200/mytest/_analyze?pretty -d '
http://website.com/article/id/1234'

Here is the output:
{
"tokens" : [ {
"token" : "http",
"start_offset" : 0,
"end_offset" : 4,
"type" : "",
"position" : 1
}, {
"token" : "website.com",
"start_offset" : 7,
"end_offset" : 18,
"type" : "",
"position" : 2
}, {
"token" : "article",
"start_offset" : 19,
"end_offset" : 26,
"type" : "",
"position" : 3
}, {
"token" : "id",
"start_offset" : 27,
"end_offset" : 29,
"type" : "",
"position" : 4
}, {
"token" : "1234",
"start_offset" : 30,
"end_offset" : 34,
"type" : "",
"position" : 5
} ]
}

You can see how your field is broken into tokens.

You should use another analyzer for this field (keyword) or don't analyze
at all this field.

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 13 mars 2013 à 09:42, Vincent <vin.d...@gmail.com <javascript:>> a
écrit :

Hi there,

I'm storing documents where I have a certain field that contains a string
(sometimes it's a word and sometimes a URL).

However when I'm trying to search for on that field with either:
query_string or text it just won't give any results.

Here is a example of my query:
{"query":{"bool":{"must":[{"query_string":{"default_field":"log.@fields.data","query":"
http://website.com/article/id/1234
"}}],"must_not":[],"should":[]}},"from":0,"size":50,"sort":[],"facets":{}}

When using a wildcard it only matches a query with: '1234' and not
'
/article/id/1234'. Is there something I'm missing with slashes?

Vincent

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.