whats the right indexing for the url\id field assuming I want to
query by url? should I add analyzer to these fields?
I can't find a way to search for website data by it's url, what am
I doing wrong? also tried encoding the url, it doesn't help:
curl -XGET http://localhost:9200/article-dev/_search -d '{
"query" : { "term" : { "_id": "http://thenextweb.com/media/2012/01/02/
uk-music-download-sales-grew-by-26-6-in-2011-but-the-industrys-still-
in-decline/" } } }'
It all depends on what type of queries you want to execute. If the
URL is simply a key, where you do not care to search inside the value,
then the URL should not be analyzed. Analyzing URLs is difficult since
they are not words. If you wanted to search inside the urls, there are
many different routes to take, all of which requiring decomposing the
data on the client side into smaller tokens.
Strings are analyzed by default in Elasticsearch, therefore your
:id field will be analyzed when indexed. Are term query is not
analyzed, so your search will not work. Your :url shoud work however.
Does searching against that field work? If not, gist an example
document.
whats the right indexing for the url\id field assuming I want to
query by url? should I add analyzer to these fields?
I can't find a way to search for website data by it's url, what am
I doing wrong? also tried encoding the url, it doesn't help:
curl -XGET http://localhost:9200/article-dev/_search -d '{
"query" : { "term" : { "_id": "Latest tech news | TNW
uk-music-download-sales-grew-by-26-6-in-2011-but-the-industrys-still-
in-decline/" } } }'
I don't need tokenizing the url, just need to search for it as a
one single string.
searching against this field (:url) doesn't work:
{"_index":"aunticles-dev-thenextweb","_type":"article","_id":"http:// thenextweb.com/apple/2012/02/23/forgotten-apple-founder-takes-to-
facebook-to-explain-his-decision-to-quit-after-12-days/","_score":1.0,
"_source" : {"id":"http://thenextweb.com/apple/2012/02/23/forgotten-
apple-founder-takes-to-facebook-to-explain-his-decision-to-quit-
after-12-days/","title":"Forgotten Apple founder takes to Facebook to
explain his decision to quit after 12 days","summary":"Yesterday,
third Apple founder Ron Wayne published an essay on Facebook about his
decision to leave Apple Computer after only 12 days.","image":"http:// cdn.thenextweb.com/wp-content/blogs.dir/1/files/2012/02/
Photoxpress_23083180-300x250.jpg","categories":"","published":"2012-03-14
13:24:43","updated":null,"type":"article","site":null}
It all depends on what type of queries you want to execute. If theURLis simply a key, where you do not care to search inside the value,
then theURLshould not be analyzed. Analyzing URLs is difficult since
they are not words. If you wanted to search inside the urls, there are
many different routes to take, all of which requiring decomposing the
data on the client side into smaller tokens.
Strings are analyzed by default in Elasticsearch, therefore your
:id field will be analyzed when indexed. Are term query is not
analyzed, so your search will not work. Your :urlshoud work however.
Does searching against that field work? If not, gist an example
document.
whats the right indexing for theurl\id field assuming I want to
query byurl? should I add analyzer to these fields?
I can't find a way to search for website data by it'surl, what am
I doing wrong? also tried encoding theurl, it doesn't help:
curl -XGEThttp://localhost:9200/article-dev/_search-d '{
"query" : { "term" : { "_id": "Latest tech news | TNW
uk-music-download-sales-grew-by-26-6-in-2011-but-the-industrys-still-
in-decline/" } } }'
You don't need to explicitly store each field, by default, the whole
_source json document you indexed is stored, so you end up storing things
twice, once each individual field, and once the whole doc).
If the url field is not analyzed, you can look it up using the same url
it was indexed with using a term query.
I don't need tokenizing the url, just need to search for it as a
one single string.
searching against this field (:url) doesn't work:
{"_index":"aunticles-dev-thenextweb","_type":"article","_id":"http:// thenextweb.com/apple/2012/02/23/forgotten-apple-founder-takes-to-
facebook-to-explain-his-decision-to-quit-after-12-days/","_score":1.0,
"_source" : {"id":"http://thenextweb.com/apple/2012/02/23/forgotten-
apple-founder-takes-to-facebook-to-explain-his-decision-to-quit-
after-12-days/","title":"Forgotten Apple founder takes to Facebook to
explain his decision to quit after 12 days","summary":"Yesterday,
third Apple founder Ron Wayne published an essay on Facebook about his
decision to leave Apple Computer after only 12 days.","image":"http:// cdn.thenextweb.com/wp-content/blogs.dir/1/files/2012/02/
Photoxpress_23083180-300x250.jpg","categories":"","published":"2012-03-14
13:24:43","updated":null,"type":"article","site":null}
It all depends on what type of queries you want to execute. If
theURLis simply a key, where you do not care to search inside the value,
then theURLshould not be analyzed. Analyzing URLs is difficult since
they are not words. If you wanted to search inside the urls, there are
many different routes to take, all of which requiring decomposing the
data on the client side into smaller tokens.
Strings are analyzed by default in Elasticsearch, therefore your
:id field will be analyzed when indexed. Are term query is not
analyzed, so your search will not work. Your :urlshoud work however.
Does searching against that field work? If not, gist an example
document.
whats the right indexing for theurl\id field assuming I want to
query byurl? should I add analyzer to these fields?
I can't find a way to search for website data by it'surl, what am
I doing wrong? also tried encoding theurl, it doesn't help:
curl -XGEThttp://localhost:9200/article-dev/_search-d '{
"query" : { "term" : { "_id": "Latest tech news | TNW
uk-music-download-sales-grew-by-26-6-in-2011-but-the-industrys-still-
in-decline/" } } }'
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.