I'm relatively new to search and definitely new to elasticsearch. Hoping
someone can take a moment to give me a 10,000 ft. overview of ES. The elasticsearch.org guides seem to jump right into details, forgoing basic
concepts.
For example, I'd like to play around with elasticsearch and try indexing
some PDFs, but it's not clear, to me, how (via REST interface) to create an
index, index some docs into it, goof with search terms & results, and then
destroy that entire index, leaving no trace of my fiddling.
Even less clear is how the REST URL is structured:
I take it the first segment of the path ('twitter') is the index name -
defined on the fly? And the 2nd ('user') is some sub-classification. But
how, where, is this defined/documented? etc.
Say I index some documents under the 'twitter' index, how then do I go
about removing one of them? Or, the whole 'twitter' index and all of its
contents without affecting any of the other indexes?
How to see a list of indexes so you know what can be searched? I guess
my background wants to treat this like a database where one can easily add
& drop databases & tables. What's the analogy in ES's world?
I take it the first segment of the path ('twitter') is the index name -
defined on the fly? And the 2nd ('user') is some sub-classification. But
how, where, is this defined/documented? etc.
The second classification, "user", is the type. An index can have
several types. In this case, the index "twitter" is created and a
document of type "user" with id "kimchy" is placed into the index. The
majority of the time, you will be creating indices beforehand and then
inserting documents in different steps).
Say I index some documents under the 'twitter' index, how then do I go about
removing one of them?
Delete by Query
Or, the whole 'twitter' index and all of its contents without affecting any of the other indexes?
Delete Index
How to see a list of indexes so you know what can be searched? I guess my
background wants to treat this like a database where one can easily add &
drop databases & tables. What's the analogy in ES's world?
That would require looking into the cluster state:
I would add to Ivan's answer that if you know the document ID, you can
delete it with a simple DELETE :
BTW, ES is a Zero Conf product (when you start). That's the reason why ES
can accept as your first curl call something like
$ curl -XPUT http://localhost:9200/twitter/user/kimchy -d '{ "name" : "Shay
Banon" }'
ES knows that he have to create a new index with default settings called
twitter, that there is a new document type in this index called user. More
than that, ES find out itself the mapping for your user type (field name as
a String).
About PDF indexing, you should give a look at the attachment plugin :
I take it the first segment of the path ('twitter') is the index name
defined on the fly? And the 2nd ('user') is some sub-
classification.
But how, where, is this defined/documented? etc.
The second classification, "user", is the type. An index can have
several types. In this case, the index "twitter" is created and a
document of type "user" with id "kimchy" is placed into the index. The
majority of the time, you will be creating indices beforehand and then
inserting documents in different steps).
Say I index some documents under the 'twitter' index, how then do I
go
about removing one of them?
How to see a list of indexes so you know what can be searched? I
guess my background wants to treat this like a database where one can
easily add & drop databases & tables. What's the analogy in ES's
world?
That would require looking into the cluster state:
I found this page helpful; i'm doing the same thing you are with PDF files.
This link at the bottom of the page has the full script you can use to get
going & quickly experiment.
Changing the number of shards and replicas and using the elasticsearch-head
plugin was also very helpful.
HTH.
cheers
shaun
On Thursday, June 7, 2012 6:01:30 AM UTC+9:30, Meltemi wrote:
I'm relatively new to search and definitely new to elasticsearch. Hoping
someone can take a moment to give me a 10,000 ft. overview of ES. The elasticsearch.org guides seem to jump right into details, forgoing basic
concepts.
For example, I'd like to play around with elasticsearch and try indexing
some PDFs, but it's not clear, to me, how (via REST interface) to create an
index, index some docs into it, goof with search terms & results, and then
destroy that entire index, leaving no trace of my fiddling.
Even less clear is how the REST URL is structured:
I take it the first segment of the path ('twitter') is the index name -
defined on the fly? And the 2nd ('user') is some sub-classification. But
how, where, is this defined/documented? etc.
Say I index some documents under the 'twitter' index, how then do I go
about removing one of them? Or, the whole 'twitter' index and all of its
contents without affecting any of the other indexes?
How to see a list of indexes so you know what can be searched? I guess
my background wants to treat this like a database where one can easily add
& drop databases & tables. What's the analogy in ES's world?
Any help, or links, appreciated.
On Thursday, June 7, 2012 6:01:30 AM UTC+9:30, Meltemi wrote:
I'm relatively new to search and definitely new to elasticsearch. Hoping
someone can take a moment to give me a 10,000 ft. overview of ES. The elasticsearch.org guides seem to jump right into details, forgoing basic
concepts.
For example, I'd like to play around with elasticsearch and try indexing
some PDFs, but it's not clear, to me, how (via REST interface) to create an
index, index some docs into it, goof with search terms & results, and then
destroy that entire index, leaving no trace of my fiddling.
Even less clear is how the REST URL is structured:
I take it the first segment of the path ('twitter') is the index name -
defined on the fly? And the 2nd ('user') is some sub-classification. But
how, where, is this defined/documented? etc.
Say I index some documents under the 'twitter' index, how then do I go
about removing one of them? Or, the whole 'twitter' index and all of its
contents without affecting any of the other indexes?
How to see a list of indexes so you know what can be searched? I guess
my background wants to treat this like a database where one can easily add
& drop databases & tables. What's the analogy in ES's world?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.