Looking for my "ah ha!" moment


(Meltemi) #1

I'm relatively new to search and definitely new to elasticsearch. Hoping
someone can take a moment to give me a 10,000 ft. overview of ES. The
elasticsearch.org guides seem to jump right into details, forgoing basic
concepts.

For example, I'd like to play around with elasticsearch and try indexing
some PDFs, but it's not clear, to me, how (via REST interface) to create an
index, index some docs into it, goof with search terms & results, and then
destroy that entire index, leaving no trace of my fiddling.

Even less clear is how the REST URL is structured:

$ curl -XPUT http://localhost:9200/twitter/user/kimchy -d '{ "name" : "Shay
Banon" }'

I take it the first segment of the path ('twitter') is the index name -
defined on the fly? And the 2nd ('user') is some sub-classification. But
how, where, is this defined/documented? etc.

Say I index some documents under the 'twitter' index, how then do I go
about removing one of them? Or, the whole 'twitter' index and all of its
contents without affecting any of the other indexes?

How to see a list of indexes so you know what can be searched? I guess
my background wants to treat this like a database where one can easily add
& drop databases & tables. What's the analogy in ES's world?

Any help, or links, appreciated.


(Ivan Brusic) #2

Answers inline.

On Wed, Jun 6, 2012 at 1:31 PM, Meltemi mdemetrios@gmail.com wrote:

$ curl -XPUT http://localhost:9200/twitter/user/kimchy -d '{ "name" : "Shay
Banon" }'

I take it the first segment of the path ('twitter') is the index name -
defined on the fly? And the 2nd ('user') is some sub-classification. But
how, where, is this defined/documented? etc.

The second classification, "user", is the type. An index can have
several types. In this case, the index "twitter" is created and a
document of type "user" with id "kimchy" is placed into the index. The
majority of the time, you will be creating indices beforehand and then
inserting documents in different steps).

Say I index some documents under the 'twitter' index, how then do I go about
removing one of them?

Delete by Query
http://www.elasticsearch.org/guide/reference/api/delete-by-query.html

Or, the whole 'twitter' index and all of its contents without affecting any of the other indexes?

Delete Index
http://www.elasticsearch.org/guide/reference/api/admin-indices-delete-index.html

How to see a list of indexes so you know what can be searched? I guess my
background wants to treat this like a database where one can easily add &
drop databases & tables. What's the analogy in ES's world?

That would require looking into the cluster state:

http://www.elasticsearch.org/guide/reference/api/admin-cluster-state.html

Hope this helps,

Ivan


(David Pilato) #3

I would add to Ivan's answer that if you know the document ID, you can
delete it with a simple DELETE :
http://www.elasticsearch.org/guide/reference/api/delete.html

BTW, ES is a Zero Conf product (when you start). That's the reason why ES
can accept as your first curl call something like
$ curl -XPUT http://localhost:9200/twitter/user/kimchy -d '{ "name" : "Shay
Banon" }'

ES knows that he have to create a new index with default settings called
twitter, that there is a new document type in this index called user. More
than that, ES find out itself the mapping for your user type (field name as
a String).

About PDF indexing, you should give a look at the attachment plugin :

For basic information about ES, you can also have a look here :
http://www.slideshare.net/dadoonet/elasticsearch-devoxx-france-2012-english-
version

HTH
David.

-----Message d'origine-----
De : elasticsearch@googlegroups.com
[mailto:elasticsearch@googlegroups.com] De la part de Ivan Brusic
Envoyé : mercredi 6 juin 2012 23:35
À : elasticsearch@googlegroups.com
Objet : Re: Looking for my "ah ha!" moment

Answers inline.

On Wed, Jun 6, 2012 at 1:31 PM, Meltemi mdemetrios@gmail.com wrote:

$ curl -XPUT http://localhost:9200/twitter/user/kimchy -d '{ "name" :
"Shay Banon" }'

I take it the first segment of the path ('twitter') is the index name

  • defined on the fly? And the 2nd ('user') is some sub-
    classification.
    But how, where, is this defined/documented? etc.

The second classification, "user", is the type. An index can have
several types. In this case, the index "twitter" is created and a
document of type "user" with id "kimchy" is placed into the index. The
majority of the time, you will be creating indices beforehand and then
inserting documents in different steps).

Say I index some documents under the 'twitter' index, how then do I
go
about removing one of them?

Delete by Query
http://www.elasticsearch.org/guide/reference/api/delete-by-query.html

Or, the whole 'twitter' index and all of its contents without
affecting any of the other indexes?

Delete Index
http://www.elasticsearch.org/guide/reference/api/admin-indices-delete-
index.html

How to see a list of indexes so you know what can be searched? I
guess my background wants to treat this like a database where one can
easily add & drop databases & tables. What's the analogy in ES's
world?

That would require looking into the cluster state:

http://www.elasticsearch.org/guide/reference/api/admin-cluster-
state.html

Hope this helps,

Ivan


(Shaun Etherton) #4

Hi

I found this page helpful; i'm doing the same thing you are with PDF files.

http://www.elasticsearch.org/tutorials/2011/07/18/attachment-type-in-action.html

This link at the bottom of the page has the full script you can use to get
going & quickly experiment.

Changing the number of shards and replicas and using the elasticsearch-head
plugin was also very helpful.

HTH.
cheers

  • shaun

On Thursday, June 7, 2012 6:01:30 AM UTC+9:30, Meltemi wrote:

I'm relatively new to search and definitely new to elasticsearch. Hoping
someone can take a moment to give me a 10,000 ft. overview of ES. The
elasticsearch.org guides seem to jump right into details, forgoing basic
concepts.

For example, I'd like to play around with elasticsearch and try indexing
some PDFs, but it's not clear, to me, how (via REST interface) to create an
index, index some docs into it, goof with search terms & results, and then
destroy that entire index, leaving no trace of my fiddling.

Even less clear is how the REST URL is structured:

$ curl -XPUT http://localhost:9200/twitter/user/kimchy -d '{ "name" :
"Shay Banon" }'

I take it the first segment of the path ('twitter') is the index name -
defined on the fly? And the 2nd ('user') is some sub-classification. But
how, where, is this defined/documented? etc.

Say I index some documents under the 'twitter' index, how then do I go
about removing one of them? Or, the whole 'twitter' index and all of its
contents without affecting any of the other indexes?

How to see a list of indexes so you know what can be searched? I guess
my background wants to treat this like a database where one can easily add
& drop databases & tables. What's the analogy in ES's world?

Any help, or links, appreciated.

On Thursday, June 7, 2012 6:01:30 AM UTC+9:30, Meltemi wrote:

I'm relatively new to search and definitely new to elasticsearch. Hoping
someone can take a moment to give me a 10,000 ft. overview of ES. The
elasticsearch.org guides seem to jump right into details, forgoing basic
concepts.

For example, I'd like to play around with elasticsearch and try indexing
some PDFs, but it's not clear, to me, how (via REST interface) to create an
index, index some docs into it, goof with search terms & results, and then
destroy that entire index, leaving no trace of my fiddling.

Even less clear is how the REST URL is structured:

$ curl -XPUT http://localhost:9200/twitter/user/kimchy -d '{ "name" :
"Shay Banon" }'

I take it the first segment of the path ('twitter') is the index name -
defined on the fly? And the 2nd ('user') is some sub-classification. But
how, where, is this defined/documented? etc.

Say I index some documents under the 'twitter' index, how then do I go
about removing one of them? Or, the whole 'twitter' index and all of its
contents without affecting any of the other indexes?

How to see a list of indexes so you know what can be searched? I guess
my background wants to treat this like a database where one can easily add
& drop databases & tables. What's the analogy in ES's world?

Any help, or links, appreciated.


(system) #5