Moving from Solr to ES


(Vinicius Carvalho) #1

Hi there! I'm testing solr and ES as the new backend search for our
company. I really like ES flexibility, but I shall confess it's docs are
kinda confusing.

I'm going through the mapping put api (I want to have more control over
types, analyzers of my index) and I'm getting a bit confused.

For instance on solr we have a schema that we can define types for our
documents I like that approach, so I can have defined search_analyzer,
index_analyzer and filters per type. How does this work on ES?

I'm a bit confused on how to do this on ES. For instance, the mapping api
has a _type property that maps to indexed no/yes but mapping also have a
type that could be string/number/date and so on. this is somehow confusion
IMHO.

So what would be the best practice to set field level properties like
indexing, storing, analyzer and so on?

Could someone provide some examples? Just warming up here, I'm sure I'll
get up to speed soon.

Regards


(Shairon Toledo) #2

Hi,

First of all, in solr you dont have index name(you can do it using core
approach although), in ES you will define your fields and their types using
mapping api or defining it in config/elasticsearch.yml.

Let me try to create a mapping similar to solr schema.xml, for this example
I will create the index "shop1" and in the same time the mapping for type
"products"( it'll appear as _type in the response), comparing with solr
"products" is the document itself, you can have many others types.
Following the call.

curl -X PUT http://localhost:9200/shop1 -d '
{
"settings" : {
"index" : {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "standard",
"filter": ["standard", "asciifolding","stop","lowercase"]
}
}
},
"mappings" : {
"products": {
"properties" : {
"name" : {"type" : "string", "store" : "yes", "index" :
"analyzed", "analyzer": "my_analyzer"},
"description" : {"type" : "string", "store" : "yes"},
"sku" : {"type" : "string", "store" : "yes"}
}
}
}
}
}
}
'
As you may notice, there is a custom analyzer named as "my_analyzer", it's
similar to xpath:/schema/types/fieldType/analyzer in solr schema.xml. The
diffence in ES is, the analyzer is defined globally instead of tied with
document type, because you can use it in other types of document. After
defining the analyzer, you can use it in any field, like "analyzer":
"my_analyzer", "search_analyzer" and "index_analyzer" in the query time,
for example:

curl -XGET --silent "http://:9200$SERVER/shop1/products/_search?pretty=true"
-d '
{
"query":{
"query_string":{
"fields": ["name","description"],
"query": "laptop",
"analyzer": "my_analyzer"
}
}
}
'
Or by url "analyzer=my_analyzer".

On Wed, Jul 11, 2012 at 12:36 PM, Vinicius Carvalho <
viniciusccarvalho@gmail.com> wrote:

Hi there! I'm testing solr and ES as the new backend search for our
company. I really like ES flexibility, but I shall confess it's docs are
kinda confusing.

I'm going through the mapping put api (I want to have more control over
types, analyzers of my index) and I'm getting a bit confused.

For instance on solr we have a schema that we can define types for our
documents I like that approach, so I can have defined search_analyzer,
index_analyzer and filters per type. How does this work on ES?

I'm a bit confused on how to do this on ES. For instance, the mapping api
has a _type property that maps to indexed no/yes but mapping also have a
type that could be string/number/date and so on. this is somehow confusion
IMHO.

So what would be the best practice to set field level properties like
indexing, storing, analyzer and so on?

Could someone provide some examples? Just warming up here, I'm sure I'll
get up to speed soon.

Regards

--

Shairon Toledo
http://hashcode.me


(Vinicius Carvalho) #3

Thanks :slight_smile:

On Wednesday, July 11, 2012 2:11:07 PM UTC-4, Shairon Toledo wrote:

Hi,

First of all, in solr you dont have index name(you can do it using core
approach although), in ES you will define your fields and their types using
mapping api or defining it in config/elasticsearch.yml.

Let me try to create a mapping similar to solr schema.xml, for this
example I will create the index "shop1" and in the same time the mapping
for type "products"( it'll appear as _type in the response), comparing with
solr "products" is the document itself, you can have many others types.
Following the call.

curl -X PUT http://localhost:9200/shop1 -d '
{
"settings" : {
"index" : {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "standard",
"filter": ["standard", "asciifolding","stop","lowercase"]
}
}
},
"mappings" : {
"products": {
"properties" : {
"name" : {"type" : "string", "store" : "yes", "index" :
"analyzed", "analyzer": "my_analyzer"},
"description" : {"type" : "string", "store" : "yes"},
"sku" : {"type" : "string", "store" : "yes"}
}
}
}
}
}
}
'
As you may notice, there is a custom analyzer named as "my_analyzer", it's
similar to xpath:/schema/types/fieldType/analyzer in solr schema.xml. The
diffence in ES is, the analyzer is defined globally instead of tied with
document type, because you can use it in other types of document. After
defining the analyzer, you can use it in any field, like "analyzer":
"my_analyzer", "search_analyzer" and "index_analyzer" in the query time,
for example:

curl -XGET --silent
"http://:9200$SERVER/shop1/products/_search?pretty=true" -d '
{
"query":{
"query_string":{
"fields": ["name","description"],
"query": "laptop",
"analyzer": "my_analyzer"
}
}
}
'
Or by url "analyzer=my_analyzer".

On Wed, Jul 11, 2012 at 12:36 PM, Vinicius Carvalho <
viniciusccarvalho@gmail.com> wrote:

Hi there! I'm testing solr and ES as the new backend search for our
company. I really like ES flexibility, but I shall confess it's docs are
kinda confusing.

I'm going through the mapping put api (I want to have more control over
types, analyzers of my index) and I'm getting a bit confused.

For instance on solr we have a schema that we can define types for our
documents I like that approach, so I can have defined search_analyzer,
index_analyzer and filters per type. How does this work on ES?

I'm a bit confused on how to do this on ES. For instance, the mapping api
has a _type property that maps to indexed no/yes but mapping also have a
type that could be string/number/date and so on. this is somehow confusion
IMHO.

So what would be the best practice to set field level properties like
indexing, storing, analyzer and so on?

Could someone provide some examples? Just warming up here, I'm sure I'll
get up to speed soon.

Regards

--

Shairon Toledo
http://hashcode.me


(system) #4