Indexes in database


(tullio0106) #1

I'm investigating about elastic searc and I've a question about
indexes.
I didn't understand where indexes are stored and if it's possible to
store them in a Sql Database (like Oracle or MySql).
Tks
Tullio


(David Pilato) #2

Index are stored on disk.
Why do you think you need to store it in a database ?

David :wink:
Twitter : @dadoonet / @elasticsearchfr

Le 10 mai 2012 à 15:54, tullio0106 tbettinazzi@axioma.it a écrit :

I'm investigating about elastic searc and I've a question about
indexes.
I didn't understand where indexes are stored and if it's possible to
store them in a Sql Database (like Oracle or MySql).
Tks
Tullio


(tullio0106) #3

Database is safer than filesystem and it's also easier to backup.
No crash can corrupt DB informations on filesystem it can.
Tks
Tullio

Il giorno giovedì 10 maggio 2012 15:54:55 UTC+2, tullio0106 ha scritto:

I'm investigating about elastic searc and I've a question about
indexes.
I didn't understand where indexes are stored and if it's possible to
store them in a Sql Database (like Oracle or MySql).
Tks
Tullio


(David Pilato) #4

I disagree. Database is powered by a software that use the filesystem to
store its datas.
So, you should trust your file system. :wink: But, that's not the right place
to have this debate !

BTW, I don't think that you should care about backup.
Just increase the number of replica in your cluster, increase the number of
servers and spread them all other the world !

I mean that, IMHO, it could be better to still have your source documents
and reindex all if you have a really major crash instead of thinking of
making complex things.

But if you really want to do it, you can perhaps make your own gateway like
the hadoop one :
http://www.elasticsearch.org/guide/reference/modules/gateway/hadoop.html
http://www.elasticsearch.org/guide/reference/modules/gateway/hadoop.html
?

Not sure I answered to what your are looking for :wink:
David.

Le 10 mai 2012 à 16:26, tullio0106 tbettinazzi@axioma.it a écrit :

and it's also easier to backup.
No crash can corrupt DB informations on filesystem it can.
Tks
Tullio

Il giorno giovedì 10 maggio 2012 15:54:55 UTC+2, tullio0106 ha scritto:

searc and I've a question about
indexes.
I didn't understand where indexes are stored and if it's possible to
store them in a Sql Database (like Oracle or MySql).
Tks
Tullio =?f1a0ad7a-5daa-4ff9-a3d5-f1d42f61d6fc--

--
David Pilato
http://dev.david.pilato.fr/
Twitter : @dadoonet


(Andrew[.:at:.]DataFeedFile.com) #5

You should think of ES as almost 100% index.
One of the main difference between relational SQL DB and ES
is that ES was built to search, filter and facet. In order for ES
to be very fast, everything is indexed.

To answer your question, "where indices are stored?",
ES provides many gateways (where index storage resides).
The choices are, memory, local (filesystem), hadoop, S3, etc...

--Andrew

On May 10, 8:54 am, tullio0106 tbettina...@axioma.it wrote:

I'm investigating about elastic searc and I've a question about
indexes.
I didn't understand where indexes are stored and if it's possible to
store them in a Sql Database (like Oracle or MySql).
Tks
Tullio


(tullio0106) #6

I understand but DB has transaction managent which guarantees data
integrity, if power goes down data in the DB are safe, data in the
filesystem don't.
Bye
Tullio

Il giorno giovedì 10 maggio 2012 16:58:35 UTC+2, David Pilato ha scritto:

I disagree. Database is powered by a software that use the filesystem to
store its datas.

So, you should trust your file system. :wink: But, that's not the right place
to have this debate !

BTW, I don't think that you should care about backup.

Just increase the number of replica in your cluster, increase the number
of servers and spread them all other the world !

I mean that, IMHO, it could be better to still have your source documents
and reindex all if you have a really major crash instead of thinking of
making complex things.

But if you really want to do it, you can perhaps make your own gateway
like the hadoop one :
http://www.elasticsearch.org/guide/reference/modules/gateway/hadoop.html ?

Not sure I answered to what your are looking for :wink:

David.

Le 10 mai 2012 à 16:26, tullio0106 tbettinazzi@axioma.it a écrit :

Database is safer than filesystem and it's also easier to backup.
No crash can corrupt DB informations on filesystem it can.
Tks
Tullio

Il giorno giovedì 10 maggio 2012 15:54:55 UTC+2, tullio0106 ha scritto:

I'm investigating about elastic searc and I've a question about
indexes.
I didn't understand where indexes are stored and if it's possible to
store them in a Sql Database (like Oracle or MySql).
Tks
Tullio

--
David Pilato
http://dev.david.pilato.fr/
Twitter : @dadoonet


(David Pilato) #7

There is something like this in ES. So you don't have really to think about
it.
Just start ES and relax.

If you trust Oracle when Oracle tells you : Okay your data is stored, trust
ES when ES answers that your data is indexed.

My 2 cents
David.

Le 10 mai 2012 à 17:26, tullio0106 tbettinazzi@axioma.it a écrit :

managent which guarantees data integrity, if power goes down data in the DB
are safe, data in the filesystem don't.
Bye
Tullio

Il giorno giovedì 10 maggio 2012 16:58:35 UTC+2, David Pilato ha scritto:

I disagree. Database is powered by a software that use the filesystem to
store its datas.
So, you should trust your file system. :wink: But, that's not the right place
to have this debate !

BTW, I don't think that you should care about backup.
Just increase the number of replica in your cluster, increase the number of
servers and spread them all other the world !

I mean that, IMHO, it could be better to still have your source documents
and reindex all if you have a really major crash instead of thinking of
making complex things.

But if you really want to do it, you can perhaps make your own gateway like
the hadoop one : http://www.elasticsearch.org/ guide/reference/modules/
gateway/hadoop.html
http://www.elasticsearch.org/guide/reference/modules/gateway/hadoop.html
?

Not sure I answered to what your are looking for :wink:
David.

Le 10 mai 2012 à 16:26, tullio0106 < tbettinazzi@axioma.it
mailto:tbettinazzi@axioma.it > a écrit :

and it's also easier to backup.
No crash can corrupt DB informations on filesystem it can.
Tks
Tullio

Il giorno giovedì 10 maggio 2012 15:54:55 UTC+2, tullio0106 ha scritto:

searc and I've a question about
indexes.
I didn't understand where indexes are stored and if it's possible to
store them in a Sql Database (like Oracle or MySql).
Tks
Tullio =?f1a0ad7a-5daa-4ff9-a3d5-f1d42f61d6fc--

--
David Pilato
http://dev.david.pilato.fr/ http://dev.david.pilato.fr/
Twitter : @dadoonet

--
David Pilato
http://dev.david.pilato.fr/
Twitter : @dadoonet


(Andrew[.:at:.]DataFeedFile.com) #8

Ok, i see you would like to compare against transactional type of query.

In elasticsearch there is no transaction.

However there is Write Consistency option, where you can control the commit
safety level
against the cost of a little delay, the options are:

Valid write consistency values are one, quorum, and all.

you can read more about it here:
http://www.elasticsearch.org/guide/reference/api/index_.html

--Andrew

On Thursday, May 10, 2012 10:26:01 AM UTC-5, tullio0106 wrote:

I understand but DB has transaction managent which guarantees data
integrity, if power goes down data in the DB are safe, data in the
filesystem don't.
Bye
Tullio

Il giorno giovedì 10 maggio 2012 16:58:35 UTC+2, David Pilato ha scritto:

I disagree. Database is powered by a software that use the filesystem
to store its datas.

So, you should trust your file system. :wink: But, that's not the right
place to have this debate !

BTW, I don't think that you should care about backup.

Just increase the number of replica in your cluster, increase the number
of servers and spread them all other the world !

I mean that, IMHO, it could be better to still have your source documents
and reindex all if you have a really major crash instead of thinking of
making complex things.

But if you really want to do it, you can perhaps make your own gateway
like the hadoop one :
http://www.elasticsearch.org/guide/reference/modules/gateway/hadoop.html ?

Not sure I answered to what your are looking for :wink:

David.

Le 10 mai 2012 à 16:26, tullio0106 tbettinazzi@axioma.it a écrit :

Database is safer than filesystem and it's also easier to backup.
No crash can corrupt DB informations on filesystem it can.
Tks
Tullio

Il giorno giovedì 10 maggio 2012 15:54:55 UTC+2, tullio0106 ha scritto:

I'm investigating about elastic searc and I've a question about
indexes.
I didn't understand where indexes are stored and if it's possible to
store them in a Sql Database (like Oracle or MySql).
Tks
Tullio

--
David Pilato
http://dev.david.pilato.fr/
Twitter : @dadoonet


(Lukáš Vlček) #9

Tullio,

Can you elaborate more on what exactly you are afraid off?

Once you do an index request (which you do when indexing new documents)
then as soon as you get successful response, ES quaranties that the
document is persisted to the disk and replicated as well. Of course there
is also option not to wait for the replication but that is not the default
option.

Regards,
Lukáš
Dne 10.5.2012 17:26 "tullio0106" tbettinazzi@axioma.it napsal(a):

I understand but DB has transaction managent which guarantees data
integrity, if power goes down data in the DB are safe, data in the
filesystem don't.
Bye
Tullio

Il giorno giovedì 10 maggio 2012 16:58:35 UTC+2, David Pilato ha scritto:

I disagree. Database is powered by a software that use the filesystem
to store its datas.

So, you should trust your file system. :wink: But, that's not the right
place to have this debate !

BTW, I don't think that you should care about backup.

Just increase the number of replica in your cluster, increase the number
of servers and spread them all other the world !

I mean that, IMHO, it could be better to still have your source documents
and reindex all if you have a really major crash instead of thinking of
making complex things.

But if you really want to do it, you can perhaps make your own gateway
like the hadoop one : http://www.elasticsearch.org/**
guide/reference/modules/**gateway/hadoop.htmlhttp://www.elasticsearch.org/guide/reference/modules/gateway/hadoop.html ?

Not sure I answered to what your are looking for :wink:

David.

Le 10 mai 2012 à 16:26, tullio0106 tbettinazzi@axioma.it a écrit :

Database is safer than filesystem and it's also easier to backup.
No crash can corrupt DB informations on filesystem it can.
Tks
Tullio

Il giorno giovedì 10 maggio 2012 15:54:55 UTC+2, tullio0106 ha scritto:

I'm investigating about elastic searc and I've a question about
indexes.
I didn't understand where indexes are stored and if it's possible to
store them in a Sql Database (like Oracle or MySql).
Tks
Tullio

--
David Pilato
http://dev.david.pilato.fr/
Twitter : @dadoonet


(Andy Wick) #10

Backups and replication are not the same thing, there are many web pages
discussing the difference, but basically replication will replicate
bad/corrupted data created at the calling application level while backups
allow you to recover by going back in time. This is a bigger issue if you
are using ES as a DB and not just for search.

Databases can get corrupt no matter how much you pay for them, even with
UPSes and battery backed raid cards, they still corrupt.

Some other NoSQL dbs are adding the option of delayed op logs, so you can
actually have a replicate that is 24hours behind (or more.) Of course you
could just do this at the application level if you really needed it, but it
is kind of a slick feature, although not sure if it is actually useful or
not. :slight_smile:

Thanks,
Andy


(system) #11