I am undecided regarding whether to use java client or REST API in my case
and would greatly appreciate your opinions.
We have a fairly small number of records - less than 1M (currently 0.3M).
They are around 2K in size and have over 100 fields (size and number of
fields will grow).
It will be a read heavy application with may be a 100-200 of docs updated
every minute (every minute (or any predefined interval) batch updates will
posted from our main application).
The main ExtJS rich application currently connects directly from user
browser to ES via REST (data is not channeled via our app server) and does
lots of faceting and heavy searches. In the future, due to security
requirements we may need to enforce access control to groups of documents
so we may need to use RestChannel or some other layer to do so and data
will be channeled via the main app server or the whole ES node will be
running as embedded with authentication and ACL performed by web container.
Data can be reindexed fairly easily so loss of index is an operational
concern (affect users) but not a concern as far as loss of the data
First of all we are planning to use single node with one shard. mostly
because with small number of documents and precision requirements for facet
counts and searches sharding will be detrimental and partially because the
server will be supported by "generic" unix support stuff and I cringe
thinking about all potential clustering issues after lurking on this groups
for a while.
Secondly, at least for now, I am planning to host code which pushes updates
from main system to ES in the main application. Chief reason is that I am
most concerned with the effect of the indexing of the updated records on
our Oracle database. Domain objects to be indexed involve dozens of tables
producing heavy load on Oracle and doing it in the main app server will let
me use in-process caches (JDO secondary cache) and significantly reduce
load on the database
Now the question:
-
Use REST bulk API for bulk updates - will let me avoid ES dependencies
and upgrade ES server without having to release the application. Same with
search REST API although currently our ExtJS java script application mostly
consumes ES data directly from the browser. Independent deployments are
very attractive. I do not want ES upgrades to force application releases
with full regression and acceptance testing and client involvement. -
Use native java client - tightly couple our app and ES. If I use it I
would rather move batch update from the app to either our ES data node (we
are pretty low volume as I described above) or a separate indexer node. The
downside is lot heavier impact on oracle database due to having to read
from domain objects from database rather than app server caches
So while I would love to use native client, my current thinking is to use
REST API via Apache HttpClient to do bulk indexing and whatever searches we
need until we have a clear need to use the native client. By that time ES
may have a light weight native client less sensitive to ES server version
changes
Your thoughts will be greatly appreciated
Alex
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.