Quick question (as I have not seen any opinions on this topic) - using
part of ES as a public application's API would be a good idea?
Our app relies heavily on search (livesearch, in general) - limiting
every layer (ruby app, in this case) would be a huge performance boost
and since ES uses JSON - this was an obvious thing that came to my
mind: use it as the livesearch underlying backend API directly. Bad
idea, good idea?
Hi,
I would rather not do it. Couple of reasons why I think this is not a good idea:
API can change. Although ES API has been very stable for major part
the release 1.0 is still not here.
Depending on what part of API you want to expose you should be
careful. Even if you expose only search related API it would allow
detailed inspection of your index structure and then anybody could
create queries that can put unnecessary load on our servers.
If you want to log your use activity then you should do it before
the request hits first ES node.
I would personaly recommend the opposite: write your own proxy to
expose only the minimal function set.
Regards,
Lukas
Dne neděle, 26. června 2011, Bartosz Pietrzak pietrzak@bartosz.me napsal(a):
Quick question (as I have not seen any opinions on this topic) - using
part of ES as a public application's API would be a good idea?
Our app relies heavily on search (livesearch, in general) - limiting
every layer (ruby app, in this case) would be a huge performance boost
and since ES uses JSON - this was an obvious thing that came to my
mind: use it as the livesearch underlying backend API directly. Bad
idea, good idea?
I've been thinking about this too - is there any way within ES to limit
public access to the read-only endpoints, or is that something that has to
be configured within the hosting web server (not sure how the web server
world works in Javaland)?
Hi,
I would rather not do it. Couple of reasons why I think this is not a good
idea:
API can change. Although ES API has been very stable for major part
the release 1.0 is still not here.
Depending on what part of API you want to expose you should be
careful. Even if you expose only search related API it would allow
detailed inspection of your index structure and then anybody could
create queries that can put unnecessary load on our servers.
If you want to log your use activity then you should do it before
the request hits first ES node.
I would personaly recommend the opposite: write your own proxy to
expose only the minimal function set.
Regards,
Lukas
Dne neděle, 26. června 2011, Bartosz Pietrzak pietrzak@bartosz.me
napsal(a):
Quick question (as I have not seen any opinions on this topic) - using
part of ES as a public application's API would be a good idea?
Our app relies heavily on search (livesearch, in general) - limiting
every layer (ruby app, in this case) would be a huge performance boost
and since ES uses JSON - this was an obvious thing that came to my
mind: use it as the livesearch underlying backend API directly. Bad
idea, good idea?
Guys, this is seriously not a good idea, ES is designed as a systems component, It is designed to operate inside a closed network as part of a back-end system. It does not have the security and intrusion hardening exposure needed to operate as a direct connected internet service.
As stated before, use a service between ES and the web to buffer, rate limit, log and allow blocking of dangerous options.
Also remember that even with a readonly query system, if you allow arbitrary queries from hosts that have little or not involvement in the design and construction of the data sets provided, then you WILL get major performance issues, again a intervening service would limit queries to the operations that your datasets can safely provide.
On Monday, June 27, 2011 at 9:33 AM, Eric Mill wrote:
I've been thinking about this too - is there any way within ES to limit public access to the read-only endpoints, or is that something that has to be configured within the hosting web server (not sure how the web server world works in Javaland)?
Hi,
I would rather not do it. Couple of reasons why I think this is not a good idea:
API can change. Although ES API has been very stable for major part
the release 1.0 is still not here.
Depending on what part of API you want to expose you should be
careful. Even if you expose only search related API it would allow
detailed inspection of your index structure and then anybody could
create queries that can put unnecessary load on our servers.
If you want to log your use activity then you should do it before
the request hits first ES node.
I would personaly recommend the opposite: write your own proxy to
expose only the minimal function set.
Quick question (as I have not seen any opinions on this topic) - using
part of ES as a public application's API would be a good idea?
Our app relies heavily on search (livesearch, in general) - limiting
every layer (ruby app, in this case) would be a huge performance boost
and since ES uses JSON - this was an obvious thing that came to my
mind: use it as the livesearch underlying backend API directly. Bad
idea, good idea?
Elasticsearch also supports JSONP and powers the search on elasticsearch.org. I
see your points, and even agree that it is likely a bad idea, but you didn't
answer my question. Is there any way within ES to limit destructive
endpoints to credentialed users, or should this be done at the web server
level?
Guys, this is seriously not a good idea, ES is designed as a systems
component, It is designed to operate inside a closed network as part of a
back-end system. It does not have the security and intrusion hardening
exposure needed to operate as a direct connected internet service.
As stated before, use a service between ES and the web to buffer, rate
limit, log and allow blocking of dangerous options.
Also remember that even with a readonly query system, if you allow
arbitrary queries from hosts that have little or not involvement in the
design and construction of the data sets provided, then you WILL get major
performance issues, again a intervening service would limit queries to the
operations that your datasets can safely provide.
On Monday, June 27, 2011 at 9:33 AM, Eric Mill wrote:
I've been thinking about this too - is there any way within ES to limit
public access to the read-only endpoints, or is that something that has to
be configured within the hosting web server (not sure how the web server
world works in Javaland)?
Hi,
I would rather not do it. Couple of reasons why I think this is not a
good idea:
API can change. Although ES API has been very stable for major part
the release 1.0 is still not here.
Depending on what part of API you want to expose you should be
careful. Even if you expose only search related API it would allow
detailed inspection of your index structure and then anybody could
create queries that can put unnecessary load on our servers.
If you want to log your use activity then you should do it before
the request hits first ES node.
I would personaly recommend the opposite: write your own proxy to
expose only the minimal function set.
Quick question (as I have not seen any opinions on this topic) -
using
part of ES as a public application's API would be a good idea?
Our app relies heavily on search (livesearch, in general) - limiting
every layer (ruby app, in this case) would be a huge performance
boost
and since ES uses JSON - this was an obvious thing that came to my
mind: use it as the livesearch underlying backend API directly. Bad
idea, good idea?
Elasticsearch also supports JSONP and powers the search on elasticsearch.org. I see your points, and even agree that it is likely a
bad idea, but you didn't answer my question. Is there any way within ES to
limit destructive endpoints to credentialed users, or should this be done at
the web server level?
Guys, this is seriously not a good idea, ES is designed as a systems
component, It is designed to operate inside a closed network as part of a
back-end system. It does not have the security and intrusion hardening
exposure needed to operate as a direct connected internet service.
As stated before, use a service between ES and the web to buffer, rate
limit, log and allow blocking of dangerous options.
Also remember that even with a readonly query system, if you allow
arbitrary queries from hosts that have little or not involvement in the
design and construction of the data sets provided, then you WILL get major
performance issues, again a intervening service would limit queries to the
operations that your datasets can safely provide.
On Monday, June 27, 2011 at 9:33 AM, Eric Mill wrote:
I've been thinking about this too - is there any way within ES to limit
public access to the read-only endpoints, or is that something that has to
be configured within the hosting web server (not sure how the web server
world works in Javaland)?
Hi,
I would rather not do it. Couple of reasons why I think this is not a
good idea:
API can change. Although ES API has been very stable for major
part
the release 1.0 is still not here.
Depending on what part of API you want to expose you should be
careful. Even if you expose only search related API it would allow
detailed inspection of your index structure and then anybody could
create queries that can put unnecessary load on our servers.
If you want to log your use activity then you should do it before
the request hits first ES node.
I would personaly recommend the opposite: write your own proxy to
expose only the minimal function set.
Quick question (as I have not seen any opinions on this topic) -
using
part of ES as a public application's API would be a good idea?
Our app relies heavily on search (livesearch, in general) - limiting
every layer (ruby app, in this case) would be a huge performance
boost
and since ES uses JSON - this was an obvious thing that came to my
mind: use it as the livesearch underlying backend API directly. Bad
idea, good idea?
b) It is a not convenient to directly expose the ES API to your
users. I don't believe a thin Ruby wrapper around ES adds any
significant overhead. And you'll need things like authentication,
throttling, storing analytics anyway...
Elasticsearch also supports JSONP and powers the search on elasticsearch.org. I see your points, and even agree that it is likely a
bad idea, but you didn't answer my question. Is there any way within ES to
limit destructive endpoints to credentialed users, or should this be done at
the web server level?
Guys, this is seriously not a good idea, ES is designed as a systems
component, It is designed to operate inside a closed network as part of a
back-end system. It does not have the security and intrusion hardening
exposure needed to operate as a direct connected internet service.
As stated before, use a service between ES and the web to buffer, rate
limit, log and allow blocking of dangerous options.
Also remember that even with a readonly query system, if you allow
arbitrary queries from hosts that have little or not involvement in the
design and construction of the data sets provided, then you WILL get major
performance issues, again a intervening service would limit queries to the
operations that your datasets can safely provide.
On Monday, June 27, 2011 at 9:33 AM, Eric Mill wrote:
I've been thinking about this too - is there any way within ES to limit
public access to the read-only endpoints, or is that something that has to
be configured within the hosting web server (not sure how the web server
world works in Javaland)?
Hi,
I would rather not do it. Couple of reasons why I think this is not a
good idea:
API can change. Although ES API has been very stable for major
part
the release 1.0 is still not here.
Depending on what part of API you want to expose you should be
careful. Even if you expose only search related API it would allow
detailed inspection of your index structure and then anybody could
create queries that can put unnecessary load on our servers.
If you want to log your use activity then you should do it before
the request hits first ES node.
I would personaly recommend the opposite: write your own proxy to
expose only the minimal function set.
Quick question (as I have not seen any opinions on this topic) -
using
part of ES as a public application's API would be a good idea?
Our app relies heavily on search (livesearch, in general) - limiting
every layer (ruby app, in this case) would be a huge performance
boost
and since ES uses JSON - this was an obvious thing that came to my
mind: use it as the livesearch underlying backend API directly. Bad
idea, good idea?
On Sun, Jun 26, 2011 at 9:43 PM, Bartosz Pietrzak pietrzak@bartosz.me wrote:
Quick question (as I have not seen any opinions on this topic) - using
part of ES as a public application's API would be a good idea?
Our app relies heavily on search (livesearch, in general) - limiting
every layer (ruby app, in this case) would be a huge performance boost
and since ES uses JSON - this was an obvious thing that came to my
mind: use it as the livesearch underlying backend API directly. Bad
idea, good idea?
In my case, I have started off by exposing CouchDB APIs to external clients, but
over time I have exposed Elasticsearch APIs to our data.
My setup is something like this:
nginx (security, https, limiting to GET/POST) etc. This also does
load balancing to the below layers.
A django (spawning + eventlet) server handling auth and acls.
Elasticsearch beneath django.
The plan is to slowly remove the Django setup in between.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.