River plugin: detect updates in the input data

Hi,
is there any way to get notification that a river index has been modified?
I have tried using the RSS River plugin and it seems that once the rss
input has been defined, the plugin starts with that data set and any
attempt to add/modify the input data has no effect (the new rss sources are
stored but the RSS are not processed by the river which is not notified
that the input data are changed).
Is there any other way, other then restart the plugin or to delete\recreate
the index, to process the updated items?

Thanks in advance,
Marc

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Could you add an issue to the RSS project?
I think I can support update. That said, you can remove the _river itself and recreate it.
It does not mean that you have to remove your index that contains data itself.

Does it help?

David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 13 juin 2013 à 13:12, Marc ciarcia.marco@gmail.com a écrit :

Hi,
is there any way to get notification that a river index has been modified?
I have tried using the RSS River plugin and it seems that once the rss input has been defined, the plugin starts with that data set and any attempt to add/modify the input data has no effect (the new rss sources are stored but the RSS are not processed by the river which is not notified that the input data are changed).
Is there any other way, other then restart the plugin or to delete\recreate the index, to process the updated items?

Thanks in advance,
Marc

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thanks for the quick reply, I have added the issue on the rss project.
BTW do you have any suggestion on how to be notified that the index
containing the list of feed has been updated?
By deleting and recreating the _river the plugin kills and restarts all
threads again, it would be nice to add\remove only threads with updated data

Il giorno giovedì 13 giugno 2013 13:16:00 UTC+2, David Pilato ha scritto:

Could you add an issue to the RSS project?
I think I can support update. That said, you can remove the _river itself
and recreate it.
It does not mean that you have to remove your index that contains data
itself.

Does it help?

David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 13 juin 2013 à 13:12, Marc <ciarci...@gmail.com <javascript:>> a écrit
:

Hi,
is there any way to get notification that a river index has been modified?
I have tried using the RSS River plugin and it seems that once the rss
input has been defined, the plugin starts with that data set and any
attempt to add/modify the input data has no effect (the new rss sources are
stored but the RSS are not processed by the river which is not notified
that the input data are changed).
Is there any other way, other then restart the plugin or to
delete\recreate the index, to process the updated items?

Thanks in advance,
Marc

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Not sure I fully understood your question. Let me answer like this.

You can create as many RSS river instance as you want.
For example:
$ curl -XPUT 'localhost:9200/_river/lemonde/_meta' -d '{
"type": "rss",
"rss": {
"feeds" : [ {
"name": "lemonde",
"url": "http://www.lemonde.fr/rss/une.xml",
"update_rate": 900000,
"ignore_ttl": true
}
]
}
}'
And

$ curl -XPUT 'localhost:9200/_river/lefigaro/_meta' -d '{
"type": "rss",
"rss": {
"feeds" : [ {
"name": "lefigaro",
"url": "http://rss.lefigaro.fr/lefigaro/laune",
"update_rate": 900000,
"ignore_ttl": true
}
]
}
}'

Then, if you need to stop one specific river, you can do it as is:
$ curl -XDELETE 'localhost:9200/_river/lefigaro/_meta'
It should only remove the RSS river instance for that last URL.

Is it what you are looking for?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 13 juin 2013 à 13:52, Marco Ciarcia ciarcia.marco@gmail.com a écrit :

Thanks for the quick reply, I have added the issue on the rss project.
BTW do you have any suggestion on how to be notified that the index containing the list of feed has been updated?
By deleting and recreating the _river the plugin kills and restarts all threads again, it would be nice to add\remove only threads with updated data

Il giorno giovedì 13 giugno 2013 13:16:00 UTC+2, David Pilato ha scritto:
Could you add an issue to the RSS project?
I think I can support update. That said, you can remove the _river itself and recreate it.
It does not mean that you have to remove your index that contains data itself.

Does it help?

David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 13 juin 2013 à 13:12, Marc ciarci...@gmail.com a écrit :

Hi,
is there any way to get notification that a river index has been modified?
I have tried using the RSS River plugin and it seems that once the rss input has been defined, the plugin starts with that data set and any attempt to add/modify the input data has no effect (the new rss sources are stored but the RSS are not processed by the river which is not notified that the input data are changed).
Is there any other way, other then restart the plugin or to delete\recreate the index, to process the updated items?

Thanks in advance,
Marc

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

That will be creating separate indexes for articles (i.e. lefigaro and
lemonde), making the result search complex.
Normally, I'd like different rss feeds source, with articles collected
within the same index and this could be done by adding an array of
different "feeds" elements in the curl below, but if I update that list and
send the curl again with different elements the request is ignored.

$ curl -XPUT 'localhost:9200/_river/lemonde/_meta' -d '{ "type": "rss", "rss": { "feeds" : [ { "name": "lemonde", "url": "http://www.lemonde.fr/rss/une.xml", "update_rate": 900000, "ignore_ttl": true } ] }}'

$ curl -XPUT 'localhost:9200/_river/articles/_meta' -d '{ "type": "rss", "rss": { "feeds" : [ { "name": "lemonde", "url": "http://www.lemonde.fr/rss/une.xml", "update_rate": 900000, "ignore_ttl": true }, { "name": "lefigaro",
"url": "http://rss.lefigaro.fr/lefigaro/laune"
} ] }}'

The second request is ignored, or better, a new version of the index
("articles" in the example above) will be created, but the river will not
receive the update and will keep working as before till elasticsearch and
the plugin doesn't restart or the _river index gets deleted\re-created

Il giorno giovedì 13 giugno 2013 15:38:29 UTC+2, David Pilato ha scritto:

Not sure I fully understood your question. Let me answer like this.

You can create as many RSS river instance as you want.
For example:

$ curl -XPUT 'localhost:9200/_river/lemonde/_meta' -d '{ "type": "rss", "rss": { "feeds" : [ { "name": "lemonde", "url": "http://www.lemonde.fr/rss/une.xml", "update_rate": 900000, "ignore_ttl": true } ] }}'

And

$ curl -XPUT 'localhost:9200/_river/lefigaro/_meta' -d '{ "type": "rss", "rss": { "feeds" : [ { "name": "lefigaro", "url": "http://rss.lefigaro.fr/lefigaro/laune", "update_rate": 900000, "ignore_ttl": true } ] }}'

Then, if you need to stop one specific river, you can do it as is:

$ curl -XDELETE 'localhost:9200/_river/lefigaro/_meta'

It should only remove the RSS river instance for that last URL.

Is it what you are looking for?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 13 juin 2013 à 13:52, Marco Ciarcia <ciarci...@gmail.com <javascript:>>
a écrit :

Thanks for the quick reply, I have added the issue on the rss project.
BTW do you have any suggestion on how to be notified that the index
containing the list of feed has been updated?
By deleting and recreating the _river the plugin kills and restarts all
threads again, it would be nice to add\remove only threads with updated data

Il giorno giovedì 13 giugno 2013 13:16:00 UTC+2, David Pilato ha scritto:

Could you add an issue to the RSS project?
I think I can support update. That said, you can remove the _river itself
and recreate it.
It does not mean that you have to remove your index that contains data
itself.

Does it help?

David Pilato | Technical Advocate | *Elasticsearch.comhttp://elasticsearch.com/
*
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 13 juin 2013 à 13:12, Marc ciarci...@gmail.com a écrit :

Hi,
is there any way to get notification that a river index has been modified?
I have tried using the RSS River plugin and it seems that once the rss
input has been defined, the plugin starts with that data set and any
attempt to add/modify the input data has no effect (the new rss sources are
stored but the RSS are not processed by the river which is not notified
that the input data are changed).
Is there any other way, other then restart the plugin or to
delete\recreate the index, to process the updated items?

Thanks in advance,
Marc

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Yeah unless you set index and type for that river. Sounds like I forget to document it! :frowning:

It's something like:
{
"type":"rss",
"rss":{
"feeds":[
{
"name":"lemonde",
"url":"http://www.lemonde.fr/rss/une.xml",
"update_rate":900000,
"ignore_ttl":true
}
]
},
"index":{
"index":"myindex",
"type":"mytype"
}
}

So basically, you should be able to define each feed to go to the same index / type.

Does it help?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 13 juin 2013 à 16:28, Marco Ciarcia ciarcia.marco@gmail.com a écrit :

That will be creating separate indexes for articles (i.e. lefigaro and lemonde), making the result search complex.
Normally, I'd like different rss feeds source, with articles collected within the same index and this could be done by adding an array of different "feeds" elements in the curl below, but if I update that list and send the curl again with different elements the request is ignored.
$ curl -XPUT 'localhost:9200/_river/lemonde/_meta' -d '{
"type": "rss",
"rss": {
"feeds" : [ {
"name": "lemonde",
"url": "http://www.lemonde.fr/rss/une.xml",
"update_rate": 900000,
"ignore_ttl": true
}
]
}
}'
$ curl -XPUT 'localhost:9200/_river/articles/_meta' -d '{
"type": "rss",
"rss": {
"feeds" : [ {
"name": "lemonde",
"url": "http://www.lemonde.fr/rss/une.xml",
"update_rate": 900000,
"ignore_ttl": true
}, {
"name": "lefigaro",
"url": "http://rss.lefigaro.fr/lefigaro/laune"
}
]
}
}'
The second request is ignored, or better, a new version of the index ("articles" in the example above) will be created, but the river will not receive the update and will keep working as before till elasticsearch and the plugin doesn't restart or the _river index gets deleted\re-created

Il giorno giovedì 13 giugno 2013 15:38:29 UTC+2, David Pilato ha scritto:
Not sure I fully understood your question. Let me answer like this.

You can create as many RSS river instance as you want.
For example:
$ curl -XPUT 'localhost:9200/_river/lemonde/_meta' -d '{
"type": "rss",
"rss": {
"feeds" : [ {
"name": "lemonde",
"url": "http://www.lemonde.fr/rss/une.xml",
"update_rate": 900000,
"ignore_ttl": true
}
]
}
}'
And

$ curl -XPUT 'localhost:9200/_river/lefigaro/_meta' -d '{
"type": "rss",
"rss": {
"feeds" : [ {
"name": "lefigaro",
"url": "http://rss.lefigaro.fr/lefigaro/laune",
"update_rate": 900000,
"ignore_ttl": true
}
]
}
}'

Then, if you need to stop one specific river, you can do it as is:
$ curl -XDELETE 'localhost:9200/_river/lefigaro/_meta'
It should only remove the RSS river instance for that last URL.

Is it what you are looking for?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 13 juin 2013 à 13:52, Marco Ciarcia ciarci...@gmail.com a écrit :

Thanks for the quick reply, I have added the issue on the rss project.
BTW do you have any suggestion on how to be notified that the index containing the list of feed has been updated?
By deleting and recreating the _river the plugin kills and restarts all threads again, it would be nice to add\remove only threads with updated data

Il giorno giovedì 13 giugno 2013 13:16:00 UTC+2, David Pilato ha scritto:
Could you add an issue to the RSS project?
I think I can support update. That said, you can remove the _river itself and recreate it.
It does not mean that you have to remove your index that contains data itself.

Does it help?

David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 13 juin 2013 à 13:12, Marc ciarci...@gmail.com a écrit :

Hi,
is there any way to get notification that a river index has been modified?
I have tried using the RSS River plugin and it seems that once the rss input has been defined, the plugin starts with that data set and any attempt to add/modify the input data has no effect (the new rss sources are stored but the RSS are not processed by the river which is not notified that the input data are changed).
Is there any other way, other then restart the plugin or to delete\recreate the index, to process the updated items?

Thanks in advance,
Marc

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.