CustomQueryParser and customSimilarity to integrate payload


(Aurélien-3) #1

Hi,

In the continuity of a development to integrate payloads into my elastic
search application, I developped successfully custom analyzers, custom
similarity and custom queyr parser. BTW, if someone is interested by code
example for a custom query parser, let me know, I'll take some time to put
it on github.

Now my issue is to use my custom similarity into the scoring process of the
custom query parser. Can someone give me some ideas how and where this is
done in the code ?

Thank you all.
Aurelien

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Luca Cavanna) #2

The scoring part is in the similarity. Have a look here for an example of
custom
similarity: https://github.com/tlrx/elasticsearch-custom-similarity-provider/
.
If you want to score based on payloads the interesting method to override
in your own similarity should be the scorePayload one. The lucene javadoc<http://lucene.apache.org/core/4_4_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html#scorePayload(int,
int, int, org.apache.lucene.util.BytesRef)> should help understanding what
you need to do.

Cheers
Luca

On Tuesday, September 10, 2013 12:02:28 PM UTC+2, Aurélien wrote:

Hi,

In the continuity of a development to integrate payloads into my elastic
search application, I developped successfully custom analyzers, custom
similarity and custom queyr parser. BTW, if someone is interested by code
example for a custom query parser, let me know, I'll take some time to put
it on github.

Now my issue is to use my custom similarity into the scoring process of
the custom query parser. Can someone give me some ideas how and where this
is done in the code ?

Thank you all.
Aurelien

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Aurélien-3) #3

Hi, thanks for the hint. Actually I've done all this. Let me put some more
details :

I've made already my custom similarity :

public class CustomSimilarityProvider extends AbstractSimilarityProvider {

private final CustomSimilarity similarity;

@Inject
protected CustomSimilarityProvider(@Assisted String name) {
super(name);
this.similarity = new CustomSimilarity();
}

@Override
public Similarity get() {
return similarity;
}

}

public class CustomSimilarity extends DefaultSimilarity {

@Override
public float scorePayload(int doc, int start, int end, BytesRef payload) {
if (payload != null) {
return PayloadHelper.decodeFloat(payload.bytes);
} else {
return 1.0F;
}
}
}

I've created my own queryParser :

public class PayloadQueryParser implements QueryParser {
<...>

public Query parse(QueryParseContext parseContext) throws IOException,
QueryParsingException {
XContentParser parser = parseContext.parser();
<...>
return query; // A PayloadTermQuery
}
}

I've registered all these using a Plugin registering with the dedicated
module :

public void processModule(Module module) {
<...>
if(module instanceof IndexQueryParserModule){
((IndexQueryParserModule) module).addQueryParser("payload_query",
PayloadQueryParser.class);
}
}

But seems I'm missing something so that the Similarity is actually used for
scoring.

I've noticed in the QueryParserContext that it makes reference to the
defaultSimilarity, but not only. May it be linked with the configuration of
the mapping, or there is a trick to register and make use of the
similarity?

I've been lurking around in the code. I haven't found a hint about it yet.
Luca maybe you know the classes that use the similarity ?

Actually in Lucene, I would inject the similarity as follow:
this.isearcher = new IndexSearcher(reader);
isearcher.setSimilarity(new PayloadSimilarity());

before executing a PayloadTermQuery.

BR,
Aurelien

Le mardi 10 septembre 2013 20:11:19 UTC+3, Luca Cavanna a écrit :

The scoring part is in the similarity. Have a look here for an example of
custom similarity:
https://github.com/tlrx/elasticsearch-custom-similarity-provider/ .
If you want to score based on payloads the interesting method to override
in your own similarity should be the scorePayload one. The lucene javadochttp://lucene.apache.org/core/4_4_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html#scorePayload(int,+int,+int,+org.apache.lucene.util.BytesRef)should help understanding what you need to do.

Cheers
Luca

On Tuesday, September 10, 2013 12:02:28 PM UTC+2, Aurélien wrote:

Hi,

In the continuity of a development to integrate payloads into my elastic
search application, I developped successfully custom analyzers, custom
similarity and custom queyr parser. BTW, if someone is interested by code
example for a custom query parser, let me know, I'll take some time to put
it on github.

Now my issue is to use my custom similarity into the scoring process of
the custom query parser. Can someone give me some ideas how and where this
is done in the code ?

Thank you all.
Aurelien

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Luca Cavanna) #4

Hi,
I think the only missing part is the one that registers your custom
similarity provider:

curl -XPOST 'http://host:port/tweeter/' -d '
{
"settings": {
"similarity": {
"index": {
"type": "org.elasticsearch.index.similarity.CustomSimilarityProvider"
},
"search": {
"type": "org.elasticsearch.index.similarity.CustomSimilarityProvider"
}
}
}
}'

It is used at index time to compute norms, which take into account index
time boosting, and at search time to score the documents. That's why it is
specified twice in the example above. The example above sets the similarity
for a whole index, but as of 0.90 you can also set it per field. Have a
look at the reference:
http://www.elasticsearch.org/guide/reference/index-modules/similarity/.

Cheers
Luca

On Wednesday, September 11, 2013 5:49:53 PM UTC+2, Aurélien wrote:

Hi, thanks for the hint. Actually I've done all this. Let me put some more
details :

I've made already my custom similarity :

public class CustomSimilarityProvider extends AbstractSimilarityProvider {

private final CustomSimilarity similarity;

@Inject
protected CustomSimilarityProvider(@Assisted String name) {
super(name);
this.similarity = new CustomSimilarity();
}

@Override
public Similarity get() {
return similarity;
}

}

public class CustomSimilarity extends DefaultSimilarity {

@Override
public float scorePayload(int doc, int start, int end, BytesRef payload)
{
if (payload != null) {
return PayloadHelper.decodeFloat(payload.bytes);
} else {
return 1.0F;
}
}
}

I've created my own queryParser :

public class PayloadQueryParser implements QueryParser {
<...>

public Query parse(QueryParseContext parseContext) throws IOException,
QueryParsingException {
XContentParser parser = parseContext.parser();
<...>
return query; // A PayloadTermQuery
}
}

I've registered all these using a Plugin registering with the dedicated
module :

public void processModule(Module module) {
<...>
if(module instanceof IndexQueryParserModule){
((IndexQueryParserModule) module).addQueryParser("payload_query",
PayloadQueryParser.class);
}
}

But seems I'm missing something so that the Similarity is actually used
for scoring.

I've noticed in the QueryParserContext that it makes reference to the
defaultSimilarity, but not only. May it be linked with the configuration of
the mapping, or there is a trick to register and make use of the
similarity?

I've been lurking around in the code. I haven't found a hint about it yet.
Luca maybe you know the classes that use the similarity ?

Actually in Lucene, I would inject the similarity as follow:
this.isearcher = new IndexSearcher(reader);
isearcher.setSimilarity(new PayloadSimilarity());

before executing a PayloadTermQuery.

BR,
Aurelien

Le mardi 10 septembre 2013 20:11:19 UTC+3, Luca Cavanna a écrit :

The scoring part is in the similarity. Have a look here for an example of
custom similarity:
https://github.com/tlrx/elasticsearch-custom-similarity-provider/ .
If you want to score based on payloads the interesting method to override
in your own similarity should be the scorePayload one. The lucene javadochttp://lucene.apache.org/core/4_4_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html#scorePayload(int,+int,+int,+org.apache.lucene.util.BytesRef)should help understanding what you need to do.

Cheers
Luca

On Tuesday, September 10, 2013 12:02:28 PM UTC+2, Aurélien wrote:

Hi,

In the continuity of a development to integrate payloads into my elastic
search application, I developped successfully custom analyzers, custom
similarity and custom queyr parser. BTW, if someone is interested by code
example for a custom query parser, let me know, I'll take some time to put
it on github.

Now my issue is to use my custom similarity into the scoring process of
the custom query parser. Can someone give me some ideas how and where this
is done in the code ?

Thank you all.
Aurelien

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Aurélien-3) #5

Yes, thanks a lot, this is what was missing actually. I haven't noticed
this setting registration step.

Great !! That's awesome man.

Le jeudi 12 septembre 2013 14:39:43 UTC+3, Luca Cavanna a écrit :

Hi,
I think the only missing part is the one that registers your custom
similarity provider:

curl -XPOST 'http://host:port/tweeter/' -d '
{
"settings": {
"similarity": {
"index": {
"type": "org.elasticsearch.index.similarity.CustomSimilarityProvider"
},
"search": {
"type": "org.elasticsearch.index.similarity.CustomSimilarityProvider"
}
}
}
}'

It is used at index time to compute norms, which take into account index
time boosting, and at search time to score the documents. That's why it is
specified twice in the example above. The example above sets the similarity
for a whole index, but as of 0.90 you can also set it per field. Have a
look at the reference:
http://www.elasticsearch.org/guide/reference/index-modules/similarity/.

Cheers
Luca

On Wednesday, September 11, 2013 5:49:53 PM UTC+2, Aurélien wrote:

Hi, thanks for the hint. Actually I've done all this. Let me put some
more details :

I've made already my custom similarity :

public class CustomSimilarityProvider extends AbstractSimilarityProvider {

private final CustomSimilarity similarity;

@Inject
protected CustomSimilarityProvider(@Assisted String name) {
super(name);
this.similarity = new CustomSimilarity();
}

@Override
public Similarity get() {
return similarity;
}

}

public class CustomSimilarity extends DefaultSimilarity {

@Override
public float scorePayload(int doc, int start, int end, BytesRef
payload) {
if (payload != null) {
return PayloadHelper.decodeFloat(payload.bytes);
} else {
return 1.0F;
}
}
}

I've created my own queryParser :

public class PayloadQueryParser implements QueryParser {
<...>

public Query parse(QueryParseContext parseContext) throws IOException,
QueryParsingException {
XContentParser parser = parseContext.parser();
<...>
return query; // A PayloadTermQuery
}
}

I've registered all these using a Plugin registering with the dedicated
module :

public void processModule(Module module) {
<...>
if(module instanceof IndexQueryParserModule){
((IndexQueryParserModule) module).addQueryParser("payload_query",
PayloadQueryParser.class);
}
}

But seems I'm missing something so that the Similarity is actually used
for scoring.

I've noticed in the QueryParserContext that it makes reference to the
defaultSimilarity, but not only. May it be linked with the configuration of
the mapping, or there is a trick to register and make use of the
similarity?

I've been lurking around in the code. I haven't found a hint about it
yet. Luca maybe you know the classes that use the similarity ?

Actually in Lucene, I would inject the similarity as follow:
this.isearcher = new IndexSearcher(reader);
isearcher.setSimilarity(new PayloadSimilarity());

before executing a PayloadTermQuery.

BR,
Aurelien

Le mardi 10 septembre 2013 20:11:19 UTC+3, Luca Cavanna a écrit :

The scoring part is in the similarity. Have a look here for an example
of custom similarity:
https://github.com/tlrx/elasticsearch-custom-similarity-provider/ .
If you want to score based on payloads the interesting method to
override in your own similarity should be the scorePayload one. The lucene
javadochttp://lucene.apache.org/core/4_4_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html#scorePayload(int,+int,+int,+org.apache.lucene.util.BytesRef)should help understanding what you need to do.

Cheers
Luca

On Tuesday, September 10, 2013 12:02:28 PM UTC+2, Aurélien wrote:

Hi,

In the continuity of a development to integrate payloads into my
elastic search application, I developped successfully custom analyzers,
custom similarity and custom queyr parser. BTW, if someone is interested by
code example for a custom query parser, let me know, I'll take some time to
put it on github.

Now my issue is to use my custom similarity into the scoring process of
the custom query parser. Can someone give me some ideas how and where this
is done in the code ?

Thank you all.
Aurelien

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Aurélien-3) #6

Hi Luca,

Well I still have a bit of question related to this. In debug all what you
told works great. Once compiled and install through the plugin interface, I
have a strange behaviour related I guess to some injection mechanism.

It returns me the following (on index creation with settings)

{"error":"RemoteTransportException[[serv03][inet[/x.x.x.x:9300]][indices/create]];
nested: IndexCreationException[[test] failed to create index]; nested:
NoClassSettingsException[Failed to load class setting [type] with value
[com.myproject.elasticsearch.similarity.CustomSimilarityProvider]]; nested:
ClassNotFoundException[com.myproject.elasticsearch.similarity.customsimilarityprovider.CustomSimilarityProviderSimilarityProvider];
","status":500}%

Seems very awkward to see a nested path !! any idea or constraints I may
have to respect for it to work ?

Though in my eclipse while running in debug works great.

Le jeudi 12 septembre 2013 17:09:29 UTC+3, Aurélien a écrit :

Yes, thanks a lot, this is what was missing actually. I haven't noticed
this setting registration step.

Great !! That's awesome man.

Le jeudi 12 septembre 2013 14:39:43 UTC+3, Luca Cavanna a écrit :

Hi,
I think the only missing part is the one that registers your custom
similarity provider:

curl -XPOST 'http://host:port/tweeter/' -d '
{
"settings": {
"similarity": {
"index": {
"type": "org.elasticsearch.index.similarity.CustomSimilarityProvider"
},
"search": {
"type": "org.elasticsearch.index.similarity.CustomSimilarityProvider"
}
}
}
}'

It is used at index time to compute norms, which take into account index
time boosting, and at search time to score the documents. That's why it is
specified twice in the example above. The example above sets the similarity
for a whole index, but as of 0.90 you can also set it per field. Have a
look at the reference:
http://www.elasticsearch.org/guide/reference/index-modules/similarity/.

Cheers
Luca

On Wednesday, September 11, 2013 5:49:53 PM UTC+2, Aurélien wrote:

Hi, thanks for the hint. Actually I've done all this. Let me put some
more details :

I've made already my custom similarity :

public class CustomSimilarityProvider extends AbstractSimilarityProvider
{

private final CustomSimilarity similarity;

@Inject
protected CustomSimilarityProvider(@Assisted String name) {
super(name);
this.similarity = new CustomSimilarity();
}

@Override
public Similarity get() {
return similarity;
}

}

public class CustomSimilarity extends DefaultSimilarity {

@Override
public float scorePayload(int doc, int start, int end, BytesRef
payload) {
if (payload != null) {
return PayloadHelper.decodeFloat(payload.bytes);
} else {
return 1.0F;
}
}
}

I've created my own queryParser :

public class PayloadQueryParser implements QueryParser {
<...>

public Query parse(QueryParseContext parseContext) throws IOException,
QueryParsingException {
XContentParser parser = parseContext.parser();
<...>
return query; // A PayloadTermQuery
}
}

I've registered all these using a Plugin registering with the dedicated
module :

public void processModule(Module module) {
<...>
if(module instanceof IndexQueryParserModule){
((IndexQueryParserModule) module).addQueryParser("payload_query",
PayloadQueryParser.class);
}
}

But seems I'm missing something so that the Similarity is actually used
for scoring.

I've noticed in the QueryParserContext that it makes reference to the
defaultSimilarity, but not only. May it be linked with the configuration of
the mapping, or there is a trick to register and make use of the
similarity?

I've been lurking around in the code. I haven't found a hint about it
yet. Luca maybe you know the classes that use the similarity ?

Actually in Lucene, I would inject the similarity as follow:
this.isearcher = new IndexSearcher(reader);
isearcher.setSimilarity(new PayloadSimilarity());

before executing a PayloadTermQuery.

BR,
Aurelien

Le mardi 10 septembre 2013 20:11:19 UTC+3, Luca Cavanna a écrit :

The scoring part is in the similarity. Have a look here for an example
of custom similarity:
https://github.com/tlrx/elasticsearch-custom-similarity-provider/ .
If you want to score based on payloads the interesting method to
override in your own similarity should be the scorePayload one. The lucene
javadochttp://lucene.apache.org/core/4_4_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html#scorePayload(int,+int,+int,+org.apache.lucene.util.BytesRef)should help understanding what you need to do.

Cheers
Luca

On Tuesday, September 10, 2013 12:02:28 PM UTC+2, Aurélien wrote:

Hi,

In the continuity of a development to integrate payloads into my
elastic search application, I developped successfully custom analyzers,
custom similarity and custom queyr parser. BTW, if someone is interested by
code example for a custom query parser, let me know, I'll take some time to
put it on github.

Now my issue is to use my custom similarity into the scoring process
of the custom query parser. Can someone give me some ideas how and where
this is done in the code ?

Thank you all.
Aurelien

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Luca Cavanna) #7

Hi,
don't worry about the nested path, if the class is not found in the
classpath we try to look it up adding a suffix. Potentially, you could just
register the simlarity provider calling it custom and the related class
CustomSimilarityProvider would found anyway.

The issue here is that the class is not found in the classpath. Did you add
the jar files to the lib of your plugins? I suspect something might be
going wrong there.

Cheers
Luca

On Fri, Sep 13, 2013 at 4:57 PM, Aurélien goldenlink82@gmail.com wrote:

Hi Luca,

Well I still have a bit of question related to this. In debug all what you
told works great. Once compiled and install through the plugin interface, I
have a strange behaviour related I guess to some injection mechanism.

It returns me the following (on index creation with settings)

{"error":"RemoteTransportException[[serv03][inet[/x.x.x.x:9300]][indices/create]];
nested: IndexCreationException[[test] failed to create index]; nested:
NoClassSettingsException[Failed to load class setting [type] with value
[com.myproject.elasticsearch.similarity.CustomSimilarityProvider]]; nested:
ClassNotFoundException[com.myproject.elasticsearch.similarity.customsimilarityprovider.CustomSimilarityProviderSimilarityProvider];
","status":500}%

Seems very awkward to see a nested path !! any idea or constraints I may
have to respect for it to work ?

Though in my eclipse while running in debug works great.

Le jeudi 12 septembre 2013 17:09:29 UTC+3, Aurélien a écrit :

Yes, thanks a lot, this is what was missing actually. I haven't noticed
this setting registration step.

Great !! That's awesome man.

Le jeudi 12 septembre 2013 14:39:43 UTC+3, Luca Cavanna a écrit :

Hi,
I think the only missing part is the one that registers your custom
similarity provider:

curl -XPOST 'http://host:port/tweeter/' -d '
{
"settings": {
"similarity": {
"index": {
"type": "org.elasticsearch.index.**similarity.**CustomSimilarityProvider"
},
"search": {
"type": "org.elasticsearch.index.**similarity.**CustomSimilarityProvider"
}
}
}
}'

It is used at index time to compute norms, which take into account index
time boosting, and at search time to score the documents. That's why it is
specified twice in the example above. The example above sets the similarity
for a whole index, but as of 0.90 you can also set it per field. Have a
look at the reference: http://www.elasticsearch.org/**
guide/reference/index-modules/**similarity/http://www.elasticsearch.org/guide/reference/index-modules/similarity/
.

Cheers
Luca

On Wednesday, September 11, 2013 5:49:53 PM UTC+2, Aurélien wrote:

Hi, thanks for the hint. Actually I've done all this. Let me put some
more details :

I've made already my custom similarity :

public class CustomSimilarityProvider extends
AbstractSimilarityProvider {

private final CustomSimilarity similarity;

@Inject
protected CustomSimilarityProvider(@**Assisted String name) {
super(name);
this.similarity = new CustomSimilarity();
}

@Override
public Similarity get() {
return similarity;
}

}

public class CustomSimilarity extends DefaultSimilarity {

@Override
public float scorePayload(int doc, int start, int end, BytesRef
payload) {
if (payload != null) {
return PayloadHelper.decodeFloat(**payload.bytes);
} else {
return 1.0F;
}
}
}

I've created my own queryParser :

public class PayloadQueryParser implements QueryParser {
<...>

public Query parse(QueryParseContext parseContext) throws
IOException, QueryParsingException {
XContentParser parser = parseContext.parser();
<...>
return query; // A PayloadTermQuery
}
}

I've registered all these using a Plugin registering with the dedicated
module :

public void processModule(Module module) {
<...>
if(module instanceof IndexQueryParserModule){
((IndexQueryParserModule) module).addQueryParser("**payload_query",
PayloadQueryParser.class);
}
}

But seems I'm missing something so that the Similarity is actually used
for scoring.

I've noticed in the QueryParserContext that it makes reference to the
defaultSimilarity, but not only. May it be linked with the configuration of
the mapping, or there is a trick to register and make use of the
similarity?

I've been lurking around in the code. I haven't found a hint about it
yet. Luca maybe you know the classes that use the similarity ?

Actually in Lucene, I would inject the similarity as follow:
this.isearcher = new IndexSearcher(reader);
isearcher.setSimilarity(new PayloadSimilarity());

before executing a PayloadTermQuery.

BR,
Aurelien

Le mardi 10 septembre 2013 20:11:19 UTC+3, Luca Cavanna a écrit :

The scoring part is in the similarity. Have a look here for an example
of custom similarity: https://github.*com/tlrx/elasticsearch-custom-
*similarity-provider/https://github.com/tlrx/elasticsearch-custom-similarity-provider/.
If you want to score based on payloads the interesting method to
override in your own similarity should be the scorePayload one. The lucene
javadochttp://lucene.apache.org/core/4_4_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html#scorePayload(int,+int,+int,+org.apache.lucene.util.BytesRef)should help understanding what you need to do.

Cheers
Luca

On Tuesday, September 10, 2013 12:02:28 PM UTC+2, Aurélien wrote:

Hi,

In the continuity of a development to integrate payloads into my
elastic search application, I developped successfully custom analyzers,
custom similarity and custom queyr parser. BTW, if someone is interested by
code example for a custom query parser, let me know, I'll take some time to
put it on github.

Now my issue is to use my custom similarity into the scoring process
of the custom query parser. Can someone give me some ideas how and where
this is done in the code ?

Thank you all.
Aurelien

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/jc1n25hx0QY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Aurélien-3) #8

Yeah, thanks, well actually, my mistake. I simply needed to restart the
cluster for it to be correctly taken into account.

Seems similarity are loaded correctly only at start time and not at plugin
installation time. Anyway works great.

Le lundi 16 septembre 2013 11:59:02 UTC+3, Luca Cavanna a écrit :

Hi,
don't worry about the nested path, if the class is not found in the
classpath we try to look it up adding a suffix. Potentially, you could just
register the simlarity provider calling it custom and the related class
CustomSimilarityProvider would found anyway.

The issue here is that the class is not found in the classpath. Did you
add the jar files to the lib of your plugins? I suspect something might be
going wrong there.

Cheers
Luca

On Fri, Sep 13, 2013 at 4:57 PM, Aurélien <golden...@gmail.com<javascript:>

wrote:

Hi Luca,

Well I still have a bit of question related to this. In debug all what
you told works great. Once compiled and install through the plugin
interface, I have a strange behaviour related I guess to some injection
mechanism.

It returns me the following (on index creation with settings)

{"error":"RemoteTransportException[[serv03][inet[/x.x.x.x:9300]][indices/create]];
nested: IndexCreationException[[test] failed to create index]; nested:
NoClassSettingsException[Failed to load class setting [type] with value
[com.myproject.elasticsearch.similarity.CustomSimilarityProvider]]; nested:
ClassNotFoundException[com.myproject.elasticsearch.similarity.customsimilarityprovider.CustomSimilarityProviderSimilarityProvider];
","status":500}%

Seems very awkward to see a nested path !! any idea or constraints I may
have to respect for it to work ?

Though in my eclipse while running in debug works great.

Le jeudi 12 septembre 2013 17:09:29 UTC+3, Aurélien a écrit :

Yes, thanks a lot, this is what was missing actually. I haven't noticed
this setting registration step.

Great !! That's awesome man.

Le jeudi 12 septembre 2013 14:39:43 UTC+3, Luca Cavanna a écrit :

Hi,
I think the only missing part is the one that registers your custom
similarity provider:

curl -XPOST 'http://host:port/tweeter/' -d '
{
"settings": {
"similarity": {
"index": {
"type": "org.elasticsearch.index.**similarity.**CustomSimilarityProvider"
},
"search": {
"type": "org.elasticsearch.index.**similarity.**CustomSimilarityProvider"
}
}
}
}'

It is used at index time to compute norms, which take into account
index time boosting, and at search time to score the documents. That's why
it is specified twice in the example above. The example above sets the
similarity for a whole index, but as of 0.90 you can also set it per field.
Have a look at the reference: http://www.elasticsearch.org/**
guide/reference/index-modules/**similarity/http://www.elasticsearch.org/guide/reference/index-modules/similarity/
.

Cheers
Luca

On Wednesday, September 11, 2013 5:49:53 PM UTC+2, Aurélien wrote:

Hi, thanks for the hint. Actually I've done all this. Let me put some
more details :

I've made already my custom similarity :

public class CustomSimilarityProvider extends
AbstractSimilarityProvider {

private final CustomSimilarity similarity;

@Inject
protected CustomSimilarityProvider(@**Assisted String name) {
super(name);
this.similarity = new CustomSimilarity();
}

@Override
public Similarity get() {
return similarity;
}

}

public class CustomSimilarity extends DefaultSimilarity {

@Override
public float scorePayload(int doc, int start, int end, BytesRef
payload) {
if (payload != null) {
return PayloadHelper.decodeFloat(**payload.bytes);
} else {
return 1.0F;
}
}
}

I've created my own queryParser :

public class PayloadQueryParser implements QueryParser {
<...>

public Query parse(QueryParseContext parseContext) throws
IOException, QueryParsingException {
XContentParser parser = parseContext.parser();
<...>
return query; // A PayloadTermQuery
}
}

I've registered all these using a Plugin registering with the
dedicated module :

public void processModule(Module module) {
<...>
if(module instanceof IndexQueryParserModule){
((IndexQueryParserModule) module).addQueryParser("**payload_query",
PayloadQueryParser.class);
}
}

But seems I'm missing something so that the Similarity is actually
used for scoring.

I've noticed in the QueryParserContext that it makes reference to the
defaultSimilarity, but not only. May it be linked with the configuration of
the mapping, or there is a trick to register and make use of the
similarity?

I've been lurking around in the code. I haven't found a hint about it
yet. Luca maybe you know the classes that use the similarity ?

Actually in Lucene, I would inject the similarity as follow:
this.isearcher = new IndexSearcher(reader);
isearcher.setSimilarity(new PayloadSimilarity());

before executing a PayloadTermQuery.

BR,
Aurelien

Le mardi 10 septembre 2013 20:11:19 UTC+3, Luca Cavanna a écrit :

The scoring part is in the similarity. Have a look here for an
example of custom similarity: https://github.**
com/tlrx/elasticsearch-custom-**similarity-provider/https://github.com/tlrx/elasticsearch-custom-similarity-provider/.
If you want to score based on payloads the interesting method to
override in your own similarity should be the scorePayload one. The lucene
javadochttp://lucene.apache.org/core/4_4_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html#scorePayload(int,+int,+int,+org.apache.lucene.util.BytesRef)should help understanding what you need to do.

Cheers
Luca

On Tuesday, September 10, 2013 12:02:28 PM UTC+2, Aurélien wrote:

Hi,

In the continuity of a development to integrate payloads into my
elastic search application, I developped successfully custom analyzers,
custom similarity and custom queyr parser. BTW, if someone is interested by
code example for a custom query parser, let me know, I'll take some time to
put it on github.

Now my issue is to use my custom similarity into the scoring process
of the custom query parser. Can someone give me some ideas how and where
this is done in the code ?

Thank you all.
Aurelien

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/jc1n25hx0QY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Luca Cavanna) #9

Yep, you have the restart the node to have the classes available in the
classpath :wink:

The plugin script only copies files in the end, nothing more.

On Mon, Sep 16, 2013 at 5:54 PM, Aurélien goldenlink82@gmail.com wrote:

Yeah, thanks, well actually, my mistake. I simply needed to restart the
cluster for it to be correctly taken into account.

Seems similarity are loaded correctly only at start time and not at plugin
installation time. Anyway works great.

Le lundi 16 septembre 2013 11:59:02 UTC+3, Luca Cavanna a écrit :

Hi,
don't worry about the nested path, if the class is not found in the
classpath we try to look it up adding a suffix. Potentially, you could just
register the simlarity provider calling it custom and the related class
CustomSimilarityProvider would found anyway.

The issue here is that the class is not found in the classpath. Did you
add the jar files to the lib of your plugins? I suspect something might be
going wrong there.

Cheers
Luca

On Fri, Sep 13, 2013 at 4:57 PM, Aurélien golden...@gmail.com wrote:

Hi Luca,

Well I still have a bit of question related to this. In debug all what
you told works great. Once compiled and install through the plugin
interface, I have a strange behaviour related I guess to some injection
mechanism.

It returns me the following (on index creation with settings)

{"error":"**RemoteTransportException[[**serv03][inet[/x.x.x.x:9300]][**indices/create]];
nested: IndexCreationException[[test] failed to create index]; nested:
NoClassSettingsException[**Failed to load class setting [type] with
value [com.myproject.elasticsearch.**similarity.**CustomSimilarityProvider]];
nested: ClassNotFoundException[com.myproject.elasticsearch.
similarity.**customsimilarityprovider.CustomSimilarityProviderSimilarityProvider];
","status":500}%

Seems very awkward to see a nested path !! any idea or constraints I may
have to respect for it to work ?

Though in my eclipse while running in debug works great.

Le jeudi 12 septembre 2013 17:09:29 UTC+3, Aurélien a écrit :

Yes, thanks a lot, this is what was missing actually. I haven't noticed
this setting registration step.

Great !! That's awesome man.

Le jeudi 12 septembre 2013 14:39:43 UTC+3, Luca Cavanna a écrit :

Hi,
I think the only missing part is the one that registers your custom
similarity provider:

curl -XPOST 'http://host:port/tweeter/' -d '
{
"settings": {
"similarity": {
"index": {
"type": "org.elasticsearch.index.similarity.CustomSimilarityProvider"
},
"search": {
"type": "org.elasticsearch.index.similarity.CustomSimilarityProvider"
}
}
}
}'

It is used at index time to compute norms, which take into account
index time boosting, and at search time to score the documents. That's why
it is specified twice in the example above. The example above sets the
similarity for a whole index, but as of 0.90 you can also set it per field.
Have a look at the reference: http://www.elasticsearch.org/g
uide/reference/index-modules/similarity/http://www.elasticsearch.org/guide/reference/index-modules/similarity/
.

Cheers
Luca

On Wednesday, September 11, 2013 5:49:53 PM UTC+2, Aurélien wrote:

Hi, thanks for the hint. Actually I've done all this. Let me put some
more details :

I've made already my custom similarity :

public class CustomSimilarityProvider extends
AbstractSimilarityProvider {

private final CustomSimilarity similarity;

@Inject
protected CustomSimilarityProvider(@Assisted String name) {
super(name);
this.similarity = new CustomSimilarity();
}

@Override
public Similarity get() {
return similarity;
}

}

public class CustomSimilarity extends DefaultSimilarity {

@Override
public float scorePayload(int doc, int start, int end, BytesRef
payload) {
if (payload != null) {
return PayloadHelper.decodeFloat(payload.bytes);
} else {
return 1.0F;
}
}
}

I've created my own queryParser :

public class PayloadQueryParser implements QueryParser {
<...>

public Query parse(QueryParseContext parseContext) throws
IOException, QueryParsingException {
XContentParser parser = parseContext.parser();
<...>
return query; // A PayloadTermQuery
}
}

I've registered all these using a Plugin registering with the
dedicated module :

public void processModule(Module module) {
<...>
if(module instanceof IndexQueryParserModule){
((IndexQueryParserModule) module).addQueryParser("payload_query",
PayloadQueryParser.class);
}
}

But seems I'm missing something so that the Similarity is actually
used for scoring.

I've noticed in the QueryParserContext that it makes reference to the
defaultSimilarity, but not only. May it be linked with the configuration of
the mapping, or there is a trick to register and make use of the
similarity?

I've been lurking around in the code. I haven't found a hint about it
yet. Luca maybe you know the classes that use the similarity ?

Actually in Lucene, I would inject the similarity as follow:
this.isearcher = new IndexSearcher(reader);
isearcher.setSimilarity(new PayloadSimilarity());

before executing a PayloadTermQuery.

BR,
Aurelien

Le mardi 10 septembre 2013 20:11:19 UTC+3, Luca Cavanna a écrit :

The scoring part is in the similarity. Have a look here for an
example of custom similarity: https://github.com
/tlrx/elasticsearch-custom-similarity-provider/https://github.com/tlrx/elasticsearch-custom-similarity-provider/.
If you want to score based on payloads the interesting method to
override in your own similarity should be the scorePayload one. The lucene
javadochttp://lucene.apache.org/core/4_4_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html#scorePayload(int,+int,+int,+org.apache.lucene.util.BytesRef)should help understanding what you need to do.

Cheers
Luca

On Tuesday, September 10, 2013 12:02:28 PM UTC+2, Aurélien wrote:

Hi,

In the continuity of a development to integrate payloads into my
elastic search application, I developped successfully custom analyzers,
custom similarity and custom queyr parser. BTW, if someone is interested by
code example for a custom query parser, let me know, I'll take some time to
put it on github.

Now my issue is to use my custom similarity into the scoring
process of the custom query parser. Can someone give me some ideas how and
where this is done in the code ?

Thank you all.
Aurelien

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/**
topic/elasticsearch/**jc1n25hx0QY/unsubscribehttps://groups.google.com/d/topic/elasticsearch/jc1n25hx0QY/unsubscribe
.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@**googlegroups.com.
For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out
.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/jc1n25hx0QY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #10