Mapping attachement seems to fail


(tullio0106) #1

I created a new index with the following code :
private static void createIndex(Client xpClient) throws
JsonGenerationException, JsonMappingException, IOException {
CreateIndexRequest cri = new CreateIndexRequest("documentale");
HashMap<String,Object> indice = new HashMap<String,Object>();
HashMap<String,Object> documento = new HashMap<String,Object>();
HashMap<String,Object> mappings = new HashMap<String,Object>();
HashMap<String,Object> properties = new
HashMap<String,Object>();
HashMap<String,Object> contenuto = new HashMap<String,Object>();
contenuto.put("type","attachment");
contenuto.put("_content_type", "application/pdf");
properties.put("contenuto", contenuto);
documento.put("properties", properties);
mappings.put("documento",documento);
indice.put("mappings", mappings);
StringWriter sw = new StringWriter();
mapper.writeValue(sw, indice);
String json = sw.getBuffer().toString();
System.out.println(json);
cri.source(json);
xpClient.admin().indices().create(cri).actionGet();
}

Then I inserted a document using :

    HashMap<String,Object> mappa = new HashMap<String,Object>();
    mappa.put("user", "Tullio");
    mappa.put("data", new Date());
    mappa.put("message", "La vispa teresina avea tra");
    long inizio = new Date().getTime();
    FileInputStream fis = new 

FileInputStream("C:\Tmp\mac_A17882.pdf");
byte[] contiene = IOUtils.toByteArray(fis);
String contenuto = new String(JsonUtils.encode(contiene));
mappa.put("contenuto", contenuto);
StringWriter sw = new StringWriter();
mapper.writeValue(sw, mappa);
String json = sw.getBuffer().toString();
System.out.println(json);
IndexResponse irb =
c.prepareIndex("documentale","documento","2").setSource(json).execute().actionGet();

and I tried to find it using :

    mappa = new HashMap<String,Object>();
    HashMap<String,Object> query = new HashMap<String,Object>();
    HashMap<String,Object> term = new HashMap<String,Object>();
    term.put("contenuto", "BARALDI");
    query.put("term", term);
    mappa.put("query", query);
    sw = new StringWriter();
    mapper.writeValue(sw, mappa);
    json = sw.getBuffer().toString();
    System.out.println(json);
    SearchResponse response = 

c.prepareSearch("documentale").setTypes("documento").setSource(json).execute().actionGet();
SearchHits sh = response.getHits();

But I got no results.
I'm sure the pdf document contains the "BARALDI" string.

What I mmissed ?
What's wrong ?
Tks
Tullio


(David Pilato) #2

Anything in logs ?

What do you get when doing
curl http://localhost:9200/documentale/documento/_mapping

Le 5 juin 2012 à 17:49, tullio0106 tbettinazzi@axioma.it a écrit :

I created a new index with the following code :
private static void createIndex(Client xpClient) throws
JsonGenerationException, JsonMappingException, IOException {
CreateIndexRequest cri = new CreateIndexRequest("documentale");
HashMap<String,Object> indice = new HashMap<String,Object>();
HashMap<String,Object> documento = new HashMap<String,Object>();
HashMap<String,Object> mappings = new HashMap<String,Object>();
HashMap<String,Object> properties = new HashMap<String,Object>();
HashMap<String,Object> contenuto = new HashMap<String,Object>();
contenuto.put("type","attachment");
contenuto.put("_content_type", "application/pdf");
properties.put("contenuto", contenuto);
documento.put("properties", properties);
mappings.put("documento",documento);
indice.put("mappings", mappings);
StringWriter sw = new StringWriter();
mapper.writeValue(sw, indice);
String json = sw.getBuffer().toString();
System.out.println(json);
cri.source(json);
xpClient.admin().indices().create(cri).actionGet();
}

Then I inserted a document using :

     HashMap<String,Object> mappa = new HashMap<String,Object>();
     mappa.put("user", "Tullio");
     mappa.put("data", new Date());
     mappa.put("message", "La vispa teresina avea tra");
     long inizio = new Date().getTime();
     FileInputStream fis = new FileInputStream("C:\\Tmp\\mac_A17882.pdf");
     byte[] contiene = IOUtils.toByteArray(fis);
     String contenuto = new String(JsonUtils.encode(contiene));
     mappa.put("contenuto", contenuto);
     StringWriter sw = new StringWriter();
     mapper.writeValue(sw, mappa);
     String json = sw.getBuffer().toString();
     System.out.println(json);
     IndexResponse irb =

c.prepareIndex("documentale","documento","2").setSource(json).execute().actionGet();

and I tried to find it using :

     mappa = new HashMap<String,Object>();
     HashMap<String,Object> query = new HashMap<String,Object>();
     HashMap<String,Object> term = new HashMap<String,Object>();
     term.put("contenuto", "BARALDI");
     query.put("term", term);
     mappa.put("query", query);
     sw = new StringWriter();
     mapper.writeValue(sw, mappa);
     json = sw.getBuffer().toString();
     System.out.println(json);
     SearchResponse response =

c.prepareSearch("documentale").setTypes("documento").setSource(json).execute().actionGet();
SearchHits sh = response.getHits();

But I got no results.
I'm sure the pdf document contains the "BARALDI" string.

What I mmissed ?
What's wrong ?
Tks
Tullio

--
David Pilato
http://dev.david.pilato.fr/
Twitter : @dadoonet


(tullio0106) #3

Nothing relevant.
This is the log

2012-06-05 17:43:14,286 INFO [org.elasticsearch.node] - [Gertrude Yorkes]
{0.19.4}[4044]: initializing ...
2012-06-05 17:43:14,355 INFO [org.elasticsearch.plugins] - [Gertrude
Yorkes] loaded [mapper-attachments], sites []
2012-06-05 17:43:26,635 INFO [org.elasticsearch.node] - [Gertrude Yorkes]
{0.19.4}[4044]: initialized
2012-06-05 17:43:26,635 INFO [org.elasticsearch.node] - [Gertrude Yorkes]
{0.19.4}[4044]: starting ...
2012-06-05 17:43:27,257 INFO [org.elasticsearch.transport] - [Gertrude
Yorkes] bound_address {inet[/0.0.0.0:9300]}, publish_address
{inet[/1.13.0.17:9300]}
2012-06-05 17:43:30,496 INFO [org.elasticsearch.cluster.service] -
[Gertrude Yorkes] new_master [Gertrude
Yorkes][opY-KHL1Sl28hsUlYf_RaA][inet[/1.13.0.17:9300]], reason:
zen-disco-join (elected_as_master)
2012-06-05 17:43:30,701 INFO [org.elasticsearch.discovery] - [Gertrude
Yorkes] elasticsearch/opY-KHL1Sl28hsUlYf_RaA
2012-06-05 17:43:30,862 INFO [org.elasticsearch.http] - [Gertrude Yorkes]
bound_address {inet[/0.0.0.0:9200]}, publish_address {inet[/1.13.0.17:9200]}
2012-06-05 17:43:30,863 INFO [org.elasticsearch.node] - [Gertrude Yorkes]
{0.19.4}[4044]: started
{"message":"La vispa teresina avea
tra","contenuto":"","data":1338911010865,"user":"Tullio"}
2012-06-05 17:43:35,300 INFO [org.elasticsearch.gateway] - [Gertrude
Yorkes] recovered [1] indices into cluster_state
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/I:/Maven/repository/org/slf4j/slf4j-log4j12/1.6.1/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/I:/Maven/repository/org/apache/tika/tika-app/1.1/tika-app-1.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
millis 9858
{"query":{"term":{"contenuto":"BARALDI"}}}
Trovati 0

Il giorno martedì 5 giugno 2012 17:55:55 UTC+2, David Pilato ha scritto:

Anything in logs ?

What do you get when doing

curl http://localhost:9200/documentale/documento/_mapping

Le 5 juin 2012 à 17:49, tullio0106 tbettinazzi@axioma.it a écrit :

I created a new index with the following code :
private static void createIndex(Client xpClient) throws
JsonGenerationException, JsonMappingException, IOException {
CreateIndexRequest cri = new CreateIndexRequest("documentale");
HashMap<String,Object> indice = new HashMap<String,Object>();
HashMap<String,Object> documento = new HashMap<String,Object>();
HashMap<String,Object> mappings = new HashMap<String,Object>();
HashMap<String,Object> properties = new
HashMap<String,Object>();
HashMap<String,Object> contenuto = new HashMap<String,Object>();
contenuto.put("type","attachment");
contenuto.put("_content_type", "application/pdf");
properties.put("contenuto", contenuto);
documento.put("properties", properties);
mappings.put("documento",documento);
indice.put("mappings", mappings);
StringWriter sw = new StringWriter();
mapper.writeValue(sw, indice);
String json = sw.getBuffer().toString();
System.out.println(json);
cri.source(json);
xpClient.admin().indices().create(cri).actionGet();
}

Then I inserted a document using :

    HashMap<String,Object> mappa = new HashMap<String,Object>(); 
    mappa.put("user", "Tullio"); 
    mappa.put("data", new Date()); 
    mappa.put("message", "La vispa teresina avea tra"); 
    long inizio = new Date().getTime(); 
    FileInputStream fis = new 

FileInputStream("C:\Tmp\mac_A17882.pdf");
byte[] contiene = IOUtils.toByteArray(fis);
String contenuto = new String(JsonUtils.encode(contiene));
mappa.put("contenuto", contenuto);
StringWriter sw = new StringWriter();
mapper.writeValue(sw, mappa);
String json = sw.getBuffer().toString();
System.out.println(json);
IndexResponse irb =
c.prepareIndex("documentale","documento","2").setSource(json).execute().actionGet();

and I tried to find it using :

    mappa = new HashMap<String,Object>(); 
    HashMap<String,Object> query = new HashMap<String,Object>(); 
    HashMap<String,Object> term = new HashMap<String,Object>(); 
    term.put("contenuto", "BARALDI"); 
    query.put("term", term); 
    mappa.put("query", query); 
    sw = new StringWriter(); 
    mapper.writeValue(sw, mappa); 
    json = sw.getBuffer().toString(); 
    System.out.println(json); 
    SearchResponse response = 

c.prepareSearch("documentale").setTypes("documento").setSource(json).execute().actionGet();

    SearchHits sh = response.getHits(); 

But I got no results.
I'm sure the pdf document contains the "BARALDI" string.

What I mmissed ?
What's wrong ?
Tks
Tullio

--
David Pilato
http://dev.david.pilato.fr/
Twitter : @dadoonet


(David Pilato) #4

Ooops. Sorry. I didn't see that we quit the mailing list...

Here is the last post in this thread...
Just in case others would like to see answers or contribute to Tullio's concern.

David.

Even if you are working embedded, when your webapp (or whatever) starts, it
starts also a ES node, isn't it ?
So, you probably can make a curl.

I thought you have already did it as it was my first suggestion in my first
answer.

BTW, in Java, you can do something like :

ClusterState cs =
client.admin().cluster().prepareState().setFilterIndices("documentale").execute().actionGet().getState();
IndexMetaData imd = cs.getMetaData().index("documentale");
MappingMetaData mdd = imd.mapping("documento");

And see what you can do with mdd...

Le 6 juin 2012 à 09:03, Bettinazzi Tullio tbettinazzi@axioma.it a écrit :

I'm working embedded.
How can I test it using java API?
Tks

Tullio Bettinazzi
Responsabile R & D
Axioma S.p.a. - Tel. +3902618061 - Cell. +39335 104 8626
www.axioma.it


Da: "David Pilato" david@pilato.fr
A: "Bettinazzi Tullio" tbettinazzi@axioma.it
Inviato: Mercoledì, 6 giugno 2012 9:01:02
Oggetto: Re: Mapping attachement seems to fail.

http://localhost:9200/documentale/documento/_mapping
http://localhost:9200/documentale/documento/_mapping

David :wink:
Twitter : @dadoonet / @elasticsearchfr

Le 6 juin 2012 à 08:51, Bettinazzi Tullio < tbettinazzi@axioma.it
mailto:tbettinazzi@axioma.it > a écrit :

  > > >       ElasticSearch 0.19.4, plugin 1.4.0.
  How can I verify mapping is correctly applied ?
  Tks
  Tullio

  Tullio Bettinazzi
  Responsabile R & D
  Axioma S.p.a. - Tel. +3902618061 - Cell. +39335 104 8626
  www.axioma.it <http://www.axioma.it>




  Da: "David Pilato" < david@pilato.fr <mailto:david@pilato.fr> >
  A: "Bettinazzi Tullio" < tbettinazzi@axioma.it

mailto:tbettinazzi@axioma.it >
Inviato: Mercoledì, 6 giugno 2012 8:29:42
Oggetto: Re: Mapping attachement seems to fail.

  Are you sure that your mapping has been correctly applied ?
  I have already seen some case where the mapping for the field was a

String.

  Which version of the attachment plugin do you use and with which

version of ES ?

  David


  Le 6 juin 2012 à 08:25, Bettinazzi Tullio < tbettinazzi@axioma.it

mailto:tbettinazzi@axioma.it > a écrit :

      > > > >           Done with no result.
      But in the document is in uppercase.
      Tks
      Tullio

      Tullio Bettinazzi
      Responsabile R & D
      Axioma S.p.a. - Tel. +3902618061 - Cell. +39335 104 8626
      www.axioma.it <http://www.axioma.it>




      Da: "David Pilato" < david@pilato.fr <mailto:david@pilato.fr>
      A: "tullio0106" < tbettinazzi@axioma.it

mailto:tbettinazzi@axioma.it >
Inviato: Martedì, 5 giugno 2012 21:14:45
Oggetto: Re: Mapping attachement seems to fail.

      Did you try to search for "baraldi" (lowercase)

      David


      Le 5 juin 2012 à 18:11, tullio0106 < tbettinazzi@axioma.it

mailto:tbettinazzi@axioma.it > a écrit :

          > > > > >               Nothing relevant.
          This is the log

          2012-06-05 17:43:14,286 INFO [org.elasticsearch.node] -

[Gertrude Yorkes] {0.19.4}[4044]: initializing ...
2012-06-05 17:43:14,355 INFO
[org.elasticsearch.plugins] - [Gertrude Yorkes] loaded
[mapper-attachments], sites []
2012-06-05 17:43:26,635 INFO [org.elasticsearch.node]

  • [Gertrude Yorkes] {0.19.4}[4044]: initialized
    2012-06-05 17:43:26,635 INFO [org.elasticsearch.node]

  • [Gertrude Yorkes] {0.19.4}[4044]: starting ...
    2012-06-05 17:43:27,257 INFO
    [org.elasticsearch.transport] - [Gertrude Yorkes] bound_address
    {inet[/0.0.0.0:9300]}, publish_address {inet[/1.13.0.17:9300]}
    2012-06-05 17:43:30,496 INFO
    [org.elasticsearch.cluster.service] - [Gertrude Yorkes] new_master
    [Gertrude Yorkes][opY-KHL1Sl28hsUlYf_RaA][inet[/1.13.0.17:9300]],
    reason: zen-disco-join (elected_as_master)
    2012-06-05 17:43:30,701 INFO
    [org.elasticsearch.discovery] - [Gertrude Yorkes]
    elasticsearch/opY-KHL1Sl28hsUlYf_RaA
    2012-06-05 17:43:30,862 INFO [org.elasticsearch.http]

  • [Gertrude Yorkes] bound_address {inet[/0.0.0.0:9200]},
    publish_address {inet[/1.13.0.17:9200]}
    2012-06-05 17:43:30,863 INFO [org.elasticsearch.node]

  • [Gertrude Yorkes] {0.19.4}[4044]: started
    {"message":"La vispa teresina avea
    tra","contenuto":"SKIPPED
    CONTENT","data":1338911010865,"user":"Tullio"}
    2012-06-05 17:43:35,300 INFO [org.elasticsearch.gateway]

  • [Gertrude Yorkes] recovered [1] indices into cluster_state
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in
    [jar:file:/I:/Maven/repository/org/slf4j/slf4j-log4j12/1.6.1/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in
    [jar:file:/I:/Maven/repository/org/apache/tika/tika-app/1.1/tika-app-1.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See
    http://www.slf4j.org/codes.html#multiple_bindings
    http://www.slf4j.org/codes.html#multiple_bindings for an
    explanation.
    millis 9858
    {"query":{"term":{"contenuto":"BARALDI"}}}
    Trovati 0

            Il giorno martedì 5 giugno 2012 17:55:55 UTC+2, David
    

Pilato ha scritto:
> > > > > > Anything in logs ?

            What do you get when doing
            curl

http://localhost:9200/documentale/documento/_mapping
http://localhost:9200/documentale/documento/_mapping

            Le 5 juin 2012 à 17:49, tullio0106 <

tbettinazzi@axioma.it mailto:tbettinazzi@axioma.it > a écrit :

             > > > > > > > I created a new index with the
             > > > > > > > following code :
                 private static void createIndex(Client

xpClient) throws JsonGenerationException, JsonMappingException,
IOException {
CreateIndexRequest cri = new
CreateIndexRequest("documentale");
HashMap<String,Object> indice = new
HashMap<String,Object>();
HashMap<String,Object> documento = new
HashMap<String,Object>();
HashMap<String,Object> mappings = new
HashMap<String,Object>();
HashMap<String,Object> properties = new
HashMap<String,Object>();
HashMap<String,Object> contenuto = new
HashMap<String,Object>();
contenuto.put("type","attachment");
contenuto.put("_content_type",
"application/pdf");
properties.put("contenuto", contenuto);
documento.put("properties", properties);
mappings.put("documento",documento);
indice.put("mappings", mappings);
StringWriter sw = new StringWriter();
mapper.writeValue(sw, indice);
String json = sw.getBuffer().toString();
System.out.println(json);
cri.source(json);

                    xpClient.admin().indices().create(cri).actionGet();
                 }

             Then I inserted a document using :

                     HashMap<String,Object> mappa = new

HashMap<String,Object>();
mappa.put("user", "Tullio");
mappa.put("data", new Date());
mappa.put("message", "La vispa teresina
avea tra");
long inizio = new Date().getTime();
FileInputStream fis = new
FileInputStream("C:\Tmp\mac_A17882.pdf");
byte[] contiene =
IOUtils.toByteArray(fis);
String contenuto = new
String(JsonUtils.encode(contiene));
mappa.put("contenuto", contenuto);
StringWriter sw = new StringWriter();
mapper.writeValue(sw, mappa);
String json = sw.getBuffer().toString();
System.out.println(json);
IndexResponse irb =
c.prepareIndex("documentale","documento","2").setSource(json).execute().actionGet();

             and I tried to find it using :

                     mappa = new HashMap<String,Object>();
                     HashMap<String,Object> query = new

HashMap<String,Object>();
HashMap<String,Object> term = new
HashMap<String,Object>();
term.put("contenuto", "BARALDI");
query.put("term", term);
mappa.put("query", query);
sw = new StringWriter();
mapper.writeValue(sw, mappa);
json = sw.getBuffer().toString();
System.out.println(json);
SearchResponse response =
c.prepareSearch("documentale").setTypes("documento").setSource(json).execute().actionGet();
SearchHits sh = response.getHits();

             But I got no results.
             I'm sure the pdf document contains the "BARALDI"

string.

             What I mmissed ?
             What's wrong ?
             Tks
             Tullio


            > > > > > > 

--
David Pilato
http://dev.david.pilato.fr/
Twitter : @dadoonet


(tullio0106) #5

I tried with some others pdf documents and with many of them it worked.
Is the indexing dependent from the document content ?
What can be wrong ?
Tks
Tullio


(system) #6