Recherche de valeur distincte d'un champs

Bonjour,
Je stocke dans un index ce genre de document.
Un search par un champs me donne

curl -s -XGET 'localhost:9200/loadresult/result/_search?pretty' -H 'Content-Type: application/json'  -d'
{
    "query" : {
        "constant_score" : {
            "filter" : {
                "term" : {
                    "externalId" : "88"
                }
            }
        }
    }
,"sort":  { "timestamp": { "order": "desc" }}
}'

Reponse:
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : null,
"hits" : [
{
"_index" : "loadresult",
"_type" : "result",
"_id" : "76929376-bbbb-44ba-b6bf-9f1ddc8445d2",
"_score" : null,
"_source" : {
"serverInfo" : {
"jettyVersion" : "9.4.8.v20171121",
"availableProcessors" : 32,
"totalMemory" : 1098907648,
"gitHash" : "82b8fb23f757335bb3329d540ce37a2a2615f0a8",
"javaVersion" : "1.8.0_162"
},
"collectorInformations" : {
"totalCount" : 112407,
"minValue" : 153600,
"maxValue" : 27901951,
"value50" : 407551,
"value90" : 467455,
"mean" : 422808.0,
"stdDeviation" : 258593.0,
"startTimeStamp" : 1523187253297,
"endTimeStamp" : 1523187503420
},
"loadConfigs" : [
{
"threads" : 8,
"warmupIterationsPerThread" : 0,
"iterationsPerThread" : 1,
"runFor" : 240,
"usersPerThread" : 1,
"channelsPerUser" : 6,
"resourceRate" : 500,
"scheme" : "http",
"host" : "10.0.0.20",
"port" : 8080,
"maxRequestsQueued" : 1024,
"type" : "PROBE",
"resourceNumber" : 0,
"instanceNumber" : 0
},
{
"threads" : 8,
"warmupIterationsPerThread" : 0,
"iterationsPerThread" : 1,
"runFor" : 240,
"usersPerThread" : 4,
"channelsPerUser" : 10,
"resourceRate" : 700,
"scheme" : "http",
"host" : "10.0.0.20",
"port" : 8080,
"maxRequestsQueued" : 50000,
"type" : "LOADER",
"resourceNumber" : 30,
"instanceNumber" : 2
}
],
"uuid" : "76929376-bbbb-44ba-b6bf-9f1ddc8445d2",
"externalId" : "88",
"comment" : null,
"uuidPrefix" : null,
"timestamp" : "2018-04-08T06:38.23-0500"
},
"sort" : [
1523187493800
]
}
]
}
}

Je cherche maintenant à connaître tous les valeurs possibles du champs jettyVersion.
Je pensais pouvoir utiliser:

curl -s -XGET 'localhost:9200/loadresult/result/_search?pretty' -H 'Content-Type: application/json'  -d'
{
    "aggs" : {
        "jettyVersion" : {
            "terms" : { "field" : "serverInfo.jettyVersion" }
        }
    }
}'

La reponse est

 {
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "Fielddata is disabled on text fields by default. Set fielddata=true on [serverInfo.jettyVersion] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
      }
    ],
    "type" : "search_phase_execution_exception",
    "reason" : "all shards failed",
    "phase" : "query",
    "grouped" : true,
    "failed_shards" : [
      {
        "shard" : 0,
        "index" : "loadresult",
        "node" : "cDMu8hfQQF2L-WpMbCAqVg",
        "reason" : {
          "type" : "illegal_argument_exception",
          "reason" : "Fielddata is disabled on text fields by default. Set fielddata=true on [serverInfo.jettyVersion] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
        }
      }
    ]
  },
  "status" : 400
}

Donc je pensais pouvoir utiliser

curl -s -XPUT 'localhost:9200/loadresult/_mapping/_doc?pretty' -H 'Content-Type: application/json'  -d'
{
  "properties": {
    "serverInfo.jettyVersion": {
      "type":     "text",
      "fielddata": true
    }
  }
}
'
response:
{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "Mapper for [serverInfo.jettyVersion] conflicts with existing mapping in other types:\n[mapper [serverInfo.jettyVersion] is used by multiple types. Set update_all_types to true to update [fielddata] across all types.]"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "Mapper for [serverInfo.jettyVersion] conflicts with existing mapping in other types:\n[mapper [serverInfo.jettyVersion] is used by multiple types. Set update_all_types to true to update [fielddata] across all types.]"
  },
  "status" : 400
}

curl -s -XPUT 'localhost:9200/loadresult/_mapping/_doc?pretty&update_all_types=true' -H 'Content-Type: application/json'  -d'
{
  "properties": {
    "serverInfo.jettyVersion": {
      "type":     "text",
      "fielddata": true
    }
  }
}
'

response

{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "Rejecting mapping update to [loadresult] as the final mapping would have more than 1 type: [result, _doc]"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "Rejecting mapping update to [loadresult] as the final mapping would have more than 1 type: [result, _doc]"
  },
  "status" : 400
}

Si quelqu'un a une idée je suis preneur :slight_smile:

Est-ce que vous avez activé fieldata dans le mapping de l'index?

J'ai bien essayé via

curl -s -XPUT 'localhost:9200/loadresult/_mapping/_doc?pretty' -H 'Content-Type: application/json'  -d'
{
  "properties": {
    "serverInfo.jettyVersion": {
      "type":     "text",
      "fielddata": true
    }
  }
}
'
response:
{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "Mapper for [serverInfo.jettyVersion] conflicts with existing mapping in other types:\n[mapper [serverInfo.jettyVersion] is used by multiple types. Set update_all_types to true to update [fielddata] across all types.]"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "Mapper for [serverInfo.jettyVersion] conflicts with existing mapping in other types:\n[mapper [serverInfo.jettyVersion] is used by multiple types. Set update_all_types to true to update [fielddata] across all types.]"
  },
  "status" : 400
}

ou

curl -s -XPUT 'localhost:9200/loadresult/_mapping/_doc?pretty&update_all_types=true' -H 'Content-Type: application/json'  -d'
{
  "properties": {
    "serverInfo.jettyVersion": {
      "type":     "text",
      "fielddata": true
    }
  }
}
'

response

{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "Rejecting mapping update to [loadresult] as the final mapping would have more than 1 type: [result, _doc]"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "Rejecting mapping update to [loadresult] as the final mapping would have more than 1 type: [result, _doc]"
  },
  "status" : 400
}

Pas la bonne requete?

Salut Olivier

Tu ne peux pas changer un champ déjà indexé.

Ici en l'occurence au lieu d'activer fielddata, je changerai le type du champ serverInfo.jettyVersion en keyword.

A moins que tu ne veuilles faire aussi du fulltext search dessus auquel cas, je le mettrai en text et un sous-champ de type keyword:

DELETE test
PUT test
{
  "mappings": {
    "doc": {
      "properties": {
        "serverInfo.jettyVersion": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword"
            }
          }
        }
      }
    }
  }
}
PUT test/doc/1
{
  "serverInfo.jettyVersion": "1.2.3"
}
GET test/_search
{
  "size": 0,
  "aggs": {
    "version": {
      "terms": {
        "field": "serverInfo.jettyVersion.keyword"
      }
    }
  }
}

Salut David,
J'ai effacé l'index et recréé ainsi mais maintenant lorsque je veux "post" des valeurs j'ai en retour:

{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Rejecting mapping update to [loadresult] as the final mapping would have more than 1 type: [result, doc]"}],"type":"illegal_argument_exception","reason":"Rejecting mapping update to [loadresult] as the final mapping would have more than 1 type: [result, doc]"},"status":400}

Je poste simplement un document ainsi:

        ContentResponse contentResponse = httpClient.newRequest( host, port ).scheme( scheme ) //
            .path( "/loadresult/result/" + loadResult.getUuid() ) //
            .content( new StringContentProvider( stringWriter.toString() ) ) //
            .method( HttpMethod.PUT ) //
            .header( "Content-Type", "application/json" ) //
            .send();

Je continue de chercher :slight_smile:

bon le secret est dans la typo :slight_smile:
curl -s -XPUT 'localhost:9200/loadresult/_mapping/doc?pretty'
au lieu de
curl -s -XPUT 'localhost:9200/loadresult/_mapping/_doc?pretty'

je ne suis pas sur de bien comprendre le _doc vs le doc :slight_smile:
merci!

En fait ce paramètre doc ou _doc est le nom du type.

Dans les anciennes versions nous supportions plusieurs types par index.
C'est en train d'être supprimé progressivement.

En v6, tu ne peux avoir qu'un seul type. Donc tu peux indexer en faisant:

PUT index/type/1
{ ... }

Mais si derrière tu fais:

PUT index/type_nouveau/1
{ ... }

Tu auras un rejet car cela reviendrait à avoir 2 types dans le même index.

Par convention, on utilise maintenant _doc comme nom de type. Mais j'ai la fâcheuse habitude d'utiliser doc (qui était la convention précédente).

le but est que dans le futur on migre vers:

PUT index/1
{ ... }

en tout cas tout fonctionne super!!
Merci pour l'aide!

En fait un petit probleme :slight_smile:
Le champs dont je cherche les valeurs distinctes peut contenir:
"jettyVersion" : "9.4.8.v20171121",

En faisant:

curl -s -XGET 'localhost:9200/loadresult/_search?pretty' -H 'Content-Type: application/json'  -d'
{
  "size": 0,
  "aggs": {
    "version": {
      "terms": {
        "field": "serverInfo.jettyVersion"
      }
    }
  }
}

Pour 9.4.8.v20171121 J'obtiens un entree 9.4.8 et une v20171121

"buckets" : [
{
"key" : "9.4.8",
"doc_count" : 3
},
{
"key" : "v20171121",
"doc_count" : 3
},
{
"key" : "9.1",
"doc_count" : 1
},
{
"key" : "9.2",
"doc_count" : 1
}
]

Quel est le mapping ?

GET yourindexname/_mapping

Pour ce champs

      "jettyVersion" : {
        "type" : "text",
        "fields" : {
          "keyword" : {
            "type" : "keyword",
            "ignore_above" : 256
          }
        },
        "fielddata" : true
      }

elasticsearch version 6.2.3

Run the agg on jettyVersion.keyword

Super it works!!
Merci!!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.