Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I've upgraded my Elasticsearch cluster from 1.1 to 1.2 and I have errors when indexing a somewhat big string.

{
  "error": "IllegalArgumentException[Document contains at least one immense term in field="response_body" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped.  Please correct the analyzer to not produce such terms.  The prefix of the first immense term is: '[7b 22 58 48 49 5f 48 6f 74 65 6c 41 76 61 69 6c 52 53 22 3a 7b 22 6d 73 67 56 65 72 73 69]...']",
  "status": 500
}

The mapping of the index :

{
  "template": "partner_requests-*",
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 1
  },
  "mappings": {
    "request": {
      "properties": {
        "asn_id": { "index": "not_analyzed", "type": "string" },
        "search_id": { "index": "not_analyzed", "type": "string" },
        "partner": { "index": "not_analyzed", "type": "string" },
        "start": { "type": "date" },
        "duration": { "type": "float" },
        "request_method": { "index": "not_analyzed", "type": "string" },
        "request_url": { "index": "not_analyzed", "type": "string" },
        "request_body": { "index": "not_analyzed", "type": "string" },
        "response_status": { "type": "integer" },
        "response_body": { "index": "not_analyzed", "type": "string" }
      }
    }
  }
}

I've searched the documentation and didn't find anything related to a maximum field size. According to the core types section I don't understand why I should "correct the analyzer" for a not_analyzed field.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
1.2k views
Welcome To Ask or Share your Answers For Others

1 Answer

So you are running into an issue with the maximum size for a single term. When you set a field to not_analyzed it will treat it as one single term. The maximum size for a single term in the underlying Lucene index is 32766 bytes, which is I believe hard coded.

Your two primary options are to either change the type to binary or to continue to use string but set the index type to "no".


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share

548k questions

547k answers

4 comments

86.3k users

...