Elasticsearch 设置默认值的三种方式

程序猿DD 2021-11-13

The following article is from 铭毅天下Elasticsearch Author 铭毅天下

1、实战问题

在使用 Elasticsearch 过程中，不免还会有 Mysql 等关系型数据库的使用痕迹，以下两个都是实战开发问到的问题：

Elasticsearch 新增字段，能在 Mapping 设置默认值吗？
Elasticsearch 有什么好的方式维护文档的 create_time （创建时间）和 update_time （更新时间）吗？

本文就从 Elasticsearch 默认值的实现方案说开去。

2、Elasticsearch Mapping 层面默认值

认知前提：严格讲 Elasticsearch 是不支持 Mapping 层面设置数据类型的时候，设置字段的默认值的。

有人会说，null value 设置算不算？不算。

大家看一下：

PUT my-index-000001
{
  "mappings": {
    "properties": {
      "status_code": {
        "type":       "keyword",
        "null_value": "NULL" 
      }
    }
  }
}

null_value 的本质是将“NULL” 替换 null 值，以使得空值可被索引或者检索。

我们期望设置 Mapping 的时候，可以对各种数据类型添加一个任意指定的缺省值。但是 Elasticsearch Mapping 层面不支持，咋办？

只能去寻找其他的方案。

3、曲线救国实现 Elasticsearch 设置默认值

直接给出答案，共三种设置默认值的。

3.1 方案一：pipeline 设置默认值

# 创建 append 管道
PUT _ingest/pipeline/add_default_pipeline
{
  "processors": [
    {
      "set": {
        "field": "sale_count",
        "value": 1
      }
    }
  ]
}

# 创建索引
PUT customer
{
  "mappings":{
    "properties":{
      "sale_count":{
        "type":"integer"
      },
      "major":{
        "type":"keyword",
         "null_value": "NULL" 
      }
    }
  },
  "settings": {
    "index":{
      "default_pipeline":"add_default_pipeline"
    }
  }
}

插入数据，验证一把：

POST customer/_doc/1
{
  "major":null
}

返回结果：

  "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "customer",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "major" : null,
          "sale_count" : 1
        }
      }
    ]
  }

以上的方式，实现了sale_count 的默认值为1 的设置。

是借助索引设计层面在 setting 中关联 default_pipeline 实现的。

实现方式相对简单，能保证用户在设置索引的前提下，用户只关注写入数据，其他后台预处理管道帮助实现细节。

引申一下，针对开篇提到的第二个问题：

create_time 借助 pipeline 管道预处理 set processor 实现即可。

PUT _ingest/pipeline/create_time_pipeline
{
  "description": "Adds create_time timestamp to documents",
  "processors": [
    {
      "set": {
        "field": "_source.create_time",
        "value": "{{_ingest.timestamp}}"
      }
    }
  ]
}

DELETE my_index_0003
PUT my_index_0003
{
 "settings": {
  "index.default_pipeline": "create_time_pipeline"
 }
}

POST my_index_0003/_doc/1
{}

GET my_index_0003/_search

update_time 自己维护更新，业务更新的时刻通过代码或者脚本加上时间戳就可以。

3.2 方案二：update_by_query 通过更新添加默认值

POST customer/_doc/2
{
  "major":null
}

# 批量更新脚本
POST customer/_update_by_query
{
  "script": {
    "lang": "painless",
    "source": "if (ctx._source.major == null) {ctx._source.major = 'student'}"
  }
}


POST customer/_search

结果是：

所有 major 为 null 的，都实现了更新，设置成了：“student"。

该方式属于先写入数据，然后实现数据层面的更新，算作设置默认值甚至都有点勉强。

3.3 方案三：借助 pipeline script 更新


PUT _ingest/pipeline/update_pipeline
{
  "processors": [
    {
      "script": {
        "lang": "painless",
        "source": """
              if (ctx['major'] == null) {ctx['major'] = 'student'}
          """
      }
    }
  ]
}
POST customer/_doc/4
{
  "major":null
}


POST customer/_update_by_query?pipeline=update_pipeline
{
  "query": {
    "match_all": {}
  }
}

结果是：同方案二，也实现了更新。

该方案是第二种方案的内卷版本，本质实现基本一致。

强调细节不同点，ctx 取值的时候，细节语法不一样：

脚本script 操作，访问方式：ctx._source.major。
pipeline 预处理脚本操作：访问方式：ctx['major'] 。

4、小结

本文讲解了 Elasticsearch 实现类关系型数据库默认值的三种方案，只有第一种属于前置设置默认值。

后两种都是先写入后设置默认值的脚本更新实现方案。实战方案选型，推荐方案一。

往期推荐

反996，竟反掉了17%的薪水？！而隔壁的房贷还在涨，怎么就越反越糟糕了呢？

瓜分60万现金大奖，第二届云原生编程挑战赛等你来报名！

以云原生的名义，举办一次技术聚会

Spring中涉及的设计模式总结

假如 IDEA 也加入防沉迷功能...

喜欢本文欢迎转发，关注我订阅更多精彩

关注我回复「加群」，加入Spring技术交流群

: ， . Video Mini Program Like ，轻点两下取消赞 Wow ，轻点两下取消在看

宾曰语云被法学教授投诉：严重侵权，“违法犯罪”！

京东Plus的隐藏特权，很多会员都没领取，白交了会员费...

呼吁四川大学澄清：1998年1月，川大有多少个“姜涛与爱人程月玲”？

二湘：朱令去世一周年，清华学子控诉清华在朱令案中的冷血和无耻

多长高8厘米！国内知名专家首次公开“追高秘笈”！担心孩子长不高的家长速来！

Elasticsearch 设置默认值的三种方式

1、实战问题

2、Elasticsearch Mapping 层面默认值

3、曲线救国实现 Elasticsearch 设置默认值

3.1 方案一：pipeline 设置默认值

3.2 方案二：update_by_query 通过更新添加默认值

3.3 方案三：借助 pipeline script 更新

4、小结

您可能也对以下帖子感兴趣

宾曰语云被法学教授投诉：严重侵权，“违法犯罪”！

京东Plus的隐藏特权，很多会员都没领取，白交了会员费...

呼吁四川大学澄清：1998年1月，川大有多少个“姜涛与爱人程月玲”？

二湘：朱令去世一周年，清华学子控诉清华在朱令案中的冷血和无耻

多长高8厘米！国内知名专家首次公开“追高秘笈”！担心孩子长不高的家长速来！

生成图片，分享到微信朋友圈

Elasticsearch 设置默认值的三种方式

1、实战问题

2、Elasticsearch Mapping 层面默认值

3、曲线救国实现 Elasticsearch 设置默认值

3.1 方案 一：pipeline 设置默认值

3.2 方案 二：update_by_query 通过更新添加默认值

3.3 方案 三：借助 pipeline script 更新

4、小结

您可能也对以下帖子感兴趣

3.1 方案一：pipeline 设置默认值

3.2 方案二：update_by_query 通过更新添加默认值

3.3 方案三：借助 pipeline script 更新