Elasticsearch Document Update API详解、原理与示例

Original 丁威中间件兴趣圈 2022-11-10

收录于合集 #ElasticSearch实战 30个

本文将详细介绍单文档(Document)的更新API，其更新API如下：

public final UpdateResponse update(UpdateRequest updateRequest, RequestOptions options) throws IOException
public final void updateAsync(UpdateRequest updateRequest, RequestOptions options, ActionListener<UpdateResponse> listener)

重点关注UpdateRequest。

1、UpdateRequest详解

UpdateRequest的核心类图如图所示：

我们首先来看一下UpdateRequest的核心属性：

protected ShardId shardId：指定需要执行的分片信息。
protected String index：索引库，类似于关系型数据库的数据库。
private String type：类型名，类似于关系数据库的表。
private String id ：文档ID，类似于关系数据库的主键ID。
private String routing：分片值，默认为id的值，es的分片路由算法为( hashcode(routing) % primary_sharding_count)
private String parent：
Script script：通过脚步更新文档。
private String[] fields：需要返回的字段信息，默认为不返回，已废弃，被fetchSourceContext代替。
private FetchSourceContext fetchSourceContext：执行更新操作后，如果命中，需要返回_source的上下文配置，与fields的区别是fetchSourceContext支持通配符表达式来匹配字段名，已经在Elasticsearch Document Get API详解、原理与示例详细介绍过
private long version = Versions.MATCH_ANY：版本号
private VersionType versionType = VersionType.INTERNAL：版本类型，分为内部版本、外部版本，默认为内部版本。
private int retryOnConflict = 0：更新冲突时重试次数。
private RefreshPolicy refreshPolicy = RefreshPolicy.NONE：刷新策略。NONE：代表不重试；
private ActiveShardCount waitForActiveShards = ActiveShardCount.DEFAULT：执行操作之前需要等待激活的副本数，已在Elasticsearch Document Get API详解、原理与示例中详细介绍。
private IndexRequest upsertRequest：使用该字段进行更新操作，如果原索引不存在，则更新，类似于saveOrUpdate操作，该操作需要与脚步执行，详细将在后续章节中描述，
private boolean scriptedUpsert = false;是否是用脚步执行更新操作。
private boolean docAsUpsert = false; 是否使用saveOrUpdate模式，即是否使用IndexRequest upsertRequest进行更新操作。(docAsUpser=true+ doc组合，将使用saveOrUpdate模式)。
private boolean detectNoop = true;是否检查空操作，下文会进行详细介绍。
private IndexRequest doc;默认使用该请求进行更新操作。

从上述我们基本可以得知更新基本有3种方式，script、upsert、doc(普通更新)。

2、深入分析Elasticsearch Update API（更新API）

2.1 Script脚步更新

Elasticsearch可以通过脚本(painless)进行更新，本节将不会深入去学习其语法，后续会看单独的章节对其进行详细讲解。

2.2 部分字段更新（普通更新方式）

更新API支持传递一个部分文档（_source字段中包含类型的部门字段），它将被合并到现有的文档中（简单的递归合并，对象的内部合并，替换核心的“键/值”和数组）。如果需要完全替代现有的文档，请使用(Index API)。以下部分更新为现有文档添加了一个新字段：(下文会给出基于java的API调用)。

POST test/_doc/1/_update
{
    "doc" : {
        "name" : "new_name"
    }
}

如果指定了doc和script，则script属性优先，关于更新API一个比较好的实践是使用脚步更新（painless），后续会重点章节详细介绍。

2.3 检测空更新（检测本请求是否值得更新）

该功能特性的意思是当提交的请求，发现与原文档的数据并未发送变化，是否执行update操作，默认检测。如果开启检测，detectNoop=true，如果检测到数据并未发生变化，则返回结果为noop（空操作），如果detectNoop=false，每次操作都会执行，版本号将自增。

2.4 保存或更新(Upserts)

如果文档还不存在，upsert元素的内容将作为新文档插入。Elasticsearch支持scripted_upsert和doc_as_upsert两种模式，以scripted_upsert优先。通过UpdateRequest#scriptedUpsert和UpdateRequest#docAsUpsert控制。

2.5 核心参数一览表

更新API主要核心参数一览表：

参数名	说明
retry_on_conflict	Elasticsearch基于版本进行乐观锁控制，当版本冲突后，允许的重试次数，超过重试次数retry_on_conflict后抛出异常。
routing	路由策略。
timeout	等待分片的超时时间。
wait_for_active_shards	在执行命令之前需要等待副本的数量。
refresh	刷新机制
_source	允许在响应中控制更新后的源是否和如何返回。默认情况下，更新的源代码不会返回。有关源字段过滤，请参考《Elasticsearch Document Get API详解、原理与示例》中详细介绍。
version	版本字段，基于乐观锁控制。

注意：更新API不支持除内部以外的版本控制，外部（版本类型外部和外部的）或强制（版本类型的force）版本控制不受更新API的支持，因为它会导致弹性搜索版本号与外部系统不同步。

3、Update API使用示例

本节将暂时不会展示使用脚步进行更新的Demo，此部分会在后续文章中单独的章节来介绍ElasticSearch painless Script。

3.1 常规更新（更新部分字段）

public static void testUpdate_partial() {
        RestHighLevelClient client = EsClient.getClient();
        try {
            UpdateRequest request = new UpdateRequest("twitter", "_doc", "10");
            IndexRequest indexRequest = new IndexRequest("twitter", "_doc", "10");
            Map<String, String> source = new HashMap<>();
            source.put("user", "dingw2");
            indexRequest.source(source);
            request.doc(indexRequest);
            UpdateResponse result = client.update(request, RequestOptions.DEFAULT);
            System.out.println(result);
            testGet();
        } catch (Throwable e) {
            e.printStackTrace();
        } finally {
            EsClient.close(client);
        }
    }

最终结果：调用get API能反映出user字段已经更新为dingw2，及更新成功。

3.2 开启detectNoop示例（并且不改变原始数据）

public static void testUpdate_noop() {
        RestHighLevelClient client = EsClient.getClient();
        try {
            UpdateRequest request = new UpdateRequest("twitter", "_doc", "10");
            request.detectNoop(true);
            request.doc(buildIndexRequest());

            UpdateResponse result = client.update(request, RequestOptions.DEFAULT);
            System.out.println(result);
        } catch (Throwable e) {
            e.printStackTrace();
        } finally {
            EsClient.close(client);
        }
    }

返回结果：

{
   "_shards": {
        "total": 0,
        "successful": 0,
        "failed": 0
   },
   "_index": "twitter",
   "_type": "_doc",
   "_id": "10",
   "_version": 6,
   "result": "noop"
}

3.3不开启detectNoop示例（并且不改变原始数据）

public static void testUpdate_no_noop() {
        RestHighLevelClient client = EsClient.getClient();
        try {
            UpdateRequest request = new UpdateRequest("twitter", "_doc", "10");
            request.detectNoop(false);
            request.doc(buildIndexRequest());
            UpdateResponse result = client.update(request, RequestOptions.DEFAULT);
            System.out.println(result);
        } catch (Throwable e) {
            e.printStackTrace();
        } finally {
            EsClient.close(client);
        }
    }

返回结果：

{
   "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
   },
   "_index": "twitter",
   "_type": "_doc",
   "_id": "10",
   "_version": 7,
   "result": "updated"
}

其主要特征表现为result=updated，表示执行的动作为更新，并且版本号自增1，_shards反馈的是各分片的执行情况。

3.4 saveOrUpdate更新模式（upsert）

/**
     * 更新操作，原记录不存在，使用saveOrUpdate模式。
     */
    public static void testUpdate_upsert() {
        RestHighLevelClient client = EsClient.getClient();
        try {
            UpdateRequest request = new UpdateRequest("twitter", "_doc", "11");
            IndexRequest indexRequest = new IndexRequest("twitter", "_doc", "11");
            Map<String, String> source = new HashMap<>();
            source.put("user", "dingw");
            source.put("post_date", "2009-11-17T14:12:12");
            source.put("message", "hello,update upsert。");

            indexRequest.source(source);
            request.doc(indexRequest);
            request.docAsUpsert(true);
            UpdateResponse result = client.update(request, RequestOptions.DEFAULT);
            System.out.println(result);
        } catch (Throwable e) {
            e.printStackTrace();
        } finally {
            EsClient.close(client);
        }
    }

返回结果：

/**
     * 更新操作，原记录不存在，使用saveOrUpdate模式。
     */
    public static void testUpdate_upsert() {
        RestHighLevelClient client = EsClient.getClient();
        try {
            UpdateRequest request = new UpdateRequest("twitter", "_doc", "11");
            IndexRequest indexRequest = new IndexRequest("twitter", "_doc", "11");
            Map<String, String> source = new HashMap<>();
            source.put("user", "dingw");
            source.put("post_date", "2009-11-17T14:12:12");
            source.put("message", "hello,update upsert。");

            indexRequest.source(source);
            request.doc(indexRequest);
            request.docAsUpsert(true);
            UpdateResponse result = client.update(request, RequestOptions.DEFAULT);
            System.out.println(result);
        } catch (Throwable e) {
            e.printStackTrace();
        } finally {
            EsClient.close(client);
        }
    }

返回结果：

{
   "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
   },
   "_index": "twitter",
   "_type": "_doc",
   "_id": "11",
   "_version": 1,
   "result": "created"
}

返回结果其核心表现为：result:created，表示是一个新增操作。

Document API就讲解到这里了，本节详细介绍了Document Update API的核心关键点以及实现要点，最后给出Demo展示如何在JAVA中使用Update API。

更多文章请关注微信公众号中间件兴趣圈：

薄公子低调成台湾女婿 23日已在台举办婚礼

警察殴打打人学生，舆论撕裂的背后

你手放哪呢，出生啊

薅广电羊毛！100元话费实付94.6元，还有电费96.9充100元！招团长~

警察踢打校园欺凌者：当事人不愿返校，派出所拒收锦旗

Elasticsearch Document Update API详解、原理与示例

您可能也对以下帖子感兴趣

薄公子低调成台湾女婿 23日已在台举办婚礼

警察殴打打人学生，舆论撕裂的背后

你手放哪呢，出生啊​

薅广电羊毛！100元话费实付94.6元，还有电费96.9充100元！招团长~

警察踢打校园欺凌者：当事人不愿返校，派出所拒收锦旗

生成图片，分享到微信朋友圈

Elasticsearch Document Update API详解、原理与示例

您可能也对以下帖子感兴趣

你手放哪呢，出生啊