查看原文
其他

死锁案例三

有赞coder 2019-06-04

文 | 杨一 on 运维

转 | 来源:公众号yangyidba


一、前言

死锁其实是一个很有意思也很有挑战的技术问题,大概每个 DBA 和部分开发朋友都会在工作过程中遇见。关于死锁我会持续写一个系列的案例分析,希望能够对想了解死锁的朋友有所帮助。

二、背景知识

2.1 insert 锁机制

在分析死锁案例之前,我们先学习一下背景知识 insert 语句的加锁策略。我们先来看看官方定义:

An insert intention lock is a type of gap lock set by INSERT operations prior to row insertion. This lock signals the intent to insert in such a way that multiple transactions inserting into the same index gap need not wait for each other if they are not inserting at the same position within the gap. Suppose that there are index records with values of 4 and 7. Separate transactions that attempt to insert values of 5 and 6, respectively, each lock the gap between 4 and 7 with insert intention locks prior to obtaining the exclusive lock on the inserted row, but do not block each other because the rows are nonconflicting.

相信大部分的 DBA 同行都知道在事务执行 insert 的时候会申请一把插入意向锁 Insert Intention Lock。在多事务并发写入不同数据记录至同一索引间隙的时候,并不需要等待其他事务完成,不会发生锁等待。假设有一个索引记录包含键值 4 和 7,不同的事务分别插入 5 和 6,每个事务都会产生一个加在 4-7 之间的插入意向锁,获取在插入行上的排它锁,但是不会被互相锁住,因为数据行并不冲突。

但是如果遇到唯一键呢?

"If a duplicate-key error occurs, a shared lock on the duplicate index record is set."

对于 insert 操作来说,若发生唯一约束冲突,则需要对冲突的唯一索引加上 S Next-key Lock。从这里会发现,即使是 RC 事务隔离级别,也同样会存在 Next-Key Lock 锁,从而阻塞并发。然而,文档没有说明的是,对于检测到冲突的唯一索引,等待线程在获得S Lock之后,还需要对下一个记录进行加锁,在源码中由函数row_ins_scan_sec_index_for_duplicate 进行判断。via (MySQL REPLACE 死锁问题深入剖析 )。我们可以通过如下例子进行验证。

2.2 验证

准备环境默认事务隔离级别为 RC 模式。

  1. CREATE TABLE t8 (

  2. a int AUTO_INCREMENT PRIMARY KEY,

  3. b int,

  4. c int,

  5. unique key ub(b)

  6. ) engine=InnoDB;

  7. insert into t8 values (NULL,1,2)


2.3 过程分析

在每次执行一条语句之后都执行 show innodb engine status 查看事务的状态, 执行完 delete 语句,事务相关日志显示如下:

  1. ---TRANSACTION 462308671, ACTIVE 6 sec

  2. 3 lock struct(s), heap size 360, 2 row lock(s), undo log entries 1

  3. MySQL thread id 3796960, OS thread handle 0x7f78eaabe700, query id 781051370 localhost root init

  4. show engine innodb status

  5. TABLE LOCK table `test`.`t8` trx id 462308671 lock mode IX

  6. RECORD LOCKS space id 232 page no 4 n bits 72 index `ub` of table `test`.`t8` trx id 462308671 lock_mode X locks rec but not gap

  7. RECORD LOCKS space id 232 page no 3 n bits 72 index `PRIMARY` of table `test`.`t8` trx id 462308671 lock_mode X locks rec but not gap

从日志中我们可以看到 delete 语句获取了唯一索引 ub 和主键两个行级锁(lock_mode X locks rec but not gap) 。

执行完 insert 之后 再查看 innodb engine status,事务相关日志显示如下:

  1. LIST OF TRANSACTIONS FOR EACH SESSION:

  2. ---TRANSACTION 462308676, ACTIVE 4 sec inserting

  3. mysql tables in use 1, locked 1

  4. LOCK WAIT 2 lock struct(s), heap size 360, 1 row lock(s), undo log entries 1

  5. MySQL thread id 3796966, OS thread handle 0x7f78ea5c4700, query id 781051460 localhost root update

  6. insert into t8 values (NULL,1,2)

  7. ------- TRX HAS BEEN WAITING 4 SEC FOR THIS LOCK TO BE GRANTED:

  8. RECORD LOCKS space id 232 page no 4 n bits 72 index `ub` of table `test`.`t8` trx id 462308676 lock mode S waiting

  9. ------------------

  10. TABLE LOCK table `test`.`t8` trx id 462308676 lock mode IX

  11. RECORD LOCKS space id 232 page no 4 n bits 72 index `ub` of table `test`.`t8` trx id 462308676 lock mode S waiting

  12. ---TRANSACTION 462308671, ACTIVE 70 sec

  13. 3 lock struct(s), heap size 360, 2 row lock(s), undo log entries 1

  14. MySQL thread id 3796960, OS thread handle 0x7f78eaabe700, query id 781051465 localhost root init

  15. show engine innodb status

  16. TABLE LOCK table `test`.`t8` trx id 462308671 lock mode IX

  17. RECORD LOCKS space id 232 page no 4 n bits 72 index `ub` of table `test`.`t8` trx id 462308671 lock_mode X locks rec but not gap

  18. RECORD LOCKS space id 232 page no 3 n bits 72 index `PRIMARY` of table `test`.`t8` trx id 462308671 lock_mode X locks rec but not gap

根据官方的介绍,并结合日志,我们可以看到 insert into t8 values (NULL,1,2)在申请一把 S Next-key-Lock , 显示 lock mode S waiting。这里想给大家说明的是在 Innodb 日志中如果提示 lock mode S/lock mode X ,其实都是 gap 锁,如果是行记录锁会提示 but not gap,请读者朋友们在自己分析死锁日志的时候注意。

sess1 delete 语句提交之后,sess2 的 insert 不要提交,不要提交,不要提交。再次查看 innodb engine status,事务相关日志显示如下:

  1. -----------------

  2. TRANSACTIONS

  3. ------------

  4. Trx id counter 462308678

  5. Purge done for trxs n:o < 462308678 undo n:o < 0 state: running but idle

  6. History list length 1845

  7. LIST OF TRANSACTIONS FOR EACH SESSION:

  8. ---TRANSACTION 462308671, not started

  9. MySQL thread id 3796960, OS thread handle 0x7f78eaabe700, query id 781051526 localhost root init

  10. show engine innodb status

  11. ---TRANSACTION 462308676, ACTIVE 41 sec

  12. 3 lock struct(s), heap size 360, 2 row lock(s), undo log entries 1

  13. MySQL thread id 3796966, OS thread handle 0x7f78ea5c4700, query id 781051460 localhost root cleaning up

  14. TABLE LOCK table `test`.`t8` trx id 462308676 lock mode IX

  15. RECORD LOCKS space id 232 page no 4 n bits 72 index `ub` of table `test`.`t8` trx id 462308676 lock mode S

  16. RECORD LOCKS space id 232 page no 4 n bits 72 index `ub` of table `test`.`t8` trx id 462308676 lock mode S locks gap before rec

sess1 中的事务因为提交已经结束。Innodb 中的事务列表中只剩下 sess2 中的 insert 的事务了。从获取锁的状态上看 insert 获取一把 S Next-key Lock 锁和插入行之前的 S GAP 锁。看到这里大家是否有疑惑,官方文档说:

INSERT sets an exclusive lock on the inserted row. This lock is an index-record lock, not a next-key lock (that is, there is no gap lock) and does not prevent other sessions from inserting into the gap before the inserted row.

会对 insert 成功的记录加上一把X行锁,为什么看不见呢?我们再在 sess1 中执行*update t8 set c=13 where b=1; *并查看事务日志

  1. ------------

  2. TRANSACTIONS

  3. ------------

  4. Trx id counter 462308679

  5. Purge done for trxs n:o < 462308678 undo n:o < 0 state: running but idle

  6. History list length 1845

  7. LIST OF TRANSACTIONS FOR EACH SESSION:

  8. ---TRANSACTION 462308678, ACTIVE 12 sec starting index read

  9. mysql tables in use 1, locked 1

  10. LOCK WAIT 2 lock struct(s), heap size 360, 1 row lock(s)

  11. MySQL thread id 3796960, OS thread handle 0x7f78eaabe700, query id 781059217 localhost root updating

  12. update c set c=13 where b=1

  13. ------- TRX HAS BEEN WAITING 12 SEC FOR THIS LOCK TO BE GRANTED:

  14. RECORD LOCKS space id 232 page no 4 n bits 72 index `ub` of table `test`.`t8` trx id 462308678 lock_mode X locks rec but not gap waiting

  15. ------------------

  16. TABLE LOCK table `test`.`t8` trx id 462308678 lock mode IX

  17. RECORD LOCKS space id 232 page no 4 n bits 72 index `ub` of table `test`.`t8` trx id 462308678 lock_mode X locks rec but not gap waiting

  18. ---TRANSACTION 462308676, ACTIVE 5113 sec

  19. 4 lock struct(s), heap size 1184, 3 row lock(s), undo log entries 1

  20. MySQL thread id 3796966, OS thread handle 0x7f78ea5c4700, query id 781059230 localhost root init

  21. show engine innodb status

  22. TABLE LOCK table `test`.`t8` trx id 462308676 lock mode IX

  23. RECORD LOCKS space id 232 page no 4 n bits 72 index `ub` of table `test`.`t8` trx id 462308676 lock mode S


  24. RECORD LOCKS space id 232 page no 4 n bits 72 index `ub` of table `test`.`t8` trx id 462308676 lock mode S locks gap before rec


  25. RECORD LOCKS space id 232 page no 4 n bits 72 index `ub` of table `test`.`t8` trx id 462308676 lock_mode X locks rec but not gap

从日志中可以看到 sess2 的事务持有的锁多了一把 lock_mode X locks rec but not gap,也即是 sess2 对 insert 成功的记录加上的 X 行锁。 

分析至此,对于并发 insert 造成唯一键冲突的时候 insert 的加锁策略是:

第一阶段唯一性约束检查,先申请 LOCKINSERTINTENTION

第二接入获取阶段一的锁并且 insert 成功之后 插入的位置有 Gap 锁:LOCKS + LOCKORDINARY,为了防止其他 insert 唯一键冲突。

插入成功的记录:LOCK_X + LOCK_REC_NOT_GAP

三、案例分析

本案例是两个事务并发 insert 唯一键冲突和 gap 锁一起导致的死锁案例。

3.1 环境

  1. create table t7(

  2.  id int not null primary key auto_increment,

  3.  a int not null ,

  4.  unique key ua(a)

  5. ) engine=innodb;

  6. insert into t7(id,a) values(1,1),(5,4),(20,20),(25,12);

3.2 测试项目

3.3 死锁日志

  1. ------------------------

  2. LATEST DETECTED DEADLOCK

  3. ------------------------

  4. 2017-09-17 15:15:03 7f78eac15700

  5. *** (1) TRANSACTION:

  6. TRANSACTION 462308661, ACTIVE 6 sec inserting

  7. mysql tables in use 1, locked 1

  8. LOCK WAIT 2 lock struct(s), heap size 360, 1 row lock(s), undo log entries 1

  9. MySQL thread id 3796966, OS thread handle 0x7f78ead9d700, query id 781045166 localhost root update

  10. insert into t7(id,a) values(30,10)

  11. *** (1) WAITING FOR THIS LOCK TO BE GRANTED:

  12. RECORD LOCKS space id 231 page no 4 n bits 72 index `ua` of table `test`.`t7` trx id 462308661 lock mode S waiting

  13. *** (2) TRANSACTION:

  14. TRANSACTION 462308660, ACTIVE 43 sec inserting, thread declared inside InnoDB 5000

  15. mysql tables in use 1, locked 1

  16. 4 lock struct(s), heap size 1184, 3 row lock(s), undo log entries 2

  17. MySQL thread id 3796960, OS thread handle 0x7f78eac15700, query id 781045192 localhost root update

  18. insert into t7(id,a) values(40,9)

  19. *** (2) HOLDS THE LOCK(S):

  20. RECORD LOCKS space id 231 page no 4 n bits 72 index `ua` of table `test`.`t7` trx id 462308660 lock_mode X locks rec but not gap

  21. *** (2) WAITING FOR THIS LOCK TO BE GRANTED:

  22. RECORD LOCKS space id 231 page no 4 n bits 72 index `ua` of table `test`.`t7` trx id 462308660 lock_mode X locks gap before rec insert intention waiting

  23. *** WE ROLL BACK TRANSACTION (1)

日志分析

 我们从时间线维度分析:

  1. 事务 T2 insert into t7(id,a) values(26,10)语句 insert 成功,持有 a=10 的 X 行锁(X locks rec but not gap)

  2. 事务 T1 insert into t7(id,a) values(30,10),因为 T2 的第一条 insert 已经插入 a=10 的记录,事务 T1 的 insert a=10 则发生唯一约束冲突,需要申请对冲突的唯一索引加上 S Next-key Lock (也即是 lock mode S waiting ) 这是一个间隙锁会申请锁住(,10],(10,20]之间的 gap 区域。从这里会发现,即使是 RC 事务隔离级别,也同样会存在 Next-Key Lock 锁,从而阻塞并发。

  3. 事务T2 insert into t7(id,a) values(40,9) 该语句插入的 a=9 的值在事务 T1 申请的 gap 锁[4,10]之间,故需事务 T2 的第二条 insert 语句要等待事务 T1 的 S-Next-key Lock 锁释放,在日志中显示 lock_mode X locks gap before rec insert intention waiting。

四、总结

本文案例和知识点一方面从官方文档获取,另一方面是根据何登成和姜承尧两位 MySQL 技术大牛的技术分享整理,算是站在巨人的肩膀上的学习总结。在研究分析死锁案例的过程中,insert 的意向锁和 gap 锁这种类型的锁是比较难分析的,相信通过上面的分析总结大家能够学习到 insert 的锁机制,如何加锁,如何进行 insert 方面死锁分析。

扩展阅读

1. 漫谈死锁

2. 如何阅读死锁日志

3. 死锁案例一

4. 死锁案例二

-The End-

Vol.153










有赞技术团队

为 300 万商家,150 个行业,200 亿电商交易额

提供技术支持


微商城|零售|美业


微信公众号:有赞coder    微博:@有赞技术

技术博客:tech.youzan.com




The bigger the dream, 

the more important the team.

    您可能也对以下帖子感兴趣

    文章有问题?点此查看未经处理的缓存