insert 语句加锁机制
一 前言
之前的文章里面总结了很多死锁案例,其实里面有几篇文章对于insert加锁流程表述的不准确,而且微信公众号又无法修改,所以通过本文重新梳理insert加锁流程,最后加上一个死锁案例解析。
有个勤奋好学的同事特地找我咨询insert并发导致死锁的问题,我不方便说他的名字,就叫他鲁震宇吧,本文也是为这位开发小哥哥写的。要是还看不懂,可以再来咨询我。T_T
二 基础知识
在分析死锁案例之前,我们先学习一下背景知识 insert 语句的加锁策略,来看看官方定义:
"INSERT sets an exclusive lock on the inserted row. This lock is an index-record lock, not a next-key lock (that is, there is no gap lock) and does not prevent other sessions from inserting into the gap before the inserted row.
Prior to inserting the row, a type of gap lock called an insert intention gap lock is set. This lock signals the intent to insert in such a way that multiple transactions inserting into the same index gap need not wait for each other if they are not inserting at the same position within the gap. "
相信大部分的DBA同行都知道在事务执行insert的时候会申请一把插入意向锁(Insert Intention Lock)。在多个会话并发写入不同数据记录至同一索引间隙的时候,并不需要等待其他事务完成,不会发生锁等待。
假设有一个索引记录包含键值4和7,不同的会话分别插入5和6,每个事务都会产生一个加在4-7之间的插入意向锁,获取在插入行上的排它锁,但是不会被互相锁住,因为数据行并不冲突。 但是如果遇到唯一键呢?
"If a duplicate-key error occurs, a shared lock on the duplicate index record is set."
对于insert操作来说,若发生唯一约束冲突,则需要对冲突的唯一索引加上S Next-key Lock。从这里会发现,即使是RC事务隔离级别,也同样会存在Next-Key Lock锁,从而阻塞并发。然而,文档没有说明的是,对于检测到冲突的唯一索引,等待线程在获得S Lock之后,还需要对下一个记录进行加锁,在源码中由函数row_ins_scan_sec_index_for_duplicate进行判断。via (MySQL REPLACE死锁问题深入剖析 )。 下面我们通过几个例子进行验证insert插入流程。业务上insert场景包含唯一索引和非唯一索引,本文也从这两个角度着手。
默认基于RR事务隔离级别进行测试,RC模式相对更简单,后面会增加一个基于RC隔离级别的死锁案例分析。
三 非唯一键场景
环境准备
CREATE TABLE `tx` ( id int(11) NOT NULL AUTO_INCREMENT, c1 int(11) NOT NULL DEFAULT 0, c2 int(11) NOT NULL DEFAULT 0, PRIMARY KEY (id), KEY idx_c1 (c1) ) ENGINE=InnoDB AUTO_INCREMENT=32 DEFAULT CHARSET=utf8mb4; test 09:25:46 >select * from tx; +----+----+----+ | id | c1 | c2 | +----+----+----+ | 27 | 3 | 4 | | 30 | 5 | 8 | +----+----+----+
无锁阻塞
单纯的insert,无其他事务对相关记录加锁,直接成功。
test 10:30:01 >insert into tx(c1,c2) select 8,15; Query OK, 1 row affected (0.00 sec) Records: 1 Duplicates: 0 Warnings: 0
insert记录的下一条记录加Next-key lock
sess1: test 09:25:52 >begin; Query OK, 0 rows affected (0.00 sec) test 09:26:02 >select * from tx where c1=5 for update; +----+----+----+ | id | c1 | c2 | +----+----+----+ | 30 | 5 | 8 | +----+----+----+ 1 row in set (0.00 sec)
加锁分析此时sess1 对二级索引c1加Next-key lock (3,5],(5,+∞]
对主键id 加record lock [30]
sess2:
test 09:27:36 >insert into tx(c1,c2) values(4,10);
出现等待,查看show engine innodb status
insert into tx(c1,c2) values(4,10) ------- TRX HAS BEEN WAITING 9 SEC FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 27 page no 5 n bits 72 index idx_c1 of table `test`.`tx` trx id 2391 lock_mode X locks gap before rec insert intention waiting
分析: sess1 持有的Next-key lock (3,5] 阻塞sess2 insert intention lock,sess2 等待。
insert记录的下一条记录加Gap lock
sess1
test 10:32:45 >select * from tx where c1> 5 for update; +----+----+----+ | id | c1 | c2 | +----+----+----+ | 32 | 8 | 15 | | 33 | 13 | 18 | | 34 | 19 | 25 | +----+----+----+ 3 rows in set (0.00 sec)
二级索引持有锁(5, 8] (8,13],(19,+∞] 主键持有行锁 [32],[33],[34]
sess2
test 10:33:27 >insert into tx(c1,c2) select 4,9; ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
事务日志显示
insert into tx(c1,c2) select 4,9 ------- TRX HAS BEEN WAITING 13 SEC FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 27 page no 3 n bits 80 index PRIMARY of table `test`.`tx` trx id 2416 lock_mode X insert intention waiting
分析:被插入的记录下一条记录有 Gap lock 时,插入出现等待。
小结
非唯一键的情况下,insert N 会申请 LOCK_INSERT_INTENTION lock,找到大于N的第一条记录M,如果M上没有Gap lock,Next-key lock,则插入成功。否则出现等待M上的锁释放。比如插入4,MySQL查看到4 的下一个记录5上有Gap lock,Next-key lock, 阻塞insert intention,因此出现等待。
问题:RR模式下
sess1 :select * from tx where c1<3 for update;
那么
sess2 insert into tx(c1,c2) select 4,9; 是否成功?
四 唯一键场景
环境准备
CREATE TABLE ty ( id int(11) NOT NULL AUTO_INCREMENT, c2 int(11) NOT NULL DEFAULT 0, c3 int(11) NOT NULL DEFAULT 0, PRIMARY KEY (id), UNIQUE KEY uniq_c2 (c2) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4; insert into ty(c2,c3) values(3,3),(8,8), (13,13),(19,19),(28,28);
insert之前,其他事务且对其下一条记录加了next-key lock
sess1
test 11:28:32 >begin; Query OK, 0 rows affected (0.00 sec) test 11:28:45 >select * from ty where c2 >=13 for update; +----+----+----+ | id | c2 | c3 | +----+----+----+ | 3 | 13 | 13 | | 4 | 19 | 19 | | 5 | 28 | 28 | +----+----+----+ 3 rows in set (0.00 sec)
分析:对记录c2为13的行加上next-key lock:(8,13](13,+∞)
sess2
test 11:28:49 >insert into ty(c2,c3) values(10,10); ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
查看事务日志
---TRANSACTION 2448, ACTIVE 11 sec inserting mysql tables in use 1, locked 1 LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s) MySQL thread id 1967928, OS thread handle 140143801399040, query id 8311237 127.0.0.1 root update insert into ty(c2,c3) values(10,10) ---TRX HAS BEEN WAITING 11 SEC FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 54 page no 3 n bits 80 index PRIMARY of table `test`.`ty` trx id 2448 lock_mode X insert intention waiting
分析: 从事务日志lock_mode X insert intention waiting可以看出Next-key lock 阻塞插入。
insert之前,其他事务且对其下一条记录加了gap lock
sess1
test 11:33:55 >begin; Query OK, 0 rows affected (0.00 sec) test [RW][TEST:qa_mytest:3316] 11:35:06 >select * from ty where c2 =11 for update; Empty set (0.00 sec)
sess2
test 11:35:02 >insert into ty(c2,c3) values(10,10); ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
insert into ty(c2,c3) values(10,10) ------- TRX HAS BEEN WAITING 8 SEC FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 54 page no 4 n bits 80 index uniq_c2 of table `test`.`ty` trx id 2463 lock_mode X locks gap before rec insert intention waiting
分析: 因为表ty没有c2=11的记录,sess1 会持有(8,13)的gap lock ,从事务日志 lock_mode X locks gap before rec insert intention waiting可以看出sess1 持有的gap lock 阻塞sess2 的插入。
上面两个唯一键的案例本质上都是构造gap lock 阻塞 insert intention lock,我们再来看一个
insert值其下一条记录加了record lock
sess1
test 11:53:04 >select * from ty where c2=13 for update; +----+----+----+ | id | c2 | c3 | +----+----+----+ | 3 | 13 | 13 | +----+----+----+ 1 row in set (0.00 sec)
sess2
test 11:52:55 >insert into ty(c2,c3) values(10,10); Query OK, 1 row affected (0.00 sec)
sess2 insert 10 可以直接插入。说明record lock 和 insert intention lock 不冲突。
插入重复的记录
如果直接测试插入相同的记录,
test 12:36:05 >insert into ty(c2,c3) values(19,19);
ERROR 1062 (23000): Duplicate entry '19' for key 'uniq_c2'
则会立即报错唯一键冲突,通过show engine innodb status 看不到任何锁相关的信息。
我们构造一个已经存在的记录c2=19 ,然后删除记录c2=19,在其他会话插入相同的记录。
通过这样的逻辑来测试insert 语句遇到唯一键的时候的加锁流程。(其实也可以通过看源码,获取上面的信息,只是好的人没有深入查看源码)
sess1
test 11:58:22 >begin; Query OK, 0 rows affected (0.00 sec) test [RW][TEST:qa_mytest:3316] 11:58:29 >delete from ty where c2=19; Query OK, 1 row affected (0.00 sec)
delete语句事务不提交,sess1 持有唯一索引uniq_c2和主键两个行级锁(lock_mode X locks rec but not gap)
sess2
test 11:58:16 >begin; Query OK, 0 rows affected (0.00 sec) test 11:59:41 >insert into ty(c2,c3) values(19,19);
查看事务日志 相关会话持有锁的情况
insert into ty(c2,c3) values(19,19) TRX HAS BEEN WAITING 4 SEC FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 54 page no 4 n bits 80 index uniq_c2 of table `test`.`ty` trx id 2471 lock mode S waiting
我们看到insert (19,19) 在申请一把S Next-key Lock, 显示lock mode S waiting。这里想给大家说明的是在innodb日志中如果提示 lock mode S /lock mode X,其实是Next-key lock/Gap lock,如果是行记录锁的话,日志会提示but not gap,请读者朋友们在自己分析死锁日志的时候注意。
sess1 会话提交之后 执行针对c2=19的更新操作,去更新sess2 会话新插入的insert。
test 12:44:54 >update ty set c3=100 where c2 =19;
会话出现等待,我们查看事务日志:
update ty set c3=100 where c2 =19 ------- TRX HAS BEEN WAITING 7 SEC FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 54 page no 4 n bits 80 index uniq_c2 of table `test`.`ty` trx id 2473 lock_mode X locks rec but not gap waiting
发现update语句在等待行锁 lock_mode X locks rec but not gap waiting ,说明insert 插入完成之后 持有一把c2=19 的行锁
五 总结insert流程
insert N的流程(非唯一索引的情况):
找到大于N的第一条记录M
如果M上面没有gap/next-key lock的话,则可以插入,否则等待(对其next-rec加insert intension lock,由于有gap锁,所以等待)
insert N的流程(有唯一索引的情况):
找到大于N的第一条记录M,以及前一条记录P。
如果M上面没有gap/next-key lock的话,进入第三步骤 否则等待(对其next-rec加insert intension lock,由于有gap锁,所以等待)
检查P: 判断p是否等于n: 如果不等: 则完成插入(结束) 如果相等: 再判断P 是否有锁, 如果没有锁: 报1062错误(duplicate key) --说明该记录已经存在,报重复值错误 加S-lock --说明该记录被标记为删除, 事务已经提交,还没来得及purge 如果有锁: 则加S-lock --说明该记录被标记为删除,事务还未提交.
RC模式下的insert 相对更简单,大家可以实践出真知。
六 死锁案例
环境准备
事务隔离级RC
create table t7( id int not null primary key auto_increment, a int not null , unique key ua(a) ) engine=innodb; insert into t7(id,a) values(1,1),(5,4),(20,20),(25,12);
测试用例
死锁日志
2018-05-26 14:17:57 0x7f75ca3ce700
*** (1) TRANSACTION:
TRANSACTION 2507, ACTIVE 19 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s), undo log entries 1
MySQL thread id 1971152, OS thread handle 140143695337216, query id 8326068 127.0.0.1 root update
insert into t7(id,a) values(30,10)
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 56 page no 4 n bits 72 index ua of table `test`.`t7` trx id 2507 lock mode S waiting
*** (2) TRANSACTION:
TRANSACTION 2506, ACTIVE 26 sec inserting, thread declared inside InnoDB 5000
mysql tables in use 1, locked 1
3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 2
MySQL thread id 1971147, OS thread handle 140143880890112, query id 8326089 127.0.0.1 root update
insert into t7(id,a) values(40,9)
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 56 page no 4 n bits 72 index ua of table `test`.`t7` trx id 2506 lock_mode X locks rec but not gap
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 56 page no 4 n bits 72 index ua of table `test`.`t7` trx id 2506 lock_mode X locks gap before rec insert intention waiting
*** WE ROLL BACK TRANSACTION (1)
我们从时间线维度分析:
T1: insert into t7(id,a) values(26,10) 语句insert 成功,持有a=10 的X 行锁(X locks rec but not gap)
T2: insert into t7(id,a) values(30,10),因为T1 的第一条insert已经插入a=10的记录,T2的 insert a=10 则发生唯一约束冲突,需要申请对冲突的唯一索引加上S Next-key Lock (也即是 lock mode S waiting ) 这是一个间隙锁会申请锁住(4,10],(10,20]之间的gap区域。从这里会发现,即使是RC事务隔离级别,也同样会存在Next-Key Lock锁,从而阻塞并发。
T3: insert into t7(id,a) values(40,9)该语句插入的a=9的值在T1申请的gap锁(4,10]之间,Gap lock 阻塞insert intention lock,故要等待sess1的S-Next-key Lock锁释放,在日志中显示lock_mode X locks gap before rec insert intention waiting.
推荐阅读
关于InnoDB 各种锁类型介绍请参考 内核月报
http://mysql.taobao.org/monthly/2016/01/01/
http://mysql.taobao.org/monthly/2018/05/04/
http://keithlan.github.io/2017/06/21/innodb_locks_algorithms/