其他
每天5分钟PG聊通透第7期,为什么有的SQL杀不掉?
文中参考文档点击阅读原文打开, 同时推荐2个学习环境:
1、懒人Docker镜像, 已打包200+插件:《最好的PostgreSQL学习镜像》
2、有web浏览器就能用的云起实验室: 《免费体验PolarDB开源数据库》
3、PolarDB开源数据库内核、最佳实践等学习图谱: https://www.aliyun.com/database/openpolardb/activity
每天5分钟PG聊通透第7期,为什么有的SQL使用`pg_cancel_backend, pg_terminate_backend`都杀不掉?
背景
问题说明(现象、环境) 分析原因 结论和解决办法
链接、驱动、SQL
7、为什么有的SQL使用pg_cancel_backend, pg_terminate_backend都杀不掉? (不处理中断信号阶段(HOLD..RESUME中间), src/include/miscadmin.h)
https://www.bilibili.com/video/BV1EQ4y1Y7kY/
src/include/miscadmin.h
/*****************************************************************************
* System interrupt and critical section handling
*
* There are two types of interrupts that a running backend needs to accept
* without messing up its state: QueryCancel (SIGINT) and ProcDie (SIGTERM).
* In both cases, we need to be able to clean up the current transaction
* gracefully, so we can't respond to the interrupt instantaneously ---
* there's no guarantee that internal data structures would be self-consistent
* if the code is interrupted at an arbitrary instant. Instead, the signal
* handlers set flags that are checked periodically during execution.
*
* The CHECK_FOR_INTERRUPTS() macro is called at strategically located spots
* where it is normally safe to accept a cancel or die interrupt. In some
* cases, we invoke CHECK_FOR_INTERRUPTS() inside low-level subroutines that
* might sometimes be called in contexts that do *not* want to allow a cancel
* or die interrupt. The HOLD_INTERRUPTS() and RESUME_INTERRUPTS() macros
* allow code to ensure that no cancel or die interrupt will be accepted,
* even if CHECK_FOR_INTERRUPTS() gets called in a subroutine. The interrupt
* will be held off until CHECK_FOR_INTERRUPTS() is done outside any
* HOLD_INTERRUPTS() ... RESUME_INTERRUPTS() section.
*
* There is also a mechanism to prevent query cancel interrupts, while still
* allowing die interrupts: HOLD_CANCEL_INTERRUPTS() and
* RESUME_CANCEL_INTERRUPTS().
*
* Note that ProcessInterrupts() has also acquired a number of tasks that
* do not necessarily cause a query-cancel-or-die response. Hence, it's
* possible that it will just clear InterruptPending and return.
*
* INTERRUPTS_PENDING_CONDITION() can be checked to see whether an
* interrupt needs to be serviced, without trying to do so immediately.
* Some callers are also interested in INTERRUPTS_CAN_BE_PROCESSED(),
* which tells whether ProcessInterrupts is sure to clear the interrupt.
*
* Special mechanisms are used to let an interrupt be accepted when we are
* waiting for a lock or when we are waiting for command input (but, of
* course, only if the interrupt holdoff counter is zero). See the
* related code for details.
*
* A lost connection is handled similarly, although the loss of connection
* does not raise a signal, but is detected when we fail to write to the
* socket. If there was a signal for a broken connection, we could make use of
* it by setting ClientConnectionLost in the signal handler.
*
* A related, but conceptually distinct, mechanism is the "critical section"
* mechanism. A critical section not only holds off cancel/die interrupts,
* but causes any ereport(ERROR) or ereport(FATAL) to become ereport(PANIC)
* --- that is, a system-wide reset is forced. Needless to say, only really
* *critical* code should be marked as a critical section! Currently, this
* mechanism is only used for XLOG-related code.
*
*****************************************************************************/
src/backend/tcop/postgres.c
/*
* ProcessInterrupts: out-of-line portion of CHECK_FOR_INTERRUPTS() macro
*
* If an interrupt condition is pending, and it's safe to service it,
* then clear the flag and accept the interrupt. Called only when
* InterruptPending is true.
*
* Note: if INTERRUPTS_CAN_BE_PROCESSED() is true, then ProcessInterrupts
* is guaranteed to clear the InterruptPending flag before returning.
* (This is not the same as guaranteeing that it's still clear when we
* return; another interrupt could have arrived. But we promise that
* any pre-existing one will have been serviced.)
*/
void
ProcessInterrupts(void)
{
....
下次遇到杀不掉的可以用pstack看看杀不掉的进程在干什么, 找找对应的代码是否能找到是不是调用了hold中断.
本期彩蛋 - 开源Clup: PostgreSQL&PolarDB高可用与日常管理软件
clup由《PostgreSQL从小工到专家》作者唐成乘数科技出品, 包含开源版本和企业版本, 是非常成熟的PostgreSQL&PolarDB集群管理软件.
官网: https://www.csudata.com/clup
开源项目地址: https://gitee.com/csudata
使用CLup可以轻松管理几十套至上百套PostgreSQL、PolarDB高可用的数据库集群,发生故障自动切换,不影响生产系统的运行。故障切换后有详细的故障日志,方便定位故障原因,还可以手工一键切换。CLup还提供了数据库的一些基本监控和TOP SQL的监控,CLup后续版本还会增加更多的功能。
管理基于PostgreSQL流复制的集群
管理基于共享存储的PolarDB集群
产品优势
最后推荐2本大佬的新书
文章中的参考文档请点击阅读原文获得.
欢迎关注我的github (https://github.com/digoal/blog) , 学习数据库不迷路.
近期正在写公开课材料, 未来将通过视频号推出, 欢迎关注视频号: