RAC 数据库后台进程介绍
作者 | JiekeXu
来源 | JiekeXu DBA之路(ID: JiekeXu_IT)
大家好,我是 JiekeXu,很高兴又和大家见面了,今天和大家一起学习 RAC 数据库后台进程,欢迎点击上方蓝字关注我,标星或置顶,更多干货第一时间到达!
在 RAC 数据库上会比单实例数据库多一些进程,这些进程是 RAC 特有的,为了实现集群数据库功能而设置的。
10g RAC 特有进程:
$ ps -ef|grep ora_
oracle 4721 1 0 Feb26 ? 00:00:00 ora_diag_ONEPIECE1
oracle 4725 1 0 Feb26 ? 00:02:26 ora_lmon_ONEPIECE1
oracle 4727 1 0 Feb26 ? 00:00:02 ora_lmd0_ONEPIECE1
oracle 4729 1 0 Feb26 ? 00:00:01 ora_lms0_ONEPIECE1
oracle 4733 1 0 Feb26 ? 00:00:01 ora_lms1_ONEPIECE1
oracle 4761 1 0 Feb26 ? 00:00:07 ora_lck0_ONEPIECE1
oracle 4772 1 0 Feb26 ? 00:00:00 ora_asmb_ONEPIECE1
oracle 4776 1 0 Feb26 ? 00:00:00 ora_rbal_ONEPIECE1
oracle 4840 1 0 Feb26 ? 00:00:00 ora_o001_ONEPIECE1
11g RAC 特有进程:
$ ps -ef|grep ora_
oracle 426 1 0 Feb27 ? 00:00:08 ora_o000_RAC11G21
oracle 9082 1 0 Feb25 ? 00:01:09 ora_diag_RAC11G21
oracle 9086 1 0 Feb25 ? 00:00:27 ora_ping_RAC11G21
oracle 9088 1 0 Feb25 ? 00:00:06 ora_acms_RAC11G21
oracle 9092 1 0 Feb25 ? 00:05:27 ora_lmon_RAC11G21
oracle 9094 1 0 Feb25 ? 00:01:32 ora_lmd0_RAC11G21
oracle 9096 1 0 Feb25 ? 00:02:07 ora_lms0_RAC11G21
oracle 9100 1 0 Feb25 ? 00:00:06 ora_rms0_RAC11G21
oracle 9102 1 0 Feb25 ? 00:00:14 ora_lmhb_RAC11G21
oracle 9116 1 0 Feb25 ? 00:00:09 ora_rbal_RAC11G21
oracle 9118 1 0 Feb25 ? 00:00:05 ora_asmb_RAC11G21
oracle 9136 1 0 Feb25 ? 00:04:25 ora_lck0_RAC11G21
oracle 9138 1 0 Feb25 ? 00:00:14 ora_rsmn_RAC11G21
oracle 9295 1 0 Feb25 ? 00:00:07 ora_gtx0_RAC11G21
oracle 9297 1 0 Feb25 ? 00:00:07 ora_rcbg_RAC11G21
这篇文章会对这些 RAC 特有的进程做一些介绍。
LMD: Global Enqueue Service Daemon。LMD 进程主要处理从远程节点发出的资源请求。大概过程如下:
+ 一个连接发出了global enqueue 请求
+ 这个请求会被发给本节点的LMD0进程
+ 这个前台进程会处于等待状态
+ LMD0会找到这个资源的master节点是谁
+ LMD0会把这个请求发送给master节点
+ 如果需要的话,master节点会增加一个新的master资源
+ 这时从master节点可以获知谁是owner, waiter
+ 当这个资源被grant给requestor后, master节点的LMD0进程会告知requestor节点的LMD0
+ 然后requestor节点的LMD0会通知申请资源的前台进程
也就是说LMD主要处理global enqueue 的请求, 而LCK0主要处理本实例的lock.
另外,RAC上的global deadlock 也是由LMD来发现的。
LCK0: Instance Enqueue Process。LCK0进程主要处理非cache fustion的资源请求,比如library 和row cache 请求。
LCK0处理在实例一级的锁:
Row cache entries
Library cache entries
Result cache entries
这些实例级的锁的owner, waiter是LCK0进程。
只要这个实例的锁的owner是LCK0,那么这个实例的任何一个连接都可以使用这种cached的metedata.
如果本地的实例没有拥有这个lock,那么需要申请这个lock,前台进程会等待DFS Lock Handle。
另外,当shared pool出现压力需要释放一些内存来存放新的cursor时,LCK进程会将dictionary cache 的一些内存进行释放。
LMON: Global Enqueue Service Monitor。LMON用于监控整个集群的global enqueues和resources, 而且会执行global enqueue recovery。实例异常终止后,会由LMON来进行GCS内存方面的处理。当一个实例加入或者离开集群后,LMON会对lock和resource进行reconfiguration.另外LMON会在不同的实例间进行通讯检查,如果发现对方通讯超时,就会发出节点eviction,所以很多时候节点发生eviction后(ORA-481, ORA-29740等),我们需要查看LMON的trace来了解eviction的原因。
还有,在DRM(Dynamic Resource management)中,LMD会监控需要进行remaster的queue,然后把任务发送给LMON进程,LMON进程来实施remaster。
LMS: Global Cache Service Process。 LMS进程会维护在Global Resource Directory (GRD)中的数据文件以及每个cached block的状态。LMS用于在RAC的实例间进行message以及数据块的传输。LMS是Cache Fusion的一个重要部分。LMS进程可以说是RAC上最活跃的后台进程,会消耗较多的CPU.一般每个实例会有多个LMS进程,每个Oracle版本的默认的LMS进程数目会有所不同,大部分版本的默认值是:MIN(CPU_COUNT/2, 2))
DIAG: Diagnostic Capture Process。用来打印诊断信息。diag进程会响应别的进程发出的dump请求,将相关的诊断信息写到diag trace文件中。在RAC上,当发出global oradebug请求时,会由每个实例的diag进程来打印诊断信息到diag trace中。
比如:下面的命令用了“-g”,那么生成的dump信息会分别写到每个实例的diag trace文件中:
SQL>oradebug -g all hanganalyze 3
SQL>oradebug -g all dump systemstate 266
ASMB: ASM Background Process。用于和ASM实例进行通讯,用来管理storage和提供statistics。当使用ASMCMD的cp命令时,需要用到ASM实例上的ASMB进程,数据库实例的spfile如果位于存于ASM上,那么也会用到ASMB进程。如果OCR存放在ASM中,也会用到ASMB。
RBAL:ASM Rebalance Master Process。作为ASM磁盘组进行rebalance时的协调者(Coordinator)。在数据库实例上,由它来管理ASM磁盘组。
Onnn:ASM Connection Pool Process。是从数据库实例连接到ASM实例上的一些连接池,通过这些连接池,数据库可以发送消息给ASM实例。比如,由它将打开文件的请求发送给ASM实例,这些连接池只处理一些较短的请求,不处理创建文件这种较长的请求。
PZ:PQ slaves。PZnn进程(从99开始)用于查询GV$视图,这种查询需要在每个实例上并行执行。如果需要更多的PZ进程,会自动生成PZ98, PZ97,...(降序)。
11G 特有的:
PING:Interconnect Latency Measurement Process。用来检查集群中各个实例间的私网通讯状况。每个实例每隔几秒会发送给其它实例一些消息,这些消息会由其它实例的PING进程收到。发送和接收信息花费的时间会被记录下来并判断是否正常。
LMHB: Global Cache/Enqueue Service Heartbeat Monitor。监控本地的LMON, LMD, LCK0,RMS0 and LMSn等进程是否运行正常,是否被阻塞或者已经hang了。
RMSn:Oracle RAC Management Process。完成对RAC的一些管理任务,比如当一个新的实例加入到集群后,给这个实例创建相关的资源。
RSMN: Remote Slave Monitor Process。管理后台的slave进程的创建,作为远程实例的协调者来完成一些任务。
GTXn: Global Transaction Process。在RAC环境中对于XA 事务提供透明支持,维护在RAC中的XA事务的global信息,完成global事务的两阶段提交。
RCBG: Result Cache Background Process。这个进程用来处理RAC上Result Cache相关的消息。
ACMS: Atomic Control File to Memory Service Process。作为每个实例上的agent来保证SGA的更新在RAC的所有实例上都是同步的,或者是全局成功提交,或者由于一些问题而导致全局回滚。
* ACMS (atomic controlfile to memory service) per-instance process is an agent that contributes to ensuring a distributed SGA memory update is either globally committed on success or globally aborted in the event of a failure in an Oracle RAC environment.
* DBRM (database resource manager) process is responsible for setting resource plans and other resource manager related tasks.
* DIA0 (diagnosability process 0) (only 0 is currently being used) is responsible for hang detection and deadlock resolution.
* DIAG (diagnosability) process performs diagnostic dumps and executes global oradebug commands.
* EMNC (event monitor coordinator) is the background server process used for database event management and notifications.
* FBDA (flashback data archiver process) archives the historical rows of tracked tables into flashback data archives. Tracked tables are tables which are enabled for flashback archive. When a transaction containing DML on a tracked table commits, this process stores the pre-image of the rows into the flashback archive. It also keeps metadata on the current rows.FBDA is also responsible for automatically managing the flashback data archive for space, organization, and retention and keeps track of how far the archiving of tracked transactions has occurred.
* GTX0-j (global transaction) processes provide transparent support for XA global transactions in an Oracle RAC environment. The database autotunes the number of these processes based on the workload of XA global transactions. Global transaction processes are only seen in an Oracle RAC environment.
* KATE performs proxy I/O to an ASM metafile when a disk goes offline.
* MARK marks ASM allocation units as stale following a missed write to an offline disk.
* SMCO (space management coordinator) process coordinates the execution of various space management related tasks, such as proactive space allocation and space reclamation. It dynamically spawns slave processes (Wnnn) to implement the task.
* VKTM (virtual keeper of time) is responsible for providing a wall-clock time (updated every second) and reference-time counter (updated every 20 ms and available only when running at elevated priority).Some additional Processes not documented in 10G :
* PZ (PQ slaves used for global Views) are RAC Parallel Server Slave processes, but they are not normal parallel slave processes, PZnn processes (starting at 99) are used to query GV$ views which is done using Parallel Execution on all instances, if more than one PZ process is needed, then PZ98, PZ97,... (in that order) are created automatically.
* O00 (ASM slave processes) A group of slave processes establish connections to the ASM instance. Through this connection pool database processes can send messages to the ASM instance. For example opening a file sends the open request to the ASM instance via a slave. However slaves are not used for long running operations such as creating a file. The use slave (pool) connections eliminate the overhead of logging into the ASM instance for short requests
* x000 - Slave used to expell disks after diskgroup reconfiguration
原文链接:https://blogs.oracle.com/database4cn/rac-v2
New Background Processes In 11g (Doc ID 444149.1)
对于 RAC 进程就先学习这么多,先简单记录下来,可以时常翻翻。如果此文对您有帮助,欢迎点赞、在看与转发,写作不易,举手之劳,便是对作者最大的支持。
————————————————————————————
公众号:JiekeXu DBA之路
墨天轮:https://www.modb.pro/u/4347
CSDN :https://blog.csdn.net/JiekeXu
腾讯云:https://cloud.tencent.com/developer/user/5645107
————————————————————————————