ORA-00349|激活 ADG 备库时遇到的问题
作者 | JiekeXu
来源 | JiekeXu DBA之路(ID: JiekeXu_IT)
大家好,我是 JiekeXu,很高兴又和大家见面了,今天和大家聊聊 激活 ADG 备库时遇到的问题,欢迎点击上方蓝字关注我,标星或置顶,更多干货第一时间到达!
近日有一套实时同步的 ASM 管理的单机 11204 ADG 备库,由于业务需要,想要脱离主库的约束,想激活拉成读写库直接升级成 ASM 管理的 19C,闪回快照模式无法满足要求,只能 ALTER DATABASE ACTIVATE STANDBY DATABASE 强制切成可读写的主库。说干就干,先将其切成主库,升级过程等下次在一起讨论。
--主库
--主库设置为 defer, 取消备库日志应用,关库启动到 mount 状态进行。
show parameter log_archive_dest_state_2
alter system set log_archive_dest_state_2=defer scope=both sid='*';
--备库
alter database recover managed standby database cancel;
shu immediate
startup mount
--强制拉成主库,很遗憾报错 ORA-00349
SQL> ALTER DATABASE ACTIVATE STANDBY DATABASE;
ALTER DATABASE ACTIVATE STANDBY DATABASE
*
ERROR at line 1:
ORA-00349: failure obtaining block size for '
+JIEKE_DATA
'
ORA-15001: diskgroup "
JIEKE_DATA
" does not exist or is not mounted
ORA-15001: diskgroup "
JIEKE_DATA
" does not exist or is not mounted
使用 ACTIVATE 命令想强制拉成主库,很遗憾如下图报错 ORA-00349。alert 日志中发现有很多清理 redo log 的报错,“ORA-00313: open failed…”无法打开日志组 5、6、23,于是查看日志组成员确实发现 redolog 创建的有问题,member 成员显示的为不存在的磁盘组 “+JIEKE_DATA” 而不是具体路径,真是存在的磁盘组“+JIEKER_DATA”。这就是问题所在,redolog 创建错误,切成主库时 redolog 又是必须的,故报错了,那么现在就是将这个错误的 redolog 重建,问题就会得到解决。但实际上不是这样的,折腾了好久也没解决,继续往下看。
GROUP# Member
---------- ---------------------------------------------------------------------------------------------------
5 +JIEKE_DATA
5 +JIEKE_DATA
6 +JIEKE_DATA
6 +JIEKE_DATA
23 +JIEKE_DATA
23 +JIEKE_DATA
11 +JIEKER_DATA/jiekexu/onlinelog/group_11.1621.1065127343
11 +JIEKER_ARCH/jiekexu/onlinelog/group_11.389.1065127355
12 +JIEKER_DATA/jiekexu/onlinelog/group_12.1620.1065127363
12 +JIEKER_ARCH/jiekexu/onlinelog/group_12.395.1065127371
13 +JIEKER_DATA/jiekexu/onlinelog/group_13.1619.1065127381
SELECT to_char(f.group#) groupno, to_char(l.thread#) thread, f.member member, f.type redo_file_type, l.status group_status, f.status member_status, l.bytes/1024/1024 bytes, l.archived archived
FROM v$logfile f, v$log l WHERE f.group# = l.grSQL> oup# ORDER BY f.group#, f.member;
SQL> SQL> SQL> SQL> SQL> SQL> SQL> SQL> SQL> 2
Group Thread Member Redo Type Group Status Member Status Size(M) Archived
------ ------ -------------------- ---------- ------------ --------------- -------- ----------
5 1 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
1 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
6 2 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
2 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
23 1 +JIEKE_DATA ONLINE CLEARING_CUR INVALID 4,096 YES
RENT
1 +JIEKE_DATA ONLINE CLEARING_CUR INVALID 4,096 YES
RENT
27 2 +JIEKER_DATA/jiekexu/ ONLINE UNUSED 4,096 YES
onlinelog/group_27.1
741.1065129955
28 2 +JIEKER_DATA/jiekexu/ ONLINE UNUSED 4,096 YES
onlinelog/group_28.1
742.1065129973
alert 日志如下:
Errors in file /app/oracle/diag/rdbms/jiekexu/jiekexu/trace/jiekexu_ora_30872.trc:
ORA-00313: open failed for members of log group 5 of thread 1
Clearing online redo logfile 5 +JIEKE_DATA
Clearing online log 5 of thread 1 sequence number 4751
Errors in file /app/oracle/diag/rdbms/jiekexu/jiekexu/trace/jiekexu_ora_30872.trc:
ORA-00313: open failed for members of log group 5 of thread 1
Errors in file /app/oracle/diag/rdbms/jiekexu/jiekexu/trace/jiekexu_ora_30872.trc:
ORA-00313: open failed for members of log group 5 of thread 1
Clearing online redo logfile 5 complete
Errors in file /app/oracle/diag/rdbms/jiekexu/jiekexu/trace/jiekexu_ora_30872.trc:
ORA-00313: open failed for members of log group 6 of thread 2
Clearing online redo logfile 6 +JIEKE_DATA
Clearing online log 6 of thread 2 sequence number 2592
Errors in file /app/oracle/diag/rdbms/jiekexu/jiekexu/trace/jiekexu_ora_30872.trc:
ORA-00313: open failed for members of log group 6 of thread 2
Errors in file /app/oracle/diag/rdbms/jiekexu/jiekexu/trace/jiekexu_ora_30872.trc:
ORA-00313: open failed for members of log group 6 of thread 2
Clearing online redo logfile 6 complete
Errors in file /app/oracle/diag/rdbms/jiekexu/jiekexu/trace/jiekexu_ora_30872.trc:
ORA-00313: open failed for members of log group 23 of thread 1
Clearing online redo logfile 23 +JIEKE_DATA
Clearing online log 23 of thread 1 sequence number 4752
Errors in file /app/oracle/diag/rdbms/jiekexu/jiekexu/trace/jiekexu_ora_30872.trc:
ORA-00313: open failed for members of log group 23 of thread 1
Errors in file /app/oracle/diag/rdbms/jiekexu/jiekexu/trace/jiekexu_ora_30872.trc:
ORA-00313: open failed for members of log group 23 of thread 1
Clearing online redo logfile 23 complete
Resetting resetlogs activation ID 2008461997 (0x77b6b2ad)
Errors in file /app/oracle/diag/rdbms/jiekexu/jiekexu/trace/jiekexu_ora_30872.trc:
ORA-00313: open failed for members of log group 5 of thread 1
Errors in file /app/oracle/diag/rdbms/jiekexu/jiekexu/trace/jiekexu_ora_30872.trc:
ORA-00313: open failed for members of log group 5 of thread 1
ORA-349 signalled during: ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE...
Tue Mar 02 22:42:30 2021
alter database drop logfile group 23
ORA-1623 signalled during: alter database drop logfile group 23...
Tue Mar 02 22:45:07 2021
RFS[7]: Assigned to RFS process 10180
RFS[7]: Opened log for thread 2 sequence 2592 dbid 1797812601 branch 1063804222
Archived Log entry 1100 added for thread 2 sequence 2592 rlc 1063804222 ID 0x77b6b2ad dest 2:
Tue Mar 02 22:45:08 2021
RFS[8]: Assigned to RFS process 10277
RFS[8]: Selected log 17 for thread 2 sequence 2593 dbid 1797812601 branch 1063804222
Tue Mar 02 22:45:08 2021
Primary database is in MAXIMUM PERFORMANCE mode
重建备库 redolog
那么,这里将上演的是重建备库 redolog 的相关操作步骤。
SQL> alter database drop logfile group 27;
alter database drop logfile group 27
*
ERROR at line 1:
ORA-01156: recovery or flashback in progress may need access to files
--由于开启了日志应用进程,直接删除会报错,故需要停止日志应用,修改参数 standby_file_management 为手动。
SQL> alter database recover managed standby database cancel;
Database altered.
SQL> alter system set standby_file_management='manual' scope=both sid='*';
Database altered.
SQL> show parameter standby_file_management
NAME TYPE VALUE
------------------------------------ ---------------------- ------------------------------
standby_file_management string manual
SQL> alter database drop logfile group 27;
Database altered.
--由于日志组不能少于 2 个,故 日志组 28 不不能够删除。
SQL> alter database drop logfile group 28;
alter database drop logfile group 28
*
ERROR at line 1:
ORA-01567: dropping log 28 would leave less than 2 log files for instance JIEKEXU2 (thread 2)
ORA-00312: online log 28 thread 2: '+JIEKER_DATA/jiekexu/onlinelog/group_28.1742.1065129973'
下面则通过继续应用日志、重启、主库切日志、重命名等各种手段继续尝试删除这三个有问题的日志组。
--重启备库删除日志组 6、23,由于是当前日志组无法删除,庆幸日志组 5 成功删除了。
SQL> alter database drop logfile group 6;
alter database drop logfile group 6
*
ERROR at line 1:
ORA-01623: log 6 is current log for instance JIEKEXU2 (thread 2) - cannot drop
ORA-00312: online log 6 thread 2: '+JIEKE_DATA'
ORA-00312: online log 6 thread 2: '+JIEKE_DATA'
SQL> alter database drop logfile group 5;
Database altered.
SQL> alter database drop logfile group 23;
alter database drop logfile group 23
*
ERROR at line 1:
ORA-01623: log 23 is current log for instance jiekexu (thread 1) - cannot drop
ORA-00312: online log 23 thread 1: '+JIEKE_DATA'
ORA-00312: online log 23 thread 1: '+JIEKE_DATA'
--只剩两组日志组也是当前日志组,则当前日志组无法删除,尝试进行 rename 操作,但也是无效或者缺失命令。
SQL> alter database rename '+JIEKE_DATA' to '+JIEKER_DATA';
alter database rename '+JIEKE_DATA' to '+JIEKER_DATA'
*
ERROR at line 1:
ORA-02231: missing or invalid option to ALTER DATABASE
SQL> alter database rename '+JIEKE_DATA' to '+JIEKER_DATA/jiekexu/onlinelog/group_6.dbf';
alter database rename '+JIEKE_DATA' to '+JIEKER_DATA/jiekexu/onlinelog/group_6.dbf'
*
ERROR at line 1:
ORA-02231: missing or invalid option to ALTER DATABASE
--当然继续激活为主库肯定也是报错。那就继续开启日志同步模式,先保持备库同步吧。
SQL> ALTER DATABASE ACTIVATE STANDBY DATABASE;
ALTER DATABASE ACTIVATE STANDBY DATABASE
*
ERROR at line 1:
ORA-00349: failure obtaining block size for '
+JIEKE_DATA
'
ORA-15001: diskgroup "
JIEKE_DATA
" does not exist or is not mounted
ORA-15001: diskgroup "
JIEKE_DATA
" does not exist or is not mounted
SQL> alter database recover managed standby database using current logfile disconnect from session;
Database altered.
重建备库控制文件解决
第二日早晨,睡醒之后头脑清晰想到备库既然无法删除,那主库肯定是可以删除的。通过主库删除日志组 6、23 之后,再重建一个备库的控制文件就可以解决。这样也很简单,主库删除备库有问题的两组日志组后使用 rman 备份一个备库的 控制文件,然后 scp 到备库,备库重启到 nomount 恢复控制文件,启动到 mount 就好了。
--主库:
SQL> alter database drop logfile group 6;
Database altered.
SQL> alter database drop logfile group 23;
Database altered.
rman target /
RMAN> backup current controlfile for standby format '/home/oracle/backup20210303%d_%I_%s_%p.ctl';
scp /home/oracle/backup20210303%d_%I_%s_%p.ctl jiekeadg:/home/oracle/
--备库
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
Database altered.
SQL> shu immediate
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
SQL> startup nomount
ORACLE instance started.
Total System Global Area 1.0689E+11 bytes
Fixed Size 2265864 bytes
Variable Size 4.2144E+10 bytes
Database Buffers 6.4425E+10 bytes
Redo Buffers 323678208 bytes
SQL> exit
jiekeadg:/home/oracle(jiekexu)>rman target /
Recovery Manager: Release 11.2.0.4.0 - Production on Wed Mar 3 10:07:34 2021
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
connected to target database: JIEKEXU (not mounted)
RMAN> restore standby controlfile from '/home/oracle/backup20210303JIEKEXU_1797812601_106_1.ctl';
Starting restore at 2021-03-03 10:08:03
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=3009 device type=DISK
channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
output file name=+JIEKER_DATA/jiekexu/controlfile/current.1739.1065125909
output file name=+JIEKER_ARCH/jiekexu/controlfile/current.323.1065125911
Finished restore at 2021-03-03 10:08:04
RMAN> sql'alter database mount';
sql statement: alter database mount
released channel: ORA_DISK_1
RMAN> exit
-- SQLPLUS 里应用 MRP0 进程同步数据。
jiekeadg:/home/oracle(jiekexu)>sqlplus / as sysdba
SQL> alter database recover managed standby database using current logfile disconnect from session;
Database altered.
SQL> /
NAME VALUE UNIT TIME_COMPUTED
------------- -------------------- ------------------------------ ------------------------------
transport lag +00 00:00:00 day(2) to second(0) interval 03/03/2021 10:19:08
apply lag +00 00:00:00 day(2) to second(0) interval 03/03/2021 10:19:08
重建控制文件后恢复的备库中就没有了错误的磁盘组,但这样时间长一些则备库日志应用会延迟,因为 standby_log 日志状态全部为 UNASSIGNED 的。需要重建备库 standby redolog 日志组后它的状态才会变成 ACTIVE。
取消日志应用,修改参数为 manual 删除原有日志组,重建日志组改回参数应用日志同步进程即可。
SELECT to_char(f.group#) groupno, to_char(l.thread#) thread, f.member member, f.type redo_file_type, l.status group_status, f.status member_status, l.bytes/1024/1024 bytes, l.archived archived
FROM v$logfile f, v$log l WHERE f.group# = l.group# ORDER BY f.group#, f.member;SQL>
SQL> SQL> SQL> SQL> SQL> SQL> SQL> SQL> SQL> 2
Group Thread Member Redo Type Group Status Member Status Size(M) Archived
------ ------ -------------------- ---------- ------------ --------------- -------- ----------
1 1 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
1 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
2 1 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
1 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
3 1 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
1 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
4 1 +JIEKE_DATA ONLINE CURRENT INVALID 4,096 YES
1 +JIEKE_DATA ONLINE CURRENT INVALID 4,096 YES
5 1 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
1 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
7 2 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
2 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
8 2 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
2 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
9 2 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
2 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
10 2 +JIEKE_DATA ONLINE CURRENT INVALID 4,096 YES
2 +JIEKE_DATA ONLINE CURRENT INVALID 4,096 YES
24 2 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
2 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
Database altered.
SQL> show parameter standby_file_management
NAME TYPE VALUE
------------------------------------ ---------------------- ------------------------------standby_file_management string manual
SQL>
SQL> select inst_id,GROUP#,THREAD#,SEQUENCE#,USED,ARCHIVED,STATUS from gv$standby_log;
INST_ID GROUP# THREAD# SEQUENCE# USED Archived STATUS
---------- ---------- ---------- ---------- ---------- ---------- --------------------
1 11 1 0 0 YES UNASSIGNED
1 12 1 0 0 YES UNASSIGNED
1 13 1 0 0 YES UNASSIGNED
1 14 1 0 0 YES UNASSIGNED
1 15 1 0 0 YES UNASSIGNED
alter database drop logfile group 11;
alter database drop logfile group 12;
alter database drop logfile group 13;
alter database drop logfile group 14;
alter database drop logfile group 15;
ALTER DATABASE ADD standby LOGFILE THREAD 1 ('+JIEKER_DATA','+JIEKER_ARCH') SIZE 4G;
ALTER DATABASE ADD standby LOGFILE THREAD 1 ('+JIEKER_DATA','+JIEKER_ARCH') SIZE 4G;
ALTER DATABASE ADD standby LOGFILE THREAD 1 ('+JIEKER_DATA','+JIEKER_ARCH') SIZE 4G;
ALTER DATABASE ADD standby LOGFILE THREAD 1 ('+JIEKER_DATA','+JIEKER_ARCH') SIZE 4G;
ALTER DATABASE ADD standby LOGFILE THREAD 1 ('+JIEKER_DATA','+JIEKER_ARCH') SIZE 4G;
SQL> select inst_id,GROUP#,THREAD#,SEQUENCE#,USED,ARCHIVED,STATUS from gv$standby_log;
INST_ID GROUP# THREAD# SEQUENCE# USED Archived STATUS
---------- ---------- ---------- ---------- ---------- ---------- --------------------
1 6 1 4797 3026358272 YES ACTIVE
1 11 1 0 0 YES UNASSIGNED
1 12 1 0 0 YES UNASSIGNED
1 13 1 0 0 YES UNASSIGNED
1 14 1 0 0 YES UNASSIGNED
SQL> alter system set standby_file_management='AUTO' scope=both sid='*';
SQL> alter database open;
alter database recover managed standby database using current logfile disconnect from session;
Database altered.
找到最终问题所在
完成重建后,本以为已经万事大吉了,但一开库应用日志却发现后台日志中所有 redolog 日志组都被清理了。这才意识到问题的根源所在,立马查看了路径相关的参数发现 db_create_online_log_dest 参数设置错误,导致创建出了错误的磁盘组。
jiekeadg:/home/oracle(jiekexu)>sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Wed Mar 3 10:52:04 2021
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options
SQL> show parameter db_create_online_log_dest_
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_create_online_log_dest_1 string
+JIEKE_DATA
db_create_online_log_dest_2 string
+JIEKE_DATA
db_create_online_log_dest_3 string
db_create_online_log_dest_4 string
db_create_online_log_dest_5 string
SQL> alter system set db_create_online_log_dest_1='
+JIEKER_DATA
' scope=spfile;
System altered.
SQL> alter system set db_create_online_log_dest_2='
+JIEKER_DATA
' scope=spfile;
System altered.
但是现在修改完参数所有的日志组成员也都出现在错误的磁盘组了,因为有了当前日志组 4、10 占用了,重建日志组也行不通,故只能再次重建备库控制文件了。以上主库备份控制文件传到备库恢复的过程再来一遍即可。
SELECT to_char(f.group#) groupno, to_char(l.thread#) thread, f.member member, f.type redo_file_type, l.status group_status, f.status member_status, l.bytes/1024/1024 bytes, l.archived archived
FROM v$logfile f, v$log l WHERE f.group# = l.group# ORDER BY f.group#, f.member;SQL>
SQL> SQL> SQL> SQL> SQL> SQL> SQL> SQL> SQL> 2
Group Thread Member Redo Type Group Status Member Status Size(M) Archived
------ ------ -------------------- ---------- ------------ --------------- -------- ----------
1 1 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
1 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
2 1 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
1 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
3 1 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
1 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
4 1 +JIEKE_DATA ONLINE CURRENT INVALID 4,096 YES
1 +JIEKE_DATA ONLINE CURRENT INVALID 4,096 YES
5 1 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
1 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
7 2 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
2 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
8 2 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
2 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
9 2 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
2 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
10 2 +JIEKE_DATA ONLINE CURRENT INVALID 4,096 YES
2 +JIEKE_DATA ONLINE CURRENT INVALID 4,096 YES
24 2 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
2 +JIEKE_DATA ONLINE CLEARING INVALID 4,096 YES
20 rows selected.
最终激活备库
再一次重建后正常恢复同步,然后关闭实例启动到 mount 状态,激活 ADG 备库,重启验证即可。
SQL> shu immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL>
SQL> startup mount
ORACLE instance started.
Total System Global Area 1.0689E+11 bytes
Fixed Size 2265864 bytes
Variable Size 4.2144E+10 bytes
Database Buffers 6.4425E+10 bytes
Redo Buffers 323678208 bytes
Database mounted.
SQL>
SQL> ALTER DATABASE ACTIVATE STANDBY DATABASE;
Database altered.
SQL> alter database open;
Database altered.
SQL> select DATABASE_ROLE from v$database;
DATABASE_ROLE
--------------------------------
PRIMARY
SQL> shu immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area 1.0689E+11 bytes
Fixed Size 2265864 bytes
Variable Size 4.2144E+10 bytes
Database Buffers 6.4425E+10 bytes
Redo Buffers 323678208 bytes
Database mounted.
Database opened.
SQL> exit
那么小伙伴们今天就到这里了,以下地址均可找到我,感兴趣的可添加我个人微信【JiekeXu_DBA】一起交流学习。
————————————————————————————
公众号:JiekeXu DBA之路
墨天轮:https://www.modb.pro/u/4347
CSDN :https://blog.csdn.net/JiekeXu
腾讯云:https://cloud.tencent.com/developer/user/5645107
————————————————————————————