Oracle 19c RAC 遇到的几个问题
作者 | JiekeXu
来源 |公众号 JiekeXu DBA之路(ID: JiekeXu_IT)
大家好,我是 JiekeXu,很高兴又和大家见面了,今天和大家一起来看看 Oracle 19c RAC 遇到的几个问题,欢迎点击上方蓝字关注我,标星或置顶,更多干货第一时间到达!
Oracle19c 作为长期支持的大版本,是很多公司和个人选择的主流数据库版本,很多公司新上线的系统也都是以 19C 为主,也有很多企业渐渐地迁移数据库到19C,11204 版本已经逐渐退出了舞台。下面对 Oracle 19C版本遇到的几个最主要的问题做个简单介绍。
1、Fock() 炸弹
19 RAC 使用 srvctl 无法正常启动,reboot 重启也不会随着 OS 启动而启动数据库实例。
现象如下图:
Alert 日志报错
对应 trace
猜测可能是资源限制问题,但查了很多都是正常。
ipcs –ma 查看 Oracle 共享内存段
ipcs -q
对应时间段的 OS message
Grid 用户资源限制情况,基本上都是无限制。
Oracle 用户资源限制情况,基本上都是无限制
使用 strace 跟踪启动过程
strace–o /tmp/strace2.log srvctl start database -d teststb –i teststb2
strace–o /tmp/strace1.log srvctl start database -d teststb -i teststb1
其他资源限制查看
cat /etc/security/limits.conf
cat /etc/systemd/system.conf
将此参数打开修改为 infinity 然后重启主机则恢复正常。
cat /etc/systemd/system.conf|grepDefaultTasksMax
#DefaultTasksMax=512
DefaultTasksMax=infinity
Database And ASM InstanceOra-27300 OS System Dependent Operation Fork Failed With Status 11 (Doc ID2331884.1)
systemd limited maximumnumber of tasks that may be created on the node.This setting will alsoaffect maxpid value on the OS.
主要是MAX_PID
修改DefaultTasksMax
从SLES 12 SP2开始引入了PID cgroup controller,限制fork()的并发数,避免fork()炸弹。由配置参数DefaultTasksMax控制,缺省值为512,该值比较小,不适用于数据库运行的环境,需要修改为大于等于65536,在此修改为infinity(无限制)。
如果在DBCA期间抛出如下图所示错误,则可能是由于DefaultTasksMax值过小导致的。为了避免DBCA出现错误,提前修改该参数值。
解决办法:
1)、查看当前值:
test1:~ #systemctl show --property DefaultTasksMax
DefaultTasksMax=512
2)、编辑/etc/systemd/system.conf,取消DefaultTasksMax该行注释,并修改值为"infinity".
# vi /etc/systemd/system.conf
DefaultTasksMax=infinity
3)、reboot OS
4)、查看修改后的值
oracle@test1:~>systemctl show --property DefaultTasksMax
DefaultTasksMax=18446744073709551615
2、客户端问题 ORA-28040
前置条件:已安装Oracle11g客户端,配置好环境变量,用PL/SQL Developer登录数据库
解决方案:
在oracle 19C服务器端oracle用户下:
cd $ORACLE_HOME/network/admin目录下 新建文件 sqlnet.ora
vi sqlnet.ora
SQLNET.ALLOWED_LOGON_VERSION_SERVER=10;
SQLNET.ALLOWED_LOGON_VERSION_CLIENT=10;
②在服务器端,管理员身份登录,重新修改密码。
sqlplus / as sysdba
alter session set container=服务名; --如果使用 pdb 需要切换到 pdb 修改。
alter user 用户名 identified by 密码;
注意:配置完服务器端的sqlnet.ora文件后,务必要重新修改密码,否则仍登录失败
3、打补丁报错
打补丁报错 Prerequisite check "CheckActiveFilesAndExecutables" failed。
testrac1:~# /app/app/19.0.0.0/grid/OPatch/opatchauto apply /app/soft/32226239
OPatchautosession is initiated at Thu Mar 415:27:50 2022
Systeminitialization log file is /app/app/19.0.0.0/grid/cfgtoollogs/opatchautodb/systemconfig2022-03-04_03-27-53PM.log.
Sessionlog file is /app/app/19.0.0.0/grid/cfgtoollogs/opatchauto/opatchauto2022-03-04_03-28-19PM.log
Theid for this session is U4JF
ExecutingOPatch prereq operations to verify patch applicability on home/app/app/oracle/product/19.0.0.0/dbhome_1
ExecutingOPatch prereq operations to verify patch applicability on home/app/app/19.0.0.0/grid
Patchapplicability verified successfully on home /app/app/19.0.0.0/grid
Patchapplicability verified successfully on home/app/app/oracle/product/19.0.0.0/dbhome_1
Executingpatch validation checks on home /app/app/19.0.0.0/grid
Patchvalidation checks successfully completed on home /app/app/19.0.0.0/grid
Executingpatch validation checks on home /app/app/oracle/product/19.0.0.0/dbhome_1
Patchvalidation checks successfully completed on home/app/app/oracle/product/19.0.0.0/dbhome_1
VerifyingSQL patch applicability on home /app/app/oracle/product/19.0.0.0/dbhome_1
"/bin/sh-c 'cd /app/app/oracle/product/19.0.0.0/dbhome_1;ORACLE_HOME=/app/app/oracle/product/19.0.0.0/dbhome_1 ORACLE_SID=piccyx1/app/app/oracle/product/19.0.0.0/dbhome_1/OPatch/datapatch -prereq-verbose'" command failed with errors. Please refer to logs for more details.SQL changes, if any, can be analyzed by manually retrying the same command.
SQLpatch applicability verified successfully on home/app/app/oracle/product/19.0.0.0/dbhome_1
Preparingto bring down database service on home /app/app/oracle/product/19.0.0.0/dbhome_1
Successfullyprepared home /app/app/oracle/product/19.0.0.0/dbhome_1 to bring down databaseservice
Bringingdown CRS service on home /app/app/19.0.0.0/grid
Prepatchoperation log file location: /app/app/grid/crsdata/testrac1/crsconfig/crs_prepatch_apply_inplace_testrac1_2022-03-04_03-29-01PM.log
CRSservice brought down successfully on home /app/app/19.0.0.0/grid
Performingprepatch operation on home /app/app/oracle/product/19.0.0.0/dbhome_1
Perpatchoperation completed successfully on home/app/app/oracle/product/19.0.0.0/dbhome_1
Startapplying binary patch on home /app/app/oracle/product/19.0.0.0/dbhome_1
Failedwhile applying binary patches on home /app/app/oracle/product/19.0.0.0/dbhome_1
Executionof [OPatchAutoBinaryAction] patch action failed, check log for more details.Failures:
PatchTarget : testrac1->/app/app/oracle/product/19.0.0.0/dbhome_1 Type[rac]
Details:[
---------------------------PatchingFailed---------------------------------
Commandexecution failed during patching in home:/app/app/oracle/product/19.0.0.0/dbhome_1, host: testrac1.
Commandfailed: /app/app/oracle/product/19.0.0.0/dbhome_1/OPatch/opatchauto apply /app/soft/32226239 -oh/app/app/oracle/product/19.0.0.0/dbhome_1 -target_type rac_database -binary-invPtrLoc /app/app/19.0.0.0/grid/oraInst.loc -jre/app/app/19.0.0.0/grid/OPatch/jre -persistresult/app/app/oracle/product/19.0.0.0/dbhome_1/opatchautocfg/db/sessioninfo/sessionresult_testrac1_rac_2.ser-analyzedresult /app/app/oracle/product/19.0.0.0/dbhome_1/opatchautocfg/db/sessioninfo/sessionresult_analyze_testrac1_rac_2.ser
Commandfailure output:
==Followingpatches FAILED in apply:
Patch:/app/soft/32226239/32218454
Log:/app/app/oracle/product/19.0.0.0/dbhome_1/cfgtoollogs/opatchauto/core/opatch/opatch2022-03-04_15-30-20PM_1.log
Reason: Failed during Patching:oracle.opatch.opatchsdk.OPatchException: Prerequisite check"CheckActiveFilesAndExecutables" failed.
Afterfixing the cause of failure Run opatchauto resume]
OPATCHAUTO-68061:The orchestration engine failed.
OPATCHAUTO-68061:The orchestration engine failed with return code 1
OPATCHAUTO-68061:Check the log for more details.
OPatchAutofailed.
OPatchautosession completed at Thu Mar 4 15:31:152022
Timetaken to complete the session 3 minutes, 25 seconds
opatchauto failed with error code 42
查看日志
tail -100f/app/app/oracle/product/19.0.0.0/dbhome_1/cfgtoollogs/opatchauto/core/opatch/opatch2022-03-04_15-30-20PM_1.log
[Mar4, 2022 3:31:14 PM] [INFO] Finishfuser command /bin/fuser /app/app/oracle/product/19.0.0.0/dbhome_1/bin/expdp atThu Mar 04 15:31:14 CST 2021
[Mar4, 2022 3:31:14 PM] [INFO] Followingactive executables are not used by opatch process : /app/app/oracle/product/19.0.0.0/dbhome_1/lib/libclntsh.so.19.1 /app/app/oracle/product/19.0.0.0/dbhome_1/lib/libsqlplus.so
Followingactive executables are used by opatch process :
[Mar4, 2022 3:31:14 PM] [INFO] Prerequisite check "CheckActiveFilesAndExecutables" failed.
The detailsare:
Following active executables are not used byopatch process :
/app/app/oracle/product/19.0.0.0/dbhome_1/lib/libclntsh.so.19.1 /app/app/oracle/product/19.0.0.0/dbhome_1/lib/libsqlplus.so
Following active executables are used byopatch process :
[Mar4, 2022 3:31:15 PM] [SEVERE] OUI-67073:UtilSessionfailed: Prerequisite check "CheckActiveFilesAndExecutables" failed.
[Mar4, 2022 3:31:15 PM] [INFO] FinishingUtilSession at Thu Mar 04 15:31:15 CST 2022
[Mar4, 2021 3:31:15 PM] [INFO] Log filelocation: /app/app/oracle/product/19.0.0.0/dbhome_1/cfgtoollogs/opatchauto/core/opatch/opatch2022-03-04_15-30-20PM_1.log
查看进程占用情况
fuser/app/app/oracle/product/19.0.0.0/dbhome_1/lib/libclntsh.so.19.1
/app/app/oracle/product/19.0.0.0/dbhome_1/lib/libclntsh.so.19.1:89500m 109054m
kill 掉占用的进程
kill-9 89500
kill-9 109054
验证没有进程占用
fuser/app/app/oracle/product/19.0.0.0/dbhome_1/lib/libsqlplus.so
fuser /app/app/oracle/product/19.0.0.0/dbhome_1/lib/libclntsh.so.19.1
继续打补丁
testrac1:~# /app/app/19.0.0.0/grid/OPatch/opatchauto resume
4、CRS-6706 patchlevel 不一致
19.3.RAC 的 2 节点打补丁失败,导致2节点集群无法启动报错:
[root@testrac2 soft]#/u01/app/19.0.0.0/grid/bin/crsctl start crs
CRS-6706: OracleClusterware Release patch level ('4203896349') does not match Software patchlevel ('724960844'). Oracle Clusterware cannot be started.
CRS-4000: Command Startfailed, or completed with errors.
从报错补丁 patch level 不一致导致。然后查看 mos
1. Run the following command as the root user tocomplete the patching set up behind the scenes:
#GI_HOME/bin:> ./clscfg -localpatch
2. Run the following command as the root user tolock the GI home:
#GI_HOME/crs/install:> ./rootcrs.sh -lock
3. Run the following command as the root user tostart the GI:
#GI_HOME/bin:> ./crsctl start crs
执行:
[root@testrac2 bin]#./clscfg -localpatch
[root@testrac2install]# ./rootcrs.sh -lock
[root@testrac2 bin]#./crsctl start crs
然后集群可以启动。
参考文档:
CRS-6706: OracleClusterware Release patch level ('nnn') does not match Software patch level('mmm') (文档 ID 1639285.1)
Patching 12.2.0.1 GridInfrastructure gives error CRS-6706: Oracle Clusterware Release Patch Level('748994161') Does Not Match Software Patch Level (文档 ID 2348013.1)
5、内存大页配置过低 ORA-27106
由于内存大页配置小于 SGA 导致数据库无法启动。可调大HugePages内存大页或者改小 SGA,对于HugePages内存大页设置问题后面单独说明。
推荐阅读 Mos 文档:
最常见的5个导致 RAC 实例崩溃的问题 (文档 ID1549191.1)
诊断 GridInfrastructure 启动问题 (文档 ID1623340.1)
RAC 环境中最常见的 5 个数据库和/或实例性能问题 (文档 ID 1602076.1)
如何诊断 11.2 集群节点驱逐问题 (文档 ID 1674872.1)
Grid Infrastructure 启动的五大问题 (文档 ID 1526147.1)
Crs Components Are NotStarting After Server Reboot (文档 ID1152653.1)
Troubleshooting 10g and 11.1Clusterware Reboots (文档 ID265769.1)
Troubleshooting 11.2 or 12.1Grid Infrastructure root.sh Issues (文档 ID 1053970.1)
11gR2 Clusterware 和 Grid Home – 你需要知道的事 (文档 ID 2225748.1)
VIP, SCAN VIP/Listener FailsOver and Listener Stops After Short Public Network Hiccup (文档 ID 1333165.1)
How to change Hostname / IPfor a Grid Infrastructure Oracle Restart Standalone Configuration (SIHA) (文档 ID 1552810.1)
How to disable Automatic VIPfailback (文档 ID1280218.1)
导致 Scan VIP和 Scan Listener(监听程序)出现故障的最常见的5 个问题 (文档 ID 1602038.1)
How To Configure Server SideTransparent Application Failover (文档 ID 460982.1)
How To Configure Server SideTransparent Application Failover (文档 ID 460982.1)
Rebalance of one diskgroupdismounted another diskgroup as disks were offlined (文档 ID 1525330.1)最常见的5个导致 RAC 实例崩溃的问题 (文档 ID 1549191.1)
诊断 GridInfrastructure 启动问题 (文档 ID1623340.1)
RAC 环境中最常见的 5 个数据库和/或实例性能问题 (文档 ID 1602076.1)
如何诊断 11.2 集群节点驱逐问题 (文档 ID 1674872.1)
Grid Infrastructure 启动的五大问题 (文档 ID 1526147.1)
Crs Components Are NotStarting After Server Reboot (文档 ID1152653.1)
Troubleshooting 10g and 11.1Clusterware Reboots (文档 ID265769.1)
Troubleshooting 11.2 or 12.1Grid Infrastructure root.sh Issues (文档 ID 1053970.1)
11gR2 Clusterware 和 Grid Home – 你需要知道的事 (文档 ID 2225748.1)
VIP, SCAN VIP/Listener FailsOver and Listener Stops After Short Public Network Hiccup (文档 ID 1333165.1)
How to change Hostname / IPfor a Grid Infrastructure Oracle Restart Standalone Configuration (SIHA) (文档 ID 1552810.1)
How to disable Automatic VIPfailback (文档 ID1280218.1)
导致 Scan VIP和 Scan Listener(监听程序)出现故障的最常见的5 个问题 (文档 ID 1602038.1)
How To Configure Server SideTransparent Application Failover (文档 ID 460982.1)
How To Configure Server SideTransparent Application Failover (文档 ID 460982.1)
Rebalance of one diskgroupdismounted another diskgroup as disks were offlined (文档 ID 1525330.1)
全文完,希望可以帮到正在阅读的你,如果觉得有帮助,可以分享给你身边的朋友,你关心谁就分享给谁,一起学习共同进步~~~
❤️ 欢迎关注我的公众号,来一起玩耍吧!!!
————————————————————————————
公众号:JiekeXu DBA之路
墨天轮:https://www.modb.pro/u/4347
CSDN :https://blog.csdn.net/JiekeXu
腾讯云:https://cloud.tencent.com/developer/user/5645107
————————————————————————————