Document TitlePre 11.2 Database Issues in 11gR2 Grid Infrastructure Environment (Doc ID 948456.1) |
PurposeThe note lists most known issues in 11gR2 Grid Infrastructure (in short GI) + pre-11gR2 database environment.Even workaround is available in some cases, it's recommended to apply patches whenever possible. Refer to note 1064804.1 for instructions to patch in mixed environment. For CRS PSU/bundle patch information, refer to note 405820.1 for 10.2 and note 810663.1 for 11.1 Pre 11.2 Database Issues in 11gR2 Grid Infrastructure Environment1. Error creating or starting pre-11.2 database:If it happens while creating database, DBCA fails with ORA-29702 and traces shows: ORA-01501: CREATE DATABASE failed ORA-00200: control file could not be created ORA-00202: control file: '+DG_DATA/racdb/control01.ctl' ORA-17502: ksfdcre:4 Failed to create file +DG_DATA/racdb/control01.ctl ORA-15001: diskgroup "DG_DATA" does not exist or is not mounted ORA-15077: could not locate ASM instance serving a required diskgroup If it happens while starting existing database, sqlplus startup fails: ORA-01078: failure in processing system parameters ORA-01565: error in identifying file '+DATA/prod/spfileprod.ora' ORA-17503: ksfdopn:2 Failed to open file +DATA/prod/spfileprod.ora ORA-15077: could not locate ASM instance serving a required diskgroup ORA-29701: unable to connect to Cluster Manager Database alert.log shows the following message: ORA-27504: IPC error creating OSD context ORA-27300: OS system dependent operation:skgxnqtsz failed with status: 0 ORA-27301: OS failure message: Error 0 ORA-27302: failure occurred at: SKGXN not av clsssinit ret = 21 interconnect information is not available from OCR WARNING: No cluster interconnect has been specified. Depending on the communication driver configured Oracle cluster traffic may be directed to the public interface of this machine. Oracle recommends that RAC clustered databases be configured with a private interconnect for enhanced security and performance Solution: To start a pre11gR2 database in 11gR2 Grid Infrastructure environment, node(s) must be pinned. To pin node(s), as root execute: $GRID_HOME/bin/crsctl pin css -n <racnode1> <racnode2> <racnode3> $GRID_HOME/bin/olsnodes -t -n 2. If datafiles are located in ASM, DBCA fails to create database with error: "DBCA could not startup the ASM instance configured on this node. To processd with database creation using ASM you need the ASM instance to be up and running. Do you want to recreate the ASM instance on this node?" DBCA trace (10g in $RDBMS_HOME/cfgtoollogs/dbca and 11g in $ORACLE_BASE/cfgtools/dbca) shows the following exception: oracle.sysman.assistants.util.CommonUtils.getListenerProperties(CommonUtils.java:421) oracle.sysman.assistants.util.asm.ASMAttributes.getConnection(ASMAttributes.java:150) oracle.sysman.assistants.util.asm.ASMInstanceRAC.validateLocalASMConnection(ASMInstanceRAC.java:811) oracle.sysman.assistants.util.asm.ASMInstanceRAC.validateASM(ASMInstanceRAC.java:595) oracle.sysman.assistants.util.asm.ASMInstanceRAC.validateASM(ASMInstanceRAC.java:522) oracle.sysman.assistants.util.asm.ASMInstanceRAC.validateASM(ASMInstanceRAC.java:515) oracle.sysman.assistants.dbca.ui.StorageOptionsPage.validate(StorageOptionsPage.java:496) oracle.sysman.assistants.util.wizard.WizardPageExt.wizardValidatePage(WizardPageExt.java:206) .... java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:151) java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:145) java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:137) java.awt.EventDispatchThread.run(EventDispatchThread.java:100) [AWT-EventQueue-0] [10:4:5:781] [StorageOptionsPage.validate:611] ASM present but not startable, querying user.. Solution: Due to unpublished bug 8288940 (fixed in 10.2.0.5), DBCA will fail if database files are located in ASM. patch 8288940 is platform independent and is available for 10.2.0.3, 10.2.0.4, 11.1.0.6 and 11.1.0.7 as a .jar file and needs to be applied to database home. For other RDBMS version where there's no patch, please refer to workaround section of bug 8520511 in Database Readme <Oracle Database Readme 11g Release 2 (11.2) -> Bug 8520511> 3. SRVCTL fails to start instance if OCR is located in an ASM diskgroup or with different permission/ownership. The racgimon log (located in $RDBMS_HOME/log/$HOST/racg/imon_${DBNAME}.log) shows the following message: 2009-10-17 11:20:22.093: [ OCROSD][7866809875]utopen:6':failed in stat OCR file/disk +DATA, errno=2, os err string=No such file or directory 2009-10-17 11:20:22.093: [ OCROSD][7866809875]utopen:7:failed to open OCR file/disk +DATA , errno=2, os err string=No such file or directory 2009-10-17 11:20:22.093: [ OCRRAW][7866809875]proprinit: Could not open raw device 2009-10-17 11:20:22.093: [ default][7866809875]a_init:7!: Backend init unsuccessful : [26] .. 2009-10-17 11:20:22.094: [ CSSCLNT][7866809875]clsssinit: Unable to access OCR device in OCR init.PROC-26: Error while accessing the physical storage OperatingSystem error [No such file or directory] [2] 2009-10-17 11:20:22.094: [ RACG][7866809875] [23974][7866809875][ora.default]: racgimon exiting clsz init failed Solution: Due to unpublished bug 8262786, if OCR is located in ASM or with different permission/ownership, srvctl will fail to start earlier database version.Fix for unpublished bug 8262786 is included in 10.2.0.4 CRS PSU4, 10.2.0.5, 11.1.0.7 CRS PSU4, and Windows 10.2.0.4 Patch 36 and needs to be applied to database home. The workaround is to use sqlplus to start the database instead of srvctl @ 8312004 closed as dup 4. Database fails to start after restart of GI. $GRID_HOME/log/$HOST/agent/crsd/application_<dbuser>/application_<dbuser>.log shows: 2009-11-05 14:31:19.922: [ AGFW][1342593344] Agent received the message: RESOURCE_START[ora.db10.db102.inst 1 1] ID 4098:632 .. 2009-11-05 14:31:19.924: [ AGFW][1275476288] Executing command: start for resource: ora.db10.db102.inst 1 1 2009-11-05 14:31:19.924: [ora.db10.db102.inst][1275476288] [start] START action called. 2009-11-05 14:31:19.924: [ora.db10.db102.inst][1275476288] [start] Executing action script: /home/app/oracle/product/10.2/db/bin/racgwrap[start] .. 2009-11-05 14:31:22.781: [ora.db10.db102.inst][1275476288] [start] Enter user-name: Connected to an idle instance. 2009-11-05 14:31:22.781: [ora.db10.db102.inst][1275476288] [start] 2009-11-05 14:31:22.782: [ora.db10.db102.inst][1275476288] [start] SQL> ORA-01565: error in identifying file '+DATA/db10/spfiledb10.ora' 2009-11-05 14:31:22.782: [ora.db10.db102.inst][1275476288] [start] ORA-17503: ksfdopn:2 Failed to open file +DATA/db10/spfiledb10.ora 2009-11-05 14:31:22.782: [ora.db10.db102.inst][1275476288] [start] ORA-15077: could not locate ASM instance serving a required diskgroup 2009-11-05 14:31:22.782: [ora.db10.db102.inst][1275476288] [start] 2009-11-05 14:31:22.782: [ora.db10.db102.inst][1275476288] [start] ORA-01078: failure in processing system parameters After GI restart, status of diskgroup: $GRID_HOME/bin/crsctl stat res -t .. ora.DATA.dg OFFLINE OFFLINE racnode1 OFFLINE OFFLINE racnode2 .. Solution: Due to unpublished bug 8448079, while stopping GI, ASM init.ora parameter asm_diskgroups will be nullified and some diskgroups will remain OFFLINE after restart of GI which cause pre-11.2 database fails to start. Fix for unpublished bug 8448079 is included in the 11.2.0.2, patch 8448079 exists for certain platform and needs to be applied to GI home. The workaround is to add dependence of diskgroup to each instance: $GRID_HOME/bin/crsctl modify res ora.db10.db102.inst -attr "REQUIRED_RESOURCES='ora.racnode2.ASM2.asm,ora.DATA.dg'"5. SRVCTL fails to start service For example: $RDBMS_HOME/bin/srvctl start service -d b1 -s sb1 $GRID_HOME/bin/crsctl stat res .. NAME=ora.b1.sb1.cs TYPE=application TARGET=ONLINE STATE=ONLINE on eyrac1f NAME=ora.b1.sb1.b11.srv TYPE=application TARGET=OFFLINE STATE=OFFLINE NAME=ora.b1.sb1.b12.srv TYPE=application TARGET=OFFLINE STATE=OFFLINE NAME=ora.eons TYPE=ora.eons.type TARGET=ONLINE , ONLINE STATE=ONLINE on eyrac1f, ONLINE on eyrac2f Solution: Due to unpublished bug 8373758, pre-11.2 srvctl will fail to start a service if the service is followed by a 11gR2 new resource in "crsctl stat res" output. In the above example, ora.eons is a new resource in 11.2 and pre-11.2 srvctl can't parse its status properly. Fix for unpublished bug 8373758 is included in 10.2.0.4 CRS PSU4, 10.2.0.5, 11.1.0.7 CRS PSU2 and needs to be applied to database home. The workaround is to: A. Create a dummy pre11.2 resource entry alphabetically after service entry. In the example above, creating a dummy resource ora.b2.db should workaround the problem$RDBMS_HOME/bin/srvctl add database -d b2 -o $RDBMS_HOME B. Try to start all services for the database: $RDBMS_HOME/bin/srvctl start service -d b1 ORA-15025: could not open disk '/dev/rdsk/disk1' ORA-27041: unable to open file SVR4 Error: 13: Permission denied And execution of setasmgidwrap fails with: $GRID_HOME/bin/setasmgidwrap o=/home/oracle/10.2/bin/oracle KFSG-00312: not an Oracle binary: '/home/oracle/10.2/bin/oracle' Solution: Due to bug 9575578, setasmgidwrap fails with pre-11.2 oracle binary. Fix for bug 9575578 is included in 11.2.0.2, patch 9575578 exists for certain platform and needs to be applied to GI home. 7. After removal of pre-11.2 CRS home, the following error reported while trying to start or stop database, or stop cluster: CRS-5809: Failed to execute 'ACTION_SCRIPT' value of '/ocw/crs10/bin/racgwrap' for 'ora.db10.db'. Error information 'cmd /ocw/crs10/bin/racgwrap not found' CRS-2680: Clean of 'ora.db10.db' on 'node1' failed If GI is being stopped, the following will be reported: CRS-2794: Shutdown of Cluster Ready Services-managed resources on 'node1' has failed CRS-2675: Stop of 'ora.crsd' on 'node1' failed CRS-4000: Command Stop failed, or completed with errors. Solution: Due to bug 9257105, even upgrade finishes successfully, OCR configuration for pre-11.2 database still points to pre-11.2 CRS home. Fix for bug 9257105 is included in 11.2.0.1.2 and 11.2.0.2, unfortunately the fix itself is having regression which is being worked in unpublished bug 9678856. The workaround is to: A. As pre-11.2 database owner, execute the following command for each pre-11.2 database: crsctl modify res ora.<dbname>.db -attr "ACTION_SCRIPT=$GRID_HOME/bin/racgwrap"Or B. As pre-11.2 database owner, recreate database resource in OCR with note 1069369.1 8. Singleton service does not failover or uniform service does not stop after local node VIP resource failed or stopped: $DBHOME/bin/srvctl config service -d racstr rac_u PREF: racstr1 racstr2 AVAIL: rac_s PREF: racstr1 AVAIL: racstr2 $DBHOME/bin/srvctl status service -d racstr Service rac_u is running on instance(s) racstr1, racstr2 Service rac_s is running on instance(s) racstr1 $GRID_HOME/bin/crsctl status res ora.strdt01.vip NAME=ora.strdt01.vip TYPE=ora.cluster_vip_net1.type TARGET=ONLINE STATE=ONLINE on strdt01 Disable public network on node where instance racstr1 is running, VIP failover to another node: $GRID_HOME/bin/crsctl status res ora.strdt01.vip NAME=ora.racha602.vip TYPE=ora.cluster_vip_net1.type TARGET=ONLINE STATE=INTERMEDIATE on racha603 <== Vip failover to other node $DBHOME/bin/srvctl status service -d racstr Service rac_u is running on instance(s) racstr1, racstr2 <== Service still running on racstr1 Service rac_s is running on instance(s) racstr1 <== Service did not failover to racstr2 Solution: Due to unpublished bug 9039498, pre-11.2 database service will not failover or stop if local public network is down. Fix for bug 9039498 is included in 11.2.0.2, 12.1 and applies to GI home. 9. DBCA/srvctl fails to add instance/database with the following error: PRKO-2010 : Error in adding instance to node: node1 PRKR-1008 : adding of instance dba21 on node node1 to cluster database dba2 failed. CRS-2518: Invalid directory path '/home/oracle/product/11.1/db/bin/racgwrap' CRS-0241: Invalid directory path Solution: Due to bug 9767810, if pre-11.2 database is not installed on all nodes of the cluster, srvctl fails to add instance/database to OCR. bug 9767810 is fixed in 11.2.0.1.2 and applies to GI home. The workaround is to copy pre-11.2db/bin/racgwrap to all nodes in the cluster, and make sure it's accessible by pre-11.2 database owner. 10. Pre-11.2.0.2 database fails if any private network fails. It happens as pre-11.2.0.2 instance isn't aware of Redundant Interconnect feature (multiple active cluster_interconnect in "oifcfg getif"), see note 1210883.1 for more about HAIP Solution: As HAIP feature can not be disabled, for environment with GI 11.2.0.2 with pre-11.2.0.2 database/ASM, it's recommended to use OS level bonding solution for private network as in earlier clusterware version. 11. If OCR is located on ASM, pre-10.2.0.5 database can not get cluster interconnect even it's configured in OCR: $ oifcfg getif eth3 120.0.0.0 global public eth1 10.1.0.0 global cluster_interconnect Instance alert.log: ORA-27504: IPC error creating OSD context ORA-27300: OS system dependent operation:skgxnqtsz failed with ! status: 0 ORA-27301: OS failure message: Error 0 ORA-27302: failure occurred at: SKGXN not av clsssinit ret = 21 interconnect information is not available from OCR WARNING: No cluster interconnect has been specified. Depending on the communication driver configured Oracle cluster traffic may be directed to the public interface of this machine. Oracle recommends that RAC clustered databases be configured with a private interconnect for enhanced security and performance. Solution: For 11gR2 GI + pre-10.2.0.5 database environment, it's not recommended to place OCR on ASM. Other workaround is to set init.ora parameter cluster_interconnects and ignore the warning. @ 9865139 closed as duplicate of 5389506 which is only fixed in 10.2.0.5 and above 12. After clusterware upgrade, ocrdump shows key "SYSTEM.ORA_CRS_HOME" is not updated to new clusterware home. If previous clusterware home is renamed or removed, pre-11gR2 SRVM client (srvctl, DBCA, DBUA etc) fails. $ srvctl <command> database -d racdb PRKA-2019 : Error executing command "/ocw/b201/bin/crs_stat". File is missing. $ ocrdump -stdout -keyname SYSTEM.ORA_CRS_HOME [SYSTEM.ORA_CRS_HOME] ORATEXT : /ocw/b201 In this example, /ocw/b201 is pre-upgrade clusterware home which is not updated. Solution: Due to bug 10231584, OCR key SYSTEM.ORA_CRS_HOME is not updated during upgrade. The workaround is to execute the following as root on any node: # ${11.2.0.2GI_HOME}/bin/clscfg -upgrade -lastnode -g <asmadmin> Note: <asmadmin> is oinstall group typically, but it should be asmadmin in job role separation environment. 13. Service does not stop/failover after stopping corresponding instance $DB_HOME/log/$HOST/racg/imon_<DB_NAME>.log 2011-03-14 23:22:52.228: [ RACG][1108842816] [14693][1108842816][ora.SD302.SD3021.inst]: CLSR-0521: Event ora.ha.racdb.racdb1.inst.down is rejected by EVM daemon 2011-03-14 23:22:52.228: [ RACG][1108842816] [14693][1108842816][ora.SD302.SD3021.inst]: clsrcepevm: clsrcepostevt status = 17 2011-03-14 23:22:52.228: [ RACG][1108842816] [14693][1108842816][ora.SD302.SD3021.inst]: clsrcep:evm post return 1 2011-03-14 23:22:54.458: [ RACG][1108842816] [14693][1108842816][ora.SD302.SD3021.inst]: CLSR-0521: Event sys.ora.clu.crs.app.trigger is rejected by EVM daemon 2011-03-14 23:23:06.495: [ RACG][1108842816] [14693][1108842816][ora.SD302.SD3021.inst]: clsrcexecut: env _USR_ORA_PFILE=/ocw/grid/racg/tmp/ora.racdb.racdb1.inst.ora 2011-03-14 23:23:06.495: [ RACG][1108842816] [14693][1108842816][ora.SD302.SD3021.inst]: clsrcexecut: cmd = /database/db205/bin/racgeut -e _USR_ORA_DEBUG=0 -e ORACLE_SID=racdb1 540 /database/db205/bin/racgimon stop ora.racdb.racdb1.inst $GRID_HOME/log/$HOST/evmd/evmd.log 2011-02-16 06:13:55.965: [ EVMAPP][4163668704] EVMD Started .. 2011-02-16 06:13:55.980: [ EVMD][4163668704] Could not open /ocw/grid/evm/admin/conf/evmdaemon.conf Reconfiguration aborted. Solution: Due to bug 12340700, the following files has wrong permission: ls -l $GRID_HOME/evm/admin/conf Unpublished bug 12340700 is fixed in 11.2.0.3 and applies to GI home. The workaround is to fix the permission manually with "chmod" command and restart GI. The expected ownership and permission are:-rw------- 1 root root 3032 Feb 19 14:42 evm.auth -rw------- 1 root root 2318 Feb 19 14:42 evmdaemon.conf -rw------- 1 root root 4871 Feb 19 14:42 evmlogger.conf ls -l $GRID_HOME/evm/admin/conf -rw-r--r-- 1 root root 3032 Feb 19 14:42 evm.auth -rw-r--r-- 1 root root 2318 Feb 19 14:42 evmdaemon.conf -rw-r--r-- 1 root root 4871 Feb 19 14:42 evmlogger.conf ReferencesBUG:9257105 - CRSCTL STOP CRS REPORTS CRS-5809BUG:9575578 - KFSG-00312: NOT AN ORACLE BINARY, USING SETASMGIDWRAP NOTE:1064804.1 - Apply Grid Infrastructure/CRS Patch in Mixed environment NOTE:1069369.1 - How to Delete or Add Resource to OCR NOTE:1210883.1 - 11gR2 Grid Infrastructure Redundant Interconnect and ora.cluster_interconnect.haip NOTE:405820.1 - 10.2.0.X CRS Bundle Patch Information NOTE:810663.1 - 11.1.0.X CRS Bundle Patch Information |
Niciun comentariu:
Trimiteți un comentariu