CLUSTER DE-CONFIGURING
AND
RE-CONFIGURING
It's Easy with AJEET,
Below section I have provided complete steps to de-config and re-config the cluster on 3 node RAC.
It has been done when the host-name and IP address of all the 3 nodes have been changed.
When the host-name and IP address is changed the existing cluster will stop working so its very important to make your cluster to start so that your database works fine and good as earlier it use to be.
So this can be done in two ways.
1. Delete and add node (which I am going to post in another section)
2. De-configuring and Re-configuring cluster.
I perter doing it the 2nd way, you may ask why?
the reason is very simple, think of a RAC environment where you have more node (may be 5 to 10),
how much time it will take to delete the node and add the node again.
Grid Infrastructure Cluster - Entire Cluster
Deconfigure and reconfigure entire cluster will rebuild OCR and Voting Disk, user resources (database, instance, service, listener etc) will need to be added back to the cluster manually after reconfigure finishes.
Why is deconfigure needed?
Deconfigure is needed when:
- OCR is corrupted without any good backup
- Or GI stack will not come up on any nodes due to missing Oracle Clusterware related files in /etc or /var/opt/oracle, i.e. init.ohasd missing etc. If GI is able to come up on at least one node, refer to next Section "B. Grid Infrastructure Cluster - One or Partial Nodes".
- $GRID_HOME should be intact as deconfigure will NOT fix $GRID_HOME corruption
In below case we consider IP addresses and hostname of all the node have been changed.
PRECHECK BEFORE PERFORMING DE-CONFIGURE OF CLUSTER
1. Before de-configuring a node, ensure it's not pinned, i.e
$GI_HOME/bin/olsnodes -s -t
Node1 Active Unpinned
node2 Active Unpinned
node3 Active Unpinned
2. If a node is pinned, unpin it first, i.e. as root user:
/oracle/grid/product/11.2.0/grid/bin/crsctl unpin css -n <node_name>
3. Before de-configuring, collect the following as grid user if possible to generate a list of user resources to be added back to the cluster after reconfigure finishes:
$GRID_HOME/bin/crsctl stat res -t
$GRID_HOME/bin/crsctl stat res -p
$GRID_HOME/bin/crsctl query css votedisk
$GRID_HOME/bin/ocrcheck
$GRID_HOME/bin/oifcfg getif
$GRID_HOME/bin/srvctl config nodeapps -a
$GRID_HOME/bin/srvctl config scan
$GRID_HOME/bin/srvctl config asm -a
$GRID_HOME/bin/srvctl config listener -l <listener-name> -a
$DB_HOME/bin/srvctl config database -d <dbname> -a
$DB_HOME/bin/srvctl config service -d <dbname> -s <service-name> -v
TO DECONFIG THE CLUSTERWARE
- If OCR and Voting Disks are NOT on ASM, or If OCR and Voting Disks are on ASM but there's NO user data in OCR/Voting Disk ASM diskgroup:
On all remote nodes, as root execute:
# /oracle/grid/product/11.2.0/grid/crs/install/rootcrs.pl -deconfig -force -verbose
Once the above command finishes on all remote nodes, on local node, as root execute:
# /oracle/grid/product/11.2.0/grid/crs/install/rootcrs.pl -deconfig -force -verbose -lastnode
If there is user data in OCR/Voting Disk ASM diskgroup
# $GRID_HOME/crs/install/rootcrs.pl -deconfig -force -verbose -keepdg –lastnode
We do not had any user data on OCR VOTING DISK so we followed
# $GRID_HOME/crs/install/rootcrs.pl -deconfig -force -verbose -lastnode
Once de-configuration completed follow the below steps before re- configuration of cluster
Clean up the profile.xml files
The profile.xml files will contains the old ip address so to update the new ip address to profile.xml we have to clean the old profile.xml file.
$GRID_HOME/gpnp/node1/profiles/peer#
In case if we are not cleaning the profile.xml file we may have to face the issue has below.
The below will occur while executing root.sh script.
ERROR:
CRS-2676: Start of 'ora.cssd' on 'node1' succeeded
Start of resource "ora.cluster_interconnect.haip" failed
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'node1'
CRS-5017: The resource action "ora.cluster_interconnect.haip start" encountered the following error:
Start action for HAIP aborted. For details refer to "(:CLSN00107:)" in "/oracle/grid/product/11.2.0/grid/log/node1/agent/ohasd/orarootagent_root/orarootagent_root.log".
CRS-2674: Start of 'ora.cluster_interconnect.haip' on 'node1' failed
CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'node1'
CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'node1' succeeded
CRS-4000: Command Start failed, or completed with errors.
HAIP startup failure considered fatal, terminating at /oracle/grid/product/11.2.0/grid/crs/install/crsconfig_lib.pm line 1330.
/oracle/grid/product/11.2.0/grid/perl/bin/perl -I/oracle/grid/product/11.2.0/grid/perl/lib -I/oracle/grid/product/11.2.0/grid/crs/install /oracle/grid/product/11.2.0/grid/crs/install/rootcrs.pl execution failed
****************************************************************************
TO CLEAN PROFILE.XML AND CHECKPOINT FILE USE THE FOLLOWING COMMAND
a. Step1 and 2 can be skipped on node(s) where root.sh haven't been executed this time.
b. Step1 and 2 should remove checkpoint file. To verify:
ls -l $ /oracle/grid/oracle_base/Clusterware/ckptGridHA_.xml
If it's still there, please remove it manually with "rm" command on all nodes
c. If GPNP profile is different between nodes/setup, clean it up on all nodes as grid user
$ find /oracle/grid/product/11.2.0/grid/gpnp/* -type f -exec rm -rf {} \;
Clean up the OCR_VOTE disk or the disk which contains voting files.
If we are not cleaning OCR_VOTE1 disk we may have to face the error as below.
The error will occur at the time of executing root.sh.
ERROR:
bash: /root/.bashrc: Permission denied
Disk Group OCR_VOTE mounted successfully.
Existing OCR configuration found, aborting the configuration. Rerun configuration setup after deinstall at /oracle/grid/product/11.2.0/grid/crs/install/crsconfig_lib.pm line 10302.
/oracle/grid/product/11.2.0/grid/perl/bin/perl -I/oracle/grid/product/11.2.0/grid/perl/lib -I/oracle/grid/product/11.2.0/grid/crs/install /oracle/grid/product/11.2.0/grid/crs/install/rootcrs.pl execution failed
dd if=/dev/zero of=/dev/xvdd1 bs=1048576 count=10
dd if=/dev/zero of=/dev/xvde1 bs=1048576 count=10
dd if=/dev/zero of=/dev/xvdf1 bs=1048576 count=10
To clear the asm OCR_VOTE1 disk follow below steps
NOTE:
Befor performing below steps make sure you are using proper disk name.
Clear the details of disk from asm binary
dd if=/dev/zero of=/dev/xvdd1 bs=1048576 count=10
dd if=/dev/zero of=/dev/xvde1 bs=1048576 count=10
dd if=/dev/zero of=/dev/xvdf1 bs=1048576 count=10
Delete the OCR_VOTE DISKS
oracleasm deletedisk OCR_VOTE3
oracleasm deletedisk OCR_VOTE2
oracleasm deletedisk OCR_VOTE1
Create the OCR_VOTE DISKS
/etc/init.d/oracleasm createdisk OCR_VOTE1 /dev/xvdd1
/etc/init.d/oracleasm createdisk OCR_VOTE2 /dev/xvde1
etc/init.d/oracleasm createdisk OCR_VOTE3 /dev/xvdf1
TO CONFIGURE CLUSTER
export DISPLAY=HOSTNAME
$$GRID_HOME /crs/config/config.sh
Follow the instruction on GUI
Run the root.sh first on node1 after successful execution on NEW_NODE1 execute it on NEW_NODE2 and NEW_NODE3.
ISSUES FACE AT GUI.
*****************
If You face errors at GUI as below.
Just stop at this step and execute the below commands and click on RETRY
$GRID_HOME/oui/bin/runInstaller -nowait -noconsole -waitforcompletion -ignoreSysPrereqs -updateNodeList -silent CRS=true "CLUSTER_NODES={node1,node2,node3}" ORACLE_HOME=GRID_HOME_path
Even after executing above command if error still exists then click on ignore and click on next.
Once the configuration completed execute below command to update oraInventory.
1. remove the old, incorrect CRS home entry from the inventory.xml:
$GRID_HOME/oui/bin/runinstaller -detachHome -local ORACLE_HOME GRID_HOME_path
2. rerun the failed "attachhome" command:
$GRID_HOME/oui/bin/runInstaller -attachHome -noClusterEnabled ORACLE_HOME GRID_HOME_path ORACLE_HOME_NAME=Ora11g_gridinfrahome1 "CLUSTER_NODES={node1,node2,node3}" "INVENTORY_LOCATION=/oracle/app/oraInventory" LOCAL_NODE=node1
$GRID_HOME/oui/bin/runInstaller -attachHome -noClusterEnabled ORACLE_HOME GRID_HOME_path ORACLE_HOME_NAME=Ora11g_gridinfrahome1 "CLUSTER_NODES={node1,node2,node3}" "INVENTORY_LOCATION=/oracle/app/oraInventory" LOCAL_NODE=node2
./runInstaller -attachHome -noClusterEnabled ORACLE_HOME=GRID_HOME_path ORACLE_HOME_NAME=Ora11g_gridinfrahome1 "CLUSTER_NODES={node1,node2,node3}" "INVENTORY_LOCATION=/oracle/app/oraInventory" LOCAL_NODE=node3
3. mark the new home as the CRS home (CRS=true):
GRID_HOME/oui/bin/./runinstaller -local -updateNodeList ORACLE_HOME GRID_HOME_path "CLUSTER_NODES={node1,node2,node3}" CRS="true"
If you face issue at GUI as below.
The cluvfy can be ignored as this is just to get verification of how your cluster is.
Mount the diskgruops
. oraenv
+ASM1
Sqlplus ‘/as sysasm’
SQL> desc v$asm_diskgroup
Name Null? Type
----------------------------------------- -------- ----------------------------
GROUP_NUMBER NUMBER
NAME VARCHAR2(30)
SECTOR_SIZE NUMBER
BLOCK_SIZE NUMBER
ALLOCATION_UNIT_SIZE NUMBER
STATE VARCHAR2(11)
TYPE VARCHAR2(6)
TOTAL_MB NUMBER
FREE_MB NUMBER
HOT_USED_MB NUMBER
COLD_USED_MB NUMBER
REQUIRED_MIRROR_FREE_MB NUMBER
USABLE_FILE_MB NUMBER
OFFLINE_DISKS NUMBER
COMPATIBILITY VARCHAR2(60)
DATABASE_COMPATIBILITY VARCHAR2(60)
VOTING_FILES VARCHAR2(1)
SQL> select NAME,STATE from v$asm_diskgroup;
NAME STATE
------------------------------ -----------
OCR_VOTE MOUNTED
DG_ARCHIVE1 DISMOUNTED
DG_DATA1 DISMOUNTED
DG_REDO1 DISMOUNTED
SQL> alter diskgroup DG_ARCHIVE1 mount;
Diskgroup altered.
SQL> select NAME,STATE from v$asm_diskgroup;
NAME STATE
------------------------------ -----------
OCR_VOTE MOUNTED
DG_ARCHIVE1 MOUNTED
DG_DATA1 DISMOUNTED
DG_REDO1 DISMOUNTED
SQL> alter diskgroup DG_DATA1 mount;
Diskgroup altered.
SQL> alter diskgroup DG_REDO1 mount;
Diskgroup altered.
SQL> select NAME,STATE from v$asm_diskgroup;
NAME STATE
------------------------------ -----------
OCR_VOTE MOUNTED
DG_ARCHIVE1 MOUNTED
DG_DATA1 MOUNTED
DG_REDO1 MOUNTED
Try to startup the database
root@NEW_node1:$ORACLE_HOME/bin# ./srvctl start database -d accup11
PRCD-1120 : The resource for database accup11 could not be found.
PRCR-1001 : Resource ora.accup11.db does not exist
If you face issue like below
check for the database resources are added in cluster or not.
crsctl stat res –t
if there is no database resources added in it try to startup database connecting to sqlplus.
Startup the instance on each node
. oraenv
NEW_NODE1 ACCUP111
NEW_NODE1 ACCUP112
NEW_NODE1 ACCUP113
Startup
If it’s get startup without any issue.
Then just add the data base resource to cluster by following command.
srvctl add database -d accup11 -o $ORACLE_HOME +DG_DATA1/accup11/spfileaccup11.ora -a DG_ARCHIVE1,DG_DATA1,DG_REDO1
($ORACLE_HOME is the path of your oracle home)
crsctl stat res –t
ora.dbname.db
1 OFFLINE OFFLINE
1 ONLINE ONLINE node1
If the instance are not added in the cluster add them using below command.
1. srvctl add instance -d dbname -i instance1 -n node1
2. srvctl add instance -d dbname -i instance2 -n node2
3. srvctl add instance -d dbname -i instance3 -n node3
srvctl config database -d dbname
Check again
crsctl stat res –t
1 ONLINE ONLINE node1
ora.dbname.db
1 OFFLINE OFFLINE
2 OFFLINE OFFLINE
3 OFFLINE OFFLINE
Just shutdown database connecting to all the nodes.
Startup using following command
./srvctl start database -d dbname
oracle@new_node1:/oracle/grid/product/11.2.0/grid/bin$ ps -aef |grep smon
root 18023 1 2 04:31 ? 00:01:34 /oracle/grid/product/11.2.0/grid/bin/osysmond.bin
grid 18162 1 0 04:31 ? 00:00:00 asm_smon_+ASM1
oracle 26572 1 0 05:29 ? 00:00:00 ora_smon_dbname
oracle 26835 25608 0 05:29 pts/0 00:00:00 grep smon
Check again
crsctl stat res –t
oracle 26835 25608 0 05:29 pts/0 00:00:00 grep smon
oracle@new_node1:$GRID_HOME/bin$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DG_ARCHIVE1.dg
ONLINE ONLINE new_node1
ONLINE ONLINE new_node2
ONLINE ONLINE new_node3
ora.DG_DATA1.dg
ONLINE ONLINE new_node1
ONLINE ONLINE new_node2
ONLINE ONLINE new_node3
ora.DG_REDO1.dg
ONLINE ONLINE new_node1
ONLINE ONLINE new_node2
ONLINE ONLINE new_node3
ora.LISTENER.lsnr
ONLINE ONLINE new_node1
ONLINE ONLINE new_node2
ONLINE ONLINE new_node3
ora.OCR_VOTE.dg
ONLINE ONLINE new_node1
ONLINE ONLINE new_node2
ONLINE ONLINE new_node3
ora.asm
ONLINE ONLINE new_node1 Started
ONLINE ONLINE new_node2 Started
ONLINE ONLINE new_node3 Started
ora.gsd
OFFLINE OFFLINE new_node1
OFFLINE OFFLINE new_node2
OFFLINE OFFLINE new_node3
ora.net1.network
ONLINE ONLINE new_node1
ONLINE ONLINE new_node2
ONLINE ONLINE new_node3
ora.ons
ONLINE ONLINE new_node1
ONLINE ONLINE new_node2
ONLINE ONLINE new_node3
ora.registry.acfs
ONLINE ONLINE new_node1
ONLINE ONLINE new_node2
ONLINE ONLINE new_node3
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE new_node2
ora.LISTENER_SCAN2.lsnr
1 ONLINE ONLINE new_node3
ora.LISTENER_SCAN3.lsnr
1 ONLINE ONLINE new_node1
ora.accup11.db
1 ONLINE ONLINE new_node1 Open
2 ONLINE ONLINE new_node2 Open
3 ONLINE ONLINE new_node3 Open
ora.cvu
1 ONLINE ONLINE new_node1
ora.new_node1.vip
1 ONLINE ONLINE new_node1
ora.new_node2.vip
1 ONLINE ONLINE new_node2
ora.new_node3.vip
1 ONLINE ONLINE new_node3
ora.oc4j
1 ONLINE ONLINE new_node1
ora.scan1.vip
1 ONLINE ONLINE new_node2
ora.scan2.vip
1 ONLINE ONLINE new_node3
ora.scan3.vip
1 ONLINE ONLINE new_node1
oracle@new_node1:$GRID_HOME/bin$
To stop database and cluster resources on RAC.
$ORACLE_HOME/bin/srvctl stop database -d DBNAME
$GRID_HOME/bin/crsctl stop cluster -all.
That's It
There you good to go.................................................
Please subscribe for latest updates.