Last night my rac 2 node server went down for OS patcing and rebooted
but all CRS resources not coming up on both the node after node reboots:
conn as root user and check all resources
[root@oradev11 bin]# ./crsctl stat
res -t -init
--------------------------------------------------------------------------------
NAME TARGET STATE
SERVER
STATE_DETAILS
--------------------------------------------------------------------------------
Cluster
Resources
--------------------------------------------------------------------------------
ora.asm
1
ONLINE OFFLINE
ora.cluster_interconnect.haip
1
ONLINE OFFLINE
ora.crf
1
ONLINE OFFLINE
ora.crsd
1
ONLINE OFFLINE
ora.cssd
1
ONLINE OFFLINE
ora.cssdmonitor
1
ONLINE ONLINE oradev11
ora.ctssd
1
ONLINE OFFLINE
ora.diskmon
1
ONLINE OFFLINE
ora.drivers.acfs
1
ONLINE ONLINE oradev11
ora.evmd
1
ONLINE OFFLINE
ora.gipcd
1
ONLINE OFFLINE
ora.gpnpd
1
ONLINE OFFLINE
ora.mdnsd
1
ONLINE OFFLINE STARTING
CRS alert log says:
[root@oradev11 ] # cd $GRID_HOME/log/hostname
[root@oradev11 oradev11]# tail -50f alertoradev11.log
o/p trimmed………
2015-04-22
20:06:01.173:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(24990)]CRS-5818:Aborted
2015-04-22
20:06:05.177:
[ohasd(12696)]CRS-2757:Command
'Start' timed out waiting for response from the resource 'ora.mdnsd'. Details
at (:CRSPE00111:) {0:0:2} in /u01/app/11.2.0.4/grid/log/sl73vmhasd/ohasd.log.
2015-04-22
20:06:05.658:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(25614)]CRS-0037:An
error occurred while attempting to write to file
"/u01/app/11.2.0.4/grid/log/oradev11/agent/ohasd/oraagenagent_grid.log".
Additional diagnostics: LFI-00004: Call to lfibwrt() failed.
LFI-01518:
write() failed(OSD return value = 28) in slfiwl.
2015-04-22
20:06:05.659:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(25614)]CRS-0004:logging
terminated for the process. log file: "/u01/app/11.2.0.4/grid/log/oradev11/agent/ohasd/oraagent_gridgrid.log"
2015-04-22
20:06:06.176:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(25631)]CRS-0037:An
error occurred while attempting to write to file
"/u01/app/11.2.0.4/grid/log/oradev11/agent/ohasd/oraagenagent_grid.log".
Additional diagnostics: LFI-00004: Call to lfibwrt() failed.
LFI-01518:
write() failed(OSD return value = 28) in slfiwl.
2015-04-22
20:06:06.176:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(25631)]CRS-0004:logging
terminated for the process. log file: "/u01/app/11.2.0.4/grid/log/oradev11/agent/ohasd/oraagent_gridgrid.log"
2015-04-22
20:06:06.272:
[gpnpd(25644)]CRS-0037:An
error occurred while attempting to write to file
"/u01/app/11.2.0.4/grid/log/oradev11/gpnpd/gpnpd.log". Additional
diagnostics: LFI-00004: ibwrt() failed.
LFI-01518:
write() failed(OSD return value = 28) in slfiwl.
2015-04-22
20:06:06.272:
[gpnpd(25644)]CRS-0004:logging
terminated for the process. log file: "/u01/app/11.2.0.4/grid/log/oradev11/gpnpd/gpnpd.log"
2015-04-22
20:06:09.314:
[gpnpd(25644)]CRS-2329:GPNPD
on node oradev11 shutdown.
2015-04-22
20:08:06.226:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(25631)]CRS-5818:Aborted
command 'start' for resource 'ora.gpnpd'. Details at (:CRSAGF00113:) {0:0:2} in
/u01/app/11.2.0.4/grid/logbd001/agent/ohasd/oraagent_grid/oraagent_grid.log.
2015-04-22
20:08:10.229:
[ohasd(12696)]CRS-2757:Command
'Start' timed out waiting for response from the resource 'ora.gpnpd'. Details
at (:CRSPE00111:) {0:0:2} in /u01/app/11.2.0.4/grid/log/sl73vmhasd/ohasd.log.
2015-04-22
20:08:10.710:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(26582)]CRS-0037:An error occurred while attempting to write
to file "/u01/app/11.2.0.4/grid/log/oradev11/agent/ohasd/oraagenagent_grid.log".
Additional diagnostics: LFI-00004: Call to lfibwrt() failed.
LFI-01518:
write() failed(OSD return value = 28) in slfiwl.
2015-04-22
20:08:10.710:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(26582)]CRS-0004:logging
terminated for the process. log file: "/u01/app/11.2.0.4/grid/log/oradev11/agent/ohasd/oraagent_gridgrid.log"
2015-04-22
20:08:11.280:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(26604)]CRS-0037:An error occurred while attempting to write
to file "/u01/app/11.2.0.4/grid/log/oradev11/agent/ohasd/oraagenagent_grid.log".
Additional diagnostics: LFI-00004: Call to lfibwrt() failed.
LFI-01518:
write() failed(OSD return value = 28) in slfiwl.
2015-04-22
20:08:11.280:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(26604)]CRS-0004:logging
terminated for the process. log file: "/u01/app/11.2.0.4/grid/log/oradev11/agent/ohasd/oraagent_gridgrid.log"
2015-04-22
20:08:11.347:
[mdnsd(26617)]CRS-0037:An error occurred while attempting to write
to file "/u01/app/11.2.0.4/grid/log/oradev11/mdnsd/mdnsd.log".
Additional diagnostics: LFI-00004: ibwrt() failed.
LFI-01518:
write() failed(OSD return value = 28) in slfiwl.
2015-04-22
20:08:11.347:
[mdnsd(26617)]CRS-0004:logging
terminated for the process. log file: "/u01/app/11.2.0.4/grid/log/oradev11/mdnsd/mdnsd.log"
2015-04-22
20:08:11.351:
[mdnsd(26617)]CRS-5602:mDNS
service stopping by request.
After so much of time spending on troubleshooting I checked the space on
server and then released it is because of space issue on a mount point where my GRID home located, due to which CRS resources are not coming
up
[root@oradev11 bin]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/rootvg-rootlv
5.8G 1.8G
3.8G 33% /
tmpfs 3.0G 0
3.0G 0% /dev/shm
/dev/sda1 190M 86M
95M 48% /boot
/dev/mapper/rootvg-homelv
2.0G 9.2M
1.8G 1% /home
/dev/mapper/rootvg-optlv
9.8G 2.0G
7.3G 22% /opt
/dev/mapper/rootvg-securlv
1.5G 211M
1.2G 16% /opt/security
/dev/mapper/rootvg-tmplv
2.0G 375M
1.5G 21% /tmp
/dev/mapper/rootvg-varlv
9.8G 1.1G
8.2G 12% /var
/dev/mapper/datavg-gridbaselv
50G 49G
0 100% /u01/app
/dev/mapper/datavg-rdbmsbaselv
50G 4.8G
42G 11% /u01/app/oracle
/dev/mapper/datavg-adrrepolv
50G 2.6G
45G 6% /oratrace
/dev/mapper/datavg-oemagentlv
20G 651M
18G 4% /u01/app/emagent
/dev/mapper/datavg-gglv
50G 52M
47G 1% /gg
/dev/mapper/datavg-dbawslv
99G 16G
79G 17% /oraworkspace
/dev/mapper/datavg-auditfslv
50G 230M
47G 1% /oradbaudit
/dev/mapper/datavg-dbtoolslv
9.8G 86M
9.2G 1% /oratools
Checking to see if i can delete anything on /u01/app mount point and i see "crfclust.bdb" is consuming much space then any other
[root@oradev11 bin]# cd ../crf/db
[root@oradev11 db]# ls -lrht
total 4.0K
drwxr-x--- 2 root oinstall 4.0K Apr 22 20:45 oradev11
[root@oradev11 db]# cd oradev11
[root@oradev11 oradev11]# ls -lrth
total 38G
-rw-r--r-- 1 root root 1.1M Sep
8 2014 08-SEP-2014-09:24:06.txt
-rw-r--r-- 1 root root 1.9M Sep
8 2014 08-SEP-2014-10:07:28.txt
-rw-r--r-- 1 root root 1.2M Sep
8 2014 08-SEP-2014-10:20:00.txt
-rw-r----- 1 root root 8.0K Nov 20 09:44 repdhosts.bdb
-rw-r--r-- 1 root root 74K
Mar 9 10:53 09-MAR-2015-10:53:37.txt
-rw-r--r-- 1 root root 856K Mar
9 10:56 09-MAR-2015-10:56:42.txt
-rw-r--r-- 1 root root 77K Mar
13 19:21 13-MAR-2015-19:21:26.txt
-rw-r--r-- 1 root root 218K Mar 13 19:21 13-MAR-2015-19:21:44.txt
-rw-r----- 1 root root 16M Apr
22 12:19 log.0000007983
-rw-r----- 1 root root 24K Apr
22 20:42 __db.001
-rw-r--r-- 1 root root 115M Apr 22 20:42 oradev11.ldb
-rw-r----- 1 root root 8.0K Apr 22 20:43 crfconn.bdb
-rw-r--r-- 1 root root 777K Apr 22 20:45 22-APR-2015-20:45:53.txt
-rw-r----- 1 root root 56K Apr 22
20:56 __db.006
-rw-r----- 1 root root 392K Apr 22 20:56 __db.002
-rw-r----- 1 root root 812M Apr 22 20:56 crfloclts.bdb
-rw-r----- 1 root root 668M Apr 22 20:56 crfcpu.bdb
-rw-r----- 1 root root 743M Apr 22 20:56 crfalert.bdb
-rw-r----- 1 root root 526M Apr 22 20:56 crfts.bdb
-rw-r----- 1 root root 607M Apr 22 20:56 crfhosts.bdb
-rw-r----- 1 root
root 34G Apr 22 20:56 crfclust.bdb
-rw-r----- 1 root root 16M Apr
22 20:56 log.0000007984
-rw-r----- 1 root root 1.2M Apr 22 20:56 __db.005
-rw-r----- 1 root root 2.1M Apr 22 20:56 __db.004
-rw-r----- 1 root root 2.6M Apr 22 20:56 __db.003
From the above output I see only “crfclust.bdb”
is consuming lot of space, then I followed the steps given in the oracle doc to
free up the space on the server
Stop ora.crf ……….
[root@oradev11 bin]# ./crsctl
stop res ora.crf -init
CRS-2673: Attempting to stop 'ora.crf' on 'oradev11'
CRS-2677: Stop of 'ora.crf' on 'oradev11' succeeded
[root@oradev11 oradev11]# rm crfclust.bdb
[root@oradev11 oradev11]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/rootvg-rootlv
5.8G 1.8G
3.8G 33% /
tmpfs 3.0G 854M
2.2G 28% /dev/shm
/dev/sda1 190M 86M
95M 48% /boot
/dev/mapper/rootvg-homelv
2.0G 9.2M
1.8G 1% /home
/dev/mapper/rootvg-optlv
9.8G 2.0G
7.3G 22% /opt
/dev/mapper/rootvg-securlv
1.5G 211M
1.2G 16% /opt/security
/dev/mapper/rootvg-tmplv
2.0G 376M
1.5G 21% /tmp
/dev/mapper/rootvg-varlv
9.8G 1.1G
8.2G 12% /var
/dev/mapper/datavg-gridbaselv
50G 13G
34G 28% /u01/app
/dev/mapper/datavg-rdbmsbaselv
50G 4.8G
42G 11% /u01/app/oracle
/dev/mapper/datavg-adrrepolv
50G 2.6G
45G 6% /oratrace
/dev/mapper/datavg-oemagentlv
20G 651M
18G 4% /u01/app/emagent
/dev/mapper/datavg-gglv
50G 52M
47G 1% /gg
/dev/mapper/datavg-dbawslv
99G 16G
79G 17% /oraworkspace
/dev/mapper/datavg-auditfslv
50G 231M
47G 1% /oradbaudit
/dev/mapper/datavg-dbtoolslv
9.8G 86M
9.2G 1% /oratools
/dev/asm/ggatevol-387
20G 562M
20G 3% /gg/GG11
Start again………..
[root@oradev11 bin]# ./crsctl
start res ora.crf -init
CRS-2672: Attempting to start 'ora.crf' on 'oradev11'
CRS-2676: Start of 'ora.crf' on 'oradev11' succeeded
[root@oradev11 bin]# ./crsctl status res -t -init
--------------------------------------------------------------------------------
NAME TARGET STATE
SERVER
STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
1 ONLINE
ONLINE oradev11 Started
ora.cluster_interconnect.haip
1 ONLINE
ONLINE oradev11
ora.crf
1
ONLINE ONLINE oradev11
ora.crsd
1 ONLINE
ONLINE oradev11
ora.cssd
1 ONLINE
ONLINE oradev11
ora.cssdmonitor
1 ONLINE
ONLINE oradev11
ora.ctssd
1 ONLINE
ONLINE oradev11 OBSERVER
ora.diskmon
1 OFFLINE OFFLINE
ora.drivers.acfs
1 ONLINE
ONLINE oradev11
ora.evmd
1 ONLINE
ONLINE oradev11
ora.gipcd
1 ONLINE
ONLINE oradev11
ora.gpnpd
1 ONLINE
ONLINE oradev11
ora.mdnsd
1 ONLINE
ONLINE oradev11
Now I see all the resources are up and running
Refer:
Oracle Cluster Health Monitor (CHM) using large
amount of space (more than default) (Doc ID 1343105.1)
0 comments:
Post a Comment