Today after server patching (reboot of servers) my standby recovery stopped
working and LAG keeps increasing.
Primary: prod on RHEL 6 11.2.0.3
Standby: on RHEL 6 11.2.0.3
On STANDBY:
SQL> select status from
v$instance;
STATUS
------------
MOUNTED
SQL> SELECT PROCESS, STATUS
FROM V$MANAGED_STANDBY;
PROCESS STATUS
--------- ------------
ARCH CONNECTED
ARCH CONNECTED
ARCH CONNECTED
ARCH CONNECTED
MR(fg) WAIT_FOR_LOG (Don’t see RFS running)
Don’t see any archive GAPS and I see recovery is Running fine and
applied all the logs whatever have on standby, NO archive log GAPS
Tried to switch logs on primary and they are not shipping, so first
need to find out why logs not shipping to standby
Problem: Archive logs NOT shipping to standby side
Things checked:
·
Tried stopping and restarting of recovery process
(NO Luck)
·
Bounced standby database (NO Luck)
·
Checked archive log free space
·
Parameters looks good nothing changed
·
Log dest on primary ENABLED
·
Can able to ping server
·
Listener up and running
Primary
alert log :
Error 1034 received logging on to the standby
Fri Sep 23 13:21:11 2016
Error 1034 received logging on to the standby
Fri Sep 23 13:22:14 2016
Error 1034 received logging on to the standby
Fri Sep 23 13:23:16 2016
Check error message
SQL> SELECT gvi.thread#, timestamp, message FROM gv$dataguard_status
gvds, gv$instance gvi WHERE gvds.inst_id = gvi.inst_id AND severity in
('Error','Fatal') and rownum <11 ORDER BY timestamp, thread#;
THREAD# TIMESTAMP MESSAGE
---------- ---------
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 23-SEP-16 Error 1034
received logging on to the standby
1 23-SEP-16 PING[ARC3]:
Heartbeat failed to connect to standby 'primdb_sb1'. Error is 1034.
1 23-SEP-16 Error 1034
received logging on to the standby
1 23-SEP-16 PING[ARC3]:
Heartbeat failed to connect to standby 'primdb_sb1'. Error is 1034.
1 23-SEP-16 Error 1034
received logging on to the standby
1 23-SEP-16 PING[ARC3]:
Heartbeat failed to connect to standby 'primdb_sb1'. Error is 1034.
1 23-SEP-16 Error 1034
received logging on to the standby
1 23-SEP-16 PING[ARC3]:
Heartbeat failed to connect to standby 'primdb_sb1'. Error is 1034.
1 23-SEP-16 Error 1034
received logging on to the standby
1 23-SEP-16 PING[ARC3]: Heartbeat failed to
connect to standby 'primdb_sb1'. Error is 1034.
10 rows selected.
Solution:
After reviewing error messages from above output I again checked the
listener on standby
oradev13:primdb:/opt/oracle/product/11.2.0.3.64/network/admin $ ps -e
f| grep tns| grep -i prim
110542 ? Ssl 0:37
/opt/oracle/product/12.1.0.2.64/bin/tnslsnr LSNR_DBPRIMDBSB1 -inherit
oradev13:primdb:/opt/oracle/product/11.2.0.3.64/network/admin $ lsnrctl
status LSNR_DBPRIMDBSB1
LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 23-SEP-2016
10:30:37
Copyright (c) 1991, 2011, Oracle.
All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=dbprimdbsb1)(PORT=1781)))
STATUS of the LISTENER
------------------------
Alias LSNR_DBPRIMDBSB1
Version TNSLSNR
for Linux: Version 12.1.0.2.0 - Production
Start Date
22-SEP-2016 01:41:35
Uptime 1 days
8 hr. 49 min. 2 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter
File /opt/oracle/product/12.1.0.2.64/network/admin/listener.ora
Listener Log File
/opt/oracle/diag/tnslsnr/oradev13/lsnr_dbprimdbsb1/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.20.27.68)(PORT=1781)))
Services Summary...
Service "primdb" has 1 instance(s).
Instance "primdb",
status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully
Notice my listener is running
from 12c home but my primary and standby databases are on 11g version
oradev13:primdb:/opt/oracle/product/11.2.0.3.64/network/admin $ echo
$ORACLE_HOME
/opt/oracle/product/11.2.0.3.64
oradev13:primdb:/opt/oracle/product/11.2.0.3.64/network/admin $ lsnrctl
stop LSNR_DBPRIMDBSB1
LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 23-SEP-2016
10:32:44
Copyright (c) 1991, 2011, Oracle.
All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=dbprimdbsb1)(PORT=1781)))
The command completed successfully
oradev13:primdb:/opt/oracle/product/11.2.0.3.64/network/admin $ lsnrctl start LSNR_DBPRIMDBSB1
LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 23-SEP-2016
10:33:20
Copyright (c) 1991, 2011, Oracle.
All rights reserved.
Starting /opt/oracle/product/11.2.0.3.64/bin/tnslsnr: please wait...
TNSLSNR for Linux: Version 11.2.0.3.0 - Production
System parameter file is
/opt/oracle/product/11.2.0.3.64/network/admin/listener.ora
Log messages written to /opt/oracle/diag/tnslsnr/oradev13/lsnr_dbprimdbsb1/alert/log.xml
Listening on:
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.20.27.68)(PORT=1781)))
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=dbprimdbsb1)(PORT=1781)))
STATUS of the LISTENER
------------------------
Alias LSNR_DBPRIMDBSB1
Version TNSLSNR
for Linux: Version 11.2.0.3.0 - Production
Start Date
23-SEP-2016 10:33:20
Uptime 0 days
0 hr. 0 min. 0 sec
Trace Level off
Security ON:
Local OS Authentication
SNMP OFF
Listener Parameter File /opt/oracle/product/11.2.0.3.64/network/admin/listener.ora
Listener Log File
/opt/oracle/diag/tnslsnr/oradev13/lsnr_dbprimdbsb1/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.20.27.68)(PORT=1781)))
Services Summary...
Service "primdb" has 1 instance(s).
Instance "primdb",
status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully
oradev13:primdb:/opt/oracle/product/11.2.0.3.64/network/admin $ sqlplus
/ as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Fri Sep 23 10:33:39 2016
Copyright (c) 1982, 2011, Oracle.
All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit
Production
With the Partitioning, OLAP, Data Mining and Real Application Testing
options
SQL> SELECT PROCESS, STATUS FROM V$MANAGED_STANDBY;
PROCESS STATUS
--------- ------------
ARCH CONNECTED
ARCH CONNECTED
ARCH CONNECTED
ARCH CONNECTED
MR(fg) WAIT_FOR_GAP
RFS RECEIVING
RFS RECEIVING
RFS RECEIVING
8 rows selected.
SQL> SELECT ARCH.THREAD# "Thread", ARCH.SEQUENCE#
"Last Sequence Received", APPL.SEQUENCE# "Last Sequence
Applied", (ARCH.SEQUENCE# - APPL.SEQUENCE#) "Difference"
FROM (SELECT THREAD# ,SEQUENCE# FROM V$ARCHIVED_LOG WHERE
(THREAD#,FIRST_TIME ) IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$ARCHIVED_LOG
GROUP BY THREAD#)) ARCH,
(SELECT THREAD# ,SEQUENCE# FROM V$LOG_HISTORY WHERE (THREAD#,FIRST_TIME
) IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$LOG_HISTORY GROUP BY THREAD#)) APPL
WHERE ARCH.THREAD# = APPL.THREAD# ORDER BY 1;
2 3
4
Thread Last Sequence Received
Last Sequence Applied Difference
---------- ---------------------- --------------------- ----------
1 63578 63576 2
SQL> /
Thread Last Sequence Received
Last Sequence Applied Difference
---------- ---------------------- --------------------- ----------
1 63580 63578 2
SQL> /
Thread Last Sequence Received
Last Sequence Applied Difference
---------- ---------------------- --------------------- ----------
1 63621 63621 0
That means from the above I can say that after my server patching is
done someone had started this database from the wrong oracle home (THIS SERVER
HOSTS 11G and 12c databases) this was not failed because in my 12 home I have 11g listener also
which is not supposed to be.
Now I see my database is in Sync and running fine. Hope this helps, Thanks
J
0 comments:
Post a Comment