Oracle DataGuard Study之--DataGuard FailOver案例 Oracle DG(Dataguard)是目前比較常見的數據庫HA配置策略。通過實現Physical Standby和Logical Standby,可以實現數據冗余容錯機制。防止在主庫出現嚴重故障,不能支持服務的時候,沒有快速的后備支持環
Oracle DataGuard Study之--DataGuard FailOver案例
Oracle DG(Dataguard)是目前比較常見的數據庫HA配置策略。通過實現Physical Standby和Logical Standby,可以實現數據冗余容錯機制。防止在主庫出現嚴重故障,不能支持服務的時候,沒有快速的后備支持環境。
在DG中,switchover和failover是兩個重要的概念,也是DG實現的核心。兩者共同點都是Primary和Standby角色切換,差異在于Planned和UnPlanned之分。Switchover關鍵點在于Planned,這個切換動作是在運維機構規劃范圍內的動作。比如,進行定期系統軟硬件升級、設備維修等動作。而Failover是真正出現嚴重系統故障,如數據庫宕機、軟硬件故障導致的Primary不能支持服務,從而進行的切換動作。
根據不同的DG配置,switchover和failover也是有差異的。理論上,Switchover是不會造成數據丟失的,Primary在切換之后也是在DG配置環境中,作為Standby存在的。但是Failover則不同,除了運行在最大保護(Maximum Protection)模式下,Primary突發的故障可能引起一部分Redo Log不能及時的傳遞到Standby端,切換之后很可能有數據損失的情況。更重要的是,Primary端在發生Failover之后,是不能夠直接加入回DG配置的!也就是說,Failover之后,Primary實際上就是被“拋出”了DG環境。
那么,有什么方法實現Primary回到原有的環境呢?這個問題的困難在于保持Primary和Standby一致。在正常情況下,Primary和Standby之間是關聯同步的,即使發生了Switchover,也在可控情況下。Failover過程中有數據的缺失,還有Primary修復問題。在目前流行版本(11g)中,有三個方法:
ü 環境重建:一種最簡單的方法就是直接刪除原來的Primary庫,引用DG重建方法,重新搭建Standby端;
ü RMAN備份恢復:如果Primary端保留過一份Failover之前的備份,則可以強制原來的Primary端恢復到進行Failover的時間點,之后作為Standby接收當前Primary的redo log傳遞,應用后可以跟上進度;
ü Flashback Database恢復:Flashback技術是作為傳統備份還原技術的補充,提供了更加便捷的恢復策略。使用flashback,可以將數據庫恢復到failover之前的時間點。之后的過程和RMAN備份恢復策略相同;
案例分析:
一、在主庫端模擬數據庫意外宕機
scott@bjdb>conn /as sysdba Connected. sys@bjdb>alter system switch logfile; System altered. sys@bjdb>shutdown abort ORACLE instance shut down.
二、在備庫端
1、查看切換信息
sys@shdb>select name,database_role,switchover_status from v$database; NAME DATABASE_ROLE SWITCHOVER_STATUS --------- ---------------- -------------------- TESTDB12 PHYSICAL STANDBY NOT ALLOWED 可以看到此時備庫處于無法切換狀態
2、直接切換
sys@shdb>alter database commit to switchover to primary; alert_log:(告警日志) Fatal NI connect error 12514, connecting to: (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=shsrv)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=shdb)(CID=(PROGRAM=oracle)(HOST=bjsrv)(USER=oracle)))) VERSION INFORMATION: TNS for Linux: Version 11.2.0.3.0 - Production TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.3.0 - Production Time: 04-MAR-2015 21:25:13 Tracing not turned on. Tns error struct: ns main err code: 12564 TNS-12564: TNS:connection refused ns secondary err code: 0 nt main err code: 0 nt secondary err code: 0 nt OS err code: 0 Error 12514 received logging on to the standby FAL[client, MRP0]: Error 12514 connecting to shdb for fetching gap sequence Wed Mar 04 21:26:00 2015 alter database commit to switchover to primary ALTER DATABASE SWITCHOVER TO PRIMARY (TestDB12) Maximum wait for role transition is 15 minutes. Switchover: Media recovery is still active Database not available for switchover End-Of-REDO archived log file has not been recovered Database not available for switchover End-Of-REDO archived log file has not been recovered Database not available for switchover
3、關閉standby MPR進程
sys@shdb>ALTER DATABASE RECOVER managed standby database finish; ALTER DATABASE RECOVER managed standby database finish Terminal Recovery: request posted (TestDB12) Wed Mar 04 21:34:34 2015 Begin: Standby Redo Logfile archival End: Standby Redo Logfile archival Terminal Recovery timestamp is '03/04/2015 21:34:34' Terminal Recovery: applying standby redo logs. Terminal Recovery: thread 1 seq# 34 redo required Media Recovery Waiting for thread 1 sequence 34 Terminal Recovery: End-Of-Redo log allocation Terminal Recovery: standby redo logfile 4 created '/dsk4/arch_bj/arch_1_0_820054583.log' This standby redo logfile is being created as part of the failover operation. This standby redo logfile should be deleted after the switchover to primary operation completes. Media Recovery Log /dsk4/arch_bj/arch_1_0_820054583.log Terminal Recovery: log 4 reserved for thread 1 sequence 34 Recovery of Online Redo Log: Thread 1 Group 4 Seq 34 Reading mem 0 Mem# 0: /dsk4/arch_bj/arch_1_0_820054583.log Identified End-Of-Redo (failover) for thread 1 sequence 34 at SCN 0xffff.ffffffff Incomplete Recovery applied until change 1234252 time 03/04/2015 21:23:43 MRP0: Media Recovery Complete (TestDB12) Terminal Recovery: successful completion Wed Mar 04 21:34:35 2015 ARCH: Archival stopped, error occurred. Will continue retrying ORACLE Instance TestDB12 - Archival Error ORA-16014: log 4 sequence# 34 not archived, no available destinations ORA-00312: online log 4 thread 1: '/dsk4/arch_bj/arch_1_0_820054583.log' Forcing ARSCN to IRSCN for TR 0:1234252 Attempt to set limbo arscn 0:1234252 irscn 0:1234252 Resetting standby activation ID 2865247982 (0xaac836ee) MRP0: Background Media Recovery process shutdown (TestDB12) Terminal Recovery: completion detected (TestDB12) Completed: ALTER DATABASE RECOVER managed standby database finish
4、切換數據庫到Primary
sys@shdb>select status from v$instance; STATUS ------------ OPEN sys@shdb>select name,database_role,switchover_status from v$database; NAME DATABASE_ROLE SWITCHOVER_STATUS --------- ---------------- -------------------- TESTDB12 PHYSICAL STANDBY TO PRIMARY sys@shdb>alter database commit to switchover to primary; Database altered. sys@shdb>alter database open; Database altered. 告警日志: alter database commit to switchover to primary ALTER DATABASE SWITCHOVER TO PRIMARY (TestDB12) Maximum wait for role transition is 15 minutes. All dispatchers and shared servers shutdown CLOSE: killing server sessions. CLOSE: all sessions shutdown successfully. Wed Mar 04 21:35:47 2015 SMON: disabling cache recovery Backup controlfile written to trace file /u01/app/oracle/diag/rdbms/bjdb/TestDB12/trace/TestDB12_ora_3146.trc Standby terminal recovery start SCN: 1234251 RESETLOGS after incomplete recovery UNTIL CHANGE 1234252 Online log /dsk2/oradata/bjdb/redo01b.log: Thread 1 Group 1 was previously cleared Online log /dsk1/oradata/bjdb/redo01a.log: Thread 1 Group 1 was previously cleared Online log /dsk2/oradata/bjdb/redo02b.log: Thread 1 Group 2 was previously cleared Online log /dsk1/oradata/bjdb/redo02a.log: Thread 1 Group 2 was previously cleared Online log /dsk2/oradata/bjdb/redo03b.log: Thread 1 Group 3 was previously cleared Online log /dsk1/oradata/bjdb/redo03a.log: Thread 1 Group 3 was previously cleared Standby became primary SCN: 1234250 Wed Mar 04 21:35:47 2015 Setting recovery target incarnation to 3 AUDIT_TRAIL initialization parameter is changed back to its original value as specified in the parameter file. Switchover: Complete - Database mounted as primary Completed: alter database commit to switchover to primary
三、原主庫修復后,開機
sys@bjdb>startup ORACLE instance started. Total System Global Area 442601472 bytes Fixed Size 2229184 bytes Variable Size 281021504 bytes Database Buffers 155189248 bytes Redo Buffers 4161536 bytes Database mounted. Database opened. sys@bjdb>select name,database_role,switchover_status from v$database; NAME DATABASE_ROLE SWITCHOVER_STATUS --------- ---------------- -------------------- TESTDB12 PRIMARY FAILED DESTINATION
現在原來的主庫被修復后,整個DataGuara架構已經被破壞了,所以必須把原來的主庫構建成新的備庫,重新恢復DataGuard的環境。
四、重新構建DataGuard
sys@bjdb>select name,database_role from v$database;
NAME DATABASE_ROLE
-------------------------------------------------- ----------------
TESTDB12 PHYSICAL STANDBY
聲明:本網頁內容旨在傳播知識,若有侵權等問題請及時與本網聯系,我們將在第一時間刪除處理。TEL:177 7030 7066 E-MAIL:11247931@qq.com