f 三月 2020 ~ 迪貝之家

Microsoft SQL Server Log Shipping

SQL 7就有的技術,簡單的說就是資料庫複本資料同步的技術,而這份複本可以開放讀取的功能以分散營業資料庫的負載,也可純粹當成備援資料庫,不管是在本地還是異地。怎麼達成? 透過SQL Agent自動化排程進行備份、傳檔及復原等作業。

Oracle GolGateden

機房搬遷跨Site能即時同步資料庫利器

Nagios 資料庫維運自動化

一開始設計這架構時,就排除使用remote agent的想法因為在專業分工的組織下,要求安裝新軟體於既有系統是一件不太可行的方案,既然身為DBA就只能把資料庫instance當作是一個最大的agent 想辦法在資料庫內做到我想做的事情

This is default featured post 4 title

Go to Blogger edit html and find these sentences.Now replace these sentences with your own descriptions.

This is default featured post 5 title

Go to Blogger edit html and find these sentences.Now replace these sentences with your own descriptions.

RAC 12.2 on HPUX 打上 2020 Jan GI Patch

與Linux的差異
比對文章Oracle RAC 12.2 on Linux 2020 第一季的Patch
就知道了
For Grid Infrastructure Home, as home user:
$ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir /oracle/patch/30501932/30593149
$ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir /oracle/patch/30501932/30585969
$ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir /oracle/patch/30501932/30591794
For Database home, as home user:
$ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir /oracle/patch/30501932/30593149
$ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir /oracle/patch/30501932/30585969
Execute the following on each node of the cluster in non-shared CRS and DB home environment to apply the patch.
1. Stop the CRS managed resources running from DB homes.
If this is a GI Home environment, as the database home owner execute:
$ <ORACLE_HOME>/bin/srvctl stop home -o <ORACLE_HOME> -s <status file location> -n <node name>
srvctl stop home -o $ORACLE_HOME -s /home/oracle/patch.txt -n vppracn1
srvctl stop home -o $ORACLE_HOME -s /home/oracle/patch.txt -n vppracn2
Note:
You need to make sure that the Oracle ACFS file systems are unmounted (see Section 2.8) and all other Oracle processes are shutdown before you proceed.
2. Run the pre root script.
this is a GI Home, as the root user execute:
# <GI_HOME>/crs/install/rootcrs.sh -prepatch
$ORACLE_HOME/crs/install/rootcrs.sh -prepatch
3. Patch GI home.
As the GI home owner execute:
$ <GI_HOME>/OPatch/opatch apply -oh <GI_HOME> -local <UNZIPPED_PATCH_LOCATION>/%BUGNO%/%OCW TRACKING BUG%
$ORACLE_HOME/OPatch/opatch apply -oh $ORACLE_HOME -local /oracle/patch/30501932/30593149
$ORACLE_HOME/OPatch/opatch apply -oh $ORACLE_HOME -local /oracle/patch/30501932/30585969
$ORACLE_HOME/OPatch/opatch apply -oh $ORACLE_HOME -local /oracle/patch/30501932/30591794
4. Patch DB home.
注意事項
因oracle的Inventory與grid的Inventory住在一起,所以下面路徑檔案在打第四步Patch DB home的時候記得要先改為oracle:oinstall
打完DB的之後記得要改回來grid:oinstall,改完後才能在繼續往下
/u01/app/oraInventory/ContentsXML]$ ll
total 6
-rw-rw---- 1 grid oinstall 329 Jun 15 13:41 comps.xml
-rw-rw---- 1 grid oinstall 670 Jan 15 10:55 inventory.xml
-rw-rw---- 1 grid oinstall 292 Jun 15 13:41 libs.xml
As the database home owner execute:
$ <UNZIPPED_PATCH_LOCATION>/%BUGNO%/%OCW TRACKING BUG%/custom/scripts/prepatch.sh -dbhome <ORACLE_HOME>
/oracle/patch/30501932/30585969/custom/scripts/prepatch.sh -dbhome $ORACLE_HOME
$ <ORACLE_HOME>/OPatch/opatch apply -oh <ORACLE_HOME> -local <UNZIPPED_PATCH_LOCATION>/%BUGNO%/%OCW TRACKING BUG%
$ORACLE_HOME/OPatch/opatch apply -oh $ORACLE_HOME -local /oracle/patch/30501932/30593149
$ORACLE_HOME/OPatch/opatch apply -oh $ORACLE_HOME -local /oracle/patch/30501932/30585969
$ <UNZIPPED_PATCH_LOCATION>/%BUGNO%/%OCW TRACKING BUG%/custom/scripts/postpatch.sh -dbhome <ORACLE_HOME>
/oracle/patch/30501932/30585969/custom/scripts/postpatch.sh -dbhome $ORACLE_HOME
5. Run the post script.
As the root user execute:
# <GI_HOME>/rdbms/install/rootadd_rdbms.sh
$ORACLE_HOME/rdbms/install/rootadd_rdbms.sh
If this is a GI Home, as the root user execute:
# <GI_HOME>/crs/install/rootcrs.pl -postpatch
$ORACLE_HOME/crs/install/rootcrs.sh -postpatch
If this is an Oracle Restart Home, as the root user execute:
6. If the message, "A system reboot is recommended before using ACFS is shown, then a reboot must be issued before continuing. Failure to do so will result in running with an unpatched ACFS\ADVM\OKS driver.
7. Start the CRS managed resources that were earlier running from DB homes.
If this is a GI Home environment, as the database home owner execute:
$ <ORACLE_HOME>/bin/srvctl start home -o <ORACLE_HOME> -s <status file location> -n <node name>
srvctl start home -o $ORACLE_HOME -s /home/oracle/patch.txt -n vppracn1
srvctl start home -o $ORACLE_HOME -s /home/oracle/patch.txt -n vppracn2
8. For each database instance running on the Oracle home being patched, run the datapatch utility as described in next table.
sqlplus / as sysdba
startup
quit
cd $ORACLE_HOME/OPatch
./datapatch -verbose
如果datapatch是空的,not apply
則需要手動處理 (請參考DOC 1955058.1)

RAC 12.2 在HPUX 上的安裝

其實跟在RedHat安裝是一樣的
都是作業系統的準備工作比較重
前一個在Linux上的專案
我是直接把RedHat的官方文件給出去
但是呢.....
負責建置作業系統的人通常都不看
這一次在HPUX上的專案
我就只好看了Oracle的官方文件
一步一步整理給作業系統對口負責人
省得又出一堆查堡
果然如我所願,只花一天就把它給安裝了起來
我這邊就只提我碰到的兩個狀況
1.user ssh equivalence
因為我們的環境不允許在HPUX上安裝bash
我也不知道為什麼?真是莫名其妙!!
就因為HP Support講了有security issue
MD...真是找藉口無上綱!!
袂生牽拖厝邊
Grid 及 RAC 的安裝程式都有幫你設定的功能
可是就因為沒bash,所以得自己來啦~~
ssh equivalence
上網找了一下hpux相關的文件
Linux 還是好用,一個ssh-copy-id就解決了
HP-UX Secure Shell - Setting up SSH to Use Public Key Authentication
2.就是直接用fixup 解決/dev/async 檔案minor number的問題

Restore 有設定replication的sql server 資料庫

如果你的資料庫設定了複寫
restore的時候,應該會出現下列的錯誤
日期,來源,嚴重性,訊息
02/25/2020 10:24:43,spid55,未知,The database 'urdb' is marked OFFLINE and is in a state that does not allow recovery to be run.
02/25/2020 10:24:37,spid55,未知,Starting up database 'urdb'.
02/25/2020 10:24:34,備份,未知,Database 'urdb' was restored<c/> however an error was encountered while replication was being restored/removed. The database has been left offline. See the topic MSSQL_ENG003165 in SQL Server Books Online.
02/25/2020 10:48:48,spid20s,未知,Recovery completed for database urdb (database ID 5) in 30 second(s) (analysis 8853 ms<c/> redo 20646 ms<c/> undo 634 ms.) This is an informational message only. No user action is required.
02/25/2020 10:48:48,spid20s,未知,Recovery is writing a checkpoint in database 'urdb' (5). This is an informational message only. No user action is required.
02/25/2020 10:48:48,spid20s,未知,0 transactions rolled back in database 'urdb' (5). This is an informational message only. No user action is required.
02/25/2020 10:48:47,spid60,未知,CHECKDB for database 'urdb' finished without errors on 2017-01-29 01:14:16.367 (local time). This is an informational message only; no user action is required.
02/25/2020 10:48:47,spid60,未知,36 transactions rolled forward in database 'urdb' (5). This is an informational message only. No user action is required.
02/25/2020 10:48:45,spid60,未知,Recovery of database 'urdb' (5) is 95% complete (approximately 1 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
02/25/2020 10:48:44,spid60,未知,Recovery of database 'urdb' (5) is 88% complete (approximately 3 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
02/25/2020 10:48:42,spid60,未知,Recovery of database 'urdb' (5) is 82% complete (approximately 5 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
02/25/2020 10:48:41,spid60,未知,Recovery of database 'urdb' (5) is 75% complete (approximately 7 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
02/25/2020 10:48:39,spid60,未知,Recovery of database 'urdb' (5) is 68% complete (approximately 9 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
02/25/2020 10:48:37,spid60,未知,Recovery of database 'urdb' (5) is 62% complete (approximately 11 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
02/25/2020 10:48:36,spid60,未知,Recovery of database 'urdb' (5) is 55% complete (approximately 14 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
02/25/2020 10:48:34,spid60,未知,Recovery of database 'urdb' (5) is 49% complete (approximately 16 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
02/25/2020 10:48:32,spid60,未知,Recovery of database 'urdb' (5) is 43% complete (approximately 18 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
02/25/2020 10:48:28,spid60,未知,Recovery of database 'urdb' (5) is 40% complete (approximately 15 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
02/25/2020 10:48:27,spid60,未知,Recovery of database 'urdb' (5) is 27% complete (approximately 25 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
02/25/2020 10:48:27,spid60,未知,Recovery of database 'urdb' (5) is 10% complete (approximately 75 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
02/25/2020 10:48:27,spid60,未知,Recovery of database 'urdb' (5) is 2% complete (approximately 428 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
02/25/2020 10:48:27,spid60,未知,Recovery of database 'urdb' (5) is 0% complete (approximately 954 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
02/25/2020 10:48:27,spid60,未知,Recovery of database 'urdb' (5) is 0% complete (approximately 2266 seconds remain). Phase 1 of 3. This is an informational message only. No user action is required.
02/25/2020 10:48:18,spid60,未知,Recovery of database 'urdb' (5) is 0% complete (approximately 2286 seconds remain). Phase 1 of 3. This is an informational message only. No user action is required.
02/25/2020 10:48:10,spid60,未知,Starting up database 'urdb'.
02/25/2020 10:48:10,spid60,未知,Setting database option ONLINE to ON for database urdb.
02/25/2020 10:47:04,spid55,未知,Setting database option MULTI_USER to ON for database urdb.
一定要移除掉replication 的設定
才能把資料庫再帶起來
use master
alter database urdb SET multi_user
b. 移除掉所有的replication 設定 :
exec sp_removedbreplication 'urdb'
移除掉設定後,你建置replication的時候,還是會出錯
從我幾次建置測試環境的過程中
就是要下這三個指令後, 複寫的建置才不會出錯 :
use urdb
exec sp_repldone @xactid = NULL, @xact_seqno = NULL, @numtrans = 0, @time = 0, @reset = 1
exec sp_replflush
exec sp_changedbowner sa

RMAN Duplicate from active database 建置 RAC 12.2 Active DataGuard

參考資料來自於Oracle的技術文件 1617946.1
Creating a Physical Standby using RMAN Duplicate (RAC or Non-RAC)
不過你要依照文件,沒有進行任何修改
能把ADG建置出來的話~~
哪你就真是神人了
照著文件做..我是做不出來啦!!
依照我之前建置single instance DR 的經驗
再加上這次的測試
反正ADG的建置就是分為兩個部分
1. Clone 資料庫
2. DG同步的設定
對...就是這麼簡單
建議還是對照原文件 1617946.1
1) Create a service to run RMAN on allocated instances in the cluster:
su - oracle
srvctl add service -db chicago -service srv_rman -preferred chicago1,chicago2
srvctl start service -db chicago -service srv_rman
[oracle@exa503 ~]$ lsnrctl status
LSNRCTL for Linux: Version 12.2.0.1.0 - Production on 18-FEB-2020 13:20:27
Copyright (c) 1991, 2016, Oracle. All rights reserved.
Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
------------------------
Alias LISTENER
Version TNSLSNR for Linux: Version 12.2.0.1.0 - Production
Start Date 15-FEB-2020 17:47:49
Uptime 2 days 19 hr. 32 min. 37 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /u01/app/12.2.0/grid/network/admin/listener.ora
Listener Log File /u01/app/grid/diag/tnslsnr/exa503/listener/alert/log.x ml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.22.6.63)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.22.6.67)(PORT=1521)))
Services Summary...
Service "+ASM" has 1 instance(s).
Instance "+ASM1", status READY, has 1 handler(s) for this service...
Service "+ASM_DATA" has 1 instance(s).
Instance "+ASM1", status READY, has 1 handler(s) for this service...
Service "9e97918792b863ace053400616acd3bf" has 1 instance(s).
Instance "chicago1", status READY, has 1 handler(s) for this service...
Service "chicago" has 1 instance(s).
Instance "chicago1", status READY, has 1 handler(s) for this service...
Service "chicagoXDB" has 1 instance(s).
Instance "chicago1", status READY, has 1 handler(s) for this service...
Service "pdb1" has 1 instance(s).
Instance "chicago1", status READY, has 1 handler(s) for this service...
Service "srv_rman" has 1 instance(s).
Instance "chicago1", status READY, has 1 handler(s) for this service...
2) When running RMAN, use the service name in the connect string for the “target” parameter.
rman target sys/xxxx@prim-scan.dbaid.tw/srv_rman
上述步驟建出服務,是為了能讓你用下列的步驟來進行rman clone source 資料庫
1. Create standby redo logs on the primary database that are the same size of the online redo logs.
Steps to Duplicate the Primary Database
SQL> alter database add standby logfile thread 1 group 7 size 200M, group 8 size 200M, group 9 size 200M;
Database altered.
SQL> alter database add standby logfile thread 2 group 10 size 200M, group 11 size 200M, group 12 size 200M;
Database altered.
SQL> alter system set log_archive_dest_1='location=+DATA' scope=spfile;
System altered.
SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
懂Oracle的應該不用我說了吧~~
這是要啟動archive log mode
[oracle@exa503 ~]$ srvctl stop database -d chicago
[oracle@exa503 ~]$ srvctl start instance -d chicago -i chicago1 -o mount
[oracle@exa503 ~]$ sqlplus "/ as sysdba"
SQL*Plus: Release 12.2.0.1.0 Production on Wed Feb 19 16:21:48 2020
Copyright (c) 1982, 2016, Oracle. All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
SQL> alter database archivelog ;
Database altered.
[oracle@exa503 ~]$ srvctl stop instance -d chicago -i chicago1
[oracle@exa503 ~]$ srvctl start database -d chicago
[oracle@exa503 ~]$ srvctl status database -d chicago
Instance chicago1 is running on node exa503
Instance chicago2 is running on node exa504
[oracle@exa503 ~]$ sqlplus "/ as sysdba"
SQL*Plus: Release 12.2.0.1.0 Production on Wed Feb 19 16:24:47 2020
Copyright (c) 1982, 2016, Oracle. All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
SQL> archive log list
Database log mode Archive Mode
Automatic archival Enabled
Archive destination +DATA
Oldest online log sequence 2
Next log sequence to archive 3
Current log sequence 3
2.接下來是要把source 資料庫password檔從asm拷貝出來
有metalink文件說在rman duplicate的過程中
就會主動幫你把它一併clone到target資料庫的asm
我測試的時候,是看不到這個動作
所以我先手動處理掉
[oracle@exa503 ~]$ asmcmd -p
ASMCMD [+] > cd data
ASMCMD [+data] > ls
ASM/
_mgmtdb/
chicago/
orapwasm
orapwasm_backup
prim/
ASMCMD [+data] > cd chicago
ASMCMD [+data/chicago] > ls
4700A987085B3DFAE05387E5E50A8C7B/
9EEA10F359D16AFAE0533F0616AC5334/
9EEA34AF67A97F5FE0533F0616ACB77C/
ARCHIVELOG/
CONTROLFILE/
DATAFILE/
ONLINELOG/
PARAMETERFILE/
PASSWORD/
TEMPFILE/
ASMCMD [+data/chicago] > cd PASSWORD
ASMCMD [+data/chicago/PASSWORD] > LS
ASMCMD [+data/chicago/PASSWORD] > ls
pwdchicago.378.1032796145
ASMCMD [+data/chicago/PASSWORD] > pwcopy '+data/chicago/PASSWORD/pwdchicago.378.1032796145' '/tmp/pwdboston'
copying +data/chicago/PASSWORD/pwdchicago.378.1032796145 -> /tmp/pwdboston
把pwdboston傳檔到你的target 主機
3.For the RMAN duplication process it is necessary to setup a temporary listener which will later be removed.
LISTENER_duplicate = (DESCRIPTION_LIST = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP) (HOST = exa505) (PORT = 1525)(IP = FIRST))))
SID_LIST_LISTENER_duplicate = (SID_LIST = (SID_DESC = (SID_NAME = boston1) (ORACLE_HOME = /u01/app/oracle/product/12.2.0.1/dbhome_1)))
4. In the database home on the primary node, create an Oracle Net alias to connect to the listener created in the above step. 

dup = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP) (HOST = exa505) (PORT = 1525)) (CONNECT_DATA = (SERVER = DEDICATED) (SID = boston1)))
5.建出這個密碼檔案是為了rman duplicate時,能夠連進target 主機上的nomount db instance
orapwd file=orapwboston1 password=test
6. target 主機dbinstance 的parameter file
我會把archive log destination參數先加進去
因為duplicate進行完後,要open時,會發生system tablespace修復的不夠
文件只有前3行
db_name=chicago
db_unique_name=boston
sga_target=5g
LOG_ARCHIVE_CONFIG='DG_CONFIG=(chicago,boston)'
LOG_ARCHIVE_DEST_1='LOCATION=+DATA VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=boston'
LOG_ARCHIVE_DEST_2='SERVICE=boston LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=chicago'
LOG_ARCHIVE_DEST_STATE_1=ENABLE'
FAL_SERVER=chicago
FAL_CLIENT=boston
7. On all standby hosts create the audit directory for the boston database:
mkdir -p /u01/app/oracle/admin/boston/adump
9. On the standby host set the SID to the standby SID (boston1) and startup nomount the standby/auxiliary instance.
$ export ORACLE_SID=boston1
SQL> startup nomount pfile='initboston1.ora'

11. On the primary host run an RMAN script that duplicates the primary database using the “duplicate target database for
standby from active database” command
文件的rman script 有底下這一行
parameter_value_convert 'chicago','boston'
拿掉,不然會出現0113的錯誤
這是我改到最後,我可以使用的RMAN DUP SCRIPT
rman <<EOF
connect target sys/xxxx@prim-scan.dbaid.tw/srv_rman;
connect auxiliary sys/xxxx@dup;
run {
duplicate target database for standby from active database
spfile
set db_unique_name='boston'
SET CLUSTER_DATABASE='FALSE'
set control_files='+DATA/boston/standby.ctl'
set local_listener='boston_local_listener'
set remote_listener='stby-scan:1521'
set audit_file_dest='/u01/app/oracle/admin/boston/adump'
set LOG_ARCHIVE_DEST_1='LOCATION=+DATA VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=boston'
set LOG_ARCHIVE_DEST_2='SERVICE=boston LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=chicago'
set LOG_ARCHIVE_DEST_STATE_1='ENABLE'
set FAL_SERVER='chicago'
set FAL_CLIENT='boston'
section size=200M; }
EOF
man <<EOF
connect target sys/xxxx@prim-scan.dbaid.tw/srv_rman;
connect auxiliary sys/xxxx@dup;
run {
duplicate target database for standby from active database
spfile
set db_unique_name='boston'
SET CLUSTER_DATABASE='FALSE'
set control_files='+DATA/boston/standby.ctl'
set local_listener='boston_local_listener'
set remote_listener='stby-scan:1521'
set audit_file_dest='/u01/app/oracle/admin/boston/adump'
set LOG_ARCHIVE_DEST_1='LOCATION=+DATA'
section size=200M; }
EOF
12.再來就是要把已經clone好的資料庫的參數檔轉出文字檔進行修改
因為原本的參數是從source拷貝過來的
要把source instance name chicago改成boston
這也許就是為什麼之前的rman script設了相應的置換
boston1.__data_transfer_cache_size=0
boston1.__db_cache_size=4060086272
boston1.__inmemory_ext_roarea=0
boston1.__inmemory_ext_rwarea=0
boston1.__java_pool_size=16777216
boston1.__large_pool_size=33554432
boston1.__oracle_base='/u01/app/oracle'#ORACLE_BASE set from environment
boston1.__pga_aggregate_target=1677721600
boston1.__sga_target=4999610368
boston1.__shared_io_pool_size=0
boston1.__shared_pool_size=872415232
boston1.__streams_pool_size=0
*.audit_file_dest='/u01/app/oracle/admin/boston/adump'
*.audit_trail='db'
*.cluster_database=FALSE
*.compatible='12.2.0'
*.control_files='+DATA/boston/standby.ctl'#Restore Controlfile
*.db_block_size=8192
*.db_create_file_dest='+DATA'
*.db_name='chicago'
*.db_unique_name='boston'
*.diagnostic_dest='/u01/app/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=chicagoXDB)'
*.enable_pluggable_database=true
family:dw_helper.instance_mode='read-only'
chicago1.instance_number=1
chicago2.instance_number=2
*.local_listener='boston_local_listener'
*.log_archive_dest_1='LOCATION=+DATA'
*.nls_language='AMERICAN'
*.nls_territory='AMERICA'
*.open_cursors=300
*.pga_aggregate_target=1589m
*.processes=300
*.remote_listener='stby-scan:1521'
*.remote_login_passwordfile='exclusive'
*.sga_target=4764m
chicago2.thread=2
chicago1.thread=1
chicago2.undo_tablespace='UNDOTBS2'
chicago1.undo_tablespace='UNDOTBS1'
+data/boston/parameterfile/spfile.329.1032799981
13.再來就是一樣用asm的指令pwcopy,把password file拷貝進ASM
+DATA/BOSTON/PASSWORD/pwboston
14.把ORACLE_HOME及password以及spfile的位置寫進OCR裡頭
srvctl add database -db boston -oraclehome /u01/app/oracle/product/12.2.0/dbhome_1
srvctl add instance -db boston -instance boston1 -node exa505
srvctl add instance -db boston -instance boston2 -node exa506
srvctl modify database -db boston -role physical_standby -spfile '+DATA/boston/parameterfile/spfile.329.1032799981' -pwfile '+DATA/BOSTON/PASSWORD/pwdboston'
15.啟動TARGET RAC 為mount
srvctl start database -d boston -o mount
16.在SOURCE RAC 端設上這些參數
alter system set LOG_ARCHIVE_CONFIG='DG_CONFIG=(chicago,boston)' scope=both;
alter system set LOG_ARCHIVE_DEST_1='LOCATION=+DATA VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=chicago' scope=both;
alter system set LOG_ARCHIVE_DEST_2='SERVICE=boston LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=boston' scope=both;
alter system set LOG_ARCHIVE_DEST_STATE_1=ENABLE scope=both;
alter system set FAL_SERVER=boston scope=both;
alter system set FAL_CLIENT=chicago scope=both;
17.在TARGET RAC 補上參數
alter system set LOG_ARCHIVE_CONFIG='DG_CONFIG=(chicago,boston)' scope=spfile;
alter system set LOG_ARCHIVE_DEST_1='LOCATION=+DATA VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=boston' scope=spfile;
alter system set LOG_ARCHIVE_DEST_2='SERVICE=boston LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=chicago' scope=spfile;
alter system set LOG_ARCHIVE_DEST_STATE_1=ENABLE scope=spfile;
alter system set FAL_SERVER='chicago' scope=spfile;
alter system set FAL_CLIENT='boston' scope=spfile;
18. 啟動同步
alter database recover managed standby database disconnect from session;

設置broker的步驟
完全照著文件1617946.1做就可以了
我想做的是是透過broker的命令傳達一個非常重要的概念
dg broker
不管你的DR RAC有幾個NODE
APPY DATA永遠只有一個

Nagios 監控Posgres 的案例

Postgres 沒開啟一些輔助的extension的狀況下
其實很難做管理
除非透過OS層的指令的output
最常用的應該就是pgbadger
因此我透過社群給的check_by_ssh plugin 程式
登入target主機,執行pgbager掃5分鐘內的系統記錄
因為要生效修改的參數archive_mode = on
由HP外包從cluster manager重開db instance所抓到的alert

SQL 2012 RTM安裝於WINDOWS 2012 所碰到的問題

上個禮拜安裝碰到這個問題
原本以為是權限問題
local admin及目錄權限沒給cluster帳號
調整了之後
移除軟體後,重新安裝,還是沒解決
就只好去看了一下sql server的記錄
結果.......
2020-03-09 10:34:39.90 spid13s Error: 26011, Severity: 16, State: 1.
2020-03-09 10:34:39.90 spid13s The server was unable to initialize encryption because of a problem with a security library. The security library may be missing. Verify that security.dll exists on the system.
2020-03-09 10:34:39.90 spid13s Error: 17182, Severity: 16, State: 1.
2020-03-09 10:34:39.90 spid13s TDSSNIClient initialization failed with error 0x139f, status code 0x80. Reason: Unable to initialize SSL support. The group or resource is not in the correct state to perform the requested operation.
2020-03-09 10:34:39.90 spid13s Error: 17182, Severity: 16, State: 1.
2020-03-09 10:34:39.90 spid13s TDSSNIClient initialization failed with error 0x139f, status code 0x1. Reason: Initialization failed with an infrastructure error. Check for previous errors. The group or resource is not in the correct state to perform the requested operation.
2020-03-09 10:34:39.90 spid13s Error: 17826, Severity: 18, State: 3.
2020-03-09 10:34:39.90 spid13s Could not start the network library because of an internal error in the network library. To determine the cause, review the errors immediately preceding this one in the error log.
2020-03-09 10:34:39.90 spid13s Error: 17120, Severity: 16, State: 1.
2020-03-09 10:34:39.90 spid13s SQL Server could not spawn FRunCommunicationsManager thread. Check the SQL Server error log and the Windows event logs for information about possible related problems.
跟HP的外包溝通後
他說可能是TLS 1.0沒啟動的問題
這個時候, 其實我已經把問題丟到微軟去了
微軟的回覆:
TLS 1.0
接到這個回覆的時候
HP的外包已經調整完正在重開OS
調完當然就解了
微軟建議閱讀的有關TLS KM

opatch 補釘使得GRID_HOME及ORACLE_HOME 使用空間越來越大

月底有一套RAC要上線了
但ORACLE_HOME或者GRID_HOME是越長越大
70GB的專屬空間,總共用掉了86%
其實我心裡有數
所以就開了case給Global Support
.patch_storage list
上述就是support的分析
資料當然是我送上去的
我怎會不知道有這個狀況
但我想知道的是
到底能不能清掉它釋放空間
我想support應該沒有實務上管理RAC的經驗
我是從他建議閱讀的文件中
再去找到550522.1
才總算有個可行的解法
可用的描述如下
Additionally, further cleanup of the $ORACLE_HOME/.patch_storage is possible if there are directories from patches applied to previous versions. This can be done manually as follows:
1. run command:
$ opatch lsinventory
2. Remove all the sub-directories from $ORACLE_HOME/.patch_storage that are not present in the list of installed patches. Directory names would be prefaced with the patchid for example:
13343438_<timestamp>
意思就是要你比對.patch_storage的patch id目錄
如果它有列在opatch lsinventory裡頭
就把它留在$ORACLE_HOME/.patch_storage裡
其它的目錄我的做法是搬到另外的filesystem
搬完之後, 使用率剩50%
其實Oracle開發團隊這樣設計
是為了當你補釘失敗時
可以從備出來的library回復(rollback)
也有可能你上了補釘後
產生能不良的系統影響
一時間解不了
當然也只能rollback走回原狀
只是我們打補釘的用意
是為了增加系統穩定性
結果反而造成了維護上的困擾
不過...至少現在我找到了解法

升級TFA 至 19.3.2

怎麼感覺安裝好像又變了
雖然只需要下一個指令
[root@dbha1 ~]# ./ahf_setup
AHF Installer for Platform Linux Architecture x86_64
AHF Installation Log : /tmp/ahf_install_23936_2020_03_06-11_34_10.log
Starting Autonomous Health Framework (AHF) Installation
AHF Version: 193200 Build Date: 201911251327
TFA is already installed at : /u01/app/12.2.0/grid/tfa/dbha1/tfa_home
Installed TFA Version : 192100 Build ID : 20190425110550
Default AHF Location : /opt/oracle.ahf
Do you want to change AHF Location (/opt/oracle.ahf) ? Y|[N] : N
AHF Location : /opt/oracle.ahf
AHF Data Directory stores diagnostic collections and metadata.
AHF Data Directory requires at least 5GB (Recommended 10GB) of free space.
Choose Data Directory from below options :
1. /opt/oracle.ahf [Free Space : 66678 MB]
2. /u01/app/grid [Free Space : 10172 MB]
3. Enter a different Location
Choose Option [1 - 3] : 1
AHF Data Directory : /opt/oracle.ahf/data
orachk scheduler is already running at : /u01/app/12.2.0/grid/tfa/dbha1/tfa_home/ext/orachk
Installed orachk version : ORACHK VERSION: 19.2.0_20190403
Stopping orachk scheduler
Copying orachk configuration from /u01/app/12.2.0/grid/tfa/dbha1/tfa_home/ext/orachk
Shutting down TFA : /u01/app/12.2.0/grid/tfa/dbha1/tfa_home
Copying TFA Data Files from /u01/app/12.2.0/grid/tfa/dbha1/tfa_home
Uninstalling TFA : /u01/app/12.2.0/grid/tfa/dbha1/tfa_home
Do you want to add AHF Notification Email IDs ? [Y]|N : N
AHF will also be installed/upgraded on these Cluster Nodes :
1. dbha2
The AHF Location and AHF Data Directory must exist on the above nodes
AHF Location : /opt/oracle.ahf
AHF Data Directory : /opt/oracle.ahf/data
Do you want to install/upgrade AHF on Cluster Nodes ? [Y]|N : Y
Extracting AHF to /opt/oracle.ahf
Configuring TFA Services
Copying TFA Data Files to AHF
Discovering Nodes and Oracle Resources
Starting TFA Services
Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Oracle TFA Status
Running TFA Inventory...
Adding default users to TFA Access list...
 Summary of AHF Configuration
Retrieving legacy orachk wallet details ...
Storing orachk wallet details into AHF config/wallet ...
Starting orachk daemon from AHF ...
AHF install completed on dbha1
Installing AHF on Remote Nodes :
AHF will be installed on dbha2, Please wait.
Installing AHF on dbha2 :
[dbha2] Copying AHF Installer
[dbha2] Running AHF Installer
AHF binaries are available in /opt/oracle.ahf/bin
AHF is successfully installed
Moving /tmp/ahf_install_23936_2020_03_06-11_34_10.log to /opt/oracle.ahf/data/dbha1/diag/ahf/

用docker 建置Nagios Core 4 在raspbian 10 上的測試環境

前年底我買了一張Raspberry Pi ZERO的開發板
及相關附件(keyboard, sdcard,線材等)來測試
不過實在是太不方便了,就開發的角度來看
但也是去年真的太忙了,我根本就沒時間去摸它
下載raspbin 9後,燒進SD Card之後
開機把畫面接到電視稍微測試一下之後
就被我晾在家裡櫃子的一角
直到最近我想在pi上頭開發Nagios 4 Plugin
我才在想有沒有更簡便的方式
找了Internet,有人用QEMU 來模擬ARM
樹莓官方居然有一專篇討論
HOWTO: Virtual Raspbian on Qemu in Ubuntu or Debian 2017 Ver
但實在是太舊了, QEMU目前的版本跑不起來
而且還得下載人家分享出來的在pi上跑的就kernel
我是依照文件的版本,自己compile了2.x的QEMU
是跑起來,但還得去搞定網路介接的問題
喔....實在真的是...天啊!!是在折磨誰啊!!
又在pi kernel 網站上看到說有pi docker可用
我實在是懶得再試QEMU,就去看了該篇連結

在Ubuntu 16.04安裝了docker package之後
照著指令執行.....
docker run -it lukechilds/dockerpi
ㄝ.....真的可以開一個pi 作業系統
哈....哈.....可是不會用阿~~
我又沒接觸過docker............
恩....好吧~~就再上網找書吧~~
嗯....看起來有本不錯的書
而且是2019年出版的
Docker in Action, 2nd Edition

看了2~3天吧~~
就用下面的步驟建出我要的環境
你要是一直用docker hub上的image
當然每次叫起container
資料會被清空
我看過書上寫docker創始人的歷史
它應該是套用github的觀念

基本上我的做法就是

1.把docker image拉回來
docker pull lukechilds/dockerpi
2.把這個image存成我的
docker save -o raspberrypi.tar lukechilds/dockerpi
3.砍掉下載回來的image
docker rmi lukechilds/dockerpi
4.把之前的tar檔放進我的repository
docker load -i raspberrypi.tar
在這之後,要另起一個host主機的ssh session
5.我們看一下image的現狀
docker images
docker images
6. 啟動一個可以互動的container
因為至少要能開通ssh
而且要指定能由host主機進到container的22 port
docker run -it -p 5022:22 7c3e811cebaa
7. 看container的狀態
docker ps
docker ps
8.我們來看一下它可以連通的IP吧
docker inspect 5041b122b41b
我只擷取我們要看的重點
docker inspect
9.再來就是登入步驟4所開啟來的互動tty
預設pi/raspberrypi
然後sudo su - 切換成root
用來啟動ssh daemon
我習慣直接下指令
systemctl enable ssh
systemctl start ssh
當然raspbian有它的管理介面可以啟動
raspi-config
raspi-config
raspi-config
raspi-config
10.我們來從另一個host session進行登入測試吧~~
ssh -p 5022 pi@172.17.0.2
pi login
11. 再來就是上傳我之前反應給github nerd的問題
為什麼要從source compile
因為nerd在deb安裝似乎沒啟用
你有興趣的話
可以看Nagios 4 query handler : nerd這個case
case裡頭給了一版的branch
這一版configure --enable_nerd是不會出錯的
不然你就得拿以前舊版的來使用
當然你也可以不要用nerd
但我個人認為,它是個很重要的功能
scp -P 5022 nagioscore-4.4.5_nerd_fix.zip root@172.17.0.2:~
12. 最後當然就是安裝
groupadd nagios
groupadd nagioscmd
useradd -g nagios -G nagioscmd -d /home/nagios nagios
mkdir -p /home/nagios/etc
mkdir -p /home/nagios/var
mkdir -p /home/nagios/plugins
./configure --prefix=/home/nagios \
--sysconfdir=/home/nagios/etc \
--localstatedir=/home/nagios/var \
--libexecdir=/home/nagios/plugins \
--with-command-group=nagioscmd \
--enable-nerd \
--with-httpd-conf=/etc/apache2/sites-enabled
make && make install
安裝步驟你當然可以完全按照官方的文件
我原本想說,docker 在開發上這麼方便
container 或者image很容易就portable
結果昨晚我嘗試把container export
或者commit之後, save 為tar 檔
想把這些archive 檔匯入暈倒平台
結果window 的docker engine
需要打開Hyper-V
看起來是跟我筆電上的VirtualBox衝到
因為Hyper-V一起動,我的VM就叫不起來了
只好把Oracle VirtualBox移除掉
讓docker desktop for windows能夠順利安裝
但就是不能用,docker engine其實重起了一遍
網路上其實早就有人反應了這個問題
Docker does not start on Windows 10
另外就是要把它匯入家裡筆電的ubuntu docker
container import比較麻煩
因為得去查看source 執行甚麼指令
我看了一下,像是entrypoint.sh等
因此啟動匯入的container image需要帶入該指令
才能開得起來,但是...問題來了
所有變更都不見了
在source端commit container
再把image匯出搬到另一台主機
也一樣.......全部的變更都消失了
好吧....至少有個開發環境了
docker的問題,待研究囉~~


安裝freetds,參考"怎麼用Perl DBI"
安裝Sybase_DBD之前
需要先安裝DBI
得用apt來安裝
apt-get install libdbi-perl
想要借用別人的測試環境來測連通
怎麼測就是有問題
為了確認是不是freetds本身的問題
我就安裝了linux sql server
及起了一個sql server 的container
測完之後........應該是routing問題
host主機上的sql server 及
sql server container都可以連通
但我還是用tcpdump去看了封包的流向
哈....
然後再從其中一台測試機去tracert IP
因為是另一個機房,應該有172 網段
所以封包繞到最後就不見蛋了
perl 的local setting 也是有點問題
export LANGUAGE="en_GB.UTF-8"
export LANG=en_GB.UTF-8
export LC_ALL=en_GB.UTF-8
locale-gen en_GB.UTF-8
dpkg-reconfigure locales -> 預設選 en_GB.UTF-8 它
就解決了
參考How to fix Perl warning setting locale failed on Raspbian
Perl 出現 Setting locale failed 的解決辦法