f 四月 2020 ~ 迪貝之家

Microsoft SQL Server Log Shipping

SQL 7就有的技術,簡單的說就是資料庫複本資料同步的技術,而這份複本可以開放讀取的功能以分散營業資料庫的負載,也可純粹當成備援資料庫,不管是在本地還是異地。怎麼達成? 透過SQL Agent自動化排程進行備份、傳檔及復原等作業。

Oracle GolGateden

機房搬遷跨Site能即時同步資料庫利器

Nagios 資料庫維運自動化

一開始設計這架構時,就排除使用remote agent的想法因為在專業分工的組織下,要求安裝新軟體於既有系統是一件不太可行的方案,既然身為DBA就只能把資料庫instance當作是一個最大的agent 想辦法在資料庫內做到我想做的事情

This is default featured post 4 title

Go to Blogger edit html and find these sentences.Now replace these sentences with your own descriptions.

This is default featured post 5 title

Go to Blogger edit html and find these sentences.Now replace these sentences with your own descriptions.

Tibero with AS

啟動AS, 就是類比Oracle ASM
TAS
啟動tibero db, 就是類比Oracle DB
Tibero
我們來看TAS instance的參數檔
tip
我們來看db instance的參數檔
tip
這是在TAS 建立DISK SPACE ds0的語法
CREATE DISKSPACE ds0 NORMAL REDUNDANCY
FAILGROUP fg1 DISK
'/dev/disk1' NAME disk1
FAILGROUP fg2 DISK
'/dev/disk2' NAME disk2
ATTRIBUTE 'AU_SIZE'='4M';
來我們來看類似asmcmd的介面
tbascmd
我們再來看資料庫內的datafile
tibero datafile
哈哈......
原來我建資料庫的時候
如果沒指定放在Disk Space的
tibero會自動幫你放在$TB_HOME/database
所以要測TAC就得小心了
因為是在筆電上測試
記憶體參數都調很小
所以就一直出現shared memory allocate的問題
out of memory
我就不斷地調整VM的memory
調到6G了...怎麼會問題還是依舊
最後在Internet上找到了他們的error guide
看了敘述才發覺...原來如此

Patching Step 30920127 - GI Apr 2020 Release Update 12.2.0.1.200414 on HPUX

For Grid Infrastructure Home, as home user:

$ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir  /oracle/patch/30920127/30882603
$ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir  /oracle/patch/30920127/30886680
$ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir  /oracle/patch/30920127/30888810
For Database home, as home user:

$ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir  /oracle/patch/30920127/30886680
$ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir  /oracle/patch/30920127/30882603
Execute the following on each node of the cluster in non-shared CRS and DB home environment to apply the patch.

1.    Stop the CRS managed resources running from DB homes.

If this is a GI Home environment, as the database home owner execute:
$ /bin/srvctl stop home -o -s -n
srvctl stop home -o $ORACLE_HOME -s /home/oracle/patch.txt -n vppracn1

srvctl stop home -o $ORACLE_HOME -s /home/oracle/patch.txt -n vppracn2

Note:

You need to make sure that the Oracle ACFS file systems are unmounted (see Section 2.8) and all other Oracle processes are shutdown before you proceed.

2.    Run the pre root script.
this is a GI Home, as the root user execute:
# /crs/install/rootcrs.sh -prepatch
$ORACLE_HOME/crs/install/rootcrs.sh -prepatch


3.    Patch GI home.

As the GI home owner execute:
$ /OPatch/opatch apply -oh -local /%BUGNO%/%OCW TRACKING BUG%
$ORACLE_HOME/OPatch/opatch apply -oh $ORACLE_HOME -local /oracle/patch/30920127/30882603
$ORACLE_HOME/OPatch/opatch apply -oh $ORACLE_HOME -local /oracle/patch/30920127/30886680
$ORACLE_HOME/OPatch/opatch apply -oh $ORACLE_HOME -local /oracle/patch/30920127/30888810


4.    Patch DB home.
注意事項
因oracle的Inventory與grid的Inventory住在一起,所以下面路徑檔案在打第四步Patch DB home的時候記得要先改為oracle:oinstall
打完DB的之後記得要改回來grid:oinstall,改完後才能在繼續往下
/u01/app/oraInventory/ContentsXML]$ ll
total 6
-rw-rw----   1 grid       oinstall       329 Jun 15 13:41 comps.xml
-rw-rw----   1 grid       oinstall       670 Jan 15 10:55 inventory.xml
-rw-rw----   1 grid       oinstall       292 Jun 15 13:41 libs.xml

As the database home owner execute:
$ /%BUGNO%/%OCW TRACKING BUG%/custom/scripts/prepatch.sh -dbhome
/oracle/patch/30920127/30882603/custom/scripts/prepatch.sh -dbhome $ORACLE_HOME

$ /OPatch/opatch apply -oh -local /%BUGNO%/%OCW TRACKING BUG%
$ORACLE_HOME/OPatch/opatch apply -oh $ORACLE_HOME -local /oracle/patch/30920127/30886680
$ORACLE_HOME/OPatch/opatch apply -oh $ORACLE_HOME -local /oracle/patch/30920127/30882603

$ /%BUGNO%/%OCW TRACKING BUG%/custom/scripts/postpatch.sh -dbhome   
/oracle/patch/30920127/30882603/custom/scripts/postpatch.sh -dbhome $ORACLE_HOME

5.    Run the post script.

As the root user execute:
# /rdbms/install/rootadd_rdbms.sh
$ORACLE_HOME/rdbms/install/rootadd_rdbms.sh

If this is a GI Home, as the root user execute:
# /crs/install/rootcrs.pl -postpatch
$ORACLE_HOME/crs/install/rootcrs.sh -postpatch

If this is an Oracle Restart Home, as the root user execute:

6.    If the message, "A system reboot is recommended before using ACFS is shown, then a reboot must be issued before continuing. Failure to do so will result in running with an unpatched ACFS\ADVM\OKS driver.

7.    Start the CRS managed resources that were earlier running from DB homes.

If this is a GI Home environment, as the database home owner execute:
$ /bin/srvctl start home -o -s -n
srvctl start home -o $ORACLE_HOME -s /home/oracle/patch.txt -n vppracn1

srvctl start home -o $ORACLE_HOME -s /home/oracle/patch.txt -n vppracn2


8. For each database instance running on the Oracle home being patched, run the datapatch utility as described in next table.


sqlplus / as sysdba
startup
quit
cd $ORACLE_HOME/OPatch
./datapatch -verbose

如果datapatch是空的,not apply
則需要手動處理 (請參考DOC 1955058.1)

Table with securefile lob column needs a regular reorg

終於有機會驗證 secure lob file經過重整後
可以釋放出多少空間了
你想了解事情始末
可以閱讀底下這篇文章
Delete table with lob securefile causes fragmentation
處理指令如下:
ALTER TABLE MESSAGEBACKUPSTORE_BCK MOVE LOB(MESSAGEBODY) STORE AS (TABLESPACE BLOBBCK) parallel 8;
BLOBBCK 表格空間只放了lob 物件
因此比對處理前後的使用空間就可以知道釋放出多少空間了
處理前 209483 MB
處理後 92758 MB

Nagios hostgroup vs servicegroup

雖然兩種group 的設定在UI上都有分群的效果
但是兩者的功能不一樣
如果在多台主機上跑相同的服務
除了主機的IP不同外
沒有絲毫的差異
你可以透過hostgroup的我工作上的ˊ設置
一次性套用該服務的監控到hostgroup的所有member主機
舉我工作上的實務案例來說好了
寫這篇文章的這個時間點來看
我有16個MSSQL 的server instance要管理
把我手上所有的sql server 主機都指定歸屬於STEVEN-MSSSQL的group
設定error log 監控service時
把hostgroup directive 指定為STEVEN-MSSQL
重開Nagios後
就會把"log監控"套用到這16個instance
不需要一台一台定義"log 監控 service"
其實日常檢核絕大多數都是每套sql server通用的routine
假如你有5個routine有做
一台一台設定,等於是要寫80次的service 定義
煩不煩阿~~
我們來看實際的hostgroup 設定
Nagios hostgroup directive
再來看"log 監控" service 的設定
再來看WEB UI上的呈現
Nagios hostgroup


hostgroup是方便我們進行一次性大量部署service監控之用
就像我取得名稱一樣,它是站在系統管理者的面向的設定功能
STEVEN-MSSQL,就是我STEVEN手上管理的所有SQL SERVER INSTANCE
你如果要從業務或者組織角度
甚或功能及關連性的面向來看的話
那就是透過servicegroup
這個功能最適合給客戶看
我們先來看組織面的畫面
Nagios servicegroup
再來看業務角度的畫面
Nagios servicegroup
最後我們來看設定
Nagios servicegroup directive
members 的設定是
主機,服務列,主機,服務列.... 等配對
我這邊的服務列設定為*
就表示該host的所有監控都是
你可以去想一下
如果一個業務包含了web*3及DB * 2
那你就可以去想像,在設定上該如何去列舉

QEMU Simulator Raspberry Pi

之前寫了一篇使用docker去模擬pi的文章
但測試上會有一些問題
因為我的測試環境是在Virtualbox
docker裏頭的Pi環境又是另一層虛擬化
所以我要從Pi作業系統連通到sql server
一直出錯,連不上去
想用docker是因為懶得為了這個去研究QEMU的網路
但看起來是沒辦法
我也測過將docker的網路啟動為host
還是連不上主機外的sql server
因為QEMU的預設網路是不讓外面的封包進去的
除非你用brige
所以莫法度囉,只好再去研究linux brige
也是搞了好幾天
最後才找到真的有用的一篇文章
Bridging of QEMU emulated Raspberry pi
總之,按照前述的Po文,我編了兩個檔
setup.sh : 用來在virtualbox上的guest設定brige及tap介面
boot.sh : 用來啟動pi
先執行setup.sh
再執行boot.sh
setup.sh 的內容如下 :
brctl addbr br0
ip addr flush dev enp0s3
brctl addif br0 enp0s3
tunctl -u $(whoami)
brctl addif br0 tap0
ip link set dev br0 up
ip link set dev tap0 up
dhclient br0
boot.sh的內容如下 :
qemu-system-arm -nographic -kernel ./kernel-qemu-4.4.34-jessie -cpu arm1176 -m 256 -M versatilepb -no-reboot -append "root=/dev/sda2 panic=1 rootfstype=ext4 rw console=ttyAMA0" -hda 2020-02-13-raspbian-buster-lite.img -net nic,macaddr=00:16:3e:00:00:01 -net tap,id=mynet0,ifname=tap0,script=no,downscript=no
pi開起來之後去編譯feetds
這是透過tsql 的連線測試
freetds
通了...真的通了
但.....碰到空間問題
root filesystem怎麼會只有1.3G
玩個屁阿
用raspi-config去extend filesystem
它回覆說,這不是SD卡......
回想之前用的docker pi沒這個問題啊!!
想說...既然docker pi的creator可以解決
那表示Internet應該是找得到資料的
果不其然,有囉...確實有這篇文章
QEmu and raspbian, how to increase image size
只是要非常的有耐心
因為這篇Po文有好幾個回覆
解法在其中一個回覆裡
這是我的作法
1. 關掉pi,回到virtualbox guest主機裡
下指令resize raspbian lite image
qemu-img resize 2020-02-13-raspbian-buster-lite.img +20G
再執行boot.sh啟動pi
2.登入pi 砍掉partition2 後,再重建出來
root@raspberrypi:/usr/local/freetds/bin# fdisk /dev/sda
Welcome to fdisk (util-linux 2.33.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.
Command (m for help): d
Partition number (1,2, default 2): 2
Partition 2 has been deleted.
Command (m for help): p
Disk /dev/sda: 21.7 GiB, 23324524544 bytes, 45555712 sectors
Disk model: QEMU HARDDISK
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x738a4d67
Device Boot Start End Sectors Size Id Type
/dev/sda1 8192 532479 524288 256M c W95 FAT32 (LBA)
Command (m for help): n
Partition type
p primary (1 primary, 0 extended, 3 free)
e extended (container for logical partitions)
Select (default p): p
Partition number (2-4, default 2):
First sector (2048-45555711, default 2048): 524288
Sector 524288 is already allocated.
First sector (532480-45555711, default 532480):
Last sector, +/-sectors or +/-size{K,M,G,T,P} (532480-45555711, default 45555711):
Created a new partition 2 of type 'Linux' and of size 21.5 GiB.
Partition #2 contains a ext4 signature.
Do you want to remove the signature? [Y]es/[N]o: N
Command (m for help): p
Disk /dev/sda: 21.7 GiB, 23324524544 bytes, 45555712 sectors
Disk model: QEMU HARDDISK
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x738a4d67
Device Boot Start End Sectors Size Id Type
/dev/sda1 8192 532479 524288 256M c W95 FAT32 (LBA)
/dev/sda2 532480 45555711 45023232 21.5G 83 Linux
Command (m for help): w
The partition table has been altered.
Failed to remove partition 2 from system: Device or resource busy
Failed to add partition 2 to system: Device or resource busy
The kernel still uses the old partitions. The new table will be used at the next reboot.
Syncing disks.
3.關掉pi重開再登入,下指令resize root file system

Nagios Core 3.5 升級至 4.4.5

compile 完source 之後
用nagios 4的binary 跑3.5的設定檔
看有沒有錯誤
有錯誤的話,想辦法修正掉就好了
warning可以不用理它
像我的環境檢核指令如下
/home/nagios/bin/nagios -v /etc/nagios3/nagios.cfg
上述指令的解釋
Nagios Configuration Check

錯誤的example如下:
Error: Empty value for contact_groups (config file '/etc/nagios3/objects/templates/She-generic-service.cfg', starting on line 2)
Error: Could not add object property in file '/etc/nagios3/objects/templates/She-generic-service.cfg' on line 6.
Error: Invalid max_check_attempts value for host '3GBT'
Error: Could not register host (config file '/etc/nagios3/objects/services/MSSQL/steven/PG_hosts.cfg', starting on line 27)
ERROR修正完之後
把nagios.cfg 及apache password 檔拷貝到你指定的位置後
把nagios 開起來就可以了
systemctl start nagios
web介面的存取改為
http://yourhost/nagios
其他還是一樣沿用原有nagios3的設定檔的位置
systemctl status nagios