My webservers have been quite stable, probably 3-4 months now without rebooting. I have 2 copies of a disk on the main server, but since I will be going to China soon, I thought I would create a third copy and put it in a third server as a further backup. if both servers go down, the third one can rebooted with the correct IP quickly.
I use gmirror to duplicate the master disk as a backup, but somehow if I stop the gmirror before rebooting the machine, then the slave would not be bootable. so I always reboot and take one drive out and swap another one in. The risk here is that if I take the wrong drive out (master), the new drive might automatically start gmirror (since it was not stopped the previous round), it can destroy the good copy in a few seconds. Gmirror does a bit by bit copying so only a few bits on the drive will render it not bootable.
This is what happened today. The main server has 8 sata ports, one drive (I thought was the slave) was on sata6. But unix says it was ad0, not ad6! the machine booted from this one, instead of sata4. After a while I realized the machine booted from the wrong drive and rebooted again (I should have issued “gmirror forget gm0” to disable gmirror). I made sure the master was sata4 and slave sata6, this was still before I realized this motherboard was confused with sata numbers. I rebooted with the other good copy! and it was destroyed in seconds…Now I lost two main drives! I should have booted with one single drive (the good one) first…or tried a different machine.
Luckily most data (wordpress posts, both html files and mysql data) are backed up daily and automatically transferred (through rsync) to the backup server. I had to use a backup server disk on the main server. Unluckily, I did not backup some files, eg. http.conf! I had create a new one…restore web files…trying to remember what else was missing…this took most of the day today.
setting up rsync again took me hours…it simply refused to work! finally I created a new user and it worked in a few minutes…something wrong my regular user name.
What a horrible mistake! actually I made the same mistake twice!
Trying to get a third copy and ending up destroying two good ones!
本æ¥æ˜¯ä¸»/付æœåŠ¡å™¨éƒ½ä¸€ç›´å¾ˆç¨³å®šï¼Œ ä¸è¦å¤‡ä»½ç®—了。 但是昨天把副的æžäº†ç¬¬3份, 很顺利, 5分钟æžå®šã€‚
怕回国期间万一出问题, 想把主机也备份å§ã€‚ 现在åªæœ‰2份, 在åŒä¸€ä¸ªæœºå™¨é‡Œã€‚
看到一个HD上有Tape, 以为是Slave, æ¢äº†ä¸€ä¸ªHD进去, 结果这个æ¢äº†çš„å¯åŠ¨ï¼Œ æŠŠé‚£ä¸ªå¥½çš„å¤‡ä»½ç ´å了(自动å¯åŠ¨Gmirror, Bit by bit 考到好盘了 — åªè¦å‡ 秒, 好盘就没有了, ä¸èƒ½å¯åŠ¨äº†ï¼‰ã€‚
我 看了看机器, åˆæŠŠå¦ä¸€ä¸ªå¥½çš„放进去,放在4å·Sata, å°†è¦å¤‡ä»½çš„放在6å·ï¼Œä½†æ˜¯ï¼Œåˆè¢«å†™äº†ï¼ã€€æœ€åŽæ‰å‘现, 一般的电脑都是SATAå·å°çš„å¯åŠ¨æˆä¸ºMaster, 但是这个怪ï¼ã€€æ˜Žæ˜Žæ˜¯SATA6, 在 Unix被认识æˆAd0ï¼ã€€ä¸æ˜¯AD6, 酿æˆå¤§é”™ï¼ã€€æˆ‘没有别的备份, åªæœ‰ç½‘页+Mysqlçš„æ•°æ®åº“æ¯æ™šè‡ªåŠ¨å¤‡ä»½åˆ°å‰¯æœºã€‚ 我为啥以å‰æ²¡æœ‰å‘现? è€Œä¸”ä¸ºå•¥ç¬¬ä¸€ä¸ªè¢«ç ´å了, 还è¦åœ¨è¯•ä¸€æ¬¡ï¼Ÿ æ¢ä¸ªæœºå™¨å°±æ²¡äº‹äº†å•Šã€‚
但是Httpd.conf都没有最新的ï¼èŠ±äº†å‡ 个å°æ—¶é‡æ–°å†™ã€‚ 还好, 网页都回æ¥äº†ã€‚ 但是最åŽå‘现2011年上的照片没有了(åªä¸Šäº†ä¸€ä¼šè®®çš„, 还好)。 Http çš„Log也都没有了 ï¼ˆè¿™æ ·æ²¡æœ‰ä»Šå¹´çš„Access Stats了)。 其他少了啥? 还么有å‘现。
以åŽå¥½å¤šConfig File得定期备份, 如/etc/rc.conf, httpd.conf, named çš„DB fileç‰ã€‚ è¦ä¸æžæ»äººäº†ã€‚