A Guide for Archiving Web Pages
Server maintenance and archive mirroring
Servers are both hardware and software. The term encompasses the computers on which datafiles are stored, the operating systems of those computers, and the software for interacting with the datafiles. Server maintenance is mostly a matter of common sense. The server environment needs to be maintained as to the hardware and software. There should be a maintenance policy for regular surveys to assure that everything is working properly. Because the datafiles and the magnetic media that house them are inherently unstable, web archives should be mirrored to at least one other site. Mirroring software can be used for automatic replication of web archives.
Sample Server Preventative Maintenance Schedule Table
Below is an example of some common server system maintenance-related actions. This list will be updated periodically based on observations and clients requirements.
Frequency |
Task |
Daily |
- Data backup log review
- Telephone support
- System availability monitoring
|
Weekly |
- Check server error logs
- Check the storage device statistics
- Review file server cache statistics
- Check server disks and volumes
- Update security patches at the OS and application level
- Review antivirus definition files
|
Monthly |
- Review all users and objects on the network to make sure that there are no intruders, obsolete accounts, or unauthorized accounts.
- Test the backup device
- Operating system updates
- Disk volume maintenance
- Review permissions and ACL
|
Every 3–6 months |
- Check Uninterruptible Power Supply (UPS) to make sure it's running properly
- Major software for applications should be applied if updates are available
|
Yearly |
- Asset review and system audit
|
Some additional resources on server maintenance and mirroring
- Server Preventative Maintenance Schedule, a sample from a network engineering site.
- Server (computing), an article from wikipedia.
- Mirror (computing) from wikipedia.
- rsync from wikipedia: "rsync is a software application for Unix systems which synchronizes files and directories from one location to another while minimizing data transfer using delta encoding when appropriate."
- CVSup from wikipedia: "CVSup is a computer program that synchronizes files and directories from one location to another while minimizing data transfer using file-type specific delta encoding when appropriate. CVSup was designed for keeping source code repositories - such as CVS - synchronized, but has been extended to support synchronizing any type of file."
- Wget from wikipedia: "GNU Wget is a simple computer program that retrieves content from web servers, and is part of the GNU Project. Its name is derived from World Wide Web and get, connotative of its primary function. It currently supports downloading via HTTP, HTTPS, and FTP protocols, the most popular TCP/IP-based protocols used for web browsing. Its features include recursive download, conversion of links for offline viewing of local HTML, support for proxies, and much more."
- Unison (file synchronizer). "Unison is a file-synchronization tool for Unix and Windows. It allows two replicas of a collection of files and directories to be stored on different hosts (or different disks on the same host), modified separately, and then brought up to date by propagating the changes in each replica to the other. Unison shares a number of features with tools such as configuration management packages (CVS, PRCS, Subversion, BitKeeper, etc.), distributed filesystems (Coda, etc.), uni-directional mirroring utilities (rsync, etc.), and other synchronizers (Intellisync, Reconcile, etc). However, there are several points where it differs."
- PowerFolder from wikipedia: "PowerFolder is a program that synchronizes files and folders over the internet or a LAN. For this program to work, it must be installed on all computers that will share the files."
- Jigdo from wikipedia: "Jigdo is a download utility that downloads files from several mirrors in order to build an optical disk image."
- Grsync from wikipedia, a Graphical User Interface (GUI) for rsync
- cwRsync from wikipedia: "cwRsync is a packaging of Rsync and Cygwin. You can use cwRsync for fast remote file backup and synchronization."