These are some examples of routine maintenance you can do on a server.
Before maintenance begins, stop all cron daemons and stop all monitoring services on affected servers. And don't forget to restart when finished.
Often, when there is an error or output from a cron job, the local mail service will send it to root@localhost
.
Most of these can be minor complaints from the system, but some can point to cruft that needs to be cleaned up. An example would be logrotate complaining about an old service that was once installed, and it can't find the logs anymore.
Install alpine and setup the mailserver to deliver mail locally for its own hostnames.
This is a good place to start to check for anomalies.
Verify outgoing email works with local sendmail instance. Verify that SendGrid outgoing email is working.
Make sure they are all running and producing the output you want.
When programs are uninstalled, they sometimes leave behind scripts.
Find out how many PHP errors there are, what ones have been recent, and check for Warning and Fatal errors.
Check to see if the files are growing large and size, and need to be rotated.
Update the location of files to updated standards. PHP error logs, for an example.
Make sure that all the proper backups are executing. Download and unpack a GPG tarball to verify it's correctly created.
Create md5sum hashes for backed up files that are stored in alternate places.
Make sure S3 keys work properly. Migrate backups to year-month folders as well.
Check crontab entries to make sure they are commented and are clearly described. Verify cron jobs are executing.
Install crag.
Make sure all services are being monitored and properly reported.
Upgrade to the latest version if necessary. Refresh modules to make sure everything is available in the menu. Setup IP restrictions. Setup hostname configuration so it goes to the right URL. Setup SSL using local certificates if available
Run NTP client to sync local time.
Make sure all the system and application logs are both being created and properly rotated. Make sure that programs are logging to the latest log file, and not an old one.
Remove old logs that are not needed anymore.
Make sure that MySQL is doing proper backups. Do test imports on an alternate system.
Make sure that they are being properly created, and are still accessible.
Check for old entries and remove them if necessary.
Make sure that everything is up-to-date with the latest security releases.
Look for any instances of logins, cron jobs, personal files left lying around from old employees on servers. Remove any old SSH authorized key entries on accounts that we have access to.
Look through known SSH hosts, and document any that we no longer have access to, or should not have access to.
Scan our public servers for issues with networking or security.
Verify that the firewall is configured properly, and is allowing access from all the right IP addresses.
Upgrade to latest version. Disable Heartbeat extension and old ciphers.
Check for mis-configured SSH services.
Create a README and MOTD files for servers that have known issues or pending changes so that when SSH'ing into a remote box, the users are aware.