Seamless backups with S3Sync & Amazon S3 Storage

Dec 16

in Server Management / 0 comments

time-machine

At Sneek we take backups very seriously, and so should you! Anyone who has a Mac and knows how perfect Time Machine is knows backups shouldn’t be painful.

Amazon provide an extremely cheap simple storage solution and today I am going to take you through setting up an automated backup solution which will only take a an hour or so.

Firstly the requirements:

  • SSH access to your server
  • Ruby installed on the server in question

Install & Configure S3Sync

S3Sync is an open source set of ruby tools that interface with Amazon S3 service. To begin with we need to install S3Sync. We chose to install s3sync in our /home/ directory, however you can install the scripts anywhere your feel comfortable.

cd /home
wget http://s3.amazonaws.com/ServEdge_pub/s3sync/s3sync.tar.gz
tar -xvf s3sync.tar.gz
rm -f s3sync.tar.gz
cd s3sync

So far we have grabbed the library, unzipped it, removed the zip and traversed into the folder. Time to configure s3sync. Copy the config file to create a new one where our S3 API keys will be held.

cp s3config.yml.example s3config.yml

If you haven’t used the VI editor before I suggest reading up on it. Begin editing the config file

vi s3config.yml

You will need to place your S3 API keys (taken from your account area) :

into the editor like so:

AWS_ACCESS_KEY_ID: 555555555555555
AWS_SECRET_ACCESS_KEY: 55555555555555555555555
AWS_CALLING_FORMAT: SUBDOMAIN

Note: If you are using the EU servers for storage like we are, use the AWS_CALLING_FORMAT parameter with SUBDOMAIN to work correctly. Full details about the bucket URIs can be found in the s3sync readme.

Exit and save changes (pressing escape then typing :x and enter)

Additional Config Step

To make the script work correctly, we need to modify the s3config.rb file. We need to let s3sync know where it’s should look for the other config file. So we add another parameter in the confpath array.

Edit the file s3config.rb (we will use the VI editor again).

// Change line 15 from
confpath = ["#{ENV['S3CONF']}", "#{ENV['HOME']}/.s3conf", "/etc/s3conf"]
// to
confpath = ["./","#{ENV['S3CONF']}", "#{ENV['HOME']}/.s3conf", "/etc/s3conf"]

This way when we write our shell script it will look in the current folder for the config files.

Time for some shell scripting

You can download the final script, however it’s always best to know how it was built up. Let’s begin by creating a cron jobs folder ready for our backup script template.

mkdir cronjobs
cd cronjobs
touch template.backup.sh

Open the file using your favorite editor

#!/bin/sh
NOW=$(date +"%Y.%m.%d")
# Start Config
BACKUP_DOMAIN=example.com
BUCKET=example_bucket:BACKUP_DOMAIN
BACKUP_FILE_NAME=$NOW.httpdocs.tar.gz
BACKUP_DIR=/var/www/vhosts/BACKUP_DOMAIN
SCRIPTS_DIR=/home/s3sync
# End Config

The script starts out by defining the current date, we are using this script for a daily backup. We then define a set of variables through the script. They are all quite self explanitory, however may need a little customising based on your given set up.

We have a ‘backups’ bucket, which we split into the various domains. Don’t worry about creating the folders in the bucket as s3sync takes care of it all for you.

Next we will define, create a log file and tmp folders:

RUBY="$(which ruby)"
LOG_DIR=$SCRIPTS_DIR'/logs'
TMP_DIR=$SCRIPTS_DIR'/tmp'
BACKUP_LOG=$LOG_DIR/BACKUP_DOMAIN.backup.log
# Go to the correct folder
cd $SCRIPTS_DIR
# Make dirs if not there
mkdir -p $LOG_DIR
mkdir -p $TMP_DIR
touch $BACKUP_LOG

Now we will start to fill the log file with some simple information. Compress the relevant directory / virtual host to backup and move it into the tmp directory.

echo "====== $NOW - Backup Started ======" >> $BACKUP_LOG
echo "---- Compression Started ----" >> $BACKUP_LOG
tar czPf $TMP_DIR/$BACKUP_FILE_NAME $BACKUP_DIR
echo "---- Compression Finished ----" >> $BACKUP_LOG

Now for the s3 magic

# Upload time
echo "---- Uploaded Started ----" >> $BACKUP_LOG
RUBY s3sync.rb -r $TMP_DIR/ $BUCKET
echo "---- Uploaded Finished ----" >> $BACKUP_LOG

Thank you s3sync!

Lastly we will clean up our tmp directory so our server doesn’t get full of backups.

# Remove tmp files
echo "---- Removing Tmp Files Started ----" >> $BACKUP_LOG
rm -rf $TMP_DIR/*
echo "---- Removing Tmp Files Finished ----" >> $BACKUP_LOG
echo "====== $NOW - Backup Finished ======" >> $BACKUP_LOG

Save and close the editor.

Testing the Script

Like any new code written, it requires testing (and debugging!). It will most likely fail to begin with as it requires some permissions. So:

chmod +x template.backup.sh

Testing time! (Provided you have filled out the script with the domain you want to back up and all of the other locations are correct)

sudo ./template.backup.sh

Did you win? We hope you did! If not leave a comment and we’ll do our best to sort it out.

Download the final script

Winning

If you love winning like Charlie Sheen we can only suggest you move the script (once customised and tested) into your cron jobs folder. Want code?

cp ./template.backup.sh /etc/cron.daily

Don’t forget you will need to chmod that as well!

Database Backup – Part 2

Phew! That’s enough for one day right? We will be putting together a post about backing up your databases in a similar way. Stay tuned!

Add your comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>