SuperGeekery: A blog probably of interest only to nerds by John F Morton.

A blog prob­a­bly of inter­est only to nerds by John F Mor­ton.

Automate daily database backups with crontab and SFTP and be the master your domain.

Daily_Backup_Cron_Job_Opt

I recent­ly worked on a site which saved data in it’s local data­base which need­ed to be col­lect­ed and sent each day to a remote serv­er via SFTP for analy­sis. Need­less to say, I didn’t want to man­u­al­ly do this process every day, but I had nev­er ful­ly explored cron jobs, so I had to teach myself about cron jobs while I fig­ured out the prob­lem at hand. Here’s how I end­ed up automat­ing the job.

Your friend, the com­mand line #

I built the Ubun­tu serv­er myself and I had full access to serv­er via the com­mand line. The root” user was locked down though for safety’s sake. I logged in under a 2nd user I cre­at­ed. To make it as clear as pos­si­ble, I’ll call that 2nd user myuser­name” in this post. myuser­name” can use sudo when need­ed to exe­cute root lev­el com­mands.

When I log into the serv­er as myuser­name”, I land in the direc­to­ry “/​home/​myusername”. I didn’t have a bin” direc­to­ry yet, so I made one to store the scripts I would write that I would even­tu­al­ly run in my cron job.

After mak­ing the bin” direc­to­ry, I went into it and made a new file called dai­ly­back­up” by using my default text edi­tor.

cd /bin
sudo nano dailybackup

I had to use sudo’ to make this file because of the per­mis­sions on the bin’ fold­er were for root’ instead of myuser­name’. (This was because of an Ubun­tu require­ment that the base direc­to­ry be root’ I don’t ful­ly under­stand. All I know is that if I made the home direc­to­ry belong to myuser­name’, I couldn’t log in.)

I changed own­er­ship of the dai­ly­back­up’ file using chown to allow myuser­name’ full own­er­ship of that file, both the own­er and group.

chown myusername:myusername dailybackup

Then I need­ed to make the dai­ly­back­up file exe­cutable.

sudo chmod 755 dailybackup

The file was now an exe­cutable. It was emp­ty though.

I could run this script as I worked on it by typ­ing the fol­low­ing into my com­mand line:

/home/myusername/bin/dailybackup

I did this fre­quent­ly along the way to test as I went along.

Basic script starter needs #

The first thing in the file defined which shell to use to inter­rupt the script. bash’ is the shell I am most famil­iar with so I went with that.

#!/bin/bash

Next I need­ed to define the PATH for the com­mands in my script. If I didn’t do this, I’d get error telling me that what seem to be the most basic pieces of func­tion­al­i­ty were miss­ing. Defin­ing the PATH vari­able solves this prob­lem.

export PATH=/bin:/usr/bin

Rotate back­ups #

Next I need­ed to start get­ting the dai­ly file export from my data­base, a MySQL data­base in my case. There are many resources, like this where I picked dif­fer­ent pieces of this up.

# modify the following to suit your environment
export DB_BACKUP="/home/myusername/backup"
export DB_USER="dbusername"
export DB_PASSWD="dbpassword"
export THEDATE='date +"%A, %B %-d, %Y"'

# title and version
echo ""
echo "Daily Tabbed Report: " $THEDATE
echo "----------------------"
echo "* Rotating backups…"
rm -rf $DB_BACKUP/04
mv $DB_BACKUP/03 $DB_BACKUP/04
mv $DB_BACKUP/02 $DB_BACKUP/03
mv $DB_BACKUP/01 $DB_BACKUP/02
mkdir $DB_BACKUP/01

The script starts by defin­ing some vari­ables. The first is the loca­tion where I want­ed to keep the back­up files, DB_BACKUP. The next 2 vari­ables are the user­name and pass­word for the MySQL data­base.

The export’ part of these com­mands should make the vari­able avail­able to sub-shells, from what I under­stand. I think this means if I sep­a­rat­ed out the var­i­ous parts of this script into their own files, these export­ed vari­ables would be able to be used in those scripts.

The vari­able called THE­DATE” holds the cur­rent date. The echo com­mands you see will print out help­ful infor­ma­tion into your com­mand line if you run it from there or in an email if you have your cron job to noti­fy you by email when it’s run.

The rest of the com­mands above rotate the back­up files from the pre­vi­ous days. (The first time you run this, there aren’t any data files to rotate, but I made place­hold­er direc­to­ries in my back­up fold­er to get things going.)

There are 4 dai­ly back ups at any one time. To rotate the back­up file, the 04’ direc­to­ry is first erased, and the remain­ing 3 direc­to­ries are renamed (or, actu­al­ly moved”) down the line. Final­ly, a new 01’ direc­to­ry is cre­at­ed to hold the new data export that we cre­ate in the next step.

Export the dai­ly data #

This sec­tion starts by echo­ing out what I’m try­ing to accom­plish in each step. This helps out in debug­ging. Believe me, I didn’t do this cor­rect­ly on the first attempt.

echo "* Creating new backup..."
mysql --password=$DB_PASSWD -h 'localhost' -e "SELECT * FROM databasenaem.tablename WHERE datecreated >= DATE_SUB(NOW(), INTERVAL 1 DAY)" > $DB_BACKUP/01/exporteddata-`date +%Y-%m-%d`.txt
echo "* File: " $DB_BACKUP/01/exporteddata-`date +%Y-%m-%d`.txt
echo "----------------------"

Since I run this script every 24 hours, I want each export to include the pre­vi­ous 24 hours data. In the data­base, I have a field in the table called date­cre­at­ed” which I pop­u­late with the date auto­mat­i­cal­ly the moment it’s cre­at­ed. In the mysql’ line above, I query for data where the date­cre­at­ed” val­ue is greater than or equal to now” minus 1 day. The “>” in the mysql’ line exports what­ev­er was found in a tab delim­i­tat­ed file name export­ed data” with a date attached to it. This tab-delim­i­­tat­ed file is read­able by Excel and Num­bers.

Trans­fer the new­ly export­ed file via SFTP #

With the script so far, I was mak­ing the file that I want­ed, but it was only on my serv­er in the “/​home/​myusername/​backup/​01/​exporteddata – 2013 – 10-26.txt” file. I still need­ed to get the data off the serv­er and to the big data ana­lyt­ics” com­pa­ny.

Get­ting the data off my serv­er onto the serv­er I’ll call big​dat​a​com​pa​ny​.com was a lit­tle tricky. If I were able to get my SSH pub­lic key set up on their serv­er would have made this much eas­i­er. Due to cir­cum­stances beyond my con­trol, that wasn’t hap­pen­ing.

Doing SFTP via the com­mand line isn’t that dif­fi­cult, but you need to enter your pass­word each time. You can’t sim­ply add in a pass­word para­me­ter to the sftp’ com­mand unfor­tu­nate­ly. It doesn’t work that way.

I found 2 options: ssh­pass” and expect”. I end­ed up using ssh­pass”, but I think using expect” would have also worked. If you want to check out the link that looked most promis­ing for how to use the expect approach, check it out here. Look for the 3rd answer on that page.

Below is the script I end­ed up with using ssh­pass”. I’ll describe it after the code block.

echo "* Logging into external big data SFTP server"

FILE=$DB_BACKUP/01/exporteddata-`date +%Y-%m-%d`.txt

sshpass -p 'mysftppassword' sftp -oBatchMode=no -b - -oStrictHostKeyChecking=no externalusername@ftp.bigdatacompany.com <<_EOF_
cd /incoming
put $FILE
bye
_EOF_

echo "* Logging off external bigdatacompany server"
echo "----------------------"
exit 0

Like before, the echo com­mands let me know what I was try­ing to exe­cute in the script.

I made a vari­able defin­ing the name of the file that was cre­at­ed in the pre­vi­ous sec­tion of code in a vari­able called FILE”. You can see where I use it lat­er in the script where you see $FILE.

In my exam­ple code, I’m show­ing that the big data com­pa­ny gave me an SFTP user­name of exter­naluser­name’ and a pass­word of mys­ftp­pass­word’. The server’s address is ftp​.big​dat​a​com​pa​ny​.com’. I used these val­ues in the ssh­pass’ com­mand.

ssh­pass’ accepts the ‑p para­me­ter with your pass­word as a string fol­lowed by the actu­al sftp’ com­mand you would use. I have the “-oStrictHostKeyChecking=no” in there because I thought it would help me avoid error that warns against an unknown host. (It end­ed up work­ing for me when I ran the script direct­ly from the com­mand line, but it failed when I had it run in the cron job. I’ll show you how I dealt with that lat­er in this post.)

Once my script logged onto the exter­nal serv­er, I had to exe­cute the com­mands that would have done if I were doing it man­u­al­ly. First I changed into the direc­to­ry with the cd’ com­mand. Then I put’ the file there. Last­ly, I say bye’ to leave sftp.

I wrapped all of those com­mands in a here script” sec­tion. What’s here script? It a way of includ­ing a bunch of com­mands, a stream of them, in your script. In my exam­ple, the EOF marks the begin­ning and end of the here script. Here’s an excel­lent tuto­r­i­al that I learned about here scripts from myself.

The final exit 0” com­mand is there to tell your cron job that you com­plet­ed the script. It’s like say­ing The End”.

Know your crontab #

The script may be over, but we’re not done yet because we haven’t talked about cron jobs yet.

All was going well with the script. I could run the com­mand from my com­mand line and it made the data export and copied it to the exter­nal serv­er. As men­tioned ear­li­er, I’d run the com­mand like this:

/home/myusername/bin/dailybackup

Now I need­ed to add that com­mand to my cron job list. You do this by using crontab ‑e’ com­mand to edit the cur­rent crontab using the default edi­tor which is nano in my case.

crontab -e

This showed my cur­rent crontab file for myuser­nam’.

I edit­ed the default crontab file to include the fol­low­ing lines:

SHELL=/bin/bash
PATH=/sbin:/bin/usr/bin
HOME=/
MAILTO='notifyme@domain.com'

These lines set the shell, path, and home for the crontab jobs. The MAIL­TO” line let me over­ride the email address I had set up as the admin user email address on my serv­er.

I want­ed my dai­ly­back­up’ job to run every day at 12:01am so I added the fol­low­ing com­mand to the end of my crontab file.

1 0 * * * /home/myusername/bin/dailybackup

The eas­i­est way is to fig­ure out the right syn­tax for your crontab com­mand is to use the web-based Corntab util­i­ty. (Yes, it’s Corntab” not Crontab” because corn is fun­nier than cron.)

Of course I test­ed this ini­tial­ly by not hav­ing it run only at 12:01am every day. I test­ed it by hav­ing it run every 5 min­utes or so until I got the whole thing debugged prop­er­ly.

Failed host error #

I had an error that would appear only when I exe­cut­ed my dai­ly­back­up’ script via crontab. I could see the error in the email reports that the cron job was send­ing to the email address I’d spec­i­fied. The error was:

Failed to add the host to the list of known hosts (/home/myusername/.ssh/known_hosts). I already had an .ssh fold­er but the known_​hosts’ file wasn’t there and my cron job didn’t seem to be able to cre­ate it. I made the file man­u­al­ly instead and made sure myuser­name’ was the own­er of the file.

/home/myusername/.ssh/known_hosts
chown myusername:myusername /home/myusername/.ssh/known_hosts

I then logged into the exter­nal ftp serv­er man­u­al­ly again from with­in my Ubun­tu serv­er. I was asked if I want­ed to add this host to my known hosts file. I said yes’ and that solved the prob­lem.

Now do this, com­put­er! #

After this dai­ly back­up was debugged and worked, I felt like I had a new pow­er that I hadn’t had before. I added oth­er jobs, like doing a SQL back­up of my data­base reg­u­lar­ly as well. I guess this is what Super­man feels like.