{"id":2960,"date":"2018-09-20T18:18:48","date_gmt":"2018-09-20T18:18:48","guid":{"rendered":"https:\/\/blog.ssdnodes.com\/blog\/?p=2727"},"modified":"2025-05-15T15:46:25","modified_gmt":"2025-05-15T15:46:25","slug":"vps-backups-simple-overthinking","status":"publish","type":"post","link":"https:\/\/www.ssdnodes.com\/blog\/vps-backups-simple-overthinking\/","title":{"rendered":"VPS backups are simple\u2014you\u2019re just overthinking it"},"content":{"rendered":"<div id=\"preview1\" class=\"g-b g-b--t1of2 split split-preview\">\n<div id=\"preview\" class=\"preview-html\">\n<p>People keep thinking that manual VPS backups are some impossible task. They demand GUIs and automated tools. They spend hours and hours trying to use obscure terminal-based backup tools.<\/p>\n<p>Instead, let me be the first one to say that VPS backups are actually incredibly simple. Here\u2019s the only command you need to know:<\/p>\n<pre><code>$ rsync USER@IP_ADDRESS:\/ -aAXvh \\\n--exclude={\"\/dev\/*\",\"\/proc\/*\",\"\/sys\/*\",\"\/tmp\/*\",\"\/run\/*\",\"\/mnt\/*\",\"\/media\/*\",\"\/lost+found\"} \\\n\/home\/USER\/backups\/\n<\/code><\/pre>\n<p>Done. Can we stop worrying about VPS backups now?<\/p>\n<p>\u2026<\/p>\n<p>Okay, that looks like a lot, but I promise it isn\u2019t. We\u2019ll come back to what that command does in a moment\u2014first, let\u2019s talk about why you should be backing-up your VPS in the first place.<\/p>\n<div class=\"cta-inline\"><\/div>\n<h2><a id=\"Why_do_VPS_backups_matter_18\"><\/a>Why do VPS backups matter?<\/h2>\n<p>Whether you\u2019re an owner of small business, host a simple personal website, or are working in a top-notch organization, your data is always important. But because it\u2019s online, it always remains vulnerable to hackers, ransomware, or even accidental deletion.<\/p>\n<p>Having a proper backup and recovery plan is essential to protect yourself against these unexpected events. If you keep duplicate copies of your important files and store them in a separate and safe location, you can recover them if you have issues with the integrity of your data\u2014no matter the source.<\/p>\n<p>Instead of just running that command without thought, you should spend a few minutes creating a proper, standardized backup policy to minimize risk and make your life easier.<\/p>\n<p>The frequency and time of your backup depend on how often the data changes, how much time it takes to make a backup, how much data you need to duplicate, and when visitors or user will be using your service the most. Since the backup process can use lots of system resources, you should schedule your backups for low usage times of the day. For a personal VPS, you should be doing weekly or even daily backups.<\/p>\n<p><strong>Remember<\/strong>: Data backup and data recovery are two different things. <em>Backups<\/em> are the act of duplicating files, whereas <em>recovery<\/em> is the process of restoring data from your backup. Restoration isn\u2019t as easy as backup, but you can\u2019t even try restoration without a proper backup!<\/p>\n<h2><a id=\"Lets_get_into_what_rsync_is_all_about_31\"><\/a>Let\u2019s get into what rsync is all about<\/h2>\n<p><code>rsync<\/code> stands for <em>remote synchronization<\/em>, and is a utility program to synchronize files and directories from one host to another in an efficient manner. <code>rsync<\/code> replicates an entire data set between the source and destination when it runs for the first time. After that first run, <code>rsync<\/code> only transfers data that has changed. These changes are called a <code>delta<\/code>.<\/p>\n<p><code>rsync<\/code> uses compression and sends data over an encrypted SSH tunnel for robust security.<\/p>\n<p>The most basic use of <code>rsync<\/code> is to replicate a folder on the same host. The following example will sync all files and folders from <code>source_folder<\/code> to <code>destination_folder<\/code>.<\/p>\n<pre><code>$ rsync -av ~\/source_folder\/ destination_folder\/\n<\/code><\/pre>\n<p>The <code>-a<\/code> option signifies archive mode, and is an alias for other flags (<code>-rltpgoD<\/code>), and <code>-v<\/code> option turns on verbose mode for details about the transfer.<\/p>\n<p>Now that <code>rsync<\/code> has copied <code>source_folder<\/code> once, I can add a new file and rerun the same command. This time, <code>rsync<\/code> will not copy the entire folder again to the destination. It will only transfer the modified files or the files those have been added since the last run.<\/p>\n<pre><code>$ cd source_folder\n$ touch pattern.txt\n$ vi IP.txt\n$ rsync -av ~\/source_folder\/  dest_folder\/\nsending incremental file list\n.\/\nIP.txt\npattern.txt\n\nsent 353 bytes  received 61 bytes  828.00 bytes\/sec\ntotal size is 13  speedup is 0.03\n<\/code><\/pre>\n<p>This type of backup is an example of incremental backup.<\/p>\n<p>To synchronize files and folders between your local system and remote VPS, you need the SSH credentials and install <code>rsync<\/code> on the remote VPS. The following example will sync a folder from your local system to the remote VPS.<\/p>\n<pre><code>$ rsync -av ~\/source_folder USER@IP_ADDRESS:\/home\/USER\/backup\/\n<\/code><\/pre>\n<p>The above <code>rsync<\/code> command will sync the <code>source_folder<\/code> from your local system to the remote VPS in the folder <code>\/home\/USER\/backup\/<\/code>.<\/p>\n<p>If SSH is running on any non-standard port in your remote VPS, then you need to specify the non-standard port of SSH using the <code>-e<\/code> flag.<\/p>\n<pre><code>$ rsync -avP source_folder\/ -e 'ssh -p 2222' USER@IP_ADDRESS:\/home\/USER\/backup\/\n<\/code><\/pre>\n<p>The <code>-P<\/code> flag combines the flags <code>--progress<\/code> and <code>--partial<\/code>. The former will produce a progress bar in the terminal, and the latter tells the VPS to keep any partially transferred files if there are any interruptions during the transfer.<\/p>\n<h2><a id=\"Turning_the_tables_backing_up_your_VPS_to_your_local_machine_80\"><\/a>Turning the tables: backing up your VPS to your local machine<\/h2>\n<p>Now that I\u2019ve shown you how to <em>push<\/em> data from your local machine to your VPS, time to <em>pull<\/em> data from your VPS back to your local machine.<\/p>\n<pre><code>$ rsync -avP USER@IP_ADDRESS:\/var\/www\/html \/home\/dd\/backups\/\nreceiving incremental file list\nhtml\/\n     612 100%  597.66kB\/s    0:00:00 (xfr#1, to-chk=2\/4)\nhtml\/drupal\/\nhtml\/wordpress\/\n\nsent 59 bytes  received 820 bytes  195.33 bytes\/sec\ntotal size is 612  speedup is 0.70\n<\/code><\/pre>\n<p>The only difference between this command and the <em>push<\/em> from earlier is that we\u2019ve swapped the source and destination folders.<\/p>\n<p>We can still use the <code>-e<\/code> flag if you need to change the SSH port.<\/p>\n<pre><code>$ rsync -avP -e 'ssh -p 2222' USER@IP_ADDRESS:\/var\/www\/html \/home\/dd\/backups\/\n<\/code><\/pre>\n<p>This works great for a single folder, but what if you want to backup the <em>entire<\/em> VPS? We can do that, too, but we\u2019ll want to exclude a few folders. The <code>--exclude<\/code> flag does exactly this by excluding files based on a pattern. <code>rsync<\/code> doesn\u2019t support regex, so only standard file matching will work.<\/p>\n<pre><code>$ rsync --dry-run USER@IP_ADDRESS:\/ -aAXvh --exclude={\"\/dev\/*\",\"\/proc\/*\",\"\/sys\/*\",\"\/tmp\/*\",\"\/run\/*\",\"\/mnt\/*\",\"\/media\/*\",\"\/lost+found\"} \/home\/dd\/backups\/\n<\/code><\/pre>\n<p>The <code>--dry-run<\/code> flag in the above example will not transfer any files but will show you the output of the command. After checking the output carefully, you can then omit the <code>--dry-run<\/code> option to pull files from remote VPS. The <code>\/<\/code> path right after <code>IP_ADDRESS:<\/code> instructs <code>rsync<\/code> to sync entire file system, excluding the folders specified in the <code>--exclude={ }<\/code> flag.<\/p>\n<h2><a id=\"Lets_turn_this_magic_into_a_script_112\"><\/a>Let\u2019s turn this magic into a script<\/h2>\n<p>Remember when I said a good backup strategy makes your life easier? Enter scripting.<\/p>\n<p>Now we\u2019re back to thinking too hard.<\/p>\n<p>The script will not delete the old snapshots but will link the recent snapshots to a folder by the name <code>latest<\/code>. To customize the script for your environment, change the value of <code>source_dir<\/code>, <code>destination_dir<\/code>, <code>ssh_user<\/code>, <code>ip_address<\/code>, <code>ssh_port_no<\/code> and <code>symbolic_name_recent_backup<\/code> in the following script.<\/p>\n<pre><code>#!\/bin\/bash\n\n#Create a timestamp\ndate=<code>date &quot;+%Y-%m-%dT%H_%M_%S&quot;<\/code>\n\n#Source location, you can change '\/' to something like \/var\/www\/html\nsource_dir=\"\/\"\n\n#Backup location on your local system\ndestination_dir=\"\/home\/dd\/Documents\/\"\n\n#Name of Backup folder\nbackup_folder_name=backup-$date\n\n#Full path of backup; concatenation of above two paths\nfinal_destination_dir=$destination_dir$backup_folder_name\n\n#Create backup directory\nmkdir -p $final_destination_dir\n\n#rsync options\nrsync_option=\"-aAXvhP\"\n\n#SSH username\nssh_user=\"peter\"\n\n#SSH Port\nSSHPort=2222\n\n#IP address of remote host\nip_address=\"123.45.67.89\"\n\n#Symbolic name of latest backup\nsymbolic_name_recent_backup=\"latest\"\n\n#Exclude folders that you don't want to backup\n\nexclude_folders=(\n  \"\/dev\"\n  \"\/usr\"\n  \"\/var\"\n  \"\/sbin\"\n  \"\/home\"\n  \"\/etc\"\n  \"\/proc\"\n  \"\/sys\"\n  \"\/tmp\"\n  \"\/run\"\n  \"\/mnt\"\n  \"media\"\n)\n\n#Change to the destination directory where rsync will pull data from remote VPS\n\ncd $destination_dir\n\n#Get the most recent snapshot folder name that will be symbolically linked to the latest folder.\n\nlatest_backup_dir=$(ls -td -- backup* | head -n 1 | cut -d'\/' -f1)\n\n#Place all the exclude folders in a single variable\n\nfor item in \"${exclude_folders[@]}\"\ndo\n  exclude_flags=\"${exclude_flags} --exclude ${item}\"\ndone\n\n#Remove the folder which was symbolically linked to the snapshots folder earlier\n\nif [ -L $symbolic_name_recent_backup ];\nthen\n     echo \"Removing previous symbolic link to the snapshots\"\n     rm -rf $symbolic_name_recent_backup\nfi\n\n#Create a new symbolic link to the latest snapshots\n\necho \"Creating new symbolic link to the latest snapshots\"\n$(ln -s $latest_backup_dir latest)\n\n#Run rsync\n\nrsync $rsync_option ${exclude_flags} -e  \"ssh -p $SSHPort\" $ssh_user@$ip_address:$source_dir $final_destination_dir || echo \"rsync died with error code $?\" &gt;&gt; \/var\/log\/backup.log\n\n<\/code><\/pre>\n<h3><a id=\"Automate_the_script_to_run_once_in_a_week_208\"><\/a>Automate the script to run once in a week<\/h3>\n<p>Once you have tested the above script in your environment, automate the script to run at least once in a week using a cron job. You can choose the running interval of the script according to your requirements. Make sure you can authenticate yourself to the remote VPS using the key-based method, and without a passphrase, otherwise the cron job won\u2019t work.<\/p>\n<p>Just run <code>crontab -u USER -e<\/code> in the terminal, choose an editor, and add a line. Specify the time interval you\u2019d like, along with the path to where you saved the above script.<\/p>\n<p>My backup strategy is to run the backup script at 9 AM every Monday, hence the <code>0 9 * *<\/code> time interval specification. If something goes wrong, you can check the log file <code>\/var\/log\/backup.log<\/code> for more information.<\/p>\n<pre><code>$ crontab -u USER -e\n...\n...\n0 9 * * Mon \/path\/to\/your\/backup\/rsync_backup.sh\n<\/code><\/pre>\n<p>The script is a simple one, and maybe isn\u2019t a comprehensive solution for your needs, but it makes backing up a VPS incredibly easy. You can backup a VPS to your local machine, or even one VPS to another.<\/p>\n<p>See what happens when you stop thinking too hard and just using the fantastic tools that your VPS already has?<\/p>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>People keep thinking that manual VPS backups are some impossible task. Let&#8217;s see what happens when we stop overthinking and just use rsync smarter.<\/p>\n","protected":false},"author":20,"featured_media":2981,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"footnotes":""},"categories":[18],"tags":[],"class_list":["post-2960","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-devops"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.ssdnodes.com\/wp-json\/wp\/v2\/posts\/2960","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.ssdnodes.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.ssdnodes.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.ssdnodes.com\/wp-json\/wp\/v2\/users\/20"}],"replies":[{"embeddable":true,"href":"https:\/\/www.ssdnodes.com\/wp-json\/wp\/v2\/comments?post=2960"}],"version-history":[{"count":2,"href":"https:\/\/www.ssdnodes.com\/wp-json\/wp\/v2\/posts\/2960\/revisions"}],"predecessor-version":[{"id":12939,"href":"https:\/\/www.ssdnodes.com\/wp-json\/wp\/v2\/posts\/2960\/revisions\/12939"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.ssdnodes.com\/wp-json\/wp\/v2\/media\/2981"}],"wp:attachment":[{"href":"https:\/\/www.ssdnodes.com\/wp-json\/wp\/v2\/media?parent=2960"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.ssdnodes.com\/wp-json\/wp\/v2\/categories?post=2960"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.ssdnodes.com\/wp-json\/wp\/v2\/tags?post=2960"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}