Incremental Backups with NAnt


Update 8/4/2005:

I have received various comments
concerning the original text of this post, mostly people asking me for
the full source. It has been a while since I played with this
project, and I have long since moved on. However, since I’m getting
lots of lovely feedback about it, I felt I should update this post.

Like the numpty I am, I managed to
some how misplace the original source files, but did manage to find the
compiled assemblies. So the new code you see below has been
disassembled and recompiled to check for errors. I have not had the
chance to test this code in NAnt, but I am confident that it’ll
work. If not, let me know and I will invest more time into it.

Many of us are not so fortunate to own large backup devices, such as
tape drives. DVD burners are becoming more frequent in machines but can
still only handle 4.5GB of data. I had recently been reading about
dedicated backup servers. PC hardware is relatively cheap nowadays and
hard disk drives plentiful in space, so the consensus is in favor of
inexpensive dedicated PCs to backup your data.

I have GB’s of data, most of it home movies (taken of my son from
birth to 13 months) and a ton of music files. I also host my own source
repository of code that I write in my spare time. All of this data,
plus other bits and bobs, is spread across two servers and one laptop.

I got to thinking. Rather than purchasing another machine dedicated
to the purpose of backup I could use each of the two servers to backup
the data of the other, thus producing a redundant data backup scheme.
Of course, in the event that my network becomes infected with a virus
or the house burns down I’ll have no off-site backup that I can rely
on. I have anti-virus and ad-ware protection on my network and if the
house burns down I have more to worry about than the loss of my music
and video archive. What I was short of was a backup in case of hardware
failure.

So, this afternoon I got to work on copying data from one machine to
the other and visa-versa. I decided that a manual approach would be
painful over time. With the regularity of picture and movie uploads
from my digital camera it wouldn’t take long before my backups would
become out of date, and I didn’t much feel like keeping track of two
directory locations for all my data I upload. I started to use
NTbackup to schedule backup tasks to run nightly, but this became
cumbersome to maintain with the various directories and exclusions on
both servers. It was then that I remembered NAnt.

NAnt is primarily for building and deploying development projects,
but is sophisticated enough to handle advanced tasks required in a
modern day project deployment. NAnt is simple to use and install, and
all the configuration lives in one script file (in XML). With NAnt, I
could write one script that contains copy operations for both servers
and host a copy of the same script on both machines with a scheduled
task to run NAnt.exe.

My plan worked well, I created a simple script that would use the
“copy” task of NAnt to copy across files. The “fileset” task works
great at recursing directories so my script did a lot of copying for
little coding effort. There was, however, one drawback. Each time the
script ran it would spend hours copying GB’s of data. Most of my
data remains static on a day to day basis with the exception
of a few file additions or minor changes, I needed to get NAnt to copy
only if the file wasn’t at the destination or was different. No such
task exists in NAnt to perform an incremental copy, so I embarked on
writing one.

I wrote a very simple extension class to NAnt that would reuse a
“fileset” task, to obtain a list of files, and then for each file,
check to see if it exists at the destination, if so, run a
modified date check and a CRC32 comparison to see
if the source file had changed from the destination copy.
Below is the task code, I omitted the CRC32 code for brevity, but if you would like it please drop me a line in the comments section of this blog.

Update 8/4/2005: Both the task and CRC32 code is below.

CopyCompare.cs

using System;
using System.IO;
using NAnt.Core;
using NAnt.Core.Types;
using NAnt.Core.Attributes;

namespace NantBackup
{
///
/// NANT Task – compare CRC of files before copy.
///
[TaskName(“copyCompare”)]
public class CopyCompare : Task
{
#region Fields

private FileSet _filesToCheck = null;
private string _destination = String.Empty;

#endregion Fields

#region Construction

public CopyCompare() {}

#endregion Construction

#region Properties

[BuildElement(“fileList”, Required=true)]
public FileSet FilesToCheck
{
get { return _filesToCheck; }
set { _filesToCheck = value; }
}

[TaskAttribute(“destDir”, Required=true)]
public string DestinationDir
{
get { return _destination; }
set { _destination = value; }
}

#endregion Properties

#region Methods

protected override void ExecuteTask()
{
// Iterate the filelist.
foreach (string filePath in _filesToCheck.FileNames)
{
// Strip off the leading base directory name (including end slash).
string baseDir = _filesToCheck.BaseDirectory;
if (!baseDir.EndsWith(“”))
baseDir += “”;
string subPath = filePath.Substring(0 + baseDir.Length);

// Desination name.
string destPath = String.Format(“{0}{1}”, _destination, subPath);
// Copy file.
CopyFile(filePath, destPath);
}
}

private void CopyFile(string srcPath, string destPath)
{
if (FileChanged(srcPath, destPath))
{
// Check if destination directory exists.
string destDir = Path.GetDirectoryName(destPath);
if (!Directory.Exists(destDir))
Directory.CreateDirectory(destDir);
// Copy the file.
File.Copy(srcPath, destPath, true);
}
}

private bool FileChanged(string srcPath, string destPath)
{
FileStream src = null;
FileStream dest = null;
try
{
// If destination exists then see if files differ.
if (File.Exists(destPath))
{
// Has the date changed?
if (FileDateChanged(srcPath, destPath))
{
// File dates differ, make sure the file has changed!
src = new FileStream(srcPath, FileMode.Open, FileAccess.Read);
dest = new FileStream(destPath, FileMode.Open, FileAccess.Read);
string srcHash = ComputeHash(src);
string destHash = ComputeHash(dest);
return (0 != String.Compare(srcHash, destHash, true));
}
else
// Dates the same, so assume they’re the same.
return false;
}
else
// If no destination, make sure we have the source.
return File.Exists(srcPath);
}
catch (Exception ex)
{
throw(ex);
}
finally
{
if (null != src)
src.Close();
if (null != dest)
dest.Close();
}
}

private bool FileDateChanged(string srcPath, string destPath)
{
FileInfo fiSrc = new FileInfo(srcPath);
FileInfo fiDest = new FileInfo(destPath);
return (fiSrc.LastWriteTime != fiDest.LastWriteTime);
}

private string ComputeHash(Stream stream)
{
CRC32 hasher = new CRC32();
string result = BitConverter.ToString(hasher.ComputeHash(stream));
hasher.Clear();
return result;
}

#endregion Methods
}
}

CRC32.cs

using System;
using System.IO;
using System.Collections;
using System.Text;
using System.Security.Cryptography;

namespace NantBackup
{
public class CRC32 : HashAlgorithm
{
#region Fields

protected static uint AllOnes;
protected static bool autoCache;
protected static Hashtable cachedCRC32Tables;
protected uint[] crc32Table;
private uint m_crc;

#endregion Fields

#region Properties

public static bool AutoCache
{
get
{
return CRC32.autoCache;
}
set
{
CRC32.autoCache = value;
}
}

public static uint DefaultPolynomial
{
get
{
return 0x4c11db7;
}
}

#endregion Properties

#region Construction

static CRC32()
{
CRC32.AllOnes = uint.MaxValue;
CRC32.cachedCRC32Tables = Hashtable.Synchronized(new Hashtable());
CRC32.autoCache = true;
}

public CRC32()
: this(CRC32.DefaultPolynomial)
{
}

public CRC32(uint aPolynomial)
: this(aPolynomial, CRC32.AutoCache)
{
}

public CRC32(uint aPolynomial, bool cacheTable)
{
this.HashSizeValue = 0x20;
this.crc32Table = (uint[])CRC32.cachedCRC32Tables[aPolynomial];
if (this.crc32Table == null)
{
this.crc32Table = CRC32.BuildCRC32Table(aPolynomial);
if (cacheTable)
{
CRC32.cachedCRC32Tables.Add(aPolynomial, this.crc32Table);
}
}
this.Initialize();
}

#endregion Construction

#region Methods

protected static uint[] BuildCRC32Table(uint ulPolynomial)
{
uint[] numArray1 = new uint[0x100];
for (int num2 = 0; num2 0; num3–)
{
if ((num1 & 1) == 1)
{
num1 = (num1 >> 1) ^ ulPolynomial;
}
else
{
num1 = num1 >> 1;
}
}
numArray1[num2] = num1;
}
return numArray1;
}

public static void ClearCache()
{
CRC32.cachedCRC32Tables.Clear();
}

public new byte[] ComputeHash(byte[] buffer)
{
return this.ComputeHash(buffer, 0, buffer.Length);
}

public new byte[] ComputeHash(Stream inputStream)
{
int num1;
byte[] buffer1 = new byte[0x1000];
while ((num1 = inputStream.Read(buffer1, 0, 0x1000)) > 0)
{
this.HashCore(buffer1, 0, num1);
}
return this.HashFinal();
}

public new byte[] ComputeHash(byte[] buffer, int offset, int count)
{
this.HashCore(buffer, offset, count);
return this.HashFinal();
}

protected override void HashCore(byte[] buffer, int offset, int count)
{
for (int num1 = offset; num1 > 8;
this.m_crc ^= this.crc32Table[(int)((IntPtr)num2)];
}
}

protected override byte[] HashFinal()
{
byte[] buffer1 = new byte[4];
ulong num1 = this.m_crc ^ CRC32.AllOnes;
buffer1[0] = (byte)((num1 >> 0x18) & 0xff);
buffer1[1] = (byte)((num1 >> 0x10) & 0xff);
buffer1[2] = (byte)((num1 >> 8) & 0xff);
buffer1[3] = (byte)(num1 & 0xff);
return buffer1;
}

public override void Initialize()
{
this.m_crc = CRC32.AllOnes;
}

#endregion Methods
}
}

The NAnt script that calls the above task is as follows….


Test build script to test Copy-Compare Custom Task.
Loading Custom Task Script…

Done

With the example above I was able to create a complete build script that could
be deployed to both servers and run an incremental redundancy backup.

In additon to
the setup I describe above, I have one other server off-site in another
state. This server uses NTBackup to create backup files on an FTP
share. My next trick will be to use NAnt to FTP into the remote server,
pull the NTBackup files and store them on the either or both servers
hosted in my basement at home. This will require writing another NAnt
custom task (unless one exists to perform FTP), when complete I’ll have
one script that maintains a backup for all three or my servers.

Also part of my plan is to write a script for the single laptop (mentioned way, way
back at the top of this blog entry). Unlike the servers, the laptop is
not on all the time, so backups are limited to when the laptop is in
use. I plan to use the same incremental backup method each time I log
on. File changes are usually minimal on this machine so a backup should
be fairly quick to run and not inhibit any work.

NAnt can be downloaded at
http://nant.sourceforge.net

5 thoughts on “Incremental Backups with NAnt

  1. http://

    Hello.
    <br>
    <br>This is a great article with very useful code.
    <br>
    <br>Can you please include the CRC32 code?
    <br>
    <br>Thank you.

  2. http://

    There also seems to be a compile error on this line:
    <br>
    <br>string baseDir = _filesToCheck.BaseDirectory;
    <br>
    <br>CopyCompareTask.cs(59): Cannot implicitly convert type ‘System.IO.DirectoryInfo’ to ‘string’
    <br>

  3. http://

    It’s me again.
    <br>
    <br>Just some things that might help someone else looking at this blog.
    <br>
    <br>1.
    <br>I used the CRC class from the following article and applied the code changes posted by other readers to get this to compile:
    <br>
    <br><a target=”_new” href=”http://www.codeproject.com/csharp/crc32_dotnet.asp”>http://www.codeproject.com/csharp/crc32_dotnet.asp</a&gt;
    <br>
    <br>2.
    <br>I also had to modify the following line:
    <br>string baseDir = _filesToCheck.BaseDirectory;
    <br>to:
    <br>string baseDir = _filesToCheck.BaseDirectory.FullName;
    <br>
    <br>to get it to compile.
    <br>
    <br>Furthermore, there already is an implementation of a nant &lt;ftp&gt; task here:
    <br><a target=”_new” href=”http://www.spinthemoose.com/~ftptask/”>http://www.spinthemoose.com/~ftptask/</a&gt;
    <br>
    <br>But I’m not sure if it does incremental uploads/downloads.

  4. http://

    This is a wonderfull class I like it!!
    <br>I would like to see the full class definition. I am working on BizTalk Deployment with nant. So it will be very helpful for me at the rest of the project.

  5. http://

    Simply put: Excellent!
    <br>
    <br>However, just curious if you have put any thought into detecting file deletions/renames with this task?

Comments are closed.