Tag Archives: Code

More on Null and Empty Strings

Following on from Jeff Atwood’s post
about comparing a string to empty string (“”), I found out that C# and
VB.net do not behave the same when comparing a string that is null.

In the following C# snippet, the code will display The String is NOT Null:

string s = null;
if (s == “”)
  Console.WriteLine(“The String IS Null”);
else
  Console.WriteLine(“The String is NOT Null”);

The following VB.net code will display The String IS Null:

Dim s as string = Nothing
If s = “” Then
  Console.WriteLine(“The String IS Null”)

else

  Console.WriteLine(“The String is NOT Null”)
End If

The IL for the C# code is:

.method private hidebysig static void
            Main(string[] args) cil managed
    {
      .entrypoint
      .custom instance void [mscorlib]System.STAThreadAttribute::.ctor()
        = ( 01 00 00 00 )
      // Code size       38 (0x26)
      .maxstack  2
      .locals init (string V_0)
      IL_0000:  ldnull
      IL_0001:  stloc.0
      IL_0002:  ldloc.0
      IL_0003:  ldstr      “”
      IL_0008: 
call       bool
[mscorlib]System.String::op_Equality(string,string)
      IL_000d:  brfalse.s  IL_001b

      IL_000f:  ldstr      “The String IS Null”
      IL_0014: 
call       void
[mscorlib]System.Console::WriteLine(string)
      IL_0019:  br.s       IL_0025

      IL_001b:  ldstr      “The String is NOT Null”
      IL_0020: 
call       void
[mscorlib]System.Console::WriteLine(string)
      IL_0025:  ret
    } // end of method Class1::Main

The IL for the VB.net code is:

.method public static void  Main() cil managed
    {
      .entrypoint
      .custom instance void [mscorlib]System.STAThreadAttribute::.ctor() =
        ( 01 00 00 00 )
      // Code size       46 (0x2e)
      .maxstack  3
      .locals init ([0] string s)
      IL_0000:  nop
      IL_0001:  ldnull
      IL_0002:  stloc.0
      IL_0003:  ldloc.0
      IL_0004:  ldstr      “”
      IL_0009:  ldc.i4.0
      IL_000a: 
call       int32

       [Microsoft.VisualBasic]Microsoft.VisualBasic.CompilerServices.
   StringType::StrCmp(string,string,bool)
      IL_000f:  ldc.i4.0
      IL_0010:  bne.un.s   IL_001f

      IL_0012:  ldstr      “The String IS Null”
      IL_0017: 
call       void
[mscorlib]System.Console::WriteLine(string)
      IL_001c:  nop
      IL_001d:  br.s       IL_002b

      IL_001f:  nop
      IL_0020:  ldstr      “The String is NOT Null”
      IL_0025: 
call       void
[mscorlib]System.Console::WriteLine(string)
      IL_002a:  nop
      IL_002b:  nop
      IL_002c:  nop
      IL_002d:  ret
    } // end of method Module1::Main

As we can see from the VB.net IL, the string comparison is replaced
with a call to
Microsoft.VisualBasic.CompilerServices.StringType::StrCmp. From further
reading I understand that Microsoft did this to maintain compatibility
with VB6. The C# compiler on the other hand uses op_Equality to check
the equality of empty string and null. Since null and empty string are
not equal C# returns false from the condition.

So, the debate about whether to use S == “”
or not just about performance, but also about expected operational
behavior. Developers who switch between C# and VB.net could find some
unexpected results in their software if they are not aware of this
idiosyncrasy.

Going back to performance…. Chris Taylor has some nice graphs on
his blog, which show the significant differences in time between S1 == S2, S1.op_Equality(S1), S1.Equals(S1) and String.Equals(S1, S2). Performance varied by test but S1.op_Equality(S1) seemed to be an overall god performer, which is real handy because it’s what the C# compiler chooses when optimizing S1 == S2.

As much as I hate to see code, which doesn’t read as one would expect, VB.net programmers might want to consider using String.Equals(S1, S2) or S1.Length > 0 when doing lots of string comparisons.

In both C# and VB.net worlds, it’s good development practice to check
your strings against null before performing equality operations. Never
assume the compiler is always going to do the work for you – oh it
brings me back to my C++ days when the compiler did crap for you.

Null, Empty Strings and Performance Programmers

Jeff Atwood makes a complaint about “performance programmers” breaking his code.

In his example, Jeff shows the following snippet of code:

If Value <> “” Then
  If nvc.Item(name) = “” Then
    nvc.Add(name, Value)
  End If
End If

… which was then changed to by the performance programmer.:

If Value <> String.Empty Then
  If nvc.Item(name).Equals(String.Empty) Then
    nvc.Add(name, Value)
  End If
End If

The new code now breaks because if the NameValueCollection (nvc) does not have an item in the container with name it’ll return null/nothing, which causes the call to Equals
to fail with a null reference exception. Jeff’s code works because null
references can be compared with empty string in C# and VB.NET.

Comparing null references to empty string is not good programming
practice in general (in C++ such a comparison would cause an
exception). Jeff wrote his code knowing that this comparison was safe
because he knew about the language fundamentals in which he was
developing, the performance programmer did not. This is a good example
of the typical traps that most performance crack teams fall into when
tuning an existing application. Developing software is an art form –
writing good maintainable code that works and performs well sometimes
requires the developer to use non-typical syntax, which can throw other
unfamiliar developers of the code into a loop.

I-Filters

Today Carved out a chunk of the day to work on I-Filters. I-Filters are COM
dynamic link libraries that convert known file types to text under
Windows XP/2K/2K3. The OS’s indexing service uses I-Filters to convert
PDF and Office file types to text so the indexer can tokenize words
contained in files.

I wrote a test application that calls an I-Filter
library given a file name and converts it to text. The correct filter is determined by
examining the file extension and querying the registry (I-Filters are
registered with associated file extensions). My code works great with
Office documents but barfs when using Adobe’s 6.0 I-Filter.

Below is a synopsis of the method that does the work of invoking the
filter (leave a comment if you want the rest of the code). The CLSID is
the class ID of the filter, read from the registry.

(Apologies for no syntax highlighting)

private static string ExecuteFilter(string clsID, string sourceFile)

{

  string result = String.Empty;

  // Some filters are not reentrant, such as Adobe PDF filter.

  lock(_lock)

  {

    object itfc = null;

    try

    {

      // Get the filter type from CLSID.

      Type t = Type.GetTypeFromCLSID(new Guid(clsID));

      if (null != t)

      {

        // Get filter instance.

        itfc = Activator.CreateInstance(t);

        // Cast to IPersistFile.

        IFilter ifilt = (IFilter)(itfc);

        System.Runtime.InteropServices.UCOMIPersistFile ipf =

           (System.Runtime.InteropServices.UCOMIPersistFile)(ifilt);

        // Load source.

        ipf.Load(sourceFile, 0);

        // Initialize.

        uint i = 0;

        int hr = 0;

        STAT_CHUNK chunk = new STAT_CHUNK();

        ifilt.Init(IFILTER_INIT.NONE, 0, null, ref i);

        // Read the in chunks.

        StringBuilder masterBuffer = new StringBuilder();

        while (0 == hr)

        {

          // Read next chunk structure.

          try

          {

            hr = ifilt.GetChunk(out chunk);

          }

          catch (COMException ex)

          {

            //
Get Chunk will throw an exception
            // when no more chunks to read – tsk.

            if (FILTER_E_END_OF_CHUNKS == ex.ErrorCode)

              hr = ex.ErrorCode;

            else

              throw ex;

          }

          // if chunk is text..

          if (0 == hr && CHUNKSTATE.CHUNK_TEXT == chunk.flags)

          {

            // Read text to buffer.

            uint bufferSize = CHUNK_SIZE;

            int hr2 = 0;

            while (FILTER_S_LAST_TEXT != hr2 || 0 == hr2)

            {

              bufferSize = CHUNK_SIZE;

             
StringBuilder buffer = new StringBuilder((int)bufferSize);

              hr2 = ifilt.GetText(ref bufferSize, buffer);

             
masterBuffer.Append(buffer.ToString(0, (int)bufferSize));

            }

            // Did we get an error?

            if
(FILTER_E_NO_MORE_TEXT != hr2 && FILTER_S_LAST_TEXT != hr2)

             
throw new Exception(“Failed reading data from chunk!”);

          }

        }

        // Assign result.

        result = masterBuffer.ToString();

      }

    }

    catch (Exception ex)

    {

      throw new FileLoadException(“Failed to read data from filter!”, ex);

    }

    finally

    {

      if (null != itfc)

        Marshal.ReleaseComObject(itfc);

    }

  }

return result;

}

Album Art and ID3_V2

I have been messing with my music collection today (in my lunch hour of
course). I spent time making sure that each MP3 and WMA file in my
collection contains the correct song title, artist name, and album
title (where possible) ID3v2
tags. I also went through the painstaking effort of adding rankings to
each song title; so I could play all my favorite 5 star rated files from the
auto play list in Windows Media Player.

I have a complete copy of my music collection on my home network, as
well as at the office, and I have rigged up my Media Center
to play the
lot. It works really well, MCE picks up all my play lists and enables me
to thumb through my collection by album, artist of title. What’s
missing is the album art. When at the office I’m typically listening to
music while writing software, and the only picture I’m looking at is
Visual Studio.
However, when at home, I’m pumping the sounds through my stereo TV (as
well as my sub woofer) and the MCE screen displays the default logo for
the album info of each track played. Of course, I can always turn on
visualizations if I want a pretty picture, but I’d much rather see the
album cover of the song I’m listening to.

I checked the Internet for plug-ins for Windows Media Player and MCE
2005 to download and display album art but I didn’t exactly find what
I’m looking for. I’m looking for some software that will use at least
the artist name and title of a song to retrieve missing information from
Freedb.org. Once the album name has
been determined, the software should contact an on-line store, such as
Amazon.com, and download the album art before adding it as an ID3v2 tag.

I think I’ll be searching for a long while before an application or
plug-in becomes available to perform what I describe above. So, I
figure I can write an application myself. It shouldn’t be too difficult
to pull out ID3v2 tag information from MP3 and WMA files, and Freedb.org provides examples to search and download song information. Amazon.com
provides a web service to download album cover art (they do for books,
so I’m hoping they do for CDs). I’ll need to embed some logic in my app
for tracks in my collection that are missing album titles and belong to
multiple albums. The rest should be fairly simple to implement.

Watch this blog to see if I succeed in my plan……

Burn CDs from .NET

The XP Burn Component allows your .NET applications to burn files to CDR/W discs on a Window XP or Windows 2003 Server system. This component does not work for systems which have a different OS installed; though it will detect that case and give a reasonable error message. This component talks directly to the system’s IMAPI interfaces and doesn’t use the Windows XP CD burning wizard, so it’s possible to create your own snazzy UI for burning CDs.

Though the component is a UserControl, I wouldn’t recommend that you put it in the toolbox. Instead, simply reference it and use it like you would use any other framework type (the constructor can potentially throw exceptions, so for robust handling you should wrap it in a try…catch). The documentation and source for the component is included in the download.

http://msdn.microsoft.com/vcsharp/team/code/xpburn/default.aspx

Incremental Backups with NAnt


Update 8/4/2005:

I have received various comments
concerning the original text of this post, mostly people asking me for
the full source. It has been a while since I played with this
project, and I have long since moved on. However, since I’m getting
lots of lovely feedback about it, I felt I should update this post.

Like the numpty I am, I managed to
some how misplace the original source files, but did manage to find the
compiled assemblies. So the new code you see below has been
disassembled and recompiled to check for errors. I have not had the
chance to test this code in NAnt, but I am confident that it’ll
work. If not, let me know and I will invest more time into it.

Many of us are not so fortunate to own large backup devices, such as
tape drives. DVD burners are becoming more frequent in machines but can
still only handle 4.5GB of data. I had recently been reading about
dedicated backup servers. PC hardware is relatively cheap nowadays and
hard disk drives plentiful in space, so the consensus is in favor of
inexpensive dedicated PCs to backup your data.

I have GB’s of data, most of it home movies (taken of my son from
birth to 13 months) and a ton of music files. I also host my own source
repository of code that I write in my spare time. All of this data,
plus other bits and bobs, is spread across two servers and one laptop.

I got to thinking. Rather than purchasing another machine dedicated
to the purpose of backup I could use each of the two servers to backup
the data of the other, thus producing a redundant data backup scheme.
Of course, in the event that my network becomes infected with a virus
or the house burns down I’ll have no off-site backup that I can rely
on. I have anti-virus and ad-ware protection on my network and if the
house burns down I have more to worry about than the loss of my music
and video archive. What I was short of was a backup in case of hardware
failure.

So, this afternoon I got to work on copying data from one machine to
the other and visa-versa. I decided that a manual approach would be
painful over time. With the regularity of picture and movie uploads
from my digital camera it wouldn’t take long before my backups would
become out of date, and I didn’t much feel like keeping track of two
directory locations for all my data I upload. I started to use
NTbackup to schedule backup tasks to run nightly, but this became
cumbersome to maintain with the various directories and exclusions on
both servers. It was then that I remembered NAnt.

NAnt is primarily for building and deploying development projects,
but is sophisticated enough to handle advanced tasks required in a
modern day project deployment. NAnt is simple to use and install, and
all the configuration lives in one script file (in XML). With NAnt, I
could write one script that contains copy operations for both servers
and host a copy of the same script on both machines with a scheduled
task to run NAnt.exe.

My plan worked well, I created a simple script that would use the
“copy” task of NAnt to copy across files. The “fileset” task works
great at recursing directories so my script did a lot of copying for
little coding effort. There was, however, one drawback. Each time the
script ran it would spend hours copying GB’s of data. Most of my
data remains static on a day to day basis with the exception
of a few file additions or minor changes, I needed to get NAnt to copy
only if the file wasn’t at the destination or was different. No such
task exists in NAnt to perform an incremental copy, so I embarked on
writing one.

I wrote a very simple extension class to NAnt that would reuse a
“fileset” task, to obtain a list of files, and then for each file,
check to see if it exists at the destination, if so, run a
modified date check and a CRC32 comparison to see
if the source file had changed from the destination copy.
Below is the task code, I omitted the CRC32 code for brevity, but if you would like it please drop me a line in the comments section of this blog.

Update 8/4/2005: Both the task and CRC32 code is below.

CopyCompare.cs

using System;
using System.IO;
using NAnt.Core;
using NAnt.Core.Types;
using NAnt.Core.Attributes;

namespace NantBackup
{
///
/// NANT Task – compare CRC of files before copy.
///
[TaskName(“copyCompare”)]
public class CopyCompare : Task
{
#region Fields

private FileSet _filesToCheck = null;
private string _destination = String.Empty;

#endregion Fields

#region Construction

public CopyCompare() {}

#endregion Construction

#region Properties

[BuildElement(“fileList”, Required=true)]
public FileSet FilesToCheck
{
get { return _filesToCheck; }
set { _filesToCheck = value; }
}

[TaskAttribute(“destDir”, Required=true)]
public string DestinationDir
{
get { return _destination; }
set { _destination = value; }
}

#endregion Properties

#region Methods

protected override void ExecuteTask()
{
// Iterate the filelist.
foreach (string filePath in _filesToCheck.FileNames)
{
// Strip off the leading base directory name (including end slash).
string baseDir = _filesToCheck.BaseDirectory;
if (!baseDir.EndsWith(“”))
baseDir += “”;
string subPath = filePath.Substring(0 + baseDir.Length);

// Desination name.
string destPath = String.Format(“{0}{1}”, _destination, subPath);
// Copy file.
CopyFile(filePath, destPath);
}
}

private void CopyFile(string srcPath, string destPath)
{
if (FileChanged(srcPath, destPath))
{
// Check if destination directory exists.
string destDir = Path.GetDirectoryName(destPath);
if (!Directory.Exists(destDir))
Directory.CreateDirectory(destDir);
// Copy the file.
File.Copy(srcPath, destPath, true);
}
}

private bool FileChanged(string srcPath, string destPath)
{
FileStream src = null;
FileStream dest = null;
try
{
// If destination exists then see if files differ.
if (File.Exists(destPath))
{
// Has the date changed?
if (FileDateChanged(srcPath, destPath))
{
// File dates differ, make sure the file has changed!
src = new FileStream(srcPath, FileMode.Open, FileAccess.Read);
dest = new FileStream(destPath, FileMode.Open, FileAccess.Read);
string srcHash = ComputeHash(src);
string destHash = ComputeHash(dest);
return (0 != String.Compare(srcHash, destHash, true));
}
else
// Dates the same, so assume they’re the same.
return false;
}
else
// If no destination, make sure we have the source.
return File.Exists(srcPath);
}
catch (Exception ex)
{
throw(ex);
}
finally
{
if (null != src)
src.Close();
if (null != dest)
dest.Close();
}
}

private bool FileDateChanged(string srcPath, string destPath)
{
FileInfo fiSrc = new FileInfo(srcPath);
FileInfo fiDest = new FileInfo(destPath);
return (fiSrc.LastWriteTime != fiDest.LastWriteTime);
}

private string ComputeHash(Stream stream)
{
CRC32 hasher = new CRC32();
string result = BitConverter.ToString(hasher.ComputeHash(stream));
hasher.Clear();
return result;
}

#endregion Methods
}
}

CRC32.cs

using System;
using System.IO;
using System.Collections;
using System.Text;
using System.Security.Cryptography;

namespace NantBackup
{
public class CRC32 : HashAlgorithm
{
#region Fields

protected static uint AllOnes;
protected static bool autoCache;
protected static Hashtable cachedCRC32Tables;
protected uint[] crc32Table;
private uint m_crc;

#endregion Fields

#region Properties

public static bool AutoCache
{
get
{
return CRC32.autoCache;
}
set
{
CRC32.autoCache = value;
}
}

public static uint DefaultPolynomial
{
get
{
return 0x4c11db7;
}
}

#endregion Properties

#region Construction

static CRC32()
{
CRC32.AllOnes = uint.MaxValue;
CRC32.cachedCRC32Tables = Hashtable.Synchronized(new Hashtable());
CRC32.autoCache = true;
}

public CRC32()
: this(CRC32.DefaultPolynomial)
{
}

public CRC32(uint aPolynomial)
: this(aPolynomial, CRC32.AutoCache)
{
}

public CRC32(uint aPolynomial, bool cacheTable)
{
this.HashSizeValue = 0x20;
this.crc32Table = (uint[])CRC32.cachedCRC32Tables[aPolynomial];
if (this.crc32Table == null)
{
this.crc32Table = CRC32.BuildCRC32Table(aPolynomial);
if (cacheTable)
{
CRC32.cachedCRC32Tables.Add(aPolynomial, this.crc32Table);
}
}
this.Initialize();
}

#endregion Construction

#region Methods

protected static uint[] BuildCRC32Table(uint ulPolynomial)
{
uint[] numArray1 = new uint[0x100];
for (int num2 = 0; num2 0; num3–)
{
if ((num1 & 1) == 1)
{
num1 = (num1 >> 1) ^ ulPolynomial;
}
else
{
num1 = num1 >> 1;
}
}
numArray1[num2] = num1;
}
return numArray1;
}

public static void ClearCache()
{
CRC32.cachedCRC32Tables.Clear();
}

public new byte[] ComputeHash(byte[] buffer)
{
return this.ComputeHash(buffer, 0, buffer.Length);
}

public new byte[] ComputeHash(Stream inputStream)
{
int num1;
byte[] buffer1 = new byte[0x1000];
while ((num1 = inputStream.Read(buffer1, 0, 0x1000)) > 0)
{
this.HashCore(buffer1, 0, num1);
}
return this.HashFinal();
}

public new byte[] ComputeHash(byte[] buffer, int offset, int count)
{
this.HashCore(buffer, offset, count);
return this.HashFinal();
}

protected override void HashCore(byte[] buffer, int offset, int count)
{
for (int num1 = offset; num1 > 8;
this.m_crc ^= this.crc32Table[(int)((IntPtr)num2)];
}
}

protected override byte[] HashFinal()
{
byte[] buffer1 = new byte[4];
ulong num1 = this.m_crc ^ CRC32.AllOnes;
buffer1[0] = (byte)((num1 >> 0x18) & 0xff);
buffer1[1] = (byte)((num1 >> 0x10) & 0xff);
buffer1[2] = (byte)((num1 >> 8) & 0xff);
buffer1[3] = (byte)(num1 & 0xff);
return buffer1;
}

public override void Initialize()
{
this.m_crc = CRC32.AllOnes;
}

#endregion Methods
}
}

The NAnt script that calls the above task is as follows….


Test build script to test Copy-Compare Custom Task.
Loading Custom Task Script…

Done

With the example above I was able to create a complete build script that could
be deployed to both servers and run an incremental redundancy backup.

In additon to
the setup I describe above, I have one other server off-site in another
state. This server uses NTBackup to create backup files on an FTP
share. My next trick will be to use NAnt to FTP into the remote server,
pull the NTBackup files and store them on the either or both servers
hosted in my basement at home. This will require writing another NAnt
custom task (unless one exists to perform FTP), when complete I’ll have
one script that maintains a backup for all three or my servers.

Also part of my plan is to write a script for the single laptop (mentioned way, way
back at the top of this blog entry). Unlike the servers, the laptop is
not on all the time, so backups are limited to when the laptop is in
use. I plan to use the same incremental backup method each time I log
on. File changes are usually minimal on this machine so a backup should
be fairly quick to run and not inhibit any work.

NAnt can be downloaded at
http://nant.sourceforge.net

Free POP3 Client for C#

Announcing FreePOP3 ….

http://www.gotdotnet.com/Workspaces/Workspace.aspx?id=2331c59d-e726-4197-b28f-ba17845153d4

I’ve not had the chance to try this out yet but I plan to. Hopefully it has the feature of leaving messages on the server. Ideally I’d like an IMAP client, but POP3 will work for what I need for now. Anyone know of a free IMAP client for .NET?