Daily Archives: July 5, 2005

Checking if a Directory Exists

I recently had to write a utility that moved hundreds of thousands of files to a new location under a different directory organization. As part of it, I checked to see if the destination directory already existed and if not, created it. At one point I wondered if it would just be faster to try and create it, and if it fails, assume that it already exists (remember, I’m dealing with hundreds of thousands of files here–anything to speed it up is very welcome).

Determining if a directory exists isn’t entirely straightforward. If you use .Net, you can use Directory.Exists(), but that function must use the Win32 API at some point and there is no Win32 API that determines the existence of a directory, so what is it doing?

Ah, but there is an API to get the attributes of a given filename.

[code lang=”cpp”]
BOOL DirectoryExists(const char* dirName)
{
DWORD attribs = ::GetFileAttributesA(dirName);
if (attribs == INVALID_FILE_ATTRIBUTES) {
return false;
}
return (attribs & FILE_ATTRIBUTE_DIRECTORY);
}[/code]

Note that if the function call fails it doesn’t necessarily mean that the directory doesn’t exist–it could be that the device is inaccessible, you don’t have security permissions, or any number of other things. To know for sure what’s going on, you would need to call GetLastError().

So what if you’re creating directories? Why not try to create them no matter what? That’s fine, but is that faster than checking to see if it exists first? Let’s test it.

[code lang=”cpp”]
BOOL CreateDirectory(const char* dirName)
{
return ::CreateDirectoryA(dirName,NULL);
}

for (int i=0;i

CreateDirectory(dirName);

}
[/code]

Results (10,000,000 iterations):
265.227 second(s) total
2.65227e-005 second(s) average iteration

Now let’s try checking first:

[code lang=”cpp”]
for (int i=0;i BOOL bExists = DirectoryExists(dirName);
if (!bExists) {
CreateDirectory(dirName);
}
}
[/code]

Results (10 million iterations):
103.24 second(s) total
1.0324e-005 second(s) average iteration

Over 2 .5 times faster!

Now, my simple test is retrying a single folder over and over, and it never actually creates anything. In my case for the utility I mentioned above, I’m creating far fewer directories than the number of files I’m moving to them (though still in the thousands). In that case, it’s definitely worth my time to check to see if the folder exists before trying to create it.

To me, it appears that unless the number of folders you’re creating is of the same magnitude as the number of files, it definitely makes sense to check first.

This goes to show that you can’t believe anything related to performance until you measure it for your application.