Tag Archives: Code

ArgumentNullException and ArgumentException

Does it strike anyone else as ironic that ArgumentException and ArgumentNullException have mismatched argument ordering? The parameter name is first for the null version, but second for the other one. Uggh… this makes it awkward to remember if you use both. ArgumentOutOfRangeException follows the example of ArgumentNullException.

I can see no obvious reason for the discrepancy. In all three cases, the constructors are exactly the same. In fact, MSDN explicitly says that the behavior is the same.

SM_SECURE

In the system metrics that you can retrieve with GetSystemMetrics, one of the available options is SM_SECURE.

The only explanation provided is “Nonzero if security is present; otherwise, 0.” What does that mean? On my computer, the value is 0. There are a few different ways security can be “present” in my mind–what specifically are they talking about here?

Google, MSN, MSDN yield no useful information.

2-D Arrays versus Structs

I had a situation the other day where I needed an array of two values and the thought occurred to me: which is better? A 2-D array or a 1-D array of structs. I decided to come up with a quick test to see.

Here are my results:

2-D Arrays : 55.9376 cycles/row
Structs : 31.9937 cycles/row

That’s the number of cycles, on average, to add up the numbers in a row. Cycles makes more sense to me than microseconds when we’re talking about this level of code.

It turns out that the struct version is about 74% faster, which confused me slightly. Isn’t the memory layout for both options the same and should thus have the same assembly code?

It turns out…no. Before proceeding, download the test code here. It’s not pretty, but it works. Some of the mess is designed to keep the compiler from optimizing out my test cases. I tested with Visual Studio 2005 beta 2 with default release mode compiler options.

So lets look at two things I discovered:

Here are the addresses of the first two rows (of two elements each) of the two array styles:

sum1: 0x00354860 0x00354868 0x00354878 0x00354880
sum2: 0x026F0020 0x026F0028 0x026F0030 0x026F0038

Take a look at sum2 (the array of structs); each element starts exactly at 8-byte intervals–what we would expect because memory is often aligned on 8-byte boundaries on x86.

But look at sum1; the elements in the first row are 8 bytes apart, but the next row starts 16 bytes later! That means 8 additional bytes are wasted between each row. This means that fewer rows will fit in the CPU’s L1 cache, causing more cache misses, slowing down the program. (Why did this layout occur? I don’t know the answer to that yet)

Second of all, let’s look at the assembly code. This is just for adding the two elements of a row together.

;array addition
mov eax, DWORD PTR _arr$[esp+4] ; 1st operand addr
mov eax, DWORD PTR [eax+ebp*4] ; 2nd operand addr
fld QWORD PTR [eax+8] ; 1st operand
fadd QWORD PTR [eax] ;add 2nd operand
faddp ST(1), ST(0) ;add result to answer

;struct addition
fld QWORD PTR [eax+8];put first operand in register
fadd QWORD PTR [eax-16] ; add other operand
faddp ST(1), ST(0) ;add result to answer

So…the array version has to do two mov’s to fixup the correct addresses.

It seems now that the struct version is slightly more efficient here–both in space and in speed. However, this is really a microcosm of a problem. This kind of thing probably wouldn’t matter unless you were doing a lot of it. Personally, I’d go with the struct more often than not because it would be much easier to update the code to use a 3rd field, for example.

List<> vs. ArrayList

I read Rico Mariani’s latest quiz, and decided to check out the results for myself in BRayTracer.

I already have some simple performance benchmark tests in my NUnit tests, so I ran some before and after. I changed only the ArrayList’s used in the scene object to hold shapes, materials, and lights.

Before:

25.197 opaque spheres/sec (time: 3.969 )

After:

31.841 opaque spheres/sec (time: 3.141 )

Pretty impressive savings with minimal work! Almost a full second! In an enormous scene, that will add up to a LOT of time saved. I still have to change the implementation for polygons and polygonal meshes–this will greatly speed up those operations, which are currently fairly slow. In fact, the way I have polygonal meshes written at the moment, they’re nearly impossible to render in a decent amount of time.

More exciting things coming to BRayTracer soon…

Curvy: Part 1

I’ve been wanting to learn COM lately, and not knowing where to start I began in some of the MFC books I have. Admittedly, MFC hides quite a bit of COM complexity, but I figured it would be a good place to get my feet wet without getting scared.

My first test program is called Curvy (Actually, Curvy2 since I restarted it with the doc/view architecture). It’s a curve drawing program. You can use the left mouse button to create points on a curve, the right mouse button closes the curve, and you can change the control points and move the curves.

The fun part was implementing copy & paste. It uses the OLE clipboard and puts both its native format and a bitmap on the clipboard.

The other fun part was implementing OLE Drag & Drop. This took me a few times to get quite right, but the results are spectacular. You can drag shapes within a window as well as to other windows. Holding down the ctrl key will copy the shape instead.

The project is in VS .Net 2005 format, but the code will compile under previous versions of VS.

Click here to download.

Next stop for this? Implement a full OLE server so other programs can host Curvy components inside them!

Checking if a Directory Exists

I recently had to write a utility that moved hundreds of thousands of files to a new location under a different directory organization. As part of it, I checked to see if the destination directory already existed and if not, created it. At one point I wondered if it would just be faster to try and create it, and if it fails, assume that it already exists (remember, I’m dealing with hundreds of thousands of files here–anything to speed it up is very welcome).

Determining if a directory exists isn’t entirely straightforward. If you use .Net, you can use Directory.Exists(), but that function must use the Win32 API at some point and there is no Win32 API that determines the existence of a directory, so what is it doing?

Ah, but there is an API to get the attributes of a given filename.

[code lang=”cpp”]
BOOL DirectoryExists(const char* dirName)
{
DWORD attribs = ::GetFileAttributesA(dirName);
if (attribs == INVALID_FILE_ATTRIBUTES) {
return false;
}
return (attribs & FILE_ATTRIBUTE_DIRECTORY);
}[/code]

Note that if the function call fails it doesn’t necessarily mean that the directory doesn’t exist–it could be that the device is inaccessible, you don’t have security permissions, or any number of other things. To know for sure what’s going on, you would need to call GetLastError().

So what if you’re creating directories? Why not try to create them no matter what? That’s fine, but is that faster than checking to see if it exists first? Let’s test it.

[code lang=”cpp”]
BOOL CreateDirectory(const char* dirName)
{
return ::CreateDirectoryA(dirName,NULL);
}

for (int i=0;i

CreateDirectory(dirName);

}
[/code]

Results (10,000,000 iterations):
265.227 second(s) total
2.65227e-005 second(s) average iteration

Now let’s try checking first:

[code lang=”cpp”]
for (int i=0;i BOOL bExists = DirectoryExists(dirName);
if (!bExists) {
CreateDirectory(dirName);
}
}
[/code]

Results (10 million iterations):
103.24 second(s) total
1.0324e-005 second(s) average iteration

Over 2 .5 times faster!

Now, my simple test is retrying a single folder over and over, and it never actually creates anything. In my case for the utility I mentioned above, I’m creating far fewer directories than the number of files I’m moving to them (though still in the thousands). In that case, it’s definitely worth my time to check to see if the folder exists before trying to create it.

To me, it appears that unless the number of folders you’re creating is of the same magnitude as the number of files, it definitely makes sense to check first.

This goes to show that you can’t believe anything related to performance until you measure it for your application.

Code Timing functions

When it comes to timing your code, there are a lot of options, and depending on what you need to do, not all of them make sense. There are timing functions which utilize callbacks and other ways to periodically call a designated function at a regular interval–I’m not going to discuss those types of timers.

Instead, I’ll focus on the kinds of timing you can do to assess code performance.

To illustrate the differences between these methods, let’s have a simple C++ program:

[code lang=”cpp”]
#include
#include

using namespace std;

int main(int argc, char** argv)
{
//begin timing

//test code
const int nLimit = 10000000;
double sum = 0;

for (int i=0;i {
double rt = sqrt(static_cast(i));
sum+=rt;
}

//end timing

cout << “sum: “<

cout << elapsed << ” second(s) total”< cout << avg << ” second(s) average iteration”<

return 0;
}[/code]

I deliberately chose something simple and fast. Here’s why: if you’re timing something that always takes an hour–a second doesn’t matter, and the first option will work perfectly fine. On the other hand, if you need to time something small that will potentially be repeated dozens, hundreds, thousands, or millions of times, every microsecond can matter.

time_t and CTime

time() is the good old C standby. It’s simple, it’s easy, it’s historic, and it’ll break after 19:14:07, January 18, 2038, UTC.

To remedy this, Microsoft has the _time64() function, which uses the type __time64_t. It improves the lifespan, but not the precision.

Our code is now:
[code lang=”cpp”]
#include
#include
#include

using namespace std;

int main(int argc, char** argv)
{
//begin timing
time_t start = time(NULL);

//test code
const int nLimit = 10000000;
double sum = 0;

for (int i=0;i {
double rt = sqrt(static_cast(i));
sum+=rt;
}

//end timing
time_t end = time(NULL);
cout << “sum: “< long elapsed = static_cast(end – start);
double avg = (double)elapsed / nLimit;
cout << elapsed << ” second(s) total”< cout << avg << ” second(s) average iteration”<

return 0;
}[/code]

Output:
sum: 2.10819e+013
16 second(s) total
0 second(s) average iteration

MFC’s CTime is just a wrapper for time_t (VC++6.0) or __time64_t (VC++7), so nothing new here.

time_t is good when you don’t need very precise measurements, when you’re just interested in the date and/or time, as in wall-clock time, or when it’s your only option.

For timing code, this is not a great option.

GetTickCount

Windows contains a function called GetTickCount which returns the number of milliseconds since the machine was turned on. This gives us 1,000 times better accuracy then time().

[code lang=”cpp”]
#include
#include
#include

using namespace std;

int main(int argc, char** argv)
{
//begin timing
DWORD start = GetTickCount();

//test code
const int nLimit = 10000000;
double sum = 0;

for (int i=0;i {
double rt = sqrt(static_cast(i));
sum+=rt;
}

//end timing
DWORD end = GetTickCount();
cout << “sum: “< double elapsed = (end – start) / 1000.0;//ms –> s
double avg = ((double)elapsed)/ nLimit;
cout << elapsed << ” second(s) total”< cout << avg << ” second(s) average iteration”<

return 0;
}[/code]

Output:
sum: 2.10818e+010
0.172 second(s) total
1.72e-008 second(s) average iteration

As seen above, this value is returned in a DWORD, which is 32-bits wide. 232 milliseconds comes out to about 49 days, at which time the counter rolls over to 0. If you’re timing very short intervals, this probably won’t be a problem. The MSDN documentation has a code example to detect timer wrap-around.

GetTickCount provides much more precision than time_t, but with processor speeds at multiple gigahertz, a lot can happen in a millisecond. Enter…

QueryPerformanceCounter and QueryPerformanceFrequency

This set of functions is also available in the Win32 API and they are much more accurate than time_t and GetTickCount.

Their usage is often paired. QueryPerformanceCounter returns the current count on the timer, and QueryPerformanceFrequency returns how many counts per second there are. The documentation makes clear that timer frequency is not the same as processor clock frequency. This timer is hardware-based, however, and if your computer doesn’t have a timing device, then calling QueryPerformanceCounter returns the same thing as GetTickCount above, and QueryPerformanceFrequency returns 1000 (1000 ms = 1 s).

The code reads:

[code lang=”cpp”]
#include
#include
#include

using namespace std;

int main(int argc, char** argv)
{
//begin timing
LARGE_INTEGER start;
::QueryPerformanceCounter(&start);

//test code
const int nLimit = 10000000;
double sum = 0;

for (int i=0;i {
double rt = sqrt(static_cast(i));
sum+=rt;
}

//end timing
LARGE_INTEGER end;
LARGE_INTEGER countsPerSecond;
::QueryPerformanceCounter(&end);
::QueryPerformanceFrequency(&countsPerSecond);

double elapsed = (double)(end.QuadPart – start.QuadPart) / countsPerSecond.QuadPart;
double avg = elapsed / nLimit;
cout << “sum: “< cout << elapsed << ” second(s) total”< cout << avg << ” second(s) average iteration”<

return 0;
}[/code]

Output:
sum: 2.10818e+010
0.165298 second(s) total
1.65298e-008 second(s) average iteration

You can easily see the much improved precision.

OK, well these are all well and good if you have Windows, but what if you don’t have Windows, or the time isn’t as important as the number of cycles required to compute something?

Here’s…

GetMachineCycleCount

…a function to read the current CPU clock cycle. Note that there aren’t any API or standard library functions to do this so we need to use assembly. Note also that this wasn’t possible until the Pentium came along, so don’t run this code on a pre-Pentium computer. 🙂

[code lang=”cpp”]
#include
#include

using namespace std;

inline __int64 GetMachineCycleCount()
{
__int64 cycles; //64-bit int

_asm rdtsc; // won’t work on 486 or below – only pentium or above
_asm lea ebx,cycles; //ebx = &cycles
_asm mov [ebx],eax; //get low-order dword
_asm mov [ebx+4],edx; //get hi-order dword

return cycles;
}

int main(int argc, char** argv)
{
//begin timing

__int64 start = GetMachineCycleCount();

//test code
const int nLimit = 10000000;
double sum = 0;

for (int i=0;i {
double rt = sqrt(static_cast(i));
sum+=rt;
}

//end timing
__int64 end = GetMachineCycleCount();

cout << “sum: “< __int64 elapsed = end – start;
double avg = (double)elapsed / nLimit;
cout << elapsed << ” cycles total”< cout << avg << ” cycles average iteration”<

return 0;
}[/code]

Output:
sum: 2.10818e+010
263631085 cycles total
26.3631 cycles average iteration

What are some situations where this is useful? If you want to make a piece of code absolutely the fastest on any piece of hardware, regardless of clock-time, this can give you a truer reflection of the effectiveness of your algorithm. If you don’t have a high-performance timer API (and you don’t want to write one), then this is a quick-and-dirty, but effective, solution.

So what about discovering the speed of your processor? Well, there’s no machine instruction to do this. Basically, you have to combine the cycle count with the performance timers and run through a lot of loops to calculate an average.

I hope this has helped some people wade through the various options of timing performance-critical code.

Code Security and Typed Assembly Language

Over the summer I’m taking a class called Programming Languages and Security. This is the first time I’ve delved into security at this level before. It’s a seminar style, which means lots of paper reading, and I am going to give two presentations.

My first presentation was this past Thursday. I spoke about typed assembly language and security automata. It was absolutely fascinating, ignoring the formality of proofs, and all the mathematical notations.

The two papers I discussed were:

The TALx86 begins by describing many shortcomings of the Java Virtual Machine Language (bytecode), including such things as:

  • Semantic errors in the bytecode that could have been discovered if a formal model had been used in its design.
  • Difficulty in compiling languages other than Java into bytecode. For example, it’s literally impossible to correctly compile Scheme into bytecode. OK, Scheme is a pretty esoteric language, but…
  • Difficulty even in extending the Java language because of the bytecode limitations
  • Interpretation is slow, and even though JIT is often used these days, that’s not built-in to the VM

My immediate thought on reading this was, “Hey! .Net addresses each and every single one of these points!”

  • The CLR defines a minimal subset of functionality that must be supported by every .Net language–allowing over 40 languages to be compiled to MSIL
  • As a bonus, MSIL is typed (as is Java bytecode)
  • Just-In-Time compilation was designed in from the beginning and generally has superior performance to Java (in my experience)

It also seems that many of the experimental features present in such early research, such as TALx86, has ended up in .Net and compilers these days. Type safety is being pushed lower and lower. Security policies are being implemented into frameworks, operating systems and compilers, and there are other tools that analyze your code for adherence to security best practices.

On platforms such as .Net, type safety is more important because you can have modules written in VB.Net interacting with objects written in C++ or Python, for example. Those languages don’t know about each other’s types, but at the MSIL level you can ensure safety.

If you’d like, a copy of the presentation is available.

Modifying the System Menu in MFC

Introduction

The system menu is a standard feature of every Windows application. It is managed by Windows so normally we don’t have to worry about it at all. However sometimes it is nice to be able to modify that menu according to our own program with things that Windows can’t automatically do for us.

As my main example I will be using a tool I’ve created called BabelOn. This is a C++/MFC program that accesses a web-service to translate text into foreign languages. It uses a toolbox window, so we don’t see the system menu icon in the upper-left corner, however the menu can still be accessed with Ctrl-Space or by right-clicking on the title bar. The menu has been modified to contain two extra commands: About BabelOn, and Exit. Exit is needed because the default action for Close (Alt-F4) has been overridden in the program to hide the window and allow tray access.

Adding Commands

First we need to define a unique variable to represent each menu item. This can be done in the Resource.h file, or in any standard header file. If the commands already exist as part of another standard or context menu, then this step can be skipped because the definitions already exist. However, it’s important to note that even if we use the pre-existing definitions, the message handlers for the commands on the regular menu will NOT be automatically called. System commands are routed differently. So, in the end, it doesn’t matter if we use the existing or define our own.

For our example, we’ll define two:

[code]
#define IDM_ABOUT 16
#define IDM_EXIT 17
[/code]

The IDM just means this is a menu-item ID.

We add these commands in our window’s initializing function ( OnInitDialog(), OnCreate() ). My example is in a dialog class, so this is what the function looks like:

[code lang=”cpp”]
BOOL CBabelOnDlg::OnInitDialog()
{
CDialog::OnInitDialog();
// Add “About…” and “Exit” menu items to system menu.
// Command IDs must be in system range
// the ANDing is because of a bug in Windows 95
ASSERT((IDM_ABOUTBOX & 0xFFF0) == IDM_ABOUTBOX);
ASSERT(IDM_ABOUTBOX < 0xF000);
ASSERT((IDM_EXIT & 0xFFF0) == IDM_EXIT);
ASSERT(IDM_EXIT < 0xF000);
CMenu* pSysMenu = GetSystemMenu(FALSE);
if (pSysMenu != NULL) {
pSysMenu->AppendMenu(MF_STRING,IDM_EXIT,”E&xit Program”);
pSysMenu->AppendMenu(MF_SEPARATOR);
pSysMenu->AppendMenu(MF_STRING, IDM_ABOUTBOX, “A&bout BabelOn”);

}
. . .
//other initialization
[/code]

The first thing you should notice is a couple of ASSERT statements for each command. The first one deals with a bug present in Windows 95 and the second ensures the custom command is below the range used by pre-defined system commands. From the MFC documentation: “All predefined Control-menu items have ID numbers greater than 0xF000. If an application adds items to the Control menu, it should use ID numbers less than F000.”

Next we get a pointer to the system menu with the GetSystemMenu. We call it with an argument of FALSE to get the pointer. If we call it with TRUE, then it will reset the menu to its default state.

If the pointer is valid, we call some commands to add to the bottom of the menu, passing the IDs and the string we want to show up when the menu is viewed.

Simple, isn’t it?

Processing Custom Commands

In order to have those commands do anything, we can’t rely on the normal message-handling mechanism, even if we have handlers for the same items in other menus. We have to handle the WM_SYSCOMMAND message in our dialog/window class.

[code lang=”cpp”]
void CBabelOnDlg::OnSysCommand(UINT nID, LPARAM lParam)
{
//trap our own system menu messages
if ((nID & 0xFFF0) == IDM_ABOUTBOX)
{
CAboutDlg dlgAbout; dlgAbout.DoModal();
} else if ((nID & 0xFFF0)==SC_CLOSE)
{
OnClose();
} else if ((nID & 0xFFF0)==IDM_EXIT)
{
::PostQuitMessage(0);
} else {
CDialog::OnSysCommand(nID, lParam);
}
}
[/code]

This is the height of simplicity herself. We compare the code that was passed in with the system to our commands, again ANDing them to account for the Windows 95 bug. Then call whatever code we want to handle it.

If we selected Exit, then we post a message to quit the application.

Take a look at the second test: SC_CLOSE is a predefined menu constant. It’s one normally handled by Windows, but we can still add custom processing to it if we want. In this application, I didn’t want it to exit, but to merely hide the application (with an icon in the tray). So I just call a handler that I wrote that does that. If the ID doesn’t equal anything we want to custom-process, we just pass it on up the hierarchy for default processing.

The IDs for the most common system commands are:

SC_CLOSE – Close the CWnd object.
SC_MAXIMIZE (or SC_ZOOM) – Maximize the CWnd object.
SC_MINIMIZE (or SC_ICON) – Minimize the CWnd object.
SC_MOVE – Move the CWnd object.
SC_RESTORE – Restore window to normal position and size.
SC_SIZE – Size the CWnd object.

There are others in special situations that you can learn about in the documentation for WM_SYSCOMMAND.

Modifying Existing Commands

We just saw that it’s possible to change the default handling of built-in system commands, but it’s also possible to modify existing menu items by changing their text, or to entirely remove them.

To modify the text of a command, use the ModifyMenu() function on pSysMenu in the above example. For example, to change “Close” to “Hide”, I could do this:
[code lang=”cpp”]
pSysMenu->ModifyMenu(SC_CLOSE, MF_BYCOMMAND,IDM_HIDE, “&Hide”);
[/code]

The MF_BYCOMMAND argument tells the function to interpret SC_CLOSE as a command ID. IDM_HIDE is a new command ID, and then comes the text we want to show.

Alternatively, we can call ModifyMenu() on the ith menu item:

[code lang=”cpp”]
pSysMenu->ModifyMenu(0,MF_BYPOSITION,IDM_HIDE,”&Hide”);
[/code]

This changes the first menu item to Hide.

Removing commands

Don’t want a close command at all on the window? Not sure if this is a good idea or not, but you can do it.

Use this method:

[code lang=”cpp”]
pSysMenu->RemoveMenu(SC_CLOSE,MF_BYCOMMMAND); pSysMenu->RemoveMenu(0,MF_BYPOSITION);
[/code]

The first one removes the command associated with SC_CLOSE, while the second removes the first item on the menu.

Conclusion

Modifying the system menu in this way has limited application but it is sometimes useful for small applications and utilities that have a limited menu structure otherwise. It can be useful for commands you want to be easily accessible from the task bar–when a program is is in the background or minimized, right-clicking on it’s icon on the task bar will bring up the system menu.

Something should be said, however, about the wisdom of certain modifications. Removing or modifying functionality that user expects is usually a bad idea. It makes your program less usable and friendly compared to other applications. UI guidelines and standard look and feel exist for a good reason!

Enjoy playing with this!

©2004 Ben Watson

Threads in MFC Part III: Exceptions, Suspense, Murder, and Safety

Exceptions

In the previous tutorial, I described the various synchronization objects you can use to control access to shared objects. In most cases, these will work fine, but consider the following situation:

[code lang=”cpp”]
UINT ThreadFunc(LPVOID lParam)
{
::criticalSection.Lock();
::globalData.DoSomething();
SomeFunctionThatThrowsException();
::criticalSection.Unlock();
return 0;
}[/code]

What’s going to happen when that exception gets thrown? The critical section will never be unlocked. If you start the thread again, it will again try to lock it, and finding it already locked, it will sit there forever waiting. Of course, a mutex will unlock when the thread exits, but a critical section won’t. So MFC has a couple of wrapper classes that can incorporate any of the other basic synchronization classes. These are called CSingleLock and CMultiLock.

Here is how they are used:
[code lang=”cpp”]
UINT ThreadFunc(LPVOID lParam)
{
CSingleLock lock(&(::criticalSection));
lock.Lock();
::globalData.DoSomething();
FunctionThatThrowsException();
lock.Unlock();
return 0;
}
[/code]

You merely pass the address of the “real” synchronization object. CSingleLock lock is created on ThreadFunc’s stack, so when an exception is thrown, and that function exits prematurely without a chance to nicely clean up, CSingleLock’s destructor is called, which unlocks the data. This would not happen to criticalSection because, being a global variable, it will not go out of scope and be destroyed when ThreadFunc exits.

CMultiLock

This class allows you to block, or wait, on up to 64 synchornization objects at once. You create an array of references to the objects, and pass this to the constructor. In addition, you can specify whether you want it to unblock when one object unlocks, or when all of them do.

[code lang=”cpp”]
//let’s pretend these are all global objects, or defined other than in the local function
\tCCriticalSection cs;
CMutex mu;
CEvent ev;
CSemaphore sem[3];

CSynObject* objects[6]={&cs, &mu, &ev,
&sem[0], &sem[1], &sem[2]};
CMultiLock mlock(objects,6);
int result=mlock.Lock(INFINITE, FALSE);

[/code]

Notice you can mix synchronization object types. The two parameters I specified (both optional) specify the time-out period and whether to wait for all the objects to unlock before continuing. I saved the return value of Lock() because that is the index into the array of objects of the one that unblocked, in case I want to do special processing.

Killing a Thread

Generally, murder is very messy. You have blood and guts everywhere that certainly don’t clean up after themselves. But sometimes, sadly, it is necessary (no one call the cops–my metaphor is about to end).

If you start a child thread, and for some reason it is just not exiting when you need it to, and you’ve fixed your code, double-checked all your signaling mechanisms, and then and only then you want to kill it, here’s how. When you create the thread, you need to get its handle and save it for later use in your
class.

[code lang=”cpp”]
HANDLE hThread;//handle to thread
[/code]

A handle is only valid while the thread is running. What if we create a thread, start it off running, and it exits immediately for some reason? Back in our main thread, even if the very next statement after creating the thread is to grab its handle, it could very possibly be too late.

So we create a thread suspended! We just don’t even let it get to first base before we allow ourselves to get to the handle. This is a piece of cake, simply change the last parameter we’ve been giving fxBeginThread() from 0 toCREATE_SUSPENDED:

[code lang=”cpp”]
CWinThread* pThread=AfxBeginThread(ThreadFunc,NULL, THREAD_PRIORITY_NORMAL, 0, CREATE_SUSPENDED);
::DuplicateHandle(GetCurrentProcess(), pThread->m_hThread, GetCurrentProcess(), &hThread, 0, FALSE, DUPLICATE_SAME_ACCESS);
pThread->;
ResumeThread();
[/code]

We start the thread suspended, use an API call to duplicate the thread’s handle, saving it to our class variable, and then resuming the thread.

Then, if we want to commit this heinous crime:

[code lang=”cpp”]::TerminateThread(hThread,0);
[/code]

Don’t say I didn’t warn you.

Thread-Safe Classes

Thread-safety refers to the possibility of calling member functions across thread-boundaries. Their are two types of safety: Class-level and Object-level. Class level means that I can create two CStringT objects called a and b, and access each of them in separate threads, but I cannot safely access just a in
two threads. Object safety means that it’s perfectly ok to access a in two or more threads simultaneously. Thread-safety at the object level generally means using synchronization objects to control access to all internal datamembers. So why not make all classes thread-safe at the object level? Because that would just about kill your performance. You can lock objects yourself outside of the actual object (as shown in Part II) to make it safe.

This is not to say that your program will always crash if you try to access a single object from two threads, but it most likely will. Also, you should not generally lock access to MFC member functions or public variables–you don’t know when the MFC framework is going to need access to them. There really isn’t need to lock on a CWnd* object anyway.

Etc

There are many, many details I have neglected to cover in these three tutorials. You can look in the SDK or .NET documentation for more information on such things as pausing/resuming, scheduling, masks in CMultiLock(), or any of the other member functions of the thread classes. If you want to learn about the internal details of Windows, threads and fibers, (plus a lot of other important subjects) check out Programming Applications for Microsoft Windows by Jeffrey Richter.

I have yet to cover so-called user-interface threads (internally, there is no difference–all threads are created equal). Perhaps in a future tutorial…

Threads are a very powerful tool, but they can quickly increase the complexity of your application by an order of magnitude. Use wisely. As always, it takes some experimentation to get the hang of how to go about it. So have fun!

©2004 Ben Watson