"A complete guide to avoid memory leaks and unwanted caching in Office"
You may wonder why one should worry about memory management in C# (or any .NET application); the mighty Garbage Collector (GC) built into Common Language Runtime (CLR) should spare us from messing around memory management till the end of time (this is one of the reasons why we chose .NET and why we make fun of C/C++ developers, right ?). Well Garbage Collector may serve us well as long as we are inside our "managed world", but as soon as your code make calls to COM (which is exactly what you have to do if you want to work with Outlook or any other Office application) Garbage Collector won't always save you. I'm not saying GC will not do its job (it eventually will), but the problem is that you cannot tell when it will happen, i.e. when the GC will free unused objects and their COM references. In this article I'd like to illustrate using an example of Outlook automation that this may sometimes be a problem.
Follow up:
Let's start with a quick look to see how things are done when you call a COM component from your .NET code. Component Object Model (COM) architecture was created in good old days when oil cost way less than $100 and automatic memory management was just a privilege of small group of weirdos (which eventually grew to a bigger group called Java programmers). Real C/C++ programmers ruled the world and which means that also COM was built in this manner: each programmer who works with COM must manage memory himself.
The idea behind it is simple. The IUnknown interface implemented by each COM object contains 3 methods: QueryInterface, AddRef and Release. Let's leave QueryInterface method aside and let's focus on AddRef and Release methods. Each time a client references a COM object, it calls AddRef method, which increases an internal counter in the COM object. And each time a client stops using the COM object, it just calls Release which (how surprisingly) decreases the internal counter. If the internal counter reaches 0 the COM object knows it is not used anymore and it can free itself from memory. The pseudo-code would look like this:
public class MyCOMObject : IUnknown
{
int _counter = 0;
public int AddRef()
{
return _counter++;
}
public int Release()
{
_counter--;
if (counter == 0)
Free();
return _counter;
}
public object QueryInterface() { .... }
}
In .NET we (usually) don't need to worry about this process. When you work with a COM object in your .NET code you are actually working just with a proxy class which is called Runtime Callable Wrapper (RCW).
The RCW has two main responsibilities. Firstly it marshals date between managed and unmanaged code (for example it would convert System.String to BSTR and vice-versa when needed) and secondly it takes care of COM memory management by calling Release method once the proxy class itself is disposed.
So far it looks like we can always rely on Garbage Collector to do the dirty job. Not really. The problem with Garbage Collector is that its behaviour is non-deterministic (like most of Microsoft products - just joking), i.e. you cannot tell when it will dispose unused objects; sometimes you may even have to wait until the application exits. But there are some situations when you need to release a COM object explicitly from code which just the Garbage Collector cannot do for you.
Let me give you an example from Microsoft Office world (but this situation may easily happen with any COM-enabled application which is built in C/C++): Microsoft Office Outlook allows to access its data using COM interface which is called Outlook Object Model. Using the interface you can perform many tasks with Outlook; one of the things you can do is to create a new appointment in your Outlook calendar:
Application outlookApplication = ...; //Outlook Application instance
AppointmentItem appointment = (AppointmentItem)outlookApplication.CreateItem(OlItemType.olAppointmentItem);
appointment.Subject = "Meeting with my lawyer";
appointment.Save();
The code above creates a new appointment in memory and saves it to calendar. At least that is what it should do. But if you run the code sometimes you may notice that the appointment is displayed after some delay or it is not displayed at all unless you close Outlook and open it again.The problem is that Outlook will not display or update the item in the user interface as long as it is referenced (i.e. until the IUnknown.Release method is called) because from perspective of Outlook you are still working with the item. The Release method is called automatically by the RCW on its disposal, but this disposal can happen anytime (RCW does not implement IDisposable interface so you cannot dispose it explicitly). Not even the code below will force RCW to dispose itself:
Application outlookApplication = ...;
AppointmentItem appointment = (AppointmentItem)outlookApplication.CreateItem(OlItemType.olAppointmentItem);
appointment.Subject = "Meeting with my lawyer";
appointment.Save();
appointment = null; //Clearing out the reference will not force GC to dispose the RCW
The garbage collection will usually occur in very short time, but for example if the PC is busy the Garbage Collector is intelligent enough to delay the garbage collection (not to consume PC resources) which means the Release method will not be called immediately and Outlook calendar will not update. So generally speaking one cannot make any assumptions when a RCW object will be disposed.
It becomes apparent that we need a way to manually invoke disposal of the COM objects which are used in .NET. The solution for this problem lies in Marshal class from System.Runtime.InteropServices namespace. This class provides many methods which are useful when dealing with unmanaged code from .NET environment.
For our purpose we need to use the FinalReleaseComObject method (what an original name...). This method will manually call Release method of underlying COM object so that host application (Microsoft Outlook in our case) can respond accordingly and free the memory:
Application outlookApplication = ...;
AppointmentItem appointment = (AppointmentItem)outlookApplication.CreateItem(OlItemType.olAppointmentItem);
appointment.Subject = "Meeting with my lawyer";
appointment.Save();
Marshal.FinalReleaseComObject(appointment); //The RCW is released
Once you call the FinalReleaseComObject method all references are freed (Release method is called until it returns zero) a the RCW proxy class is disconnected from the underlying COM object. Even though the proxy is still a valid reference (it is not null), each operation will lead to InvalidComObjectException.
...
appointment.Save();
Marshal.FinalReleaseComObject(appointment); //The RCW is released
Debug.WriteLine(appointment.Subject); //InvalidComObjectException is raised
Note: If you are using .NET 1.1 the FInalReleaseComObject method is not present. Instead you need to call ReleaseComObject method (which removes just a single reference) in a loop:
while (Marshal.ReleaseComObject(appointment) > 0) ;
As you can see Garbage Collector is not very reliable you when dealing with COM and unmanaged code in general. You may choose to ignore this fact but I can assure you will eventually get very strange errors related to constantly rising memory consumption or things not updating properly. These errors are very hard to debug as they may not even occur in your development or testing environment as they are very time and platform dependent so I recommend to make sure to free your COM objects as soon as you stop using them.
Here are some interesting articles for further reading:
cbrumme's Weblog: ReleaseComObject - http://blogs.msdn.com/cbrumme/archive/2003/04/16/51355.aspx
Understanding classic COM interoperability with .NET applications - http://www.codeproject.com/KB/COM/cominterop.aspx
LukeN 080909
This post has 7 feedbacks awaiting moderation...