Microsoft Office (Excel) Interop Crash Course

There comes one day in your life that you will be requested to execute some Excel logic on the server side, so that the user (for instance) doesn’t have to manually refresh some pivot tables. Unfortunately, Excel Interop (EI) is a living hell full of gotchas and bugs, thus let me give you a few hints.

Use only if you have to

When dealing with Excel documents (server side) you may have two group of tasks: data based and logic based. Data modifications are simple changes to the contents of the sheets. Whereas logic transformations require faking a functionality of Excel, like entering 1 + 1 should give us a final value 2 when entered into the cell.

Modifications of data are usually handled easily by free tools (such as EPPlus). Unfortunately, faking Excel logic is not that simple, thus EI becomes you a very powerful tool. It allows you to basically perform the same steps that the user could do manually (in an Excel window).

Accept the fact that it might not work

Microsoft has stated that they do not support interop services, so if it doesn’t work for you then … you have a problem. If it doesn’t for you (and you are sure that you made everything you could to make it work) then you can always resort to VBA macros. While this solution is often frowned upon, it’s much more reliable and hundred times easier to implement.

ALWAYS properly clean up your objects

You will be dealing with COM objects which is a framework that I am not very found of, alas it’s even worse when called from C#. If you do not properly clean all the COM objects you referenced, you will end up with a zombie EXCEL.exe process left even after your process has been finished. There might be even other bad things that happened, so it even more serious.

Repeat after me:

  1. I shall not keep multiple references to the same COM object as after cleanup all of them will be invalidated.
  2. I shall Close(), Quit() or use any other cleanup method that an object provides.
  3. I shall always use FinalReleaseComObject() on every COM object I’ve stopped using.
  4. I shall null my reference.
  5. I shall call Garbage Collector multiple times due to “funny” behavior of COM objects:
    GC.Collect();
    GC.WaitForPendingFinalizers();
    GC.Collect();
    
  6. I am aware of exceptions and thus the cleanup will always be done in finally block or by using & IDisposable wrapper class.

This rule is very tedious to follow, and most of the guidelines that I’ll present stem from this one.

Let’s make Excel headless

As I explained using EI is like sending messages to Excel process. It won’t spawn a window, but you will be attacked by a horde of pop ups asking you whether “You really wish to overwrite file foo.xlsx>” etc. To deal with them, I set the following properties for my Application instance.

var app = new Interop.Excel.Application();
app.Interactive = false;
app.DisplayAlerts = false;

There might be some other flags you might have to clear/set, thus you have to experiment yourself.

Double dot = your biggest enemy

Take a look at the following code:

var refToC = refToA.b.c;
// Use refToC
// Close, release, null and gc refToA and refToC

Have we properly cleaned up our objects? No, reference to B is dangling! WTF, you ask? Unfortunately, the refToB (which is not explicitly defined) is created, and we haven’t released it. Because of that lovely behavior every “going deeper” via a reference requires a separate object.

var a = comObj.a;
var b = a.b;
var c = b.c;
c.Foo(); c.Bar();
Release(c);
Release(b);
Release(a);

Forget about Enumerator and foreach

foreach is just a wrapper over Enumerator thus I will just deal with the latter. So what is the problem? Internally, the Enumerator object holds a private reference to a COM object. Sadly, the Enumerator doesn’t provide any cleanup mechanisms (Dispose() and similar), thus you can never be sure when the reference to the COM will be dropped.

Thus, prepare to manually iterate every container using Count and for.

Indices are 1-based

As the last guideline an easy one: every time you call get_Item(), Item[i] and Item(i) (and i is an integer) remember that you index from 1, not 0. If you forget about it, a nice exception will be raised.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: