Working with Excel Cells using VBA and Python

Working with Excel is all about manipulating the data and relationships between cells, and there are lots of ways to do such a thing. Here’s a tour of a few of those variations.

Assuming you’re running VBA from Excel, then it already knows of the workbooks you’re working on. For python, you’d have to set up that access, and some convenience variables.

import win32com.client
xl = win32com.client.Dispatch("excel.application")

Cells

The “Cells” call is the simplest way of identifying a cell.
In VBA:

ActiveSheet.Cells(r,c)

in Python:

xl.ActiveSheet.Cells(r,c)

where r and c are integers >= 1, r for the row, c for the column. (1,1) would be the same as “A1″ on the sheet. It will of course throw an exception if you were to go out of bounds on the sheet in any direction, so (-1,0), or even (0,0) won’t work.

Microsoft recommends using the Cells method instead of things such as the Range method, since you can iterate over r and c trivially, while that’s not true for Address values.

It should also be noted that in the object model, what Cell() returns is “Range Object”

Range
Another way to refer to a specific cell in the workbook is by it’s address, such as A1, or FH3552. In a sense these are more natural to a typical Excel user, since they’re more human readable (what letter is column 89?). Within Excel, this address format is known as the “A1″ notation, and includes all the address things you’d use in your formulas, such as “A:A” or “1:1″ or “A1:b2″, or chains/combinations thereof like “a:a, b2:g5″.

In VBA:

Range("A1:B2")

and Python:

xl.Range("A1:B2")

(For both instances, we can omit the ActiveWorkBook.ActiveSheet etc. and it will assume you’re working on the currently active ones. This is usually convenient to use, but make sure to put in the proper references if you’re working with multiple sheets/workbooks at the same time.)

Because VBA is more tuned to the specifics of Excel, it has some extra syntactic sugar. There is a shortcut notation for VBA where you can use brackets to denote ranges. This of course isn’t available in Python since brackets have their own meanings already.

ActiveSheet.[A1:B2].value

What’s nice about Range is that it can work with multiple blocks of cells at the same time, you can. Cells doesn’t allow for this, as it can only refer to a single cell, so you’d have to iterate.

In general, remember that Excel is really slow when iterating over a number of cells one by one. It is much faster to read/modify an entire block.

One final warning about Ranges. In VBA, assigning a range to a variable isn’t like assigning a simple integer to a variable.

Dim myrange as Range
Set myrange = ActiveSheet.Range("A1")

This is the legal syntax. That “Set” keyword is important, it will error on you otherwise at runtime if you simply have myrange = Activesheet.Range(“A1″) in there. This one tends to catch you every time you put VBA down for a few months and come back to it.

Offset
Offset is a great tool, it references a cell, relative to another cell. The equivalent of saying “3 over and 2 down from B2″.

Officially, offset is called using Offset(rowoffset, coloffset), where row/col offsets are integers. They can be positive or negative, or 0.

In VBA: Range(“b2″).Offset(0,0) refers to “B2″ still, while Range(“B2″).Offset(-1,-1) will refer to cell “A1″.

In Python: xl.Range(“b2″).Offset(0,0) refers to cell “A1″! while it is xl.Range(“b2″).Offset(1,1) that refers to “b2″.

I’m not sure why this mismatch occurs. It seems like an issue that the function call makes some kind of assumption that everything “counts from 1″ so it automatically has one taken away… or something. However, this discrepancy is there, and it will cause headaches if you’re not aware of it.

What’s the big use of Offset? In my work, I often have to walk around data tables. However, the tables don’t start at A1. To use Cells, I’d have to write a function that converts some arbitrary address into a row and column number, then add it to all my iteration values. Not too difficult, but I’m just writing a 20 line loop in a throwaway function, this is more work before I can get down to business.

In comes using an anchor cell and offset.

An example in Python:

anchor = xl.Range("a1")
for r in xrange(20): #iterate from 0 to 20
  anchor.offset(r+1,1).value = r #r+1 to account for the quirk mentioned above

Would iterate me down 20 rows from my anchor row. Admittedly, calling “Offset” is probably a hair slower than accessing Cells, and it’s much slower than grabbing the whole range with something like xlDown. But the ease in thinking in terms of an address and walking makes it great for rapid development.

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>