R1C1, the unused Excel cell reference system

Everyone knows about the “A1 style” notation for referring to cells in Excel. It’s what the labels on the edges of the worksheet are after all, and it’s the most natural thing to work with if you’re using the standard interface.

However, there’s a second style of notation, the R1C1 style. It’s not commonly known because it’s not even turned on normally, and it’s much more difficult to use intuitively.To turn it on, you’d have to go to Tools -  Options -  General, and then check R1C1 reference style.

When you activate it, the thing to notice is that the top row of the spreadsheet, the row that normally has the A, B, C labels for columns, has changed to numbers. What R1C1 involves is using numbers to denote rows and columns, instead of a combination of letters and numbers.

Normally, you wouldn’t bother thinking about R1C1. In fact, it’s not recommended to use it in your worksheets since you’re going to confuse the heck out of whoever you send the file to. Don’t do it, it’s not a good idea.

Absolute references

Absolute references are the equivalent of =\$A\$1, meaning no matter where your cell is, how you copy it around or move it, it always points to that cell.

In R1C1 format, it’s =R1C1 hence the name. The row number follows the R, the column number follows the C, simple!

Relative references

Relative references are when you’re referring to a cell some distance away from the cell you’re working on, 3 up and two left, and so on, just like in =A1 As you copy the formula around, the references shift along with the formula.

In R1C1, you need to use square brackets [] to denote relative-ness. =R[2]C[-5] would mean “2 cells down, 5 cells to the left”.

Why bother?

The area that R1C1 shines is in code. You can insert formulas that use R1C1 notation without switching modes in code, so you can use whatever address system happens to be convenient.

VBA has pretty clunky string functions, so you don’t want to be calling a cell’s .Address method, (without the extra options) and get “\$C\$2″ back as a string, and try to extract the C out because you want the row for a formula. Stop right there!

Instead you can just use R1C1 formulas, pull out the row number of the cell you’re interested in, and put =R[rownum] into the Range.FormulaR1C1 property. Row/column numbers iterate nicely over the integers. Letters don’t, it’s as simple as that.

Why bother with relative references for formulas when you’re writing code to begin with? Can’t we just generate a specific formula with absolute addresses for each cell since we have to do it once anyway?

That’s certainly true, and it’s certainly an option. However, one thing to remember with automating Excel is that interacting with individual cells, and iterating over them, is always a very slow operation. Excel likes working with large blocks at a time, it’s more or less optimized for that. Reading 200 cells in a row one at a time is much slower than yanking the 200 cells in a single range and reading the values in a single go.

So if we want to keep our code from crawling, what we’d like to do is use as many built in functions as possible, this includes copy/paste. For example in Python…

1
2
3
4
5
6
7
#as always, xl is a pywin32 com extension object that's connected to Excel
xl.Range("A1").FormulaR1C1 = "=c[1]+c[2]"
xl.Range("A1").Copy
xl.Range("A2:A9000").Paste
#is many times faster than
for cell in xl.Range("A2:A9000"):
  cell.Formula = "=%s + %s" % (cell.Offset(1,2).address, cell.Offset(1,3).address)
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>