Expressions / Cell References#
In Excel, cell references are just written records of which objects are being used by a formula. You don’t care that a reference is labeled “A3” - that location can change at any time. What you’re really doing is referencing an object that currently lives at A3, and you should be able to cut/paste that object to any new location in the workbook while maintaining that reference.
Excelbird brings this behavior to Python.
[2]:
import pandas as pd
from excelbird import *
import excelbird as xb
PATH = "test.xlsx"
A Cell in excelbird represents a real object in the workbook, and therefore it can only exist once.
Let’s try to place the same object twice.
[ ]:
my_cell = Cell('hello world')
Book(
Sheet(
my_cell,
my_cell,
),
).write(PATH)

This was actually invalid code. Excelbird just corrected it for you. It’s impossible to have the same thing be in two places at one time.
What you meant to say is…
[ ]:
my_cell = Cell('hello world')
Book(
Sheet(
my_cell,
my_cell.ref(), # <---
),
).write(PATH)
We can reference anything from anywhere
[ ]:
my_cell = Cell('hello world')
Book(
Sheet(
my_cell.ref(),
),
Sheet(
my_cell,
),
).write(PATH)


Note
.ref() is only used when you need to make an exact copy of something by reference. It is not needed when writing expressions or formula functions
Python Expressions#
Please take a minute and observe these first two examples closely. They demonstrate one of the most important concepts in excelbird.
[ ]:
ten = Cell(10)
Book(
Sheet(
ten,
ten + 5,
),
).write(PATH)

[ ]:
ten = Cell(10)
hundred_and_ten = 10 + (ten * 10)
Book(
Sheet(
hundred_and_ten,
ten,
),
).write(PATH)

Excelbird references are, in a sense, ‘smarter’ than those in Excel.
If you “delete” the source data for a formula (i.e. the cells it’s referencing) by excluding them from the layout, your formula will track down the original value and use it instead.
For instance, let’s try the same code as above, but exclude Cell ten from the Book
[ ]:
ten = Cell(10)
hundred_and_ten = 10 + (ten * 10)
Book(
Sheet(
hundred_and_ten,
),
).write(PATH)

Excelbird’s expression resolver noticed a reference to ten was invalid because ten wasn’t placed in the Book, so it used its value instead.
Let’s try a trickier one
[ ]:
one = Cell(1)
two = Cell(2)
ten = Cell(10)
twenty = Cell(20)
three = one + two
thirty = ten + twenty
thirty_three = three + thirty
Book(
Sheet(
one,
thirty_three,
),
).write(PATH)

Expressions with Vectors#
All of the previous examples will work with series and frames as well.
Col + Col
[ ]:
a = Col(1, 2, 3)
b = Col(10, 20, 30)
Frame(
a,
b,
a + b,
),

Frame + Frame
[ ]:
theme = xb.colors.theme
a_b = Frame([1,2,3], [4,5,6], fill_color=theme.green1)
c_d = Frame([10,20,30], [40,50,60], fill_color=theme.red1)
Stack(
a_b,
c_d,
a_b + c_d
)

Mixed Lengths#
Intuitively, the returned vector will only be as long as the minimum of the two inputs
[ ]:
shorter = Col(1, 2)
longer = Col(1, 2, 3)
Frame(
shorter,
longer,
shorter + longer,
)

Mixed Types#
Any combination of Cell, Col/Row, Frame/VFrame, or any of those paired with an int, float, list, tuple, pandas Series/DataFrame can be used in an expression together.
But how do I know what will be returned?
It will take practice, but there are rules.
We can describe a returned element by 2 characteristics:
Dimensionality: Frame is 2-dimensional, Col is 1-dimensional, and Cell is 0-dimensional
Flow Direction: Col & Frame are column-based, and Row & VFrame are row-based
The return type of a mixed-type expression is determined by the following rules:
Dimensionality: The maximum dimensionality between a and b. (1d + 2d -> 2d, and 2d + 0d -> 2d)
Flow Direction: Favor the first dimensional element in the expression. If not enough information is available (i.e. Cell + list), default to Col or Frame.
Let’s break these down one by one
Dimensionality#
This is the easy one. Just pick the bigger of the two.
2D + 1D -> 2D
0D + 2D -> 2D
1D + 0D -> 1D
Cell (0D) + Frame (2D) -> Frame
[ ]:
frame = Frame([1,2,3], [4,5,6], fill_color=theme.purple1)
cell = Cell(1000, fill_color=theme.orange1)
Stack(
frame,
cell,
cell + frame,
fill_color=theme.green1,
)

Flow Direction#
Give priority to the first dimensional element (not a Cell) in the expression if possible. Otherwise default to Col and Frame, if a vector must be returned but none of the inputs are directional (i.e. Cell + list)
Col + Row -> Col
[ ]:
column = Col(1, 2, 3)
row = Row(10, 20, 30)
column + row

Row + Col -> Row
[ ]:
row + column

Col + VFrame -> Frame
[ ]:
column = Col(10, 20, 30, 40)
vframe = VFrame([1,2,3,4], [5,6,7,8])
column + vframe

VFrame + Col -> VFrame
[ ]:
vframe + column

list + Cell -> Col (default, since neither Cell nor list have direction)
[ ]:
[1,2,3,4] + Cell(100)

A more practical example#
Most people will use pandas DataFrames as input data at some point, and want to access elements of a Frame/VFrame by header. Here’s a quick example to show what that looks like.
Note
For more detail on subscripting/slicing elements from a Frame/VFrame, please refer to the “Your First Book” lesson.
[2]:
df = pd.DataFrame(
zip([1,2,3,4], ['one', 'two', 'three', 'four']),
columns=['Number', 'Word']
)
df
[2]:
| Number | Word | |
|---|---|---|
| 0 | 1 | one |
| 1 | 2 | two |
| 2 | 3 | three |
| 3 | 4 | four |
[3]:
fr = Frame(df)
fr
[3]:
| Number | Word |
|---|---|
| Cell(1) | Cell(one) |
| Cell(2) | Cell(two) |
| Cell(3) | Cell(three) |
| Cell(4) | Cell(four) |
Add a new column that joins strings from two columns
[5]:
fr['Big Number'] = fr['Number'] & " thousand"
fr
[5]:
| Number | Word | Big Number |
|---|---|---|
| Cell(1) | Cell(one) | Cell({...}) |
| Cell(2) | Cell(two) | Cell({...}) |
| Cell(3) | Cell(three) | Cell({...}) |
| Cell(4) | Cell(four) | Cell({...}) |
[ ]:
Book(fr).write(PATH)

Operators#
Up to this point, you’ve seen the +, *, and & in use, but every other operator you need should be available.
Note
Please read the Python Operators page for a full list of available operators
We’re going to highlight the handful of operators and builtin functions that aren’t as obvious or intuitive as the others.
Python’s Built-In sum()#
Excelbird objects can handle the Python builtin sum() function, with one important catch: You must place arguments inside a list.
So, instead of sum(x) you must write sum([x]), even if x is an iterable element, like Col or Frame. If summing multiple things, proceed normally: sum([a, b, c])
[ ]:
a = Col(1,2,3)
b = Col(4,5,6)
Sheet(
Frame(a, b),
Row(
sum([a]),
sum([b]),
),
)

OR(), and NOT()#
In excelbird:
a | bin Pyhon returnsOR(a, b)in Excel~ bin Python returnsNOT(a, b)in Excel
[ ]:
a = Cell(True)
b = Cell(False)
Row(
a,
b,
a | b,
~ a,
)

MOD() and %#
In excelbird:
a % bin Python returnsMOD(a, b)in Excel.
There’s also a hidden feature. In regular Excel, you can convert a cell to a percent by placing the % operator after a cell reference, like =A4%. In excelbird, this is achieved by using the same object on both sides of the operator: a % a. Not two equal things, but the same thing.
[ ]:
ten = Cell(10)
eight = Cell(8)
Row(
ten,
eight,
ten % eight,
ten % ten,
)

Range ‘:’#
In Excel, a range is selected using syntax like A3:A10.
Unfortunately, my_rng = a:b by itself is not possible in Python.
Instead, we’ve chosen the >> operator for the job.
[ ]:
a = Cell(1)
b = Cell(2)
Row(
a,
b,
sum([a >> b]),
)

Other ways to reference a range#
The >> operator is only needed to specify exactly what a range should be. When working with Frames and series, there are easier ways to reference ranges.
.range()#
Every excelbird vector has a .range() method, and they all do the same thing: return a Cell that references the range from the first Cell to the last Cell in the vector.
The main purpose of this on the user’s end is to override the default granularity for an argument in a formula.
Here’s an ordinary Col function, where MAX(...) (when inside Frame, and given Cols as arguments) returns a vector of MAX() functions that each reference individual Cells
[ ]:
from excelbird.fn import MAX
a = Col(1,20,300)
b = Col(600,50,4)
Frame(
a,
b,
MAX(a, b),
)

But if we replace one with one.range() in the MAX function…
[ ]:
...
MAX(one.range(), two)

.loc#
All excelbird vectors have a .loc property, which offers an alternative to the default subscripting behavior: When subscripted, it returns a range from the elements returned by the slice, instead of the elements themselves.
my_frame.loc[2:5] is equivalent to my_frame[2] >> my_frame[4]. As a reminder, both of these options will return a single Col, where each cell holds a cross range
[16]:
fr = Frame(pd.DataFrame(
zip([1,2,3], [4,5,6], [7,8,9], [10,11,12], [13,14,15]),
columns=['One', 'Two', 'Three', 'Four', 'Five']
))
fr
[16]:
| One | Two | Three | Four | Five |
|---|---|---|---|---|
| Cell(1) | Cell(4) | Cell(7) | Cell(10) | Cell(13) |
| Cell(2) | Cell(5) | Cell(8) | Cell(11) | Cell(14) |
| Cell(3) | Cell(6) | Cell(9) | Cell(12) | Cell(15) |
Like normal, we can refer to elements by header instead of index
[ ]:
fr['Two to Four'] = sum([ fr.loc['Two':'Four'] ])
Book(fr).write(PATH)
