Polars noob, given an m x n
Polars dataframe df
and a 1 x n
Polars dataframe of scalars, I want to divide each column in df
by the corresponding scalar in the other frame.
import numpy as np
import polars as pl
cols = list('abc')
df = pl.DataFrame(np.linspace(1, 9, 9).reshape(3, 3),
schema=cols)
scalars = pl.DataFrame(np.linspace(1, 3, 3)[:, None],
schema=cols)
In [13]: df
Out[13]:
shape: (3, 3)
┌─────┬─────┬─────┐
│ a ┆ b ┆ c │
│ --- ┆ --- ┆ --- │
│ f64 ┆ f64 ┆ f64 │
╞═════╪═════╪═════╡
│ 1.0 ┆ 2.0 ┆ 3.0 │
│ 4.0 ┆ 5.0 ┆ 6.0 │
│ 7.0 ┆ 8.0 ┆ 9.0 │
└─────┴─────┴─────┘
In [14]: scalars
Out[14]:
shape: (1, 3)
┌─────┬─────┬─────┐
│ a ┆ b ┆ c │
│ --- ┆ --- ┆ --- │
│ f64 ┆ f64 ┆ f64 │
╞═════╪═════╪═════╡
│ 1.0 ┆ 2.0 ┆ 3.0 │
└─────┴─────┴─────┘
I can accomplish this easily in Pandas as shown below by delegating to NumPy broadcasting, but was wondering what the best way to do this is without going back and forth between Polars / Pandas representations.
In [16]: df.to_pandas() / scalars.to_numpy()
Out[16]:
a b c
0 1.0 1.0 1.0
1 4.0 2.5 2.0
2 7.0 4.0 3.0
I found this similar question where the scalar constant is already a row in the original frame, but don't see how to leverage a row from another frame.
Best I can come up with thus far is combining the frames and doing some... nasty looking things :D
In [31]: (pl.concat([df, scalars])
...: .with_columns(pl.all() / pl.all().tail(1))
...: .head(-1))
Out[31]:
shape: (3, 3)
┌─────┬─────┬─────┐
│ a ┆ b ┆ c │
│ --- ┆ --- ┆ --- │
│ f64 ┆ f64 ┆ f64 │
╞═════╪═════╪═════╡
│ 1.0 ┆ 1.0 ┆ 1.0 │
│ 4.0 ┆ 2.5 ┆ 2.0 │
│ 7.0 ┆ 4.0 ┆ 3.0 │
└─────┴─────┴─────┘