EN VI

Python - Replacing a value in a row of a column with a function of a list formed by corresponding elements of the dataframe?

2024-03-13 21:30:05
How to Python - Replacing a value in a row of a column with a function of a list formed by corresponding elements of the dataframe

I have a function of a list defined by user (function f). Say, it's the sum of the elements of the list but can be another function.

Then I have a dataframe with two columns: 'pred' that contains a list of numbers and 'value' which contains a single number. Value -1 is a placeholder that needs to be updated.

import pandas as pd

def f(my_list):
    return sum(my_list)

data = {'pred':[[],[1],[1],[1],[2],[2,3],[2,4],[3],[3,4],[4],[6,7,9,10]]}

df = pd.DataFrame(data)
df.index = df.index + 1

df.loc[5,'value'] = 1
df.loc[8,'value'] = 0
df.loc[10,'value'] = 2
df.loc[11,'value'] = 100
df.value = df.value.fillna(-1).astype(int) #placeholder, the values cannot be negative

print(df)
             pred  value
1              []     -1
2             [1]     -1
3             [1]     -1
4             [1]     -1
5             [2]      1
6          [2, 3]     -1
7          [2, 4]     -1
8             [3]      0
9          [3, 4]     -1
10            [4]      2
11  [6, 7, 9, 10]    100

Now I have to go through the rows of the df in the reverse order and update -1 values with function f of the list of values of those who have i in their pred list. It is guaranteed that i will not appear in the lists in the pred column in rows 1 to i.

In this example we should have:

value in row 9: f([100]) = 100;
value in row 7: f([100]) = 100;
value in row 6: f([100]) = 100;
value in row 4: f([2, 100, 100]) = 202;
value in row 3: f([100, 0, 100]) = 200;
value in row 2: f([1, 100, 100]) = 201;
value in row 1: f([201, 200, 202]) = 603.

So, need help on how to make a loop to get that done.

for i in range(len(df),0,-1):
  if df.loc[i,'value'] == -1:
    df.loc[i,'value'] = ???

Any advice is appreciated.

Solution:

IIUC, you can create a dictionary of indices that are referenced by other indices. Then loop over the -1 indices in reverse and index the relevant rows to pass to f:

s = df['pred'].explode()
dic = s.index.groupby(s)
# {1: [2, 3, 4], 2: [5, 6, 7], 3: [6, 8, 9], ...}

for i in df.index[df['value'].eq(-1)][::-1]:
    df.loc[i, 'value'] = f(df.loc[dic.get(i, []), 'value'])

Updated DataFrame:

             pred  value
1              []    603
2             [1]    201
3             [1]    200
4             [1]    202
5             [2]      1
6          [2, 3]    100
7          [2, 4]    100
8             [3]      0
9          [3, 4]    100
10            [4]      2
11  [6, 7, 9, 10]    100

Breaking down the loop step by step, this does:

df.loc[9, 'value'] = f(df.loc[[11], 'value'])       # f([100])
df.loc[7, 'value'] = f(df.loc[[11], 'value'])       # f([100])
df.loc[6, 'value'] = f(df.loc[[11], 'value'])       # f([100])
df.loc[4, 'value'] = f(df.loc[[7, 9, 10], 'value']) # f([100, 100, 2])
df.loc[3, 'value'] = f(df.loc[[6, 8, 9], 'value'])  # f([100, 0, 100])
df.loc[2, 'value'] = f(df.loc[[5, 6, 7], 'value'])  # f([1, 100, 100])
df.loc[1, 'value'] = f(df.loc[[2, 3, 4], 'value'])  # f([201, 200, 202])
Answer

Login


Forgot Your Password?

Create Account


Lost your password? Please enter your email address. You will receive a link to create a new password.

Reset Password

Back to login