In reality I have a dataset with 5281 rows and ~ 40 columns. From this I need to select a certain set of values which are duplicates in another row.
To simplify I try to break it down to a df with 2 columns, A and B.
d = {'A': [2, 1, 2, 2, 1, 1, 3, 1], 'B':['a', 'a', 'b', 'b', 'c', 'c', 'd', 'd']}
df = pd.DataFrame(d)
In the image you see the df, and I marked what I want: I want a set of A = (1, 2) which shares the value in B.
A little bit of context: I need to drop rows which have duplicates in one column (here as in col B) but only if the duplicates have a certain set of values in another row (here it is the set 1, 2 of A). And all this I would like to apply directly on the df.