CONTEXT
I have a DataFrame I'm trying to filter using multiple different conditions / "rulesets". There are 3 different rules with a set of conditions as follows:
Rule_DF Variable Operator Value
Rule1 Gender == M
Rule1 DOB > 01/01/2000
Rule1 Height > 72 <-- "inches"
Rule2 Gender == F
Rule2 DOB >= 06/10/1998
Rule2 Height < 70
etc...
EXAMPLE DATA
Name Gender DOB Height(in)
Bob M 01/01/2001 74
Jane F 01/01/1999 64
etc...
EXPECTED RESULT
rule1_df should contain a list of all the names of people that fit rule 1's conditions. rule2_df should contain a list of all the names of people that fit rule 2's conditions.
CURRENT CODE
I am at the point where the rules are all grouped by the rule name and the query is written as such:
for item in grouped_rule_df:
query = ' & '.join(f'{i} {j} {repr(k)}' for i, j, k in zip(column, equal, condition))
query = "Gender == 'M' & Height > 72 & DOB > '01/01/2000'"
test = df.query(query)
PROBLEM
The query still returns a list of all the heights. If I run the query with just the height condition, it returns the correct list of people. Can you not combine data types in pandas .query? Or is PandaSQL/another package a better solution here?