Python - OR in Regexp: finding an element in a list?

2024-03-16 16:30:04
I have a strings like: 'I got 11 bananas yesterday'. I need to get the number of fruits and fruits themselves. For example, '11 bananas' from the string above.

My code is:

s1 = 'I got 11 oranges yesterday'
s2 = 'I got 11 bananas yesterday'
regex = re.compile('\d+ oranges|bananas|pencils')
results1 = re.findall(regex, s1) # returns [11 oranges], it's ok
results2 = re.findall(regex, s2) # returns [bananas], it's wrong

It finds what I need if the fruit's name is first in the list of possible fruits, but fails with the fruit inside the list.

Where's the mistake? My brain is broken


The issue is the priority of the | operator, what you wrote means

(\d+ oranges) OR (bananas) OR (pencils)

You need parenthesis around the fruits

\d+ (?:oranges|bananas|pencils)


