Pandas Filter by Multiple Conditions

The most popular DataFrame manipulation in Pandas is filtering. In this post, we’ll look at how to use several conditions to filter a Pandas DataFrame. In Pandas, there are multiple methods to extract data from the DataFrame using multiple conditions. In the following examples, we will demonstrate how to use different functions to filter DataFrame using multiple conditions.

Method 1: Using eval()

eval() is used to evaluate an expression. So it will act as a filter in the DataFrame and return the rows that match the condition.

Syntax

pandas.DataFrame_object[DataFrame_object.eval(“Conditions”)]

Example 1

Let’s create a DataFrame with 6 columns and 4 rows and return the rows based on the fee column where the id is greater than 20, and the name ends with “n”.

import pandas

import numpy

remarks = pandas.DataFrame([[23,'sravan','pass',1000,34,56],
[21,'siva','fail',400,32,45],
[20,'sahaja','pass',100,78,90],
[22,'suryam','fail',450,76,56]
],columns=['id','name','status','fee','points1','points2'])
print(remarks)

print()

# Return the rows based on the fee column where the id is greater than 20 and the name ends with “n”.

print(remarks[remarks.eval(“id > 20 & name.str.endswith(‘n’).values”)])

Output

id name status fee points1 points2
0 23 sravan pass 1000 34 56
1 21 siva fail 400 32 45
2 20 sahaja pass 100 78 90
3 22 suryam fail 450 76 56

id name status fee points1 points2
0 23 sravan pass 1000 34 56

There is only one row such that name ends with ‘n’, and the id is greater than 20. Here, we specified two conditions using the “and” operator.

Example 2

Return the rows based on the ‘fee’ column where id is greater than 20 and ‘points1’ is less than 35, and the name starts with ‘s’.

import pandas

import numpy

remarks = pandas.DataFrame([[23,'sravan','pass',1000,34,56],
[21,'siva','fail',400,32,45],
[20,'sahaja','pass',100,78,90],
[22,'suryam','fail',450,76,56]

],columns=['id','name','status','fee','points1','points2'])

# Return the rows based on the fee column where id is greater than 20 and starts with "s" and points1 less than 35.

print(remarks[remarks.eval("id > 20 & name.str.startswith('s').values & points1 < 35")])

Output

id name status fee points1 points2
0 23 sravan pass 1000 34 56
1 21 siva fail 400 32 45

Two rows match the condition.

Method 2: Using loc[]

Syntax

DataFrame_object.loc[]

Parameter

Index label: List of strings or a single string of the row’s index names.

Example 1

Create a DataFrame named ‘remarks’ with 6 columns. Let’s return the rows based on the fee column where fee is greater than 300 and points2 less than 76.

import pandas

remarks = pandas.DataFrame([[23,'sravan','pass',1000,34,56],
[21,'siva','fail',400,32,45],
[20,'sahaja','pass',100,78,90],
[22,'suryam','fail',450,76,56]

],columns=['id','name','status','fee','points1','points2'])

# Display the DataFrame - remarks

print(remarks)

print()

# Return the rows based on the fee column where fee is greater than 300 and points2 less than 76

print(remarks.loc[(remarks['fee'] > 300) & (remarks['points2'] < 76)])

Output

id name status fee points1 points2

0 23 sravan pass 1000 34 56

1 21 siva fail 400 32 45

2 20 sahaja pass 100 78 90

3 22 suryam fail 450 76 56

id name status fee points1 points2

0 23 sravan pass 1000 34 56

1 21 siva fail 400 32 45

3 22 suryam fail 450 76 56

There are 3 rows where the fee is greater than 300 and points2 less than 76. Here, we specified two conditions with the ‘&’ operator.

Example 2:

Create a DataFrame named ‘remarks’ with 6 columns. Let’s return the rows based on the fee column where fee is greater than 300 and points2 less than 76.

import pandas

remarks = pandas.DataFrame([[23,'sravan','pass',1000,34,56],
[21,'siva','fail',400,32,45],
[20,'sahaja','pass',100,78,90],
[22,'suryam','fail',450,76,56]

],columns=['id','name','status','fee','points1','points2'])

# Return the rows based on the fee column where fee is greater than 300 and points2 less than 76, and the status is 'fail'.

print(remarks.loc[(remarks['fee'] > 300) & (remarks['points2'] < 76) & (remarks['status'] == 'fail')])

Output

id name status fee points1 points2

1 21 siva fail 400 32 45

3 22 suryam fail 450 76 56

There are 2 rows where the fee is greater than 300 and points2 greater than 76, and the status is ‘fail’. Here, we specified three conditions with the ‘&’ operator.

Method 3: Using query()

query() will take the condition as an expression such that rows are filtered in the DataFrame based on the expression provided. Make sure that you need to write an expression inside “ ”.

Syntax

pandas.DataFrame_object.query(“Expression”)

Example

Let’s return the rows based on the fee column where fee is greater than 300 and points2 less than 76.

Import pandas

remarks = pandas.DataFrame([[23,'sravan','pass',1000,34,56],
[21,'siva','fail',400,32,45],
[20,'sahaja','pass',100,78,90],
[22,'suryam','fail',450,76,56]

],columns=['id','name','status','fee','points1','points2'])

# Return the rows based on the fee column where fee is greater than 300 and points2 less than 76

print(remarks.query("fee>300 and points2 < 76"))

Output

id name status fee points1 points2

0 23 sravan pass 1000 34 56

1 21 siva fail 400 32 45

3 22 suryam fail 450 76 56

There are 3 rows where the fee is greater than 300 and points2 less than 76. Here, we specified two conditions using the ‘and’ operator.

Conclusion

Filtering is the most often used DataFrame operation in Pandas. In this guide, we deliberated how you filter DataFrame by using multiple conditions. After covering this article, you may be able to filter the data by using multiple conditions yourself. We implemented a few examples in this article to teach you how to extract data from the DataFrame with the help of multiple conditions using the different functions in Pandas and NumPy like loc[], query(), and eval().

from https://ift.tt/FRTAPgz

Pandas Filter by Multiple Conditions

Method 1: Using eval()

Syntax

Example 1

Output

Example 2

Output

Method 2: Using loc[]

Syntax

Parameter

Example 1

Output

Example 2:

Output

Method 3: Using query()

Syntax

Example

Output

Conclusion

Post a Comment

0 Comments

Popular Posts

Best Webcam Speaker Microphone Combo

Compared: Raspberry Pi OS vs. Armbian vs. Debian GNU/Linux

RK3399K based module and SBC can operate at -20 to 80℃

Subscribe Us

Menu Footer Widget

Pandas Filter by Multiple Conditions

Method 1: Using eval()

Syntax

Example 1

Output

Example 2

Output

Method 2: Using loc[]

Syntax

Parameter

Example 1

Output

Example 2:

Output

Method 3: Using query()

Syntax

Example

Output

Conclusion

You may like these posts

Post a Comment

0 Comments

Social Plugin

Popular Posts

Best Webcam Speaker Microphone Combo

Compared: Raspberry Pi OS vs. Armbian vs. Debian GNU/Linux

RK3399K based module and SBC can operate at -20 to 80℃

Subscribe Us

Menu Footer Widget