· Pandas · 4 min read

Pandas: Drop Columns by Label and Index

Using loc indexer to drop columns by label

The loc indexer is used to access a group of rows and columns by referring to their labels. To drop a column(s) by its label, pass the column name(s) as a list to the .loc[] method and append the [ ] operator along with the .drop method. Let’s understand this with an example.

Suppose, we have a DataFrame as shown below:

import pandas as pd

df = pd.DataFrame({'A': [1, 2, 3, 4],
                   'B': [5, 6, 7, 8],
                   'C': [9, 10, 11, 12]})

To drop column ‘A’, we can use the following code:

df.drop(columns=['A'], inplace=True)

With the above code, we created a new dataframe without the ‘A’ column.

However, if we want to create the new dataframe directly and not change our existing dataframe, we can use the following code:

new_df = df.loc[:, ~df.columns.isin(['A'])]

In the above code, ~ is a logical NOT operator which means that the complement of the condition is taken to get the column list. Also, the isin() method returns a list of True and False values based on whether the column is present in the list provided or not. The tilde symbol negates these boolean values so that columns not in the list can be selected.

With these examples, you can use the loc indexer to drop the columns that you don’t want in your data.

Using iloc indexer to drop columns by index

The iloc indexer is used to access a group of rows and columns by their integer positions. To drop a column(s) in this way, we need to pass the column index as an integer or a list of integers to the .iloc[] method, along with the [ ] operator, and the .drop method. Here’s an example.

Suppose, we have a DataFrame as shown below:

import pandas as pd

df = pd.DataFrame({'A': [1, 2, 3, 4],
                   'B': [5, 6, 7, 8],
                   'C': [9, 10, 11, 12]})

To drop column 0, which is the ‘A’ column, we can use the following code:

df.drop(columns=df.columns[0], inplace=True)

With the above code, we created a new dataframe without the column 0.

However, if we want to remove multiple columns by their index, we can use a list of integers with the .iloc[] method, like this:

new_df = df.iloc[:, [1, 2]]

In the above code, the colon operator at the beginning means that all the rows should be selected. For columns, the list [1, 2] means to select columns at integer positions 1 and 2.

With these examples, you can use the iloc indexer to drop the columns that you don’t want in your data, based on their integer position.

Dropping multiple columns at once

Sometimes, we need to drop multiple columns from a DataFrame at once. This can be achieved in both the loc and iloc indexers.

Suppose, we have a DataFrame as shown below:

import pandas as pd

df = pd.DataFrame({'A': [1, 2, 3, 4],
                   'B': [5, 6, 7, 8],
                   'C': [9, 10, 11, 12],
                   'D': [13, 14, 15, 16]})

To drop columns ‘A’ and ‘B’, we can use the following with loc:

df.drop(columns=['A', 'B'], inplace=True)

With the above code, we created a new dataframe without columns ‘A’ and ‘B’.

Similarly, to drop columns ‘A’ and ‘B’ by index, we can use the following with iloc:

new_df = df.iloc[:, 2:]

In the above code, colon operator at the beginning means that all rows have to be selected, while ‘2:’ means that all columns from index 2 and onwards have to be selected, effectively dropping the first two columns.

It is important to keep in mind that while using loc, we pass the column names, while in the case of iloc, we work with the integer positions of the columns.

By using these examples, one can drop multiple columns at once from a Pandas DataFrame, efficiently and easily.

Summary

In this article, we have learned how to drop columns by label and index in Pandas. We explored the loc and iloc indexers and how they can be used to efficiently manage and manipulate data. We also looked at how to drop multiple columns at once. As a developer, knowing how to manipulate and work with data is essential. By mastering Pandas, we can streamline our data analysis processes and make data management a breeze!