· Pandas · 4 min read

Getting Pandas Column Names as a List.

The Pandas Dataframe Object

The Pandas Dataframe is essentially a 2D table-like data structure, where you can manipulate data and execute computation operations. It is one of the most important features in Pandas Library and is used widely in data manipulation tasks.

Each column in a Pandas Dataframe has a name associated with it, which acts as an identifier for that column. For example, you may name the columns as per the feature names in a dataset, such as “age”, “sex”, “income”, etc. A typical Dataframe can be created using a simple Python Dictionary where column names are keys and column data are items.

import pandas as pd
my_dict = {'Name': ['John', 'Micheal', 'Sara', 'Adam'],
           'Age': [24, 35, 19, 45],
           'Country': ['USA', 'Canada', 'UK', 'Australia']}
df = pd.DataFrame(my_dict)

In the above example, we created a Pandas Dataframe object from a dictionary with three columns and four rows. Here, the keys (Name, Age, Country) form the column names, and the lists are passed as values.

Once you’ve created your Dataframe, there are various ways to manipulate the data, such as filtering, sorting, grouping, pivoting, etc. One of the most crucial things to know is how to extract column names as a list.

col_list = list(df)
print(col_list)

This will output a list of column names:

['Name', 'Age', 'Country']

Now, the column names can easily be accessed and manipulated for your tasks. By mastering the Pandas Dataframe object and extracting its column names as a list, you can effectively manipulate your data and execute a wide range of computation operations.

Getting Column Names as a List with Pandas

In data analysis and manipulation tasks, it is often necessary to access a list of column names in order to select specific columns or perform operations on them. In Pandas, the .columns method returns all the column names of a DataFrame in the form of an index object. However, it is frequently necessary to convert this index object to a Python list.

To get the column names as a list, we can use the .tolist() method of the Pandas index object. Here’s an example:

import pandas as pd

# Creating a Dataframe with column names
data = {'Name': ['John', 'Micheal', 'Sara', 'Adam'],
        'Age': [24, 35, 19, 45],
        'Country': ['USA', 'Canada', 'UK', 'Australia']}
df = pd.DataFrame(data)

# Getting column names
col_names = df.columns.tolist()
print("Column names:", col_names)

In the above example, we used the df.columns.tolist() method to return the names of the columns as a Python list. The output should look like this:

Column names: ['Name', 'Age', 'Country']

This method ensures that the column names are now in a simple Python list, which can be used as input to Python functions, transformed or manipulated further.

Getting column names as a list with Pandas is a simple yet important task data scientists and engineers often encounter while working with the Pandas library.

Alternative Method for Retrieving Column Names

While it’s commonly known that we can use df.columns.tolist() to retrieve a list of column names in Pandas, there is also an alternative method we can use. This method is to directly access the columns attribute of the DataFrame, which returns a Pandas Index object that stores the column labels.

Here’s how to use this alternative method:

import pandas as pd

# Creating a Dataframe with column names
data = {'Name': ['John', 'Micheal', 'Sara', 'Adam'],
        'Age': [24, 35, 19, 45],
        'Country': ['USA', 'Canada', 'UK', 'Australia']}
df = pd.DataFrame(data)

# Getting column names
col_names = df.columns.values.tolist()
print("Column names:", col_names)

In this example, we directly accessed the columns attribute of the DataFrame to retrieve the column labels as a Pandas Index object. We then called the .tolist() method to convert the index object to a Python list, which stores the column names.

The output should be the same as the previous method:

Column names: ['Name', 'Age', 'Country']

This alternative method provides a different way to obtain column names as a list. It’s important to know both methods as they provide flexibility in accessing and manipulating DataFrame column data.