· Pandas · 1 min read
Selecting Columns with Python Pandas
Intro
Perhaps the shortest content we will do with Python Pandas! Selecting columns is really simple!
Let's look at how you can delete/remove columns in Python Pandas
Let’s take a look at our current Dataframe. I have reset and kept only the code we need, so we are in the same place.
Current Dataframe
date | estKey | capacity | occupancy | roomsSold | avgRate | salesValue | revPAR |
---|---|---|---|---|---|---|---|
2022-12-27 | 0 | 289 | 0.75 | 217 | 35.97 | 7805.49 | 27.008616 |
2022-12-27 | 1 | 203 | 0.35 | 71 | 82.31 | 5844.01 | 28.788227 |
2022-12-27 | 2 | 207 | 0.51 | 106 | 227.83 | 24149.98 | 116.666570 |
2022-12-27 | 3 | 27 | 0.37 | 10 | 126.46 | 1264.60 | 46.837037 |
2022-12-27 | 4 | 20 | 0.87 | 17 | 191.57 | 3256.69 | 162.834500 |
Selecting columns in Python pandas
OK, so we are doing some analysis on occupancy over time, and we will need the date
, capacity
,occupancy
, and roomsSold
columns. Let’s look at how we would select those.
occ_df = df[["date", "capacity", "occupancy", "roomsSold"]]
occ_df.head()
Output
date | capacity | occupancy | roomsSold |
---|---|---|---|
2022-12-27 | 289 | 0.75 | 217 |
2022-12-27 | 203 | 0.35 | 71 |
2022-12-27 | 207 | 0.51 | 106 |
2022-12-27 | 27 | 0.37 | 10 |
2022-12-27 | 20 | 0.87 | 17 |
It’s really that simple. Note that we have created a new variable occ_df
which is short for “occupancy dataframe”.
A good convention to follow, is to append
_df
to dataframe variable names so that it’s clear that’s what it is.
In the next tutorial, we’re going to add the relevant data columns to support our analysis.
Let's look at how we can add date columns. We will add, Day of week, Month, Week Number, Month Number along with unique identifiers for week and month.