Adding Date Columns to a Python Pandas Dataframe
Intro
To enable analysis over time, by month, by week, or by day of week, we need to have that data in our data set. Let’s make this happen.
Learn how to select specific columns with Python Pandas.
Adding Day of Week Column to Python Pandas dataframe
The first column we’re going to add is a day of week column. Because our date
column is already of type datetime, we can do this really easily. dt
is a standard python library.
df['DayOfWeek'] = df['date'].dt.day_name()
df.head()
strftime()
strftime is a method of the datetime class in the Python standard library that allows you to convert a date object to a string representation. It takes a format string as an argument, which specifies the desired layout of the resulting string.
Here is a list of the most common ones
strftime format specifiers
Format Specifier | Description | Example |
---|---|---|
%Y | 4-digit year | 2022 |
%m | 2-digit month | 01 for January |
%d | 2-digit day | 01 |
%H | Hour (24-hour) | 00, 01, …, 23 |
%M | Minute | 00, 01, …, 59 |
%S | Second | 00, 01, …, 59 |
%B | Full month name | January |
%b | Abbreviated month name | Jan |
%A | Full day name | Saturday |
%a | Abbreviated day name | Sat |
%U | Week number (Sunday as first day of week) | 01, 02, …, 53 |
%W | Week number (Monday as first day of week) | 01, 02, …, 53 |
%w | Weekday as a decimal number, where 0 is Sunday and 6 is Saturday | 0, 1, …, 6 |
Using strftime
I pretty much always use strftime to create my date columns. Here is how I can create dayOfWeek
, monthNumber
, month
and weekNumber
columns.
df['dayOfWeek'] = df['date'].dt.strftime('%A')
df['monthNumber'] = df['date'].dt.strftime('%m')
df['month'] = df['date'].dt.strftime('%B')
df['weekNumber'] = df['date'].dt.strftime('%W')
df.head()
Output
date | estKey | capacity | occupancy | roomsSold | avgRate | salesValue | revPAR | dayOfWeek | monthNumber | month | weekNumber |
---|---|---|---|---|---|---|---|---|---|---|---|
2022-12-27 | 0 | 289 | 0.75 | 217 | 35.97 | 7805.49 | 27.008616 | Tuesday | 12 | December | 52 |
2022-12-27 | 1 | 203 | 0.35 | 71 | 82.31 | 5844.01 | 28.788227 | Tuesday | 12 | December | 52 |
2022-12-27 | 2 | 207 | 0.51 | 106 | 227.83 | 24149.98 | 116.666570 | Tuesday | 12 | December | 52 |
2022-12-27 | 3 | 27 | 0.37 | 10 | 126.46 | 1264.60 | 46.837037 | Tuesday | 12 | December | 52 |
2022-12-27 | 4 | 20 | 0.87 | 17 | 191.57 | 3256.69 | 162.834500 | Tuesday | 12 | December | 52 |
Week & Month codes
Let’s create a unique reference as an integer for the weeks and months. This will be used for ordering our analysis by financial week/month across many years. You will see how this works soon.
We’re going to use python f
strings to concatenate year and month/week, then convert it to an integer. We’ve done this before, let’s take a look again.
df['monthCode'] = df.apply(lambda x: f"{x['year']}{x['monthNumber']}", axis=1).astype(int)
df['weekCode'] = df.apply(lambda x: f"{x['year']}{x['weekNumber']}", axis=1).astype(int)
Here is the final output
date | estKey | capacity | occupancy | roomsSold | avgRate | salesValue | revPAR | dayOfWeek | monthNumber | month | weekNumber | year | monthCode | weekCode |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2022-12-27 | 0 | 289 | 0.75 | 217 | 35.97 | 7805.49 | 27.008616 | Tuesday | 12 | December | 52 | 2022 | 202212 | 202252 |
2022-12-27 | 1 | 203 | 0.35 | 71 | 82.31 | 5844.01 | 28.788227 | Tuesday | 12 | December | 52 | 2022 | 202212 | 202252 |
2022-12-27 | 2 | 207 | 0.51 | 106 | 227.83 | 24149.98 | 116.666570 | Tuesday | 12 | December | 52 | 2022 | 202212 | 202252 |
2022-12-27 | 3 | 27 | 0.37 | 10 | 126.46 | 1264.60 | 46.837037 | Tuesday | 12 | December | 52 | 2022 | 202212 | 202252 |
Note: We’re adding these date columns here, to teach you how to do this. A best practice would be to create a separate table or csv file which contains all the date columns you will need, and you can join this to any data set which has a date column. Pretty much every analysis you will need these, so this is a good practice.
Now that we have the data in the correct format, and with the columns we need, let’s start doing some fun stuff. Next up we’re going to write the current file to a csv.
Learn how to write your Pandas dataframe to a csv file.
Related Posts
-
The Ultimate Python Pandas Guide
By: Adam RichardsonIn this ultimate guide, you will learn how to use Pandas to perform various data manipulation tasks, such as cleaning, filtering, sorting and aggregating data.
-
A Step-by-Step Guide to Joining Pandas DataFrames
By: Adam RichardsonLearn how to join pandas DataFrames efficiently with this step-by-step guide. Improve your data analysis skills and optimize your workflow today!
-
Appending DataFrames in Pandas: A Tutorial
By: Adam RichardsonLearn how to combine two DataFrames in Pandas using the Append function. This tutorial will guide you on how to join multiple DataFrames with code examples.
-
Calculating Mean Value Using mean() Function in Pandas
By: Adam RichardsonLearn how to use the mean() function in pandas to calculate the mean value of a dataset in Python. Improve your data analysis skills with this tutorial.