· Pandas · 2 min read
Writing to a csv file with Python Pandas
Intro
We’re going to learn how to write our Pandas dataframe to a csv file.
Let's look at how we can add date columns. We will add, Day of week, Month, Week Number, Month Number along with unique identifiers for week and month.
Writing the data
We have completed our data cleaning for this tutorial and have the columns that we need. Rather than to keep running this Pandas code, we are going to store it as a csv file and we will work with the new csv file moving forwards.
Writing the data to a CSV file allows you to save a copy of the cleaned data for future use, which can be useful if you need to revisit the data at a later date or if you want to share the data with others.
We only need one line of code to do this
df.to_csv('clean.csv', index=False, header=True)
You can see that we now have the csv file saved. I’ve renamed my previous notebook to DataPrep, and we will create a new notebook, importing the csv we have just created to start doing our analysis/visualisations.
Sense check csv file
It’s always a good idea to have a scan through your work, to ensure everything looks as expected. I’ve actually made a mistake and removed the rounding and type assignment from the revPAR column, which has meant it is no longer rounded to two decimal places
I’ve amended this line of code to include the rounding and type and I will rerun the full script to ensure it is overwritten in the csv file.
df['revPAR'] = (df['roomsSold'] * df['avgRate'] / df['capacity']).round(2).astype(float)
Aggregating data refers to the process of summarizing data by grouping it and applying statistical functions to the groups.