Time Series Processing with Pandas: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
Line 23: Line 23:
  2023-10-09, 132
  2023-10-09, 132
</font>
</font>
Create a [[Pandas_DataFrame|DataFrame]] by reading the CSV with <code>read_csv()</code> function. While loading it, we handle the "date" column as a [[Pandas_Concepts#Datetime|datetime]] type and we parse it accordingly by specifying the column to use as date to the <code>parse_dates</code> parameter.
<syntaxhighlight lang='py'>
df = pd.read_csv("./timeseries.csv", parse_dates=["date"])
</syntaxhighlight>

Revision as of 19:05, 8 October 2023

Internal

Overview

This article provides hints on how time series can be processed with Pandas.

Load a Time Series

Assuming the data comes from a CSV file whose first column, labeled "date", contains timestamp-formatted strings, and the second column contains values corresponding to those timestamps, this is how the data is loaded and turned into a Pandas Series.

The content of the CSV file should be similar to:

date, value
2023-10-01, 133
2023-10-02, 135
2023-10-03, 139
2023-10-04, 123
2023-10-05, 122
2023-10-06, 119
2023-10-07, 117
2023-10-08, 130
2023-10-09, 132

Create a DataFrame by reading the CSV with read_csv() function. While loading it, we handle the "date" column as a datetime type and we parse it accordingly by specifying the column to use as date to the parse_dates parameter.

df = pd.read_csv("./timeseries.csv", parse_dates=["date"])