Time Series Processing with Pandas

From NovaOrdis Knowledge Base
Jump to navigation Jump to search



This article provides hints on how time series can be processed with Pandas.

Load a Time Series

Assuming the data comes from a CSV file whose first column, labeled "date", contains timestamp-formatted strings, and the second column contains values corresponding to those timestamps, this is how the data is loaded and turned into a Pandas Series.

The content of the CSV file should be similar to:

date, value
2023-10-01, 133
2023-10-02, 135
2023-10-03, 139
2023-10-04, 123
2023-10-05, 122
2023-10-06, 119
2023-10-07, 117
2023-10-08, 130
2023-10-09, 132

Create a DataFrame by reading the CSV with read_csv() function. While loading it, we handle the "date" column as a datetime type and we parse it accordingly by specifying the column to use as date to the parse_dates parameter.

df = pd.read_csv("./timeseries.csv", parse_dates=["date"])