Pandas Concepts
External
Internal
Overview
pandas
is a Python package that provides fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Pandas is built in top of Numpy.
Axis
Both DataFrames and Series use the concept of axis. An axis means the direction in a bidimensional data matrix along which the means are computed.
Formally define. By default, an axis comprises of monotonically increasing integers with step 1, from 0 to length - 1
DataFrame
Series
Time Series Processing with Pandas
Index
An index can be thought as individual elements labels, for a Series or as the row labels for a DataFrame.
RangeIndex
RangeIndex(start=0, stop=3, step=1)
DatetimeIndex
This is what makes a Series a time series. Merge with Time_Series_Processing_with_Pandas
DataFrame Index
Series Index
Data Types
String
Datetime
TO PROCESS: https://pandas.pydata.org/docs/reference/api/pandas.Timestamp.html#pandas.Timestamp
Reported as datetime64[ns]
. What is this?
Also see
Date format when represented as string: YYYY-MM-DD
, "2023-10-01".
Visualization
Both the DataFrame and Series have a plot()
method, which delegates to matplotlib.
Datareader
pip3 install pandas_datareader