Pandas Concepts: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
Line 12: Line 12:
==Time Series Processing with Pandas==
==Time Series Processing with Pandas==
{{Internal|Time Series Processing with Pandas#Overview|Time Series Processing with Panda}}
{{Internal|Time Series Processing with Pandas#Overview|Time Series Processing with Panda}}
=Axis=
Both [[Pandas_DataFrame|DataFrames]] and [[Pandas_Series|Series]] use the concept of axis. <font color=darkkhaki>Formally define. By default, an axis comprises of monotonically increasing integers with step 1, from 0 to length - 1</font>


=Index=
=Index=

Revision as of 01:49, 15 October 2023

Internal

Overview

pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Pandas is built in top of Numpy.

DataFrame

DataFrame

Series

Series

Time Series Processing with Pandas

Time Series Processing with Panda

Index

https://pandas.pydata.org/docs/reference/indexing.html

An index can be thought as individual elements labels, for a Series or as the row labels for a DataFrame.

RangeIndex

RangeIndex(start=0, stop=3, step=1)

DatetimeIndex

This is what makes a Series a time series. Merge with Time_Series_Processing_with_Pandas

DataFrame Index

DataFrame Index

Series Index

Series Index

Data Types

String

Datetime

TO PROCESS: https://pandas.pydata.org/docs/reference/api/pandas.Timestamp.html#pandas.Timestamp

Reported as datetime64[ns]. What is this?

Also see

Create a Time Series from CSV

Date format when represented as string: YYYY-MM-DD, "2023-10-01".

Visualization

Both the DataFrame and Series have a plot() method, which delegates to matplotlib.

Datareader

pip3 install pandas_datareader