Pandas Concepts: Difference between revisions
(→Index) |
(→Index) |
||
Line 11: | Line 11: | ||
=Index= | =Index= | ||
{{External|https://pandas.pydata.org/docs/reference/indexing.html}} | |||
An index is an immutable sequence used to address data stored in a [[Pandas_DataFrame#Index|DataFrame]] or a [[Pandas_Series#Index|Series]]. By default, it consists in 0-based monotonically increasing integers <span id='RangeIndex'></span>(<code>[[RangeIndex]]</code>), but it can also consists in string labels, or in case of time series, by datetime instances <span id='DatetimeIndex'></span>(<code>[[DatetimeIndex]]</code>). Other indexes: <code>CategoricalIndex</code>, <code>MultiIndex</code>, <code>IntervalIndex</code>, <code>TimedeltaIndex</code>, <code>PeriodIndex</code>. For details related to DataFrame and Series indexes, see: | An index is an immutable sequence used to address data stored in a [[Pandas_DataFrame#Index|DataFrame]] or a [[Pandas_Series#Index|Series]]. By default, it consists in 0-based monotonically increasing integers <span id='RangeIndex'></span>(<code>[[RangeIndex]]</code>), but it can also consists in string labels, or in case of time series, by datetime instances <span id='DatetimeIndex'></span>(<code>[[DatetimeIndex]]</code>). Other indexes: <code>CategoricalIndex</code>, <code>MultiIndex</code>, <code>IntervalIndex</code>, <code>TimedeltaIndex</code>, <code>PeriodIndex</code>. For details related to DataFrame and Series indexes, see: | ||
* [[Pandas_DataFrame#Index|DataFrame Index]] | * [[Pandas_DataFrame#Index|DataFrame Index]] |
Revision as of 01:40, 16 October 2023
External
Internal
Overview
pandas
is a Python package that provides fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Pandas is built in top of Numpy. Pandas provides two main data structures DataFrame and Series, which share concepts like axis and index.
Axis
Both DataFrames and Series use the concept of axis. An axis means the direction in a bidimensional data matrix or a vector along which the means are computed. A DataFrame has a row axis and a column axis, while a Series has only one axis. The term comes from numpy, whose ndarray is used to implement the Panda Series. The indexes corresponding to a DataFrame or a Series axes are returned by the axes
property.
Index
An index is an immutable sequence used to address data stored in a DataFrame or a Series. By default, it consists in 0-based monotonically increasing integers (RangeIndex
), but it can also consists in string labels, or in case of time series, by datetime instances (DatetimeIndex
). Other indexes: CategoricalIndex
, MultiIndex
, IntervalIndex
, TimedeltaIndex
, PeriodIndex
. For details related to DataFrame and Series indexes, see:
DataFrame
Series
Time Series Processing with Pandas
Index
An index can be thought as individual elements labels, for a Series or as the row labels for a DataFrame.
Generic Index
Index(['distance', 'strength'], dtype='object')
This kind of index is associated with the column axis of a DataFrame.
DataFrame Index
Series Index
Data Types
String
Datetime
TO PROCESS: https://pandas.pydata.org/docs/reference/api/pandas.Timestamp.html#pandas.Timestamp
Reported as datetime64[ns]
. What is this?
Also see
Date format when represented as string: YYYY-MM-DD
, "2023-10-01".
Visualization
Both the DataFrame and Series have a plot()
method, which delegates to matplotlib.
Datareader
pip3 install pandas_datareader