Latest revision as of 19:43, 20 May 2024

External

Internal

Overview

A Series is a one-dimensional array of values, where each value has a label. The labels are referred to as "axis labels" and they are managed by the series's index. By default, in absence of any explicit specification, a series gets a monotonic integer range index, starting with 0 and with the step 1, allowing retrieving data with 0-based integer indexes (see Accessing Elements of a Series below).

Every series has a name and a data type, which are both reported when the series is printed.

A Series is implemented with a NumPy ndarray.

Axis

The Series has just one axis, "axis 0", which is aligned alongside the Series values, pointing "downwards":

The Series axes property gives access to a one-element array containing the Series's Index:

assert len(s.axes) == 1
print(s.axes)

[RangeIndex(start=0, stop=6, step=1)]

Index

https://pandas.pydata.org/docs/reference/api/pandas.Series.index.html

Also see:

Pandas Concepts | Index

RangeIndex

RangeIndex

Time Series Index

An index that contains datetime turns the A time series is a series whose index has datetime objects. To create a time series, ensure that the method that creates the series performs the conversion automatically, as show in the Create a Time Series from CSV section.

Name

A series has a name, accessible with .name.

Investigate a Series

The total number of elements of a series, also known as its size or length can be obtained with the Series' size attribute, which returns the same value as the Python len() function applied to the series:

size = s.size
same_size = len(s)
assert size == same_size

Number of elements:

The value of the first index:

The value of the last index:

Create a Series

Create a Series Programmatically

A series can be created from an in-memory list:

import pandas as pd

a = ['a', 'b', 'c']
s = pd.Series(a)

A series can also be created from data stored externally.

From a DataFrame

Create a Series from CSV

Pandas CSV | Create a Series from CSV

Create a Time Series from CSV

Pandas CSV | Create a Time Series from CSV

Create a Series from JSON

Parse: https://pandas.pydata.org/docs/reference/api/pandas.read_json.html#pandas.read_json

Also see:

datetime

Accessing Elements of a Series

This is known as indexing or subset selection.

The Index Operator `[...]`

Do not attempt to access an element using the indexing operator [] and a integral index. It may work, but the usage has been deprecated, use iloc instead.

`iloc[]`

Access using integral coordinates.

s.iloc[0]

`loc[]`

Access using index values. Reconcile

s.loc[0]
s.loc['2023-10-10']

`index[]`

Access using index values.

s.index[0]

Operations on Series

Filtering

Index for Condition

Return the index values for which the series values meet a certain condition:

s.index[<condition>]

s.index[s == 0]

Will return:

DatetimeIndex(['2008-04-06', '2008-05-04', '2008-06-07', '2008-07-05',
               '2008-08-16', '2008-09-06', '2008-09-20', '2008-10-12',
               '2012-04-12'],
              dtype='datetime64[ns]', name='Date', freq=None)

Dropping Values

Keep only the elements whose values make the expression evaluate to true:

s = s[<expression>]

Drop all zero values:

s = ...
s = s[s != 0]

Extract Values Between Certain Index Limits

`loc[]`

For a time series, use loc[] to apply a slice to the index values.

s = s.loc['2023-09-17':'2023-10-05']

s = s.loc['2023-09-17':]

Transformation

This class of operations are referred to as transformations or conversions.

`apply()`

Each element of the series can be transformed by applying the function specified as argument to apply().

The function can a named function or a lambda.

Note that apply() will not convert the elements in-place, it will create a new series instead.

`apply()` a Named Function

For example, if the elements of the series are dollar values in the format "$1,234", to convert them to integers, use:

s = ...
def convert_dollar_str_to_int(s: str):
    return int(s[1:].replace(',',''))
s = s.apply(convert_dollar_str_to_int)

`apply()` a Lambda

s = ...
s.apply(lambda x: x * 1.1)

Interpolation

Time Series Resampling and Interpolation

Binary Operations with Series

TO PROCESS: https://www.geeksforgeeks.org/python-pandas-series/ The series must be identically sampled:

sp500_perc_diff = fid_slf.sub(sp500_perf).div(sp500_perf).mul(100)

Using a matplotlib Plot with Pandas Series

Using a matplotlib Plot with Pandas Series

@@ Line 9: / Line 9: @@
 =Overview=
-A series is a one-dimensional array of values, where each value has a label. The labels are referred to as "axis labels" and they are managed by the series's [[#Index|index]]. By default, in absence of any explicit specification, a series gets a monotonic integer [[#RangeIndex|range index]], starting with 0 and with the step 1, allowing retrieving data with 0-based integer indexes (see [[#Accessing_Elements_of_a_Series|Accessing Elements of a Series]] below).
+A Series is a one-dimensional array of values, where each value has a label. The labels are referred to as "axis labels" and they are managed by the series's [[#Index|index]]. By default, in absence of any explicit specification, a series gets a monotonic integer [[#RangeIndex|range index]], starting with 0 and with the step 1, allowing retrieving data with 0-based integer indexes (see [[#Accessing_Elements_of_a_Series|Accessing Elements of a Series]] below).
 Every series has a [[#Name|name]] and a data type, which are both reported when the series is printed.
+A Series is implemented with a NumPy <code>[[NumPy_ndarray#Overview|ndarray]]</code>.
+=Axis=
+The Series has just one [[Pandas_Concepts#Axis|axis]], "axis 0", which is aligned alongside the Series values, pointing "downwards":
+:::[[File:Panda_Series_Axis.png]]
+The Series <code>axes</code> property gives access to a one-element array containing the Series's [[Pandas_Concepts#Index|Index]]:
+<syntaxhighlight lang='py'>
+assert len(s.axes) == 1
+print(s.axes)
+</syntaxhighlight>
+<font size=-1>
+ [RangeIndex(start=0, stop=6, step=1)]
+</font>
 =Index=
 {{External|https://pandas.pydata.org/docs/reference/api/pandas.Series.index.html}}
+Also see: {{Internal|Pandas_Concepts#Index|Pandas Concepts &#124; Index}}
 ==RangeIndex==
 {{Internal|Pandas_Concepts#RangeIndex|RangeIndex}}
@@ Line 21: / Line 37: @@
 An index that contains [[Pandas_Concepts#Datetime|datetime]] turns the
 A time series is a series whose index has [[Pandas_Concepts#Datetime|datetime]] objects. To create a time series, ensure that the method that creates the series performs the conversion automatically, as show in the [[#Create_a_Time_Series_from_CSV|Create a Time Series from CSV]] section.
 =Name=
 A series has a name, accessible with <code>.name</code>.
+=Investigate a Series=
+The total number of elements of a series, also known as its '''size''' or '''length''' can be obtained with the Series' <code>size</code> attribute, which returns the same value as the Python <code>len()</code> function applied to the series:
+<syntaxhighlight lang='py'>
+size = s.size
+same_size = len(s)
+assert size == same_size
+</syntaxhighlight>
+Number of elements:
+The value of the first index:
+The value of the last index:
 =Create a Series=
-==From a in-Memory List==
+==<span id='From_a_in-Memory_List'></span>Create a Series Programmatically==
 A series can be created from an in-memory list:
 <syntaxhighlight lang='py'>
@@ Line 34: / Line 63: @@
 </syntaxhighlight>
 A series can also be created from data stored externally.
 ==From a DataFrame==
 ==Create a Series from CSV==
-<font color='darkkhaki'>
+{{Internal|Pandas_CSV#Create_a_Series_from_CSV|Pandas CSV &#124; Create a Series from CSV}}
-https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html#pandas.read_csv
-To create a series from a CSV file:
-<syntaxhighlight lang='py'>
-import pandas as pd
-# TODO
-</syntaxhighlight>
-</font>
 ===Create a Time Series from CSV===
+{{Internal|Pandas_CSV#Create_a_Time_Series_from_CSV|Pandas CSV &#124; Create a Time Series from CSV}}
 ==Create a Series from JSON==
@@ Line 55: / Line 77: @@
 =Accessing Elements of a Series=
 This is known as '''indexing''' or '''subset selection'''.
+==The Index Operator <tt>[...]</tt>==
 Do not attempt to access an element using the indexing operator <code>[]</code> and a integral index. It may work, but the usage has been deprecated, use <code>[[#iloc|iloc]]</code> instead.
 ==<span id='iloc'></span><tt>iloc[]</tt>==
+Access using integral coordinates.
 <syntaxhighlight lang='py'>
 s.iloc[0]
+</syntaxhighlight>
+==<tt>loc[]</tt>==
+Access using index values. <font color=darkkhaki>Reconcile</font>
+<syntaxhighlight lang='py'>
+s.loc[0]
+s.loc['2023-10-10']
+</syntaxhighlight>
+==<span id='Accessing_the_Index_Value_for_an_Element'></span><tt>index[]</tt>==
+<font color=darkkhaki>Access using index values.</font>
+<syntaxhighlight lang='py'>
+s.index[0]
 </syntaxhighlight>
 =Operations on Series=
 ==Filtering==
+===Index for Condition===
+Return the index values for which the series values meet a certain condition:
+<syntaxhighlight lang='py'>
+s.index[<condition>]
+</syntaxhighlight>
+<syntaxhighlight lang='py'>
+s.index[s == 0]
+</syntaxhighlight>
+Will return:
+<font size=-2>
+ DatetimeIndex(['2008-04-06', '2008-05-04', '2008-06-07', '2008-07-05',
+                '2008-08-16', '2008-09-06', '2008-09-20', '2008-10-12',
+                '2012-04-12'],
+               dtype='datetime64[ns]', name='Date', freq=None)
+</font>
+===Dropping Values===
+Keep only the elements whose values make the expression evaluate to true:
+<syntaxhighlight lang='py'>
+s = s[<expression>]
+</syntaxhighlight>
+Drop all zero values:
+<syntaxhighlight lang='py'>
+s = ...
+s = s[s != 0]
+</syntaxhighlight>
+===Extract Values Between Certain Index Limits===
+====<tt>loc[]</tt>====
+For a time series, use <code>loc[]</code> to apply a slice to the index values.
+<syntaxhighlight lang='py'>
+s = s.loc['2023-09-17':'2023-10-05']
+</syntaxhighlight>
+<syntaxhighlight lang='py'>
+s = s.loc['2023-09-17':]
+</syntaxhighlight>
 ==Transformation==
 This class of operations are referred to as '''transformations''' or '''conversions'''.
@@ Line 69: / Line 140: @@
 Each element of the series can be transformed by applying the function specified as argument to <code>apply()</code>.
-The function can a named function or a lambda.
+The function can a [[#Named_Function|named function]] or a [[#Lambda|lambda]].
+Note that <code>apply()</code> will not convert the elements in-place, it will create a new series instead.
+====<span id='Named_Function'></span><tt>apply()</tt> a Named Function====
 For example, if the elements of the series are dollar values in the format "$1,234", to convert them to integers, use:
@@ Line 79: / Line 152: @@
 s = s.apply(convert_dollar_str_to_int)
 </syntaxhighlight>
+====<span id='Lambda'></span><tt>apply()</tt> a Lambda====
+<syntaxhighlight lang='py'>
+s = ...
+s.apply(lambda x: x * 1.1)
+</syntaxhighlight>
+==Interpolation==
+{{Internal|Pandas Time Series Resampling and Interpolation#Overview|Time Series Resampling and Interpolation}}
-Note that <code>apply()</code> will not convert the elements in-place, it will create a new series instead.
+==Binary Operations with Series==
+<font color=darkkhaki>TO PROCESS: https://www.geeksforgeeks.org/python-pandas-series/</font>
+The series must be identically sampled:
+<syntaxhighlight lang='py'>
+sp500_perc_diff = fid_slf.sub(sp500_perf).div(sp500_perf).mul(100)
+</syntaxhighlight>
-<font color=darkkhaki>TODO lambda.</font>
-==Binary Operations==
-<font color=darkkhaki>TO PROCESS: https://www.geeksforgeeks.org/python-pandas-series/</font>
 =Using a matplotlib Plot with Pandas Series=
 {{Internal|Using a Matplotlib Plot with Pandas Series|Using a matplotlib Plot with Pandas Series}}

Pandas Series: Difference between revisions

Latest revision as of 19:43, 20 May 2024

Contents

External

Internal

Overview

Axis

Index

RangeIndex

Time Series Index

Name

Investigate a Series

Create a Series

Create a Series Programmatically

From a DataFrame

Create a Series from CSV

Create a Time Series from CSV

Create a Series from JSON

Accessing Elements of a Series

The Index Operator `[...]`

`iloc[]`

`loc[]`

`index[]`

Operations on Series

Filtering

Index for Condition

Dropping Values

Extract Values Between Certain Index Limits

`loc[]`

Transformation

`apply()`

`apply()` a Named Function

`apply()` a Lambda

Interpolation

Binary Operations with Series

Using a matplotlib Plot with Pandas Series

Navigation menu

Pandas Series: Difference between revisions

Latest revision as of 19:43, 20 May 2024

External

Internal

Overview

Axis

Index

RangeIndex

Time Series Index

Name

Investigate a Series

Create a Series

Create a Series Programmatically

From a DataFrame

Create a Series from CSV

Create a Time Series from CSV

Create a Series from JSON

Accessing Elements of a Series

The Index Operator [...]

iloc[]

loc[]

index[]

Operations on Series

Filtering

Index for Condition

Dropping Values

Extract Values Between Certain Index Limits

loc[]

Transformation

apply()

apply() a Named Function

apply() a Lambda

Interpolation

Binary Operations with Series

Using a matplotlib Plot with Pandas Series

Navigation menu

Search

The Index Operator `[...]`

`iloc[]`

`loc[]`

`index[]`

`loc[]`

`apply()`

`apply()` a Named Function

`apply()` a Lambda