Pandas Series: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
Line 61: Line 61:
==Transformation==
==Transformation==
This class of operations are referred to as '''transformations''' or '''conversions'''.
This class of operations are referred to as '''transformations''' or '''conversions'''.
===<tt>apply()</tt>===
Each element of the series can be transformed by applying the function specified as argument to <code>apply()</code>.
The function can a named function or a lambda.
For example, if the elements of the series are dollar values in the format "$1,234", to convert them to integers, use:
<syntaxhighlight lang='py'>
s = ...
def convert_dollar_str_to_int(s: str):
    return int(s[1:].replace(',',''))
s = s.apply(convert_dollar_str_to_int)
</syntaxhighlight>
<font color=darkkhaki>TODO lambda.</font>


==Binary Operations==
==Binary Operations==
<font color=darkkhaki>TO PROCESS: https://www.geeksforgeeks.org/python-pandas-series/</font>
<font color=darkkhaki>TO PROCESS: https://www.geeksforgeeks.org/python-pandas-series/</font>

Revision as of 20:23, 8 October 2023

External

Internal

Overview

A series is a one-dimensional array of values, where each value has a label. The labels are referred to as "axis labels" and they are managed by the series's index. By default, in absence of any explicit specification, a series gets a monotonic integer range index, starting with 0 and with the step 1, allowing retrieving data with 0-based integer indexes (see Accessing Elements of a Series below).

Data stored in series can be

Index

https://pandas.pydata.org/docs/reference/api/pandas.Series.index.html

RangeIndex

RangeIndex

Time Series Index

An index that contains datetime turns the A time series is a series whose index has datetime objects. To create a time series, ensure that the method that creates the series performs the conversion automatically, as show in the Create a Time Series from CSV section.

Name

A series has a name, accessible with .name.

Create a Series

From a in-Memory List

A series can be created from an in-memory list:

import pandas as pd

a = ['a', 'b', 'c']
s = pd.Series(a)

A series can also be created from data stored externally.

From a DataFrame

Create a Series from CSV

https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html#pandas.read_csv

To create a series from a CSV file:

import pandas as pd

# TODO

Create a Time Series from CSV

Create a Series from JSON

Parse: https://pandas.pydata.org/docs/reference/api/pandas.read_json.html#pandas.read_json

Also see:

datetime

Accessing Elements of a Series

This is known as indexing or subset selection.

iloc[]

Operations on Series

Filtering

Transformation

This class of operations are referred to as transformations or conversions.

apply()

Each element of the series can be transformed by applying the function specified as argument to apply().

The function can a named function or a lambda.

For example, if the elements of the series are dollar values in the format "$1,234", to convert them to integers, use:

s = ...
def convert_dollar_str_to_int(s: str):
    return int(s[1:].replace(',',''))
s = s.apply(convert_dollar_str_to_int)

TODO lambda.

Binary Operations

TO PROCESS: https://www.geeksforgeeks.org/python-pandas-series/