Pandas CSV: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
 
(4 intermediate revisions by the same user not shown)
Line 1: Line 1:
=External=
=External=
* https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html
=Internal=
=Internal=
* [[Pandas_Concepts#CSV|Pandas Concepts]]
* [[Pandas_Concepts#CSV|Pandas Concepts]]
Line 5: Line 7:


=Create a Series from CSV=
=Create a Series from CSV=
<font color='darkkhaki'>
https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html#pandas.read_csv
To create a series from a CSV file:
<syntaxhighlight lang='py'>
<syntaxhighlight lang='py'>
import pandas as pd
import pandas as pd


# TODO
df = pd.read_csv(data_file)
ts = df["Some Column']
</syntaxhighlight>
</syntaxhighlight>
</font>


=Create a Time Series from CSV=
=Create a Time Series from CSV=
<syntaxhighlight lang='py'>
import pandas as pd
# parse the "Date" column as datetime
df = pd.read_csv(data_file, parse_dates=["Date"])
# make it a time series DataFrame
df = df.set_index('Date')
# declare the function that converts some other column
def dollar_to_int(s):
    if isinstance(s, str):
        return int(s[1:].replace(',',''))
    elif math.isnan(s):
        return s # we will interpolate later
ts = df[column_name].apply(dollar_to_int)
</syntaxhighlight>

Latest revision as of 03:19, 31 October 2023

External

Internal

Create a Series from CSV

import pandas as pd

df = pd.read_csv(data_file)
ts = df["Some Column']

Create a Time Series from CSV

import pandas as pd

# parse the "Date" column as datetime
df = pd.read_csv(data_file, parse_dates=["Date"])

# make it a time series DataFrame
df = df.set_index('Date')

# declare the function that converts some other column 
def dollar_to_int(s):
    if isinstance(s, str):
        return int(s[1:].replace(',',''))
    elif math.isnan(s):
        return s # we will interpolate later

ts = df[column_name].apply(dollar_to_int)