Pandas CSV

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

External

Internal

Create a Series from CSV

import pandas as pd

df = pd.read_csv(data_file)
ts = df["Some Column']

Create a Time Series from CSV

import pandas as pd

# parse the "Date" column as datetime
df = pd.read_csv(data_file, parse_dates=["Date"])

# make it a time series DataFrame
df = df.set_index('Date')

# declare the function that converts some other column 
def dollar_to_int(s):
    if isinstance(s, str):
        return int(s[1:].replace(',',''))
    elif math.isnan(s):
        return s # we will interpolate later

ts = df[column_name].apply(dollar_to_int)