Pandas DataFrame: Difference between revisions
(→Index) |
|||
Line 26: | Line 26: | ||
=Accessing Elements of a DataFrame= | =Accessing Elements of a DataFrame= | ||
==Accessing a Column== | |||
An individual column can be accessed with the <code>[]</code> operator, by specifying the column index (0-based) or the column name. The result is a [[Pandas_Series|Series]]: | |||
<syntaxhighlight lang='py'> | |||
df = ... | |||
df[0] | |||
df['Date'] | |||
</syntaxhighlight> | |||
==<tt>iloc[]</tt>== | ==<tt>iloc[]</tt>== | ||
A property that allows integer-based access (indexing). The location is specified as a 0-based index position. The property accepts a wide variety of arguments. | A property that allows integer-based access (indexing). The location is specified as a 0-based index position. The property accepts a wide variety of arguments. |
Revision as of 20:36, 8 October 2023
External
- https://pandas.pydata.org/docs/user_guide/dsintro.html#dataframe
- https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html#pandas.DataFrame
Internal
Overview
A DataFrame is a two-dimensional data structure with columns of potentially different types. The data structure also contains labeled axes, for both rows and columns.
Can be thought of as a dict-like container for Series objects, where each column is a Series. The dimensionality of the DataFrame is given by its shape
property.
Shape
shape
is a property of the DataFrame, containing a tuple that returns the dimensionality of the DataFrame: rows, columns.
Index
By default, the DataFrame gets a RangeIndex.
However, the index of the DataFrame can be replaced with set_index()
.
Create a DataFrame
Create a DataFrame from a CSV File
Accessing Elements of a DataFrame
Accessing a Column
An individual column can be accessed with the []
operator, by specifying the column index (0-based) or the column name. The result is a Series:
df = ...
df[0]
df['Date']
iloc[]
A property that allows integer-based access (indexing). The location is specified as a 0-based index position. The property accepts a wide variety of arguments.
Used in the following situations:
Extract a Series from the DataFrame
iloc[]
can be used to extract a series from the DataFrame. The first argument is a slice specifying the series indexes, :
to extract the entire series, and the second argument specifies the column index in the DataFrame. The Series gets a default RangeIndex:
df = ...
# extract a series corresponding to DataFrame column 0
s = df.iloc[:,0]
loc[]
A property that allows label-based access (indexing).