Pandas DataFrame: Difference between revisions
(→Shape) |
|||
Line 15: | Line 15: | ||
<code>shape</code> is a property of the DataFrame, containing a tuple that returns the dimensionality of the DataFrame: rows, columns. | <code>shape</code> is a property of the DataFrame, containing a tuple that returns the dimensionality of the DataFrame: rows, columns. | ||
=Index= | |||
By default, the DataFrame gets a [[RangeIndex]] | |||
=Create a DataFrame= | =Create a DataFrame= |
Revision as of 18:47, 8 October 2023
External
- https://pandas.pydata.org/docs/user_guide/dsintro.html#dataframe
- https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html#pandas.DataFrame
Internal
Overview
A DataFrame is a two-dimensional data structure with columns of potentially different types. The data structure also contains labeled axes, for both rows and columns.
Can be thought of as a dict-like container for Series objects, where each column is a Series. The dimensionality of the DataFrame is given by its shape
property.
Shape
shape
is a property of the DataFrame, containing a tuple that returns the dimensionality of the DataFrame: rows, columns.
Index
By default, the DataFrame gets a RangeIndex
Create a DataFrame
Create a DataFrame from a CSV File
Accessing Elements of a DataFrame
iloc[]
A property that allows integer-based access (indexing). The location is specified as a 0-based index position. The property accepts a wide variety of arguments.
Used in the following situations:
Extract a Series from the DataFrame
iloc[]
can be used to extract a series from the DataFrame. The first argument is a slice specifying the series indexes, :
to extract the entire series, and the second argument specifies the column index in the DataFrame. The Series gets a default RangeIndex:
df = ...
# extract a series corresponding to DataFrame column 0
s = df.iloc[:,0]
loc[]
A property that allows label-based access (indexing).