Python for Data Analysis: Difference between revisions
(24 intermediate revisions by the same user not shown) | |||
Line 4: | Line 4: | ||
=Internal= | =Internal= | ||
* [[ | * [[Python_Engineering#Subjects|Python Engineering]] | ||
* [[Statistical_Concepts|Statistical Concepts]] | |||
* [[Data_Science|Data Science]] | |||
=Overview= | =Overview= | ||
This article is loosely based on [https://www.amazon.com/Python-Data-Analysis-Wrangling-Jupyter/dp/109810403X/ Python for Data Analysis: Data Wrangling with pandas, NumPy, and Jupyter 3rd Edition] by Wes McKinney. | |||
The set of packages referred from this article focus on structured data, which includes tabular or spreadsheet-like data, in which each column may be a different type (relational database data, spreadsheets and CSV files), multidimensional arrays (matrices), multiple tables or related data joined by key columns, and evenly and unevenly spaced time series. | |||
Python is uniquely positioned for use in data analysis because of the availability of many specialized data processing libraries: | |||
* [[Numpy|NumPy]] | |||
* [[Pandas|pandas]] | |||
* [[scikit-learn]] | |||
* [[SciPy]] | |||
* [[statsmodels]] | |||
* [[PyTorch]] | |||
Visualization libraries are also available in Python: | |||
* [[matplotlib]] | |||
* [[plotly]] | |||
Python programs can be executed from [[IPython]], [[Jupyter Notebook]] and [[Jupyter Lab]]. | |||
In addition to all these, Python's overall strength for general-purpose software engineering makes it a great glue language for data analysis applications. |
Latest revision as of 23:50, 14 May 2024
External
- Python for Data Analysis: Data Wrangling with pandas, NumPy, and Jupyter 3rd Edition by Wes McKinney
Internal
Overview
This article is loosely based on Python for Data Analysis: Data Wrangling with pandas, NumPy, and Jupyter 3rd Edition by Wes McKinney.
The set of packages referred from this article focus on structured data, which includes tabular or spreadsheet-like data, in which each column may be a different type (relational database data, spreadsheets and CSV files), multidimensional arrays (matrices), multiple tables or related data joined by key columns, and evenly and unevenly spaced time series.
Python is uniquely positioned for use in data analysis because of the availability of many specialized data processing libraries:
Visualization libraries are also available in Python:
Python programs can be executed from IPython, Jupyter Notebook and Jupyter Lab.
In addition to all these, Python's overall strength for general-purpose software engineering makes it a great glue language for data analysis applications.