Module 4#

Pandas#

Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labelled” data both easy and intuitive. Pandas can help you with cleaning, transforming, and analysing you data starting from importing data from files to removing missing values, filtering rows, calculating statistics of the data and much more.
Pandas is built on top of the NumPy package, preserving a lot of its structure. Data processed in pandas is often used to feed statistical analysis in SciPy, plotting functions from Matplotlib, and machine learning algorithms in Sci-kit learn. We will learn about some of these packages, what they do, and how to use them in detail in the later modules.

Before we get started with this module, in case you have not installed Pandas, you can do so by running pip install pandas on your terminal.

References:

  1. https://www.learndatasci.com/tutorials/python-pandas-tutorial-complete-introduction-for-beginners/

  2. https://pandas.pydata.org/docs/getting_started/overview.html