Module 7#

Scikit learn#

Scikit-learn (Sklearn) is the most useful and robust library in Python for machine learning. It provides a wide selection of efficient tools for machine learning and statistical modeling including classification, regression, clustering, and dimensionality reduction that can be defined and used on various datasets in just a few lines of code. It’s versatile and integrates well with other Python libraries, such as matplotlib, numpy, and pandas.

Before we get started with this module, in case you have not installed scikit-learn, you can do so by running pip install scikit-learn on your terminal.

The next few jupyter notebooks (based on scikit-learn) are taken from the Python Data Science Handbook by Jake VanderPlas. It is a wonderful resource, and one that I urge you to go and checkout whenever you have time. It is quite detailed, and contains a lot of examples and in-depth analysis/case studies related to data science and data analysis.

References:

  1. https://www.tutorialspoint.com/scikit_learn/scikit_learn_introduction.htm

  2. https://www.activestate.com/resources/quick-reads/what-is-scikit-learn-in-python/