Introduction#

Hi everyone! This is Arpit wishing you a warm welcome to the course on Python for Data Science!

The course requires no previous knowledge about programming or python, but in case you have some, it would help you get a headstart (also, the initial few classes might be a bit boring for you in that case, I would still suggest you to stick around and do the assignments as they will test your grasp on certain things and help you refresh your memory). But, before we get started and jump into python, here are a few key things that will help you get started with setting up python on your system for the course and give you information on where you can find help in case you are ever stuck (either in the course, or in any other projects you might be working).

Installing python#

Many systems will often come pre-loaded with python, especially linux based systems and windows based HP systems.
But in case you don’t have python installed in your system, you can easily do that by following instructions on this website

For the course, we will be using Python 3.11.3, so I would suggest installing the same version of python. No worries if you have any other version of python though, just make sure it is Python 3.x.x and not Python 2.x.x

NOTE: Alternative to installing python on your system

If for some reason, you do not want to install python on your system, you can still run all the codes and get your work done on your browser online using Google Colab. Here is a link to get you started.

Getting started and Looking for help#

You can get started by reading the online documentation. There is a brief official tutorial that gives you basic information about the language and how to get started. You can also take a look at the library reference for a full description of Python’s many pre-installed libraries and the language reference for a complete explanation of Python’s syntax.

If you have any doubts regarding any of the above and maybe something else, try looking for your questions on the FAQ page. If you don’t find your doubt mentioned there, you can just google for it including the word “python” in your query. There are a lot of online resources that can help you out like StackOverflow, GeeksforGeeks, etc. There’s a high chance that most of the issues or doubts you might be having are already asked or mentioned in StackOverflow and someone has answered them with a detailed explanation.

Where to edit and write files#

Most lessons in the course are structured as jupyter notebooks (basically interactive computational notebooks where you can code and also write a whole lot of other things). It would be great if you use the same at least for the course and then later move on to whatever makes you comfortable. I would encourage you to try out a whole lot of different text editors and IDEs (Integrated Development Environments) to see what works best for you.
The setup I currently use is VS Code. You can download and setup VS Code for your system easily using the link provided above. Once you have installed VS Code, I would recommend installing the Jupyter extension so that you can just download these notebooks and run them on your system.

Setting up the environment#

Once you are done with installing python and setting up your editor/IDE. You should install the following packages in python:

  • jupyter

  • numpy

  • pandas

  • matplotlib

  • seaborn

  • sci-kit learn

I would also highly recommend creating a virtual environment and installing all the packages in the virtual environment. This helps keep your projects separate and makes everything look and feel clean.

Steps to set up the virtual environment#

  1. Create a new folder titled “python-for-data-science”

  2. Open the terminal/command prompt in the same location

  3. Create a new virtual environment
    python -m venv venv

  4. Activate the virtual environment
    venv\Scripts\activate

  5. Install the packages
    pip install jupyter numpy pandas matplotlib seaborn scikit-learn

  6. You are setup to run all the codes

If you are using VS Code, just follow the steps here to configure VS Code to use the environment you just created. Make sure you select the environment you just created as the interpreter when VS Code prompts.

If you are using any other setup that you are comfortable with, please go ahead and look for resources on the same. The common place to get started with would be the Jupyter Notebook website as we will be using Jupyter Notebooks for the course.

Voila! You are all set for the next steps in the course.