Python programming language

Python is a free programming language that is set to a 12 month release cycle. This means there is a new sub-version every year. As of writing this tutorial 3.8 is the current version, with 3.9 already in beta (developer testing) stage. This tutorial is written for python version 3+.
The Python programming language is supported by many individuals that largely donate their time to creating and maintaining this awesome product. There is limited support from the PSF (Python Software Foundation), which is a non-profit organization.
Lastly, Python is mostly about the community that contribute code and make the ecosystem great for programmers of all levels. So, feel comfortable to just try things and fail. Especially fail.

Navigating Python installation

The easiest way to install python for use in data science is through Anaconda. Not only will you be able to install packages using the python package manager pip, but you will have access to all the packages distributed by Anaconda using the conda command. If you are working on a cluster, this has been the easiest way to work with Python in my experience.

Installing Python in this way might take some time and space on your machine, but will set you up on the best footing for future analysis.

Download a package with pip

Downloading packages (code others have written) from the Python Package Index is the most widely used method of package distribution. With access to thousands of packages you are likely to find code for whatever interests you. The following pip command downloads and installs the numpy Python package. Additionally, pip will detect the needed other packages called dependencies.

Download a package with conda

conda is the command used to access the Anaconda package manager. Although we will only introduce python packages, Anaconda provides support for many more languages. The following command can be used to download and install the numpy python package.

Create environment

Use of virtual environments is a great skill to familiarise yourself with. Some packages might require different versions of other packages or even different versions of Python. Therefore, creating environments allows different versions to be installed on the same machine. The conda create command creates an environment and the -n argument gives in a name, in this case science. The following numpy ipython are the names of the packages you want installed within the new environment. You can then access the environment with acitvate. When you are finished, you can get out of the enironment with deactivate

Running python code

One of the most attractive things about Python is the user's ability to interact with the code. This is done through the REPL (Read, Evaluate, Print, Loop). Python is distributed with python by calling python. However, there is another one out there with a few additional features called ipython. One of the best ways to run Python code is through ipython. You can start running ipython from the command line once it is installed.

From here you can immediately start creating and running python code.

Ipython gives you access to tab complete: Tab, past code: Up, and function documentation: ?function.