At the end of this tutorial you will have a Python installed and be able to quickly start experimenting with code.
Python programming language
Python is a free programming language that is set to a 12 month release cycle. This means there is a new sub-version every year. As of writing this tutorial 3.8 is the current version, with 3.9 already in beta (developer testing) stage. This tutorial is written for python version 3+.
The Python programming language is supported by many individuals that largely donate their time to creating and maintaining this awesome product. There is limited support from the PSF (Python Software Foundation), which is a non-profit organization.
Lastly, Python is mostly about the community that contribute code and make the ecosystem great for programmers of all levels. So, feel comfortable to just try things and fail. Especially fail.
Navigating Python installation
The easiest way to install python for use in data science is through Anaconda. Not only will you be able to install packages using the python package manager pip, but you will have access to all the packages distributed by Anaconda using the conda command. If you are working on a cluster, this has been the easiest way to work with Python in my experience.
Installing Python in this way might take some time and space on your machine, but will set you up on the best footing for future analysis.
Download a package with pip
Downloading packages (code others have written) from the Python Package Index is the most widely used method of package distribution. With access to thousands of packages you are likely to find code for whatever interests you. The following
pip command downloads and installs the numpy Python package. Additionally, pip will detect the needed other packages called dependencies.
pip install numpy
Download a package with conda
conda is the command used to access the Anaconda package manager. Although we will only introduce python packages, Anaconda provides support for many more languages. The following command can be used to download and install the numpy python package.
conda install numpy
Use of virtual environments is a great skill to familiarise yourself with. Some packages might require different versions of other packages or even different versions of Python. Therefore, creating environments allows different versions to be installed on the same machine. The
conda create command creates an environment and the
-n argument gives in a name, in this case
science. The following
numpy ipython are the names of the packages you want installed within the new environment. You can then access the environment with
acitvate. When you are finished, you can get out of the enironment with
conda create -n science numpy ipython conda activate science conda deactivate
Running python code
One of the most attractive things about Python is the user's ability to interact with the code. This is done through the REPL (Read, Evaluate, Print, Loop). Python is distributed with python by calling
python. However, there is another one out there with a few additional features called
ipython. One of the best ways to run Python code is through ipython. You can start running ipython from the command line once it is installed.
conda activate science ipython Python 3.6.10 |Anaconda, Inc.| (default, Mar 25 2020, 18:53:43) Type 'copyright', 'credits' or 'license' for more information IPython 7.13.0 -- An enhanced Interactive Python. Type '?' for help.
From here you can immediately start creating and running python code.
In :print("hello world") hello world
Ipython gives you access to tab complete: Tab, past code: Up, and function documentation: ?function.
It can be very helpful to listen to Python podcasts to get informed about the language and people coding in it. Try out a few of these:
Talk Python to me
There are some great places to read up as well:
There are a lot of great courses out there too:
Talk Python Training