Part 4: Imports and packages

Imports

One of the powerful aspects of Python (and most other programming languages) is the ability to utilise external libraries and packages to extend the base functionality of Python

It is best-practice to keep all your imports at the top of your file or Notebook.

You can either import a package or library directly:

import sklearn

or import a particular subpackage, function or object from a library:

from sklearn import datasets

You can also provide an alias for an imported package using the import ... as ... syntax:

import numpy as np

Once a package has been imported, you can access the methods and objects from that package from the package name:

np.mean([1,2,3,4])

import pandas

print(pandas.__version__)

0.23.4

You can also import functions and code from python files (ending in .py) in your local directory.

In this example, I have created a file called random_numbers.py, and placed it in the same folder as my Jupyter Notebook.

The contents of this file are:

import random

def print_random_number():
    '''Prints a random number between 0 and 1'''
    return random.random()

We can import the function print_random_number from this file:

from random_numbers import print_random_number

print_random_number()

0.9319659068831242

You can also import Python code from other directories, however in the interests of time we won’t discuss that further. If you are interested in how to structure a Python project or package, this is a good place to start.

Useful packages

While this list is by no means exhaustive, here are some of the key packages that you should be familiar with (i.e. they will make your life easier!):

PANDAS: Python data analysis library. Built on top of NumPy.
NumPy: Numerical Python library. Matrices, fast data processing, efficient subsetting of data arrays. Invaluable for any serious data work.
SciPy: Routines for numerical integration, interpolation, optimization, linear algebra and statistics.
Matplotlib: Plotting. More plotting. So many beautiful plots.
Seaborn: Statistical data visualisation. Beautiful plots with minimal work. Built on top of Matplotlib.
BioPython: A set of tools for dealing with biological data and databases.
Scikit-Learn: Tools for data mining, data analysis and machine learning.
StatsModels: Statistical tests and data exploration.

Next Lesson: Data Manipulation and Plotting →

← Previous Lesson: Functions, Loops and Conditionals