Python Imports

Download notebook

Python’s strength comes from reusing code that others have written. A module is a file containing functions and classes. A package is a collection of related modules. You access them with import.

Import Syntax

There are several ways to import:

Import a module and use its contents with dot notation:
```
import math
print(math.sqrt(25))
```
```
5.0
```

Import specific items with from:

from math import sqrt
print(sqrt(25))

5.0

Use an alias with as:

from math import sqrt as sr
print(sr(25))

5.0

Import multiple items in one line:

from math import sqrt, pi
print(sqrt(16), pi)

4.0 3.141592653589793

Import everything with * (avoid this, as it can cause name conflicts):
```
from math import *
print(sqrt(16), pi)
```
```
4.0 3.141592653589793
```

Built-in vs. Third-Party

Python modules come from two places:

Built-in (the standard library): Included with Python. No installation needed. Examples: math, os, datetime, json, random.
Third-party (PyPI): Community-contributed packages installed with pip install package-name. Examples: numpy, pandas, geopandas.

Built-in Module Examples

A few examples from the standard library:

import math
print(math.pi)
print(math.sqrt(25))

3.141592653589793
5.0

from datetime import datetime
print(datetime.now())

2026-03-11 12:56:02.673821

import json
data = {"name": "Alice", "age": 25}
print(json.dumps(data))

{"name": "Alice", "age": 25}

import random
print(random.randint(1, 100))

Third-Party Packages

Third-party packages extend Python into nearly every domain: data analysis, web development, machine learning, geospatial work.

Note

The numpy and pandas examples below are a starting point. The syntax becomes familiar with practice, so follow along without worrying about memorising everything.

Numpy

numpy handles arrays and mathematical operations on them efficiently. It appears everywhere in Python’s data science and scientific computing ecosystem.

To use a third-party package, you first install it with pip (Python’s package installer). In a notebook like Colab, prefix the command with ! to run it as a shell command:

# Uncomment to install numpy
# !pip install numpy

numpy arrays work like Python lists but are optimised for numerical operations. You create one by passing a list to np.array():

import numpy as np  # by convention, numpy uses the alias np

arr = np.array([1, 2, 3, 4, 5])
print(arr)

[1 2 3 4 5]

Arithmetic applies to every element at once. Two arrays of the same length are combined element-by-element:

arr_1 = np.array([1, 2, 3])
arr_2 = np.array([4, 5, 6])

print(arr_1 + arr_2)

[5 7 9]

A single number applies to every element:

arr = np.array([1, 2, 3])
print(arr * 2)

[2 4 6]

Arrays also have built-in statistical methods:

arr = np.array([1, 2, 3, 4, 5, 6])

print(arr.mean())  # average
print(arr.min())
print(arr.max())
print(arr.std())  # standard deviation

3.5
1
6
1.707825127659933

Pandas

pandas is built on numpy and adds the DataFrame: a table with named columns, similar to a spreadsheet. It is the standard tool for data manipulation and analysis in Python.

# Uncomment to install pandas
# !pip install pandas

One way to create a DataFrame is from a dictionary, where keys become column names and values become lists of row data:

import pandas as pd  # by convention, pandas uses the alias pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'City': ['New York', 'Los Angeles', 'Chicago']
}

df = pd.DataFrame(data)
print(df)

      Name  Age         City
0    Alice   25     New York
1      Bob   30  Los Angeles
2  Charlie   35      Chicago

You can access individual columns by name. Numeric columns behave like numpy arrays:

age_column = df['Age']

print(age_column)
print(age_column + 1)

0    25
1    30
2    35
Name: Age, dtype: int64
0    26
1    31
2    36
Name: Age, dtype: int64

This means you can use methods like .mean() directly on columns:

mean_age = df['Age'].mean()
print(mean_age)

30.0

You can filter rows with conditions. A condition like df['Age'] >= 30 produces a boolean series (True/False per row):

bool_idx = df['Age'] >= 30
print(bool_idx)

0    False
1     True
2     True
Name: Age, dtype: bool

Pass that boolean series back into the DataFrame to select only the matching rows:

df[bool_idx]

	Name	Age	City
1	Bob	30	Los Angeles
2	Charlie	35	Chicago

Which is the same as passing an explicit list of booleans:

filtered_df = df[[False, True, True]]
filtered_df

	Name	Age	City
1	Bob	30	Los Angeles
2	Charlie	35	Chicago

In practice, you usually combine both steps into one line:

df[df['Age'] >= 30]

	Name	Age	City
1	Bob	30	Los Angeles
2	Charlie	35	Chicago

You can also add new columns by assigning to them:

df['Country'] = ['USA', 'USA', 'USA']
print(df)

      Name  Age         City Country
0    Alice   25     New York     USA
1      Bob   30  Los Angeles     USA
2  Charlie   35      Chicago     USA

Summary

Imports let you build on existing code so you can focus on the parts unique to your problem. The import syntax and common aliases (np, pd) will become second nature quickly.