How to create Python library

How to create Python library

Creating Python library and uploading it to PyPI using CI/CD

Python programming language is extremely popular not only because it is easy to use but because it has lots of third-party libraries which really help in shortening development time. But how do you make one? I wondered the same thing and decided to create a simple Python library. The process of creating a library is not exactly straightforward so I decided to author this article which will hopefully help other people who are creating Python library for the first time or who are just trying to refresh memory on a certain step.

Introduction

Article will go over how to make simple Python library. More specifically, REST API client for FreeGeoIp's API [1]. It will describe project structure, how to use setup.py, how to add unit tests to project and how to publish created library to PyPI using GitHub Actions as CI/CD solution.

Project structure

Project has straightforward and basic structure. There is one module - freegeoip_client, actual Python library we are trying to build here. tests folder which contains all unit tests and last, but not least, setup.py and setup.cfg files which are used to create Python package - more on them in Setuptools section.

python-freegeoip-client  --> project root
    ├── freegeoip_client --> Python module
    │   ├── __init__.py
    │   ├── client.py
    │   └── data
    │       └── client.cfg
    ├── setup.cfg
    ├── setup.py
    └── tests
        ├── conftest.py
        └── test_client.py

Python's __init__.py file

__init__.py allows us to mark a directory as a Python package directory which enables importing Python code from one Python file into another. Also, it can be useful to define any variable at the package level. Doing so is often convenient if a package defines something that will be imported frequently, in an API-like fashion. If you are interested in learning more about this, there is an interesting Reddit thread covering appropriate uses of __init__.py file [2].

When creating Python library, __init__.py is used to import key functions and/or classes from various modules directly into the package namespace and thus enabling end-users to use Python module in an API-like fashion.

Example __int__.py:

"""
freegeoip_client package.

Provides FreeGeoIpClient object which enables consuming
FreeGeoIp's RESTful API by providing your own API key.

Usage
-----
from freegeoip_client import FreeGeoIpClient

client = FreeGeoIpClient(api_key="some_api_key")

geo_data = client.get_geo_location()
geo_data_by_ip = client.get_geo_location_for_ip_address("8.8.8.8")
"""

from .client import FreeGeoIpClient

Unit tests

In computer programming, unit testing is a software testing method by which individual units of source code—sets of one or more computer program modules together with associated control data, usage procedures, and operating procedures—are tested to determine whether they are fit for use.

Adding unit tests is always highly recommended since tests enable us to test edge cases which in return results in higher code quality and more robust code.

One of the more popular unit test frameworks for Python is pytest framework [3]. It makes it easy to write small, readable tests in a pythonic way. If you are not familiar with it, I would encourage you to check it out.

python-freegeoip-client project uses pytest framework for unit tests and requests-mock library for mocking API responses [4].

from requests_mock import Mocker


def test_geo_data_output(client, current_geo_data):
    with Mocker() as mock:
        mock.get(f"{client.get_api_endpoint}/?apikey=", json=current_geo_data)
        response = client.get_geo_location()

        assert isinstance(response, dict)
        assert response == current_geo_data


def test_geo_data_output_for_ip_address(client, geo_data_for_ip_address):
    with Mocker() as mock:
        mock.get(
            f"{client.get_api_endpoint}/8.8.8.8?apikey=", json=geo_data_for_ip_address
        )
        response = client.get_geo_location_for_ip_address("8.8.8.8")

        assert isinstance(response, dict)
        assert response == geo_data_for_ip_address

Setuptools

Setuptools is a collection of enhancements to the Python distutils that allow developers to more easily build and distribute Python packages, especially ones that have dependencies on other packages.

Using setuptools you package Python code into, well, a package that can be imported in different scripts, either by installing it locally or by installing it using PyPI repository. Packaging Python code can be done using setup.py file, or, the preferred way, using pyproject.toml and setup.cfg files. I am, however, using setup.py file because there is pip bug where it is not possible to install packages locally when using pyproject.toml and setup.cfg files [5].

setup.py file for this Python project:

from setuptools import find_packages, setup


def get_long_description():
    with open("README.md") as file:
        return file.read()


setup(
    name="freegeoip-client",
    version="1.0.5",
    description="FreeGeoIp's RESTful API client for Python",
    long_description=get_long_description(),
    long_description_content_type="text/markdown",
    author="Kevin Furjan",
    author_email="kfurjan@gmail.com",
    url="https://github.com/kfurjan/python-freegeoip-client",
    project_urls={
        "GitHub Project": "https://github.com/kfurjan/python-freegeoip-client",
        "Issue Tracker": "https://github.com/kfurjan/python-freegeoip-client/issues",
    },
    packages=find_packages(
        include=["freegeoip_client", "freegeoip_client.*"],
    ),
    package_data={
        "freegeoip_client": ["data/*.cfg"],
    },
    install_requires=[
        "requests==2.27.1",
    ],
    setup_requires=[
        "pytest-runner",
        "flake8==4.0.1",
    ],
    tests_require=[
        "pytest==7.1.2",
        "requests-mock==1.9.3",
    ],
    python_requires=">=3.6",
    classifiers=[
        "Intended Audience :: Developers",
        "Programming Language :: Python",
        "Programming Language :: Python :: 3",
        "Operating System :: OS Independent",
        "License :: OSI Approved :: MIT License",
        "Topic :: Software Development :: Libraries :: Python Modules",
    ],
    keywords=[
        "FreeGeoIp",
        "REST API Client",
    ],
    license="MIT",
)

If you are interested in setup.py file in more detail, there is an excellent article about it and how to use it [6].

PyPI

The Python Package Index (PyPI) is a repository of software for the Python programming language.

By default, pip - package installer for Python - uses PyPI repository as the source for retrieving package dependencies. PyPI lets you find, install, and even publish your Python packages so that they are widely available to the public.

To publish packages to PyPI, you need to register an account [7] or login with the previously registered one [8]. Once logged in to PyPI, head over to Account settings and add new API token under API tokens section. API token will be required in the next step to publish package via GitHub Actions so make sure that you saved it.

Click on Add API token button: Screenshot 2022-05-23 at 20.21.19.png

NOTE: For purposes of this blog, I've selected token scope as "Entire account (all projects)", but it is advised to have each API token on PyPI scoped only to a certain project.

GitHub Actions

Automate, customize, and execute your software development workflows right in your repository with GitHub Actions. You can discover, create, and share actions to perform any job you'd like, including CI/CD, and combine actions in a completely customized workflow.

GitHub Actions are free to use with each GitHub repository. You can set it up using the Actions tab in GitHub repository and clicking on New workflow button or you can directly add .github/workflow folder to your project code with .yml file in the folder that defines certain workflow.

Before creating CI/CD workflow, we need to add previously created API token to GitHub Secrets. Open your GitHub project and go to the Settings tab. Then select Secrets and finally Actions. Click on New repository secret button on top of the page. Name repository secret as PYPI_API_TOKEN and add API token's value.

Screenshot 2022-05-23 at 20.42.04.png

Once done with this step, we can move on to defining workflow using .yml file that is saved within .github/workflow folder.

Setting up environment and dependencies

We need to define when CI/CD workflow will run and what environment will be used.

name: Build, lint, test, and upload to PyPI

on:
  push:
    branches:
      - main

jobs:
  build:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v2
      - name: Set up Python 3.8
        uses: actions/setup-python@v2
        with:
          python-version: '3.8'

      # Install dependencies
      - name: "Installs dependencies"
        run: |
          python3 -m pip install --upgrade pip
          python3 -m pip install setuptools wheel twine flake8

Here we defined the name of the workflow, when workflow will run - on each push to main branch, and what environment will be used - latest version of Ubuntu Linux. We also defined the step which will install necessary Python dependencies.

Linting and running unit tests

Before publishing package to PyPI, we need to make sure that code is formatted correctly and that all unit tests are passing. This can be achieved by adding two additional steps to the workflow.

      # Lint
      - name: Lint with flake8
        run: |
          python3 -m flake8 freegeoip_client/
          python3 -m flake8 tests/

      # Run unit tests
      - name: Test with pytest
        run: |
          python3 setup.py test

Building and publishing to PyPI

The last step is to build the Python package - creating source distribution using sdist - and upload it to PyPI.

      # Build package
      - name: "Builds package"
        run: |
          python3 setup.py sdist

      # Publish package to PyPI
      - name: Publish a Python distribution to PyPI
        uses: pypa/gh-action-pypi-publish@release/v1
        with:
          user: __token__
          password: ${{ secrets.PYPI_API_TOKEN }}

Here we are using pypa/gh-action-pypi-publish@release/v1 GitHub Actions workflow from Marketplace to make publishing to PyPI easier [9]. Workflow will do all the heavy lifting to upload package to PyPI, we just need to provide our API token saved to repository secrets using ${{ secrets.PYPI_API_TOKEN }} syntax.

And that's it! We have GitHub actions workflow that will publish our Python library to the PyPI and make sure that code is formatted correctly, and all unit tests are passing successfully before publishing.

Conclusion

Thank you for reading and I hope this article was useful to you! In conclusion, this article went step-by-step over creating and publishing Python library to PyPI automatically using GitHub Actions CI/CD workflow.


If you like my content and find it useful, please consider following me. If you are feeling extra generous, please consider buying me a coffee.

Connect with me on LinkedIn.

References

[1] - FreeGeoIp

[2] - What to include in init.py

[3] - pytest

[4] - requests-mock

[5] - pip bug - not possible to install packages locally

[6] - A Practical Guide to Using Setup.py

[7] - Register an account on PyPI

[8] - Log in to PyPI

[9] - pypi-publish