6. Python bindings#

C++ and Python are are strong combination: With C++ we can program for maximal performance, with Python we can work with objects in a very convenient way. Both share a similar object oriented paradigm.

Python bindings allow to use C++ functions and classes from Python.

A popular library for wrapping C++ objects to Python is pybind11.

Now you have to install Python. Choose a recent version (at least Python 3.8 should be ok). Using conda is also fine, you have to replace pip install by conda install -c conda-forge in the following. On MacOS use pip3 instead of pip.

Install pybind11 as a Python package:

pip install pybind11

Clone the pybind-branch from the ASC-git (or switch to the pybind - branch in vscode):

git clone --branch pybind https://github.com/TUWien-ASC/ASC-bla.git

For building and installing our Python package we use scikit-build which can be installed with

pip install scikit-build

Building, installing and testing ASC-bla should now work with

cd ASC-bla
pip install . -v
cd py_tests
python3 test_vector.py

The command pip install . reads information from the files setup.py and pyproject.toml, and calls cmake to build the project. The cmake-file CMakeLists.txt needed some update to find the Python installation, and the pybind11 - directories.

The Python file test_vector.py uses features of our Vector Python-class.

If this works, merge the pybind branch into the main branch of your repo.

6.1. Binding C++ classes to Python:#

Python-wrapping makes our C++ functions and classes available in Python. All code for wrapping the classes and functions happens in src/bind_bla.cpp:

#include <pybind11/pybind11.h>
#include "vector.h"

using namespace ASC_bla;
namespace py = pybind11;

PYBIND11_MODULE(bla, m) {
    m.doc() = "Basic linear algebra module"; // optional module docstring
    
    py::class_<Vector<double>> (m, "Vector")
      .def(py::init<size_t>(), 
           py::arg("size"), "create vector of given size)
      .def("__len__", &Vector<double>::Size, 
           "return size of vector")
      .def("__str__", [](const Vector<double> & self)
      {
        std::stringstream str;
        str << self;
        return str.str();
      })
      ...
}
  • we include the pybind11 headers, and abbreviate the pybind11 namespace as py

  • PYBIND11_MODULE is a macro setting up the module bla, we can add members to it using the variable m.

  • py::class_<Vector<double>> (m, "Vector") wraps the C++ class Vector<double> to Python, where its name is Vector. Templates are not supported in Python.

  • With def we can implement member functions and operators. We give the name of the function (in Python), the C++ function (which may be a old-style function pointer, member function pointer, or a lambda-function), name the arguments, and provide the documentation

  • py::init<size_t>() is a special syntax for the constructor, in this case for the ctor with one size_t argument.

  • the function __len__ is called from the Python len(v) built-in function

  • the function __str__ is called to convert the object to a string, it is used from the print(vec) function

6.1.1. some more operators: vector+vector, vector\(*\)scalar, scalar\(*\)vector#

.def("__add__", [](Vector<double> & self, Vector<double> & other)
    { return Vector<double> (self+other); })
.def("__mul__", [](Vector<double> & self, double scal)
    { return Vector<double> (scal*self); })
.def("__rmul__", [](Vector<double> & self, double scal)
    { return Vector<double> (scal*self); })

Here is the list of Python-operators

6.1.2. setter/getter functions:#

.def("__setitem__", [](Vector<double> & self, int i, double v) {
    if (i < 0) i += self.Size();
    if (i < 0 || i >= self.Size()) throw py::index_error("vector index out of range");
    self(i) = v;
})
.def("__getitem__", [](Vector<double> & self, int i) { return self(i); })

.def("__setitem__", [](Vector<double> & self, py::slice inds, double val) {
    size_t start, stop, step, n;
    if (!inds.compute(self.Size(), &start, &stop, &step, &n))
        throw py::error_already_set();
    self.Range(start, stop).Slice(0,step) = val;
})

The bracket operators v[i] = val or print (v[j]) call the __setitem__ and __getitem__ methods with an int argument. In Python v[-1] returns the last element. The Python slice operator v[3:7] = 0 calls the __setitem__ method with an py::slice argument.

6.2. Importing the python module#

We can now import the python module bla from the package ASCsoft. Either in a plain .py Python file, or into jupyter notebooks:

from ASCsoft.bla import *
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[1], line 1
----> 1 from ASCsoft.bla import *

ModuleNotFoundError: No module named 'ASCsoft'
x = Vector(5)
y = Vector(5)

for i in range(len(x)):
    x[i] = i
y[:] = 3
print ("x+y =",x+y)

6.3. Pickling support#

Pickling is the standard Python serialization (file io, parallel communication). Python knows how to convert built-in data structures (strings, floating-point values, lists, tuples, …) to a stream ob bytes. To support pickling also for our user types, we have to convert our objects into standard Python objects. For this we use the py::pickle support function, which takes two lamda-functions for pickling, and unpickling:

.def(py::pickle(
    [](Vector<double> & self) { // __getstate__
         /* return a tuple that fully encodes the state of the object */
    return py::make_tuple(self.Size(),
                          py::bytes((char*)(void*)&self(0), self.Size()*sizeof(double)));
    },
    [](py::tuple t) { // __setstate__
    Vector<double> v(t[0].cast<size_t>());
    py::bytes mem = t[1].cast<py::bytes>();
    std::memcpy(&v(0), PYBIND11_BYTES_AS_STRING(mem.ptr()), v.Size()*sizeof(double));
    return v;
    }))

We serialize a Vector<double> by a 2-tuple containing the vector-size, and the values as a junk of bytes in memory. For unpickling we first create a vector of the required size, and then copy the values from the py::bytes object into the vector. A more advanced version of pickling uses NumPy buffer protocols.

v = Vector(3)
v[:] = 7

import pickle
f = open("file.txt", 'wb')
pickle.dump([2,"hello", v], f)
del f
f2 = open("file.txt", 'rb')
val = pickle.load(f2)
print (val)
print (val[2])

6.4. Exercise#

  • Wrap your Matrix<double,RowMajor> to Python. Add getter/setter functions and operators. Add a property shape.

     .def("__getitem__",
          [](Matrix<double, RowMajor> self, std::tuple<int, int> ind) {
               return self(std::get<0>(ind), std::get<1>(ind));
          })
    .def_property_readonly("shape",
          [](const Matrix<double, RowMajor>& self) {
               return std::tuple(self.Height(), self.Width());
          })
    
  • Measure timings for Matrix-Matrix multiplicaton called from Python (width=height=n, with n=10, n=100, n=1000). Split times into actual C++ computations, and overhead due to Python wrapping.

  • Numpy is Python standard for data exchange in scientific computing. Try to convert your Python Vector/Matrix to a numpy array using np.asarray(v). How does it work ? How efficient ?

  • For efficiency add a Buffer protocol. This is also the recommended technique for pickling.

6.5. Building the Python-package#

You found two new files setup.py and pyproject.toml. They are responsible for building and installing a Python package. If we call

pip install .

the setup function from the file setup.py get called. It first triggers cmake, which installs everything in the directory ASCsoft. cmake knows nothing about the anatomy of a Python package. Here, scikit-build steps in.

When everything is uploaded properly to github, everyone can build and install our library as a Python package by running pip install with the github url:

pip install git+https://github.com/TUWien-ASC/ASC-bla.git@pybind

some links: