6. Python bindings#
C++ and Python are are strong combination: With C++ we can program for maximal performance, with Python we can work with objects in a very convenient way. Both share a similar object oriented paradigm.
Python bindings allow to use C++ functions and classes from Python.
A popular library for wrapping C++ objects to Python is pybind11.
Now you have to install Python. Choose a recent version (at least Python 3.8 should be ok). Using conda is also fine, you have to replace pip install
by conda install -c conda-forge
in the following. On MacOS use pip3
instead of pip
.
Install pybind11 as a Python package:
pip install pybind11
Clone the pybind-branch from the ASC-git (or switch to the pybind - branch in vscode):
git clone --branch pybind https://github.com/TUWien-ASC/ASC-bla.git
For building and installing our Python package we use scikit-build which can be installed with
pip install scikit-build
Building, installing and testing ASC-bla should now work with
cd ASC-bla
pip install . -v
cd py_tests
python3 test_vector.py
The command pip install .
reads information from the files setup.py
and pyproject.toml
, and calls cmake to build the project. The cmake-file CMakeLists.txt
needed some update to find the Python installation, and the pybind11 - directories.
The Python file test_vector.py uses features of our Vector
Python-class.
If this works, merge the pybind branch into the main branch of your repo.
6.1. Binding C++ classes to Python:#
Python-wrapping makes our C++ functions and classes available in Python. All code for wrapping the classes and functions happens in src/bind_bla.cpp:
#include <pybind11/pybind11.h>
#include "vector.h"
using namespace ASC_bla;
namespace py = pybind11;
PYBIND11_MODULE(bla, m) {
m.doc() = "Basic linear algebra module"; // optional module docstring
py::class_<Vector<double>> (m, "Vector")
.def(py::init<size_t>(),
py::arg("size"), "create vector of given size)
.def("__len__", &Vector<double>::Size,
"return size of vector")
.def("__str__", [](const Vector<double> & self)
{
std::stringstream str;
str << self;
return str.str();
})
...
}
we include the pybind11 headers, and abbreviate the pybind11 namespace as py
PYBIND11_MODULE is a macro setting up the module bla, we can add members to it using the variable m.
py::class_<Vector<double>> (m, "Vector")
wraps the C++ classVector<double>
to Python, where its name isVector
. Templates are not supported in Python.With
def
we can implement member functions and operators. We give the name of the function (in Python), the C++ function (which may be a old-style function pointer, member function pointer, or a lambda-function), name the arguments, and provide the documentationpy::init<size_t>()
is a special syntax for the constructor, in this case for the ctor with one size_t argument.the function
__len__
is called from the Pythonlen(v)
built-in functionthe function
__str__
is called to convert the object to a string, it is used from theprint(vec)
function
6.1.1. some more operators: vector+vector, vector\(*\)scalar, scalar\(*\)vector#
.def("__add__", [](Vector<double> & self, Vector<double> & other)
{ return Vector<double> (self+other); })
.def("__mul__", [](Vector<double> & self, double scal)
{ return Vector<double> (scal*self); })
.def("__rmul__", [](Vector<double> & self, double scal)
{ return Vector<double> (scal*self); })
Here is the list of Python-operators
6.1.2. setter/getter functions:#
.def("__setitem__", [](Vector<double> & self, int i, double v) {
if (i < 0) i += self.Size();
if (i < 0 || i >= self.Size()) throw py::index_error("vector index out of range");
self(i) = v;
})
.def("__getitem__", [](Vector<double> & self, int i) { return self(i); })
.def("__setitem__", [](Vector<double> & self, py::slice inds, double val) {
size_t start, stop, step, n;
if (!inds.compute(self.Size(), &start, &stop, &step, &n))
throw py::error_already_set();
self.Range(start, stop).Slice(0,step) = val;
})
The bracket operators v[i] = val
or print (v[j])
call the __setitem__
and __getitem__
methods with an int
argument. In Python v[-1]
returns the last element.
The Python slice operator v[3:7] = 0
calls the __setitem__
method with an py::slice
argument.
6.2. Importing the python module#
We can now import the python module bla from the package ASCsoft. Either in a plain .py Python file, or into jupyter notebooks:
from ASCsoft.bla import *
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[1], line 1
----> 1 from ASCsoft.bla import *
ModuleNotFoundError: No module named 'ASCsoft'
x = Vector(5)
y = Vector(5)
for i in range(len(x)):
x[i] = i
y[:] = 3
print ("x+y =",x+y)
6.3. Pickling support#
Pickling is the standard Python serialization (file io, parallel communication). Python knows how to convert built-in data structures (strings, floating-point values, lists, tuples, …) to a stream ob bytes. To support pickling also for our user types, we have to convert our objects into standard Python objects. For this we use the py::pickle
support function, which takes two lamda-functions for pickling, and unpickling:
.def(py::pickle(
[](Vector<double> & self) { // __getstate__
/* return a tuple that fully encodes the state of the object */
return py::make_tuple(self.Size(),
py::bytes((char*)(void*)&self(0), self.Size()*sizeof(double)));
},
[](py::tuple t) { // __setstate__
Vector<double> v(t[0].cast<size_t>());
py::bytes mem = t[1].cast<py::bytes>();
std::memcpy(&v(0), PYBIND11_BYTES_AS_STRING(mem.ptr()), v.Size()*sizeof(double));
return v;
}))
We serialize a Vector<double>
by a 2-tuple containing the vector-size, and the values as a junk of bytes in memory. For unpickling we first create a vector of the required size, and then copy the values from the py::bytes object into the vector.
A more advanced version of pickling uses NumPy buffer protocols.
v = Vector(3)
v[:] = 7
import pickle
f = open("file.txt", 'wb')
pickle.dump([2,"hello", v], f)
del f
f2 = open("file.txt", 'rb')
val = pickle.load(f2)
print (val)
print (val[2])
6.4. Exercise#
Wrap your
Matrix<double,RowMajor>
to Python. Add getter/setter functions and operators. Add a propertyshape
..def("__getitem__", [](Matrix<double, RowMajor> self, std::tuple<int, int> ind) { return self(std::get<0>(ind), std::get<1>(ind)); }) .def_property_readonly("shape", [](const Matrix<double, RowMajor>& self) { return std::tuple(self.Height(), self.Width()); })
Measure timings for Matrix-Matrix multiplicaton called from Python (width=height=n, with n=10, n=100, n=1000). Split times into actual C++ computations, and overhead due to Python wrapping.
Numpy is Python standard for data exchange in scientific computing. Try to convert your Python
Vector
/Matrix
to a numpy array usingnp.asarray(v)
. How does it work ? How efficient ?For efficiency add a Buffer protocol. This is also the recommended technique for pickling.
6.5. Building the Python-package#
You found two new files setup.py
and pyproject.toml
. They are responsible for building and installing a Python package. If we call
pip install .
the setup function from the file setup.py get called. It first triggers cmake, which installs everything in the directory ASCsoft. cmake knows nothing about the anatomy of a Python package. Here, scikit-build steps in.
When everything is uploaded properly to github, everyone can build and install our library as a Python package by running pip install with the github url:
pip install git+https://github.com/TUWien-ASC/ASC-bla.git@pybind
some links: