![]() |
ImFusion SDK 4.3
|
Making C++ classes and functions available in Python is a great way to boost productivity. It allows us to automate workflows quickly and gives users who are not proficient in C++ easy access to our powerful framework. We usually refer to classes and functions exposed to Python as "Python bindings". The basic setup of such bindings is reasonably straightforward, but one needs to be aware of a few caveats and edge cases. This tutorial will illustrate the process of creating Python bindings step-by-step.
In this tutorial, we will be creating Python bindings for the BindingsWorkshopPlugin. Aside from the usual boilerplate code needed for creating a plugin, it contains four classes that are of interest to us:
To create a submodule in the "imfusion" package for our plugin, we need to do two things:
We will call our .cpp file "PythonBindings.cpp", and it should only contain this:
The PYBIND11_MODULE macro will set up the module for us. The first argument will be the name of the .so / .dll created when building the bindings. The second argument is the name of the module variable we will use on the C++ side to define classes, variables, and functions.
After that, all we need to do is to add the following lines to the plugin's CMakeLists.txt:
Et voilà, we have created a submodule. Granted, it has nothing interesting inside, but we will deal with that in the next section.
imfusion_python_bindings CMake function creates a target for building Python bindings.PYBIND11_MODULE macro defines a new Python module in your source code.py::class_ and py::enum_..value method on py::enum_ instances..def methods define attributes, methods, and properties for py::class_ instances:def: defines a method.def_readwrite: defines a simple attribute equivalent to a struct field.def_readonly: same as above, but values cannot be assigned from Python. Mutable values can still be changed, it is not the same as const.def_property: defines a Python property. Use this when dealing with getters and setters.def_property_readonly: same as above but does not include a setter.To make the classes known to Python, we create an instance of py::class_<>. The template argument specifies the class we want to bind (and its base classes, if applicable, separated by a comma). The constructor of py::class_ takes a Python handle (usually the module) to which the class should belong, followed by its name in Python, then optionally some pybind11 tags, and, finally, an optional docstring. For enums, we use the py::enum_ class, which works analogously. I recommend making all classes and enums known to pybind at the top of your bindings code to avoid unknown return or argument types later on.
In our case, we expose the LogLevel enum and the classes Note, NoteComponent, and LogNotesAlgorithm. The PythonBindings.cpp now looks as follows:
pybind11 also enables us to attach docstrings to the Python objects we define. Docstrings are especially important for bindings as the boundary between C++ and Python is opaque. There is no way for a Python user to view the objects' definitions without access to the source code. Adding docstrings is simple; all we need to do is add a third argument to the py::class_ and py::enum_ constructors:
This is common for most functions and methods in the pybind11 API.
Now that we made the classes known to the Python interpreter, we can start fleshing them out. It is usually a good idea to start with the enums as they are often straightforward and used throughout our code. To add a value, we can use the .value method on the logLevel variable we created earlier. This method expects a string that defines the values name on the Python side and the value itself:
.value (and .def later on) because it returns a reference to the object we called it on. It is very pythonic in that regard. Alternatively, we could also start with logLevel on each line.With the enum out of the way, we can move on to the Note struct. This struct has two fields directly accessible by the user: author and contents. In such a case, we can use pybind's def_readwrite to expose these fields as simple attributes on the Python side:
If we wanted to make Note instantiable from Python, we would have to add .def(py::init()). The py::init() function is a convenient way of automatically implementing a constructor on the Python side, given the types of its parameters. Since the Note ctor does not take any arguments, we would have to call it with empty parentheses in this case. We will, however, not do this for reasons that will become clear very soon. Instead, we will also expose the comparison operator defined for this struct. For this, we can use .def for defining the special method __eq__. Methods like this one are referred to as dunder methods (double underscores before and after) and are an integral part of the Python data model. Here, __eq__ is the equivalent to C++'s operator==. Inside the .def, we can pass a function reference like so:
Note the "other"_a argument between the passed function reference and the docstring. We need to add this to define a name for the method's arguments, as pybind does not automatically carry over argument names. The method will also work fine if we omit ‘"other"_a’, but then the argument will be named arg1 in Python, which is not very helpful. The _a operator is part of pybind11 and resides in pybind11::literals, and is already included in PythonHeaders.h.
Python is often used interactively (e.g., Python REPL or Jupyter notebooks), and users frequently print instances to better understand what they are working with. To support this and make our bindings much more pleasant to use, we also add another dunder method called __repr__:
__repr__ is meant to return an object's "official" string representation. Ideally, calling __repr__ gives you a valid Python expression that would recreate the object. Alternatively, there is also __str__, which does not have these expectations attached to it and can, therefore a more informal or concise string representation. By default, a Python object's __str__ method forwards to __repr__. Hence, if you only implement one of them, pick __repr__.We first need to make the NoteComponent constructible from Python. For this, we can add .def(py::init()). We can also get the simple stuff out of the way at the same time:
notes method only works because Note, the return type, is already known to pybind11. If that were not the case, the bindings would compile just fine, but calling this method in Python would result in an error. To catch errors of this kind early, we have configured the stub generation, an additional automatic build step included in imfusion_python_bindings, to fail, which will interrupt the build. The error message is somewhat cryptic because stubgen will complain about invalid Python syntax in the stubs. What happens is that since pybind11 does not know the correct type, it will use the C++ type instead. In this case, this will be ImFusion::Demo::Note, which is invalid Python syntax due to the colons.Let's also not forget about adding a __repr__ method for our component. Since we cannot directly construct it with the state it has, we use angle brackets around it to indicate that. We can also use the fact that we already implemented __repr__ for Note and call py::repr() on NoteComponent::notes. We just need to add some casts to and from Python:
Now here is where it gets interesting. Notice that NoteComponent::addNote takes a unique_ptr as its argument. Unfortunately, there is no way to move a unique_ptr from Python to C++, as everything in Python is reference counted, similar to shared_ptrs in C++. One way to get around this is to clone the object, which is often wasteful and, more importantly, breaks the link between the instances in Python and C++. Sometimes, there is a better way:
We can avoid this issue entirely by creating the instance in C++ and only passing back a reference to Python. This requires a class higher up in the hierarchy responsible for creating the instances on the C++ side. In our case, this is NoteComponent.
py::return_value_policy::reference_internal! If we don't specify this, we will have a memory bug on our hands. By default, pybind assumes that raw pointers that are returned from C++ should be managed by the Python interpreter. There is no way for pybind to know that it is already managed by a unique_ptr on the C++ side. Using reference_internal tells pybind that this resource is already accounted for and does not transfer ownership to Python. The 'internal' part here is to avoid a situation where we are left with a dangling references. This happens when the reference count of the NoteComponent instance goes to zero while there is still a reference to a contained Note instance. You can read more about pybind's return_value_policy here.Note is not constructible from Python (since we did not implement py::init()). Also, note that unique_ptrs are only a problem as arguments. Returning a unique_ptr from C++ to Python is fine.Now the last thing the bindings for this class are missing are the configure and configuration methods but we are lucky. There already are bindings for DataComponentBase that implement those methods. Since the logic for the bindings is the same, all we have to do is let pybind know that our NoteComponent inherits from DataComponentBase by adding it as a template argument in the binding definition:
For the algorithm, we only need to bind the compute method and the filter parameter. Binding the compute method is straightforward:
However, in Python, it is common to make objects callable if there is one privileged way of what it means to call an object. To achieve this, we implement the __call__ dunder method (operator() would be the C++ equivalent) as well:
__call__ points to the same method as compute but offers a more pythonic way of interacting with the algorithm. If our algorithm had outputs, __call__ should also return those directly. Since there is no concept of unique_ptrs in Python, having to call compute and then takeOutput makes no sense to a Python user.For the parameter, we use def_property. This will create a Python property, which looks like a regular attribute to the user but functions as a combined getter and setter method. Getters and setters are generally not used in Python, so you should avoid creating bindings that use this pattern. Always use def_property (or def_property_readonly if you only bind a getter) instead.
We can and should also attach a docstring directly to our module. This is very helpful for Python users in the IDE and when using the module interactively. In the module's docstring, we can introduce the user to the contents of our module and show usage examples:
And there we have it. With this we have created a subpackage named bindings_workshop within the imfusion Python package. We can access it with from imfusion import bindings_workshop. Since we also added a helpful module docstring, typing help(bindings_workshop) will help users to get started.
Now we can call our C++ algorithm from Python like so: