Executing python code from MySQL Server
Trying py_eval
Generate a list of integers from 0 to 10
> select py_eval('[i for i in range(10)]') list; +--------------------------------+ | list | +--------------------------------+ | [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] | +--------------------------------+
Generate a dictionary (json object) from a list of dicts
> select replace(py_eval('{ str(user["id"]) : user for user in [{"id": 33, "name": "John"}, {"id": 44, "name": "George"}] }'), "'", '"') dict; +------------------------------------------------------------------------+ | dict | +------------------------------------------------------------------------+ | {"33": {"id": 33, "name": "John"}, "44": {"id": 44, "name": "George"}} | +------------------------------------------------------------------------+Replace is needed here, because python uses single quotes for dictionaries.
Make a function that receives a json array and a key and sorts the array by key
DROP function IF EXISTS sort_by_key; DELIMITER $$ CREATE FUNCTION sort_by_key (arr json, k text) RETURNS json BEGIN RETURN replace(py_eval(CONCAT("sorted(", arr, ", key=lambda e: e['", k, "'])")), "'", '"'); END$$ DELIMITER ;Test
> select sort_by_key('[{"a":2}, {"a":1}, {"a": 722}, {"a": 0}]', 'a') sorted; +--------------------------------------------+ | sorted | +--------------------------------------------+ | [{"a": 0}, {"a": 1}, {"a": 2}, {"a": 722}] | +--------------------------------------------+
How to write a MySQL UDF
There is a pretty good guide at the MySQL 8.0 Reference Manual. I'll give you a brief explanation so you can start quickly, but reading the full guide is highly recomended.
MySQL's UDFs are written in C++ and need to follow certain conventions so they can be recognized as such.
In our case, we want our MySQL function to be called py_eval, so we have to define the following C++ functions:
py_eval_init or py_eval_deinit
py_eval
**py_eval_init**: (Optional) Initializes memory and data structures for the function execution.
**py_eval**: Executes the actual function, in our case evaluates a python expression.
**py_eval_deinit**: (Optional) If any memory was allocated in the init function, this is the place where we free it.
For py_eval we only need **py_eval_init** and **py_eval**.
Functions signatures
bool py_eval_init(UDF_INIT *initid, UDF_ARGS *args, char *message); char *py_eval(UDF_INIT *, UDF_ARGS *args, char *result, unsigned long *res_length, unsigned char *null_value, unsigned char *);These are the standard definitions for MySQL functions that return string, as is the case of py_eval. To be able to declare this functions, you need to have the definition of UDF_INIT and UDF_ARGS, you can find that at the source code of mysql server -> right here.
Evaluating python expression
For evaluating python expression, we'll be using pybind11. That gives us the ability to directly access the python interpreter and execute code.
Example
Make sure you have g++ installed. Try executing: g++ --help. And some version of python running of your system, for this tutorial I'll be using version _3.8_.
$ mkdir py_eval && cd py_eval $ git clone https://github.com/pybind/pybind11Create a new file called main.cpp with the following content:
#include "pybind11/include/pybind11/embed.h" #include "pybind11/include/pybind11/eval.h" #include <iostream> namespace py = pybind11; py::scoped_interpreter guard{}; // We need this to keep the interpreter alive int main(void) { auto obj = py::eval("[i for i in range(10)]"); std::cout << std::string(py::str(obj)) << std::endl; }To run the example we have to compile the file. First, we need the compilation flags.
$ pkg-config python-3.8 --libs --cflags -I/usr/include/python3.8Then, we can compile and run our code with the following.
$ g++ main.cpp -I/usr/include/python3.8 -lpython3.8 $ ./a.out [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Putting all together
Download udf types to the project folder.
$ wget https://raw.githubusercontent.com/mysql/mysql-server/8.0/include/mysql/udf_registration_types.hCreate a new file called py_eval.cpp, with the following content:
c++ #include "pybind11/include/pybind11/embed.h" #include "pybind11/include/pybind11/eval.h" #include "udf_registration_types.h" #include <string.h> namespace py = pybind11; py::scoped_interpreter guard{}; // We need this to keep the interpreter alive extern "C" bool py_eval_init(UDF_INIT *initid, UDF_ARGS *args, char *message) { // Here we can check if we received one argument if (args->arg_count != 1) { // The function returns true if there is an error, // the error message is copied to the message arg. strcpy(message, "py_eval must have one argument"); return true; } // Cast the passed argument to string args->arg_type[0] = STRING_RESULT; initid->maybe_null = true; /* The result may be null */ return false; } extern "C" char *py_eval(UDF_INIT *, UDF_ARGS *args, char *result, unsigned long *res_length, unsigned char *null_value, unsigned char *) { // Evaluate the argument as a python expression auto obj = py::eval(args->args[0]); // Cast the result to std::string std::string res_str = std::string(py::str(obj)); // Copy the output string from py::eval to the result argument strcpy(result, res_str.c_str()); // Set the length of the result string *res_length = res_str.length(); return result; }Then, we have to compile the project as a shared library, and move it to the plugin folder of mysql (in your case, it could be located in some other directory).
$ g++ -I/usr/include/python3.8 -lpython3.8 -shared -fPIC -o py_eval.so py_eval.cpp $ sudo cp py_eval.so /usr/lib/mysql/plugin/Now, it's time to try it from mysql. First, connect to your server as root.
$ sudo mysql -urootCreate and test the function.
> create function py_eval returns string soname 'py_eval.so'; Query OK, 0 rows affected (0.029 sec) > select py_eval('[i for i in range(10)]') list; +--------------------------------+ | list | +--------------------------------+ | [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] | +--------------------------------+ 1 row in set (0.001 sec)
Future
There is a lot to do, for example there is no error control on the function execution. The python expression that we are trying to evaluate could fail causing a server reboot. Also, there is some extra work to do to be able to use import. And there are many concerns regarding concurrency issues.
If you want to contribute to improve execution of python code on mysql server, please go to my github project and make a PR.
I hope you enjoyed this tutorial and come back soon for more.