I worked on the internals of MySQL for many years, but switched to work for Timescale a few years back. If you’re interested in MySQL, or the work I did for MySQL, you can check my old blog MySQL Musings which contain posts from that time. If you’re a developer that is interested in database internals and database implementations in general—or PostgreSQL internals in particular—this blog is for you. It will obviously have a bias for PostgreSQL since this is what I am working with now, but I will look at other work as well.

The code base for MySQL and for PostgreSQL are very different, and switching from MySQL to PostgreSQL code base required digging through a lot of code, something I am still doing on a daily basis. Some things in the PostgreSQL code base is very straightforward, while other things can be more tricky to follow. In particular, some areas are not well-documented and understanding the rationale and reasoning behind a solution can require some research.

PostgreSQL Extensions

It is good to start from the beginning, so that you can see how all the pieces fit together. A good starting point is to write a PostgreSQL extension. This gives a good introduction to the PostgreSQL world, gives a good introduction to the structure of PostgreSQL, and offer many opportunities to discuss internal implementations as well as interfaces into the server. The description below assumes that you have a moderate experience with programming in the C language—this is the language the PostgreSQL server is implemented in. In a later post, you will see how you can write your extensions in C++, but right now, make sure that your C programming skills up to date.

Extension support in PostgreSQL was introduced with PostgreSQL 9.1 so it is a little more than 10 years old. The advantages of using extensions—in contrast to forking the server and adding your own features by modifying the code—is that you can write new features without having to re-compile PostgreSQL. The reason for this is that extensions are loaded dynamically into a running PostgreSQL server. There are already many extension available, for example:

Useful functions, such as the cryptographic functions available in pgcrypto.
New indexing methods, such as the Bloom filters in the bloom extension.
Background workers, such as the pre-warm workers in pg_prewarm.

There are plenty of material on how to write PostgreSQL extensions, so most of the beginning would be familiar to anybody that has written extensions before. Basic extension writing is pretty straightforward—and we will start with a very basic extension—but one area that I found a little hard to find material on is writing background workers. There are some material in the PostgreSQL documentation, and there is some examples in the code, but I find that is still leaves some questions unanswered.

To create a little more interesting example, we’ll write an extension that uses background workers to create a background worker for implementing a receiver using the InfluxDB Line Protocol format and insert it into tables. This will allow you to see several useful internal details and also give you an idea of what you can do with an extension. It will also give you opportunities to see how you can measure performance and improve it. There are several different protocols that could be used for this, but the InfluxDB Line Protocol is a very simple format so it is easy to parse and process, and also renders itself naturally to be used for a relational database. You can easily apply the same ideas to implement support for other IoT protocols like MQTT or AMQP.

Structure of an extension

Before embarking on writing the main extension, it is good to first take a look at the structure of an extension and how it is built, installed, and configured through using a very basic example. In this case, a single function that can split a string into a key-value pair. This will allow you to see the structure of an extension, and give an example of how to work with a complex data type as well as writing a non-trivial C function and integrate it with PostgreSQL.

PostgreSQL provides infrastructure (called PGXS) that will help you writing an extension. It relies on using make and it is straightforward to write a Makefile for an extension. If you want to use CMake, there is a CMake package available in pg_extension on GitHub that you can use, but we start with PGXS for the introduction so that you become familiar with the existing support and a future post will cover how you can use CMake to write your extensions.

The directory structure of an extension built using the standard way looks like this. The full source code is available in the pg_influx repository and will be updated as we implement and extend the extension.

$ tree pg_influx/
pg_influx/  
├── expected
│   └── parse.out
├── influx--0.1.sql
├── influx.c
├── influx.control  
├── Makefile  
├── README.md  
└── sql
    └── parse.sql

2 directories, 7 files

The files you see here is just an example demonstrating an extension that contains installation scripts, control file, a C implementation file, and some tests. Some of the files are required to build the extension—the Makefile, the control file, and the installation script—but the remaining files are usually added for documentation, for testing, or for implementation. It is customary to add a README file as well: in this case we use Markdown since GitHub will display that nicely and give a good starting page for the extension.

The Makefile

The Makefile in the directory is used to build the extension and need to have some boilerplate code. A typical Makefile for a PostgreSQL extension using PGXS looks like this.

EXTENSION = influx
DATA = influx--0.1.sql
MODULE = influx

REGRESS = parse worker
REGRESS_OPTS += --load-extension=influx

PG_CONFIG = pg_config
PGXS := $(shell $(PG_CONFIG) --pgxs)
include $(PGXS)

The last three lines of the file is the standard boilerplate code to load the PGXS support. The variables provide information to PGXS about the extension, but this Makefile only contains a few examples. In the Makefile, you have a set of variables that you can set to values appropriate for your extension:

The variable EXTENSION gives the name to use when installing the extension and it is expected that there is a control file based on this name (more about that below).
The variable MODULE contain the basename for the shared library module that make up the extension. It is expected that there should be a C file with this basename in the extension, “influx.c” in this case, and will be used to build a shared library with the same basename.
The REGRESS variable contain a list of regression test files. They are expected to reside in the sql/ directory in the extension and the REGRESS_OPTS contains additional options to add when running the regression tests. In this case, we automatically load the extension so that we do not have to type an explicit CREATE EXTENSION influx at the beginning of each test file.

The last three lines of the Makefile includes the PGXS build system from the PostgreSQL installation.

The control file

The control file contains information about the extension that is necessary to load the extension into PostgreSQL. If the extension is named “influx”, the control file need to have the name “influx.control”. An example of a control file would be:

default_version = '0.1'
relocatable = true
module_pathname = '$libdir/influx.so'

A full list of parameters that can be set in the control file can be found in the section Packaging Related Objects into an Extension in the PostgreSQL manual, but where we just set three parameters:

The parameter default_version is used to allow CREATE EXTENSION to be used without explicitly giving a version.
The parameter relocatable means that the extension can be moved to a different schema after installation.
The parameter module_pathname is used avoid having to give an explicit shared library in the installation script. The variable $libdir here is a placeholder for the library directory of the installation.

The installation script

The installation script is executed when installing the extension and it needs to have a file name like “influx–0.1.sql“, where “influx” is the extension name and “0.1” is the version. You can add any SQL statements here, but we create a composite type for the key-value pair and a function that splits a string into a key-value pair.

-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION influx" to load this file. \quit

CREATE TYPE item AS (key text, value text);

CREATE FUNCTION split_pair(text)
RETURNS item LANGUAGE C AS 'MODULE_PATHNAME';

The line on line 2 is there to make sure that if it is included directly from psql, it will give an error and abort. On line 4, a composite type is created to represent a key-value item, and on line 6-7 a function that splits text into an item is defined. It is implemented in C and MODULE_PATHNAME will be replaced with the module path name of the shared library given in the control file, which is where this function is defined.

Implementing PostgreSQL Functions in C

To write a C implementation, there is a need for some boilerplate code in addition to the actual function implementation. In the example directory structure above, we put all the code in the influx.c file, but later we will write an extension that uses several files.

#include <postgres.h>
#include <fmgr.h>

#include <funcapi.h>
#include <utils/builtins.h>

#include <stdbool.h>
#include <string.h>

PG_MODULE_MAGIC;

PG_FUNCTION_INFO_V1(split_pair);
Datum split_pair(PG_FUNCTION_ARGS) {
  TupleDesc tupdesc;
  char *key, *val;

  if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
    ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                    errmsg("function returning record called in context "
                           "that cannot accept type record")));

  key = text_to_cstring(PG_GETARG_TEXT_PP(0));
  val = strchr(key, '=');
  if (val) {
    Datum values[2];
    bool nulls[2] = {0};
    HeapTuple tuple;

    *val++ = '\0';
    values[0] = CStringGetTextDatum(key);
    values[1] = CStringGetTextDatum(val);
    tuple = heap_form_tuple(tupdesc, values, nulls);
    PG_RETURN_DATUM(HeapTupleGetDatum(tuple));
  }
  PG_RETURN_NULL();
}

This is the full implementation, but let’s unravel it piece by piece. It is assumed that you’re familiar with programming in C, so standard C constructions is not something that is covered here. In addition, some variable declarations and usual C constructs are ignored here: they will be discussed in more detail in later posts.

#include <postgres.h>
#include <fmgr.h>

#include <funcapi.h>
#include <utils/builtins.h>

#include <stdbool.h>
#include <string.h>

The first lines include the standard PostgreSQL headers, which contains all the definitions necessary to interface with the PostgreSQL code, and standard C header files containing definitions that we need.

The header file postgres.h contains basic definitions and should always be included first. If it is not, you will get strange errors.
The header file fmgr.h contains the function manager interface that you need to use internals of PostgreSQL, including the PG_MODULE_MAGIC macro.
The header file utils/builtins.h contains support functions for the built-in types, in particular text_to_cstring is declared here.
The header file funcapi.h contains support functions for functions that return composite types.

PG_MODULE_MAGIC;

Each shared library needs to contain a definition PG_MODULE_MAGIC. This macro expands to a function that provides some basic information about the version of PostgreSQL that it was built with. There should be one such function for each shared library, so you should only add it in one of the files.

PG_FUNCTION_INFO_V1(split_pair);
Datum split_pair(PG_FUNCTION_ARGS) {
  ...
}

The C implementation of a PostgreSQL function need to be written to interface correctly with PostgreSQL, as shown above.

All PostgreSQL C functions need additional information about the function (right now, only contains some information about the calling conventions use for functions to make sure that parameters are passed to the functions the right way) . The macro PG_FUNCTION_INFO_V1 adds that for a function name. For the function name split_pair, the macro will create a function pg_finfo_split_pair that will return a record with information.
The PG_FUNCTION_ARGS is just a convenience macro that expands to FunctionCallInfo fcinfo. You will see fcinfo being used inside the function later, so remember that it is just a standard argument to all PostgreSQL functions.

if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
  ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                  errmsg("function returning record called in context "
                         "that cannot accept type record")));

To figure out what the function is expected to return, you can use the function get_call_result_type. There are multiple reasons for why you might need to check this inside your function:

If you do not return the data in the correct format, PostgreSQL will not be able to read the return value correctly, so checking that the result type of the function (as declared in the SQL file) matches what the C function expects prevents some subtle bugs. However, assuming that both the PostgreSQL function declaration and the C function definition agree on what should be returned, there a few other reasons that you might want to check the result type.
In PostgreSQL, function can be overloaded, so you can actually have several PostgreSQL functions that refer to the same C function. In this case, it can be useful to distinguish the functions and make sure to return the data that is expected.
Even if you don’t have an overloaded function, you might want to make the function flexible and, for example, fill in different output parameters depending on their types and names.

Here we use the function to check that it actually expects a composite type and return an error otherwise. The information about the result type is stored in a tuple descriptor. The tuple descriptor contains information about a the columns of the result type: the column name, the column type, collation, nullability, etc. and can be inspected.

key = text_to_cstring(PG_GETARG_TEXT_PP(0));

Access to the parameters of a PostgreSQL function is zero-based, so the first parameter has number 0, the second number 1, etc. Here, the first parameter is a TEXT type, which is a PostgreSQL-native type, so to get a pointer to a text object you should use PG_GETARG_TEXT_PP. To convert the TEXT type pointer to a C-string, you use the text_to_cstring function, which will copy the contents of the TEXT object and return a pointer to that copy. There is no need to free the memory after the call since it will be released automatically at the end of the transaction. This is handled using a memory context, which we will discuss in a later post.

Datum values[2];
bool nulls[2] = {0};
HeapTuple tuple;
    .
    .
    .
values[0] = CStringGetTextDatum(key);
values[1] = CStringGetTextDatum(val);
tuple = heap_form_tuple(tupdesc, values, nulls);
PG_RETURN_DATUM(HeapTupleGetDatum(tuple));

To return a component type, it is necessary to construct it from its parts. This is done using heap_form_tuple, which takes the tuple descriptor that we extracted from the result type above, and two arrays: one with the values (with appropriate types) and one array indicating the corresponding null columns. Since all our values are non-null, we can initialize this array with zeroes.

But what about the values array?

Values in PostgreSQL are all using the Datum type when being passed around. It can either contain a type that can be passed by value—such as boolean or 32-bit integer—or a pointer to types that cannot be passed by value. In order to fill in the values array, you need to convert the actual value to a Datum type. In the code above, the C-string is copied into a TEXT object using CStringGetTextDatum and the pointer set in the corresponding position in the values array. There are some types that can be passed either by value or by reference depending on the architecture, such as 64-bit integers or double-precision float, but if you use the correct conversion function either a reference or the actual value will be stored.

Since the function has to return a Datum type itself, the last step convert the heap tuple into a Datum and return it. The result is always returned using the PG_RETURN_XXX macros, in this case PG_RETURN_DATUM. In most cases, the PG_RETURN_XXX function just returns the Datum, but for PG_RETURN_NULL (for example) the fcinfo structure passed as a parameter is manipulated to indicate that the result is null.

Building and installing the extension

Now everything is set up for building and installing the extension, which is trivial:

make
sudo make install

The first line builds the entire extension. It is not always necessary, but since there is C code in this extension, it needs to be built. The second line installs the extension and makes it available for the server. At this point it is available for use in the server, but not actually added to any of the databases.

To add the extension to a database, you connect to the database, install the extension and reap the benefit of your newly written extension.

$ psql
psql (13.5)
Type "help" for help.

mats=# CREATE EXTENSION influx;
CREATE EXTENSION
mats=# \dx
                 List of installed extensions
  Name   | Version |   Schema   |         Description          
---------+---------+------------+------------------------------
 influx  | 0.1     | public     | 
 plpgsql | 1.0     | pg_catalog | PL/pgSQL procedural language
(2 rows)

mats=# SELECT * FROM split_pair('foo=bar');
 key | value 
-----+-------
 foo | bar
(1 row)

When creating the extension, PostgreSQL will load the extension in a few steps, somewhat simplified here:

Look for and read the control file for the extension. The extension file is expected to be in the share/extension/ directory of the installation.
Figure out what version of the extension is being created.
Figure out the install script(s) to use. (PostgreSQL can actually do a more advanced form of installation by combining an install script with a set of update scripts, but for now, you can ignore this.)
Figure out the schema to use for the installation and create the schema if it did not exist.
Check that all required extensions are installed and optionally install them.
Execute the commands in the install scripts.

Adding and running tests

To make sure that the implementation works correctly and that we do not introduce any bugs when we work with the extension, you should add regression tests that can be use to test the installation. The regression tests are executed using the pg_regress program which is part of the PostgreSQL installation. To use the regression tests, it is necessary to have an SQL file with commands to execute and it has to be placed in the sql/ directory. In this case, it is sufficient to add an sql/parse.sql file with this contents.

SELECT split_pair('foo=bar');

The output of executing the SQL code is compared with an expected output, which has to be stored in the expected/ directory and needs to have a name corresponding to the SQL file: in this case, expected/parse.out.

SELECT split_pair('foo=bar');
 split_pair 
------------
 (foo,bar)
(1 row)

To run the regression tests on the installed extension you can use make installcheck, which will create a new PostgreSQL instance, install the extension, and run the regression test script on it.

$ make installcheck
/usr/local/pg/13.5/lib/postgresql/pgxs/src/makefiles/../../src/test/regress/pg_regress --inputdir=./ --bindir='/usr/local/pg/13.5/bin' --load-extension=influx --dbname=contrib_regression parse
(using postmaster on Unix socket, default port)
============== dropping database "contrib_regression" ==============
DROP DATABASE
============== creating database "contrib_regression" ==============
CREATE DATABASE
ALTER DATABASE
============== installing influx                      ==============
CREATE EXTENSION
============== running regression test queries        ==============
test parse                        ... ok            5 ms

=====================
 All 1 tests passed. 
=====================

Next steps

This blog covers the basics of writing an extension, but this far nothing particularly special is being done. The next post will show you how to add a background worker to your extension, how it is implemented, and also add basic code that listens for data on a port and processes is.

Mats

dbmsdrops.kindahl.net

Long time developer with a keen interest in databases, distributed systems, and programming languages. Currently working as Database Architect at Timescale.

The Nobel Art of Writing PostgreSQL Extensions