This is now a pretty decent endpoint for Influx Line Protocol, but there are a few features that are missing for something to be practical for normal usage. One of those is that the application should not lose data just because it does not recognize the metric, tag, or field. Fields and tags that are …
Continue reading pg_influx: Automatic Table Creation
Server Startup and Configuration Options
As discussed in the previous post, it would be nice to be able to automatically start background workers instead of having to do that manually each time you start the server, so it will be covered in this post. As always, the code is available in the timescale/pg_influx repository on GitHub. To automatically start background …
Continue reading Server Startup and Configuration Options
Processing packets in parallel through port reuse
The implementation of the Influx reader this far as been pretty straightforward: a single process that reads data from a single port and insert it into the database. Databases are, however, designed to be able to handle multiple inserts at the same time, so what is preventing us from using multiple processes to ingest data? …
Continue reading Processing packets in parallel through port reuse
Leveraging Prepared Statements to Improve Performance
In the post on prepared statement the statements were prepared each time a row is received, which seems like a waste of CPU. After all, the tables do not change, so it should be perfectly OK to just prepare the statement once, save it away in prepared form, and then reuse it for each row …
Continue reading Leveraging Prepared Statements to Improve Performance
Reading and writing types
One drawback of the current design is that all the tags and all the fields are stored in a JSONB structure. JSON is a great format if you want to have an open format where you can add new fields as necessary and even have more structured data than what you have in plain columns, …
Continue reading Reading and writing types
Using prepared statements through the SPI
In the previous post, you could see how to use the server programming interface to execute statements that modified the database. This was done by creating a statement containing the all the data that needed to be inserted, but that also meant generating a string from already parsed data, which seems like a waste of …
Continue reading Using prepared statements through the SPI
The Server Programming Interface
In the previous post you could see how to parse a packet and construct a complex data type from it by creating a set-returning function. This function returned a table of rows, but it did not insert it into the database. In this post you will see how to insert the data into the database …
Continue reading The Server Programming Interface
Creating JSON Values
In the previous post, you could see how to create set-returning functions that returns several rows of a table and we returned rows consisting of the timestamp, the metric, and two JSONB values: one for the tags and one for the fields. There were, however, no coverage of what JSONB is, how it differs from JSON, and how to construct them in your code. This post is going to answer those questions.
Parsing InfluxDB Line Protocol
In the previous post you could see how to create a background worker that received data over a socket as well as how to spawn a new background worker. In this post you will see how to write a simple parser for the InfluxDB Line Protocol and also get an introduction into PostgreSQL Memory Contexts and the Set-Returning Function (SRF) interface and learn how to write a function that returns multiple tuples.
It’s all in the Background
In contrast to MySQL—which is a multi-thread database system—PostgreSQL is a multi-process database system. In multi-process systems you have a process tree with several processes that interact to share the work, but only a single thread for each process. Multi-process and multi-thread systems both have advantages and disadvantages, some of which are discussed in this post. Since PostgreSQL is a multi-process system, the focus will be on PostgreSQL and multi-process systems.
In this post you will see how to add a background worker to the extension and how to spawn new background workers.
This post assume that you’re familiar with C programming and also familiar with programming for Linux. In particular, you need to know about processes, signals, and sockets in Linux: what they are, how they work, and what they are used for.