Hello!

My name is Lauri Keel and I design and build all kinds of IT systems.

Websites, mobile apps, integrations, even hardware prototypes. Let's talk

Projects

Datafisher Dolphin LMS 2018-2020

A learning management system that mainly serves SCORM packages. Started rebuilding an old PHP and MongoDB-based system feature by feature using modern technologies.

Things done: back-end (PHP, Groovy, MongoDB, PostgreSQL), front-end (React), continuous integration, deployment and monitoring systems, operations, integrations with external systems, iOS and Android apps (native navigation, content is the front-end React app).

graVeer 2019-2020

An online shop for customized presents (in Estonian).

Things done: design, back-end (Java, Groovy, PostgreSQL), front-end (React), integrations (payments, transport).

Pesutark 2015

A laundry service. We pick up your dirty laundry and in a few days bring back clean clothes. Defunct.

Things done: logo, design, back-end (Java, Groovy, PostgreSQL), front-end, integrations (payments, transport), Android & iOS apps.

PM24 2019

The client did not like his new WordPress-based website, so we refurbished the old one from 2004. As a result he has one of the fastest online shops there are.

Things done: back-end (PHP, MySQL), front-end.

Global Enterprises Tours 2020

The website and bookings management system of a Brussels-based tour company.

Things done: design, back-end (WordPress, PHP, MySQL), integrations (payments, resellers).

DentistControl 2016

A marketplace for dental supplies (in Spanish). Defunct.

Things done: logo, design, back-end (Java, Groovy, PostgreSQL, Sphinx), front-end (React, D3), integrations (payments, transport, invoices).

Wivo 2014

A dashboard application for viewing, comparing and analyzing different metrics (in Spanish).

Things done: initial design, front-end (Backbone, C3), custom graphs (D3).

SocialCull 2015

A private system for making detailed searches of Instagram users (like SocialBro) and automatically performing various actions on them. Had data for more than 80 million Instagram users.

Things done: back-end (Java, Kafka, ElasticSearch), front-end.

A weather station 2017

A weather station prototype based on the ESP8266 that uses the 915Mhz band for communication with a base station. Also had a realtime mode. Defunct.

Things done: design, back-end (Java, Groovy, PostgreSQL), front-end (React, D3), hardware (ESP8266, Arduino).

Open Source

Standalone DNS Authenticator plugin for Certbot
A plugin that uses an integrated DNS server to respond to the _acme-challenge records. Written in Python.
Gradle JavaScript Toolkit
A full solution for Gradle-based JavaScript development. Written in Groovy, JavaScript.
PostgreSQL VelocyPack extension
Adds the VelocyPack format support to PostgreSQL. Generally it is a jsonb analogue with more data types and support for logical types or tags. And faster. Written in C.
PHP VelocyPack extension
Adds support for the VelocyPack format to PHP. Also ships with a tags-based type mapper for more precise type or object document mapping. Written in C.

Thesis

Design and Implementation of a Document Store Extension for PostgreSQL

tl;dr

Added VelocyPack format support to PostgreSQL, implemented a benchmarking framework and evaluated the effect of various ZFS filesystem, operating system and database parameters to the performance of different workloads of PostgreSQL and MongoDB. PostgreSQL is at least twice as fast as MongoDB.

Summary

The objective of this thesis was to enable using PostgreSQL as a full fledged document store and evaluate the effect of different system configurations to its performance. In the first chapter an overview was given about the background of the problem, different data serialization formats were studied and their suitability for using in a document store evaluated. The importance of data statistics was explained, different ways to evaluate the performance of document stores were studied and an overview was given about different database, filesystem and operating system configuration parameters that affect performance.

In the second chapter the relevant data serialization formats were analyzed and one chosen for implementation. Only one suitable format was found – Amazon Ion – but due to the lack of documentation for its C implementation it was found to be more efficient to modify another one – VelocyPack. Additionally, it was studied how to reuse the existing PostgreSQL statistics system and rules for destructuring documents were created that are the basis of the statistics generation system. In the third part of the chapter it was found that none of the existing performance testing frameworks are suitable for the current use case and it was decided to create a new one, so the requirements for the testing framework and conducting the tests were specified.

In the third chapter the required changes to VelocyPack and the PostgreSQL extension for the new data type, relevant operators, indexing and document statistics were engineered along with the testing framework and tests automation system. Additionally, it was explained how to carry out the tests with minimal external interference.

In the fourth chapter the implementation of the PostgreSQL extension, testing framework and the tests automation system were described in more detail. Additionally, the results of the testing framework were validated by a comparison with those of the YCSB framework and the results found to be satisfying.

In the fifth chapter the results of the performance tests were evaluated and found that PostgreSQL is at all aspects significantly faster than the current most popular document database – MongoDB. But the objective of the performance tests was not confirming that well-known fact, but evaluating the effect of different database, filesystem and operating system configurations to performance and give guidelines for further testing. It was found, that:

Each application and their workloads are different and with differnet level of compexity, so the settings need to be evaluated for those particular workloads and applications. The results of the performance tests of this thesis give guidelines which aspects should be tested and how to choose the baseline scenario. The effect of many configuration options was already known, but the exact estimation was missing.

All objectives were achieved: PostgreSQL can store complex documents while keeping the original data types of its values, the query planner can make smarter decisions based on data statistics and the implementation has the best performance of the compared solutions.

Download: full thesis, PostgreSQL VelocyPack extension

Other services

Identity

One moment, please...

Lauri Keel. Development of different IT solutions. Let's talk