Hello!
My name is Lauri Keel and I design and build all kinds of IT systems.
Websites, mobile apps, integrations, even hardware prototypes. Let's talk
A learning management system that mainly serves SCORM packages. Started rebuilding an old PHP and MongoDB-based system feature by feature using modern technologies.
Things done: back-end (PHP, Groovy, MongoDB, PostgreSQL), front-end (React), continuous integration, deployment and monitoring systems, operations, integrations with external systems, iOS and Android apps (native navigation, content is the front-end React app).
An online shop for customized presents (in Estonian).
Things done: design, back-end (Java, Groovy, PostgreSQL), front-end (React), integrations (payments, transport).
A laundry service. We pick up your dirty laundry and in a few days bring back clean clothes. Defunct.
Things done: logo, design, back-end (Java, Groovy, PostgreSQL), front-end, integrations (payments, transport), Android & iOS apps.
The client did not like his new WordPress-based website, so we refurbished the old one from 2004. As a result he has one of the fastest online shops there are.
Things done: back-end (PHP, MySQL), front-end.
The website and bookings management system of a Brussels-based tour company.
Things done: design, back-end (WordPress, PHP, MySQL), integrations (payments, resellers).
A marketplace for dental supplies (in Spanish). Defunct.
Things done: logo, design, back-end (Java, Groovy, PostgreSQL, Sphinx), front-end (React, D3), integrations (payments, transport, invoices).
A dashboard application for viewing, comparing and analyzing different metrics (in Spanish).
Things done: initial design, front-end (Backbone, C3), custom graphs (D3).
A private system for making detailed searches of Instagram users (like SocialBro) and automatically performing various actions on them. Had data for more than 80 million Instagram users.
Things done: back-end (Java, Kafka, ElasticSearch), front-end.
_acme-challenge
records. Written in Python.jsonb
analogue with more data types and support for logical types or tags. And faster. Written in C.Added VelocyPack format support to PostgreSQL, implemented a benchmarking framework and evaluated the effect of various ZFS filesystem, operating system and database parameters to the performance of different workloads of PostgreSQL and MongoDB. PostgreSQL is at least twice as fast as MongoDB.
The objective of this thesis was to enable using PostgreSQL as a full fledged document store and evaluate the effect of different system configurations to its performance. In the first chapter an overview was given about the background of the problem, different data serialization formats were studied and their suitability for using in a document store evaluated. The importance of data statistics was explained, different ways to evaluate the performance of document stores were studied and an overview was given about different database, filesystem and operating system configuration parameters that affect performance.
In the second chapter the relevant data serialization formats were analyzed and one chosen for implementation. Only one suitable format was found – Amazon Ion – but due to the lack of documentation for its C implementation it was found to be more efficient to modify another one – VelocyPack. Additionally, it was studied how to reuse the existing PostgreSQL statistics system and rules for destructuring documents were created that are the basis of the statistics generation system. In the third part of the chapter it was found that none of the existing performance testing frameworks are suitable for the current use case and it was decided to create a new one, so the requirements for the testing framework and conducting the tests were specified.
In the third chapter the required changes to VelocyPack and the PostgreSQL extension for the new data type, relevant operators, indexing and document statistics were engineered along with the testing framework and tests automation system. Additionally, it was explained how to carry out the tests with minimal external interference.
In the fourth chapter the implementation of the PostgreSQL extension, testing framework and the tests automation system were described in more detail. Additionally, the results of the testing framework were validated by a comparison with those of the YCSB framework and the results found to be satisfying.
In the fifth chapter the results of the performance tests were evaluated and found that PostgreSQL is at all aspects significantly faster than the current most popular document database – MongoDB. But the objective of the performance tests was not confirming that well-known fact, but evaluating the effect of different database, filesystem and operating system configurations to performance and give guidelines for further testing. It was found, that:
synchronous_commit=off
) gives a write performance
benefit of more than 3 times;
shared_buffers
cause a significant performance decline when their hit rate
is low;logbias=throughput
for the database dataset does not bring any performance
benefits nor changes in disk usage;jsonb
format is 4-15% slower compared to the one implemented in this
thesis;Each application and their workloads are different and with differnet level of compexity, so the settings need to be evaluated for those particular workloads and applications. The results of the performance tests of this thesis give guidelines which aspects should be tested and how to choose the baseline scenario. The effect of many configuration options was already known, but the exact estimation was missing.
All objectives were achieved: PostgreSQL can store complex documents while keeping the original data types of its values, the query planner can make smarter decisions based on data statistics and the implementation has the best performance of the compared solutions.
Download: full thesis, PostgreSQL VelocyPack extension
One moment, please...