The Cost of SQL - First Measurements
There are the preliminary benchmarks comparing different DB backends. I've measured pages per second ratio for fetching different pages from the blog that includes 5,000 articles and 50,000 comments to them. The client and the server had run on the same PC.
The settings and the database is the same that is used in this article.
Backend gzip no-gzip gzip no-gzip %
------------------------------------------------
Berkeley DB 565 830 N/A N/A
MySQL InnoDB 475 645 -16% -22%
Sqlite3 410 515 -27% -38%
PostgreSQL 305 360 -46% -57%
We can see:
- There is not negligible cost of using SQL Data Bases. However the price is not too high with fast data bases like MySQL.
- PostgreSQL had badly surprised me with its real performance. Maybe I'm doing something wrong?
Berkeley DB Out, MySQL In...
After long period of tests and thoughts I had finally decided to move from Berkeley DB to traditional data base that is used for web development.
I had choosen soci library as a backend that provides a universal C++ driver that gives access to all popular data bases:
- MySQL
- PostgreSQL
- Sqlite3
- Firebird
- Oracle
- MS SQL via odbc.
About SOCI
I had used latest CVS version of this library that goes towards "boostification". I had implemented several patches to this library in order make it more useful for this project. I had send them to developers and they are waiting to be merged to CVS tree.
more...Roadmap...
In this post I'll describe the roadmap of this project for the close period. There are several important points to deal with them.
- Move from Berkeley DB to SQL. This involves an integration with external general purpose SQL library, probably, I'm going to use soci or libdbi.
- Preparation of caching system
- Improvements of text tools -- better markdown or restructured text support.
- Code cleanup and initial alpha release of the framework and the CMS itself (this blog).
BDB or not BDB - that is the question...
At the beginning of the project I had several options for president storage for CppCMS:
- Use standard SQL databases like MySQL, PostgreSQL.
- Use embedded SQL data base like Sqlite3
- Use Berkeley DB
- Implement my own storage model.
At the beginning I wanted to use MySQL. At certain point I decided to switch to Berkeley DB, and now I come back to the original question: "what data storage to use?"
In order to make a proper design I had run lots of micro benchmarks and finally I had chosen Berkeley DB. However I never used queries from real DB. And now, when I had implemented a simple blog I decided to run benchmarks on the situation that is closer to real one.
I had reimplemented an operation of fetching single post using MySQL and imported the database that included 5000 articles with 10 comments each one and tested.
more...Next Step - Caching
As we had seen in previous article, the benchmarks had shown an ability of CppCMS to produce about 630 compressed pages per second and an output of about 20Mbit/s. Is this enough?
For most of cases it is... But as we had seen I want to use every cycle of the CPU as smart as I can. Even, if the model I had suggested, was able to show "a prove of concept" there is an important point that was missed: "Why should I create same page so many times?"
Caching
This is the next logical step in the development of high performance web development framework.
First of all we should understand a requirements of the caching system:
- Efficiency
- Support of "dropping cache on update"
- Support of drop the cache by timeout
- Work using three models: single process cache, shared cache between processes, shared over the network.
- Support of caching on entry page level and single view level as well
- Transparent storage of compressed content
Lets describe each one of them:
more...