Posts in category ‘Cache’.
Version 0.99.9 Released
New Features:
Clang is support provided, CppCMS was tested against Clang 2.8.
Now CppCMS supports 5 families of C++ compilers:
- GCC 3.4.x to 4.6.1
- Visual Studio 2005 - 2010
- Clang 2.8
- Intel 11
- Sun Studio 5.10
Significant performance improvements in XSS filtering by rewriting URI validation using a C++ parser rather then using complex regular expression.
Added support of fully custom validation for HTML attributes using callback functions in the XSS filter.
Significant performance improvements over multiple places in code by eliminating multiple memory allocations:
- HTTP, SCGI and FastCGI backends - improved memory allocation for CGI variables.
- Fetching values from JSON objects using get(...), find(...) APIs is now done with 0 memory allocation.
- URL mapping is now done with 0 or very low memory allocation.
- Various filters like
escape
,urlencode
and some others now work with no or few memory allocations.
Performance improvements in caching by replacing the balanced binary tree by hash table in the primary cache key index.
Breaking Changes:
json::object
had changed fromstd::map<std::string,value>
tostd::map<string_key,value>
. It should be fully transparent for almost all users.
Bugs:
- Fixed a crash in http::response when writing HTTP headers throws due for example to incorrect file permissions.
- Fixed a bug in
booster::regex
that prevented some valid patterns to be matched against some regular expressions. - Fixed a bug that may prevent from
booster::regex
to work on big endian 64 bit platforms - Added initial support of Python3 for templates compiler.
- Added a workaround for systems that use python3 by default.
CppCMS benchmarks vs Java, C#, PHP
Long time ago I had posted a benchmarks comparing CppCMS based blog and PHP based one.
I wanted to compare real life applications with each other. For a long time I had been searching for similar applications in several technologies doing very similar jobs in leading technologies: PHP, Asp.Net and Java/JSP. The last two were particularly important as they use static type system and "compiled" languages as C# and Java that are known to be faster then other dynamic typed languages like PHP, Python, Ruby and Perl popular in web development.
Setup
Unfortunately I had failed to find such application, so finally I decided to write something representative and small on my own an application with following requirements:
- Uses simple time-out based page caching
- Uses MySQL and the database and keeps open connections in pool.
- For each request access to database (if page is not cached), fetches the page content and comments for "sample article" in blog.
- Converts text to HTML using a markdown filter and displays it on page.
I used following technologies:
CppCMS:
Version: 0.99.3
MySQL Connection: dbixx/libdbi library using libmysqlclient
Markdown library: discount
Connection: internal HTTP serverPHP
Version: 5.2.6
MySQL Connection: internal driver
Markdown library: PHP-Makrdown
Connection: Lighttpd 1.4.19 + FastCGI
Bytecode Cache: XCacheAsp.Net/Mono
Version: 2.6.7
MySQL Connection: Connector/Net
Markdown library: MarkdownSharp
Connection: internal HTTP server XSP (found to be much faster than fastcgi server)JSP/Tomcat
Version - Tomcat: 6.0.18
Version - Java: Sun Java 1.6.0_12
MySQL Connection: Connector/J
Markdown library: jmd-0.8.1
Caching: oscache 2.4.1
Connection: HTTP
I tested following parameters:
- Pages per seconds generation for different cache hit/miss ratio: stating from 0% miss ratio up to 100% miss ratio.
- Memory usage
For each test the application was "warmed up" with 100 requests to fill the cache, and then 1000 request with max concurrency of 5 request are done, while certain percent of them is new pages and the other are taken from "warmed up" once.
Notes:
I used the fastest Markdown implementation I had found.
C# implementation is the same one that http://stackoverflow.com uses - it is actually heavily optimized implementation based on older C# implementation
The Java implementation is based on the above C# and the fastest one I had found.
Discount is the fastest C implementation of markdown that I had found.
Results:
Summary
- C#, Java and PHP implementation behave very similarly and without significant differences.
- The memory usage of Java/Tomcat and Mono/Asp.Net was significantly higher - up one or two orders of magnitude in comparison to CppCMS and PHP
- Surprisingly PHP behaves very well, in comparison to "compiled" languages like Java and C#.
Revisiting
After doing some profiling it was clear that C implementation of Markdown was significantly faster then all other implementations. So I decided to create my own mini-markdown that make some basic handing of titles, lists, paragraphs and quotes at one level only. That is very simple syntax but implemented similarly in all 4 languages using same algorithm.
The results were following:
The difference between CppCMS and other implementations was still significant but still much smaller then the difference between real markdown implementation. So the performance difference was less dramatic.
2nd Revision
And in the last revision I decided not to use any text filters by fetch ready HTML formatted content from DB and display it on the web as is.
Such comparison actually profile the most basic stuff:
- Caching
- SQL Connection
- Request/Response handling
And would ignore hundreds lines of code used in any web applications responsible for the actual business logic.
Conclusions
- Using C++ with CppCMS provides significant performance gains in developing web applications even in very basic case.
- The performance is effected not only by the framework itself but also by many other libraries that are in use. Using highly optimized C and C++ libraries may give significant performance gain in many cases.
- Such called "jit-compiled" languages as C# and Java and the frameworks based on the use significant amount of memory and still provide much lower performance then the one that can be achieved using real compiled languages like C++.
- It is good to remember that these benchmarks are still quite synthetic ones and in real life the actual performance depend on many factors - but using high quality and high performance libraries available for C++ have significant impact on performance.
Results Data
Markdown
---------
Miss % CppCMS Mono PHP JSP/Tomcat
0 3200.73 747.164 974.142 821.887
1 2891.2 427.727 724.173 337.736
2 2734.69 300.017 544.162 257.44
5 2285.95 162.686 301.507 130.023
10 1749.14 89.4447 174.724 68.5387
20 1247.86 47.7347 93.7919 25.7081
50 642.769 19.8311 38.979 15.1298
100 356.968 9.77116 20.1892 7.96328
Mini-Markdown
---------
Miss % CppCMS Mono PHP JSP/Tomcat
0 3103.14 763.222 1152.63 744.72
1 2933.97 728.971 1076.38 765.599
2 2944.42 726.338 1016.42 724.869
5 2804.44 661.613 866.32 822.927
10 2592.99 584.725 705.465 753.218
20 2239.03 471.576 507.021 674.488
50 1625.5 309.443 274.962 374.26
100 1156.09 197.123 159.974 164.515
HTML
-----
Miss % CppCMS Mono PHP JSP/Tomcat
0 3286.51 849.849 1147.21 808.038
1 3055.53 776.305 1137.35 748.829
2 2991.02 691.502 1122.88 693.439
5 2687.84 693.257 1074.22 756.618
10 2390.12 615.311 1016.27 604.452
20 1886.69 521.467 917.225 668.23
50 1947.93 346.672 669.693 289.656
System and Hardware
- OS: Linux, Debian Lenny, 64 bit
- Hardware: AMD Athlon XP 3000, 64 bit, 1GB memory
Related:
Code
The Code can be downloaded from there. note, to run it you will need to have some libraries installed and configure some hardcoded paths to make it run.
more...CppCMS 0.0.4 Released
Version 0.0.4 of CppCMS had released.
It includes optimizations required for using it in embedded systems.
Normal Embedded Build:
- Caching is completely removed. Small memory footprint is very important for embedded system thus, caching stuff in memory is quite useless.
- Zlib compression are removed -- it removes dependency on boost::iostreams, zlib and bzip2 libraries.
- Removed mod-prefork.
- Removed dynamic templates loading --- this feature requires export of symbols to binary and increases its size in order to make RTTI work. Thus, all templates should be statically compiled into the binary.
Embedded CGI Mode:
- FastCGI and SCGI APIs are removed
- Mod-thread and mod process are removed including all thread pool facilities
- Changes in files based session backend to work properly with CGI mode including garbage collection (sessions that had time-out).
Downloads are avialable from Sf Project Page.
Preparing to Beta 2...
What expected in the next beta version:
Now CppCMS is really ready for serving high load sides, thanks to new distributed cache module based on Boost.Asio.
Several CppCMS processes running on different computers can share same cache distributed over several TCP/IP cache servers.
- Staticly typed template system that fully integrated with the framework that allows:
- "Django style" template inheritance.
- Powerful extendsions abilities using C++ code directly.
- Static compilation with generated templates code or loading templates as external shared objects.
- Creates a potential for future "forms/widgets" integration.
- Various bugfixes and code cleanup.
Possibly included: form validation and generation modules.
to be Continued...
Caching System: Internals
One of the latest implemented features of CppCMS is a caching system.
Each cached entry is stored using:
- Unique key that defines the entry
- An actual data
- Entry lifetime period.
- The set of triggers --- this is a feature that is not available in many cache system like memcached.
For example: main page that displays 5 recent posts may have a key main_page
and triggers: post_123
, post_124
, ... , post_128
. More then that, each time, during page build, when you fetch some cached data, like a sidebar or set of options, their sets of triggers are automatically added to the set of triggers of the page you build.
For example, when the page is created and sidebar block is fetched from cache all its triggers are automatically added: if sidebar
depends of options
, then trigger sidebar
and options
will be automatically added to triggers of main_page
.
Thus, when certain trigger is risen, all pages that depends on it are automatically trashed. This makes a cache system quite powerful and easy to control correct data representation.
The developer is expected to create a rational model of data/triggers that represent the relations between parts of internal data and rise these triggers when committing changes to database.
more...