CppCMS Blog :: Cache http://blog.cppcms.com/ A blog on CppCMS - C++ Web Development Framework Version 0.99.9 Released http://blog.cppcms.com/post/83 http://blog.cppcms.com/post/83 <div style="direction:ltr"> <h2>New Features:</h2> <ul> <li><p>Clang is support provided, CppCMS was tested against Clang 2.8.</p> <p>Now CppCMS supports 5 families of C++ compilers:</p> <ul> <li>GCC 3.4.x to 4.6.1</li> <li>Visual Studio 2005 - 2010</li> <li>Clang 2.8</li> <li>Intel 11</li> <li>Sun Studio 5.10</li> </ul> </li> <li><p>Significant performance improvements in XSS filtering by rewriting URI validation using a C++ parser rather then using complex regular expression.</p> <p>Added support of fully custom validation for HTML attributes using callback functions in the XSS filter.</p></li> <li><p>Significant performance improvements over multiple places in code by eliminating multiple memory allocations:</p> <ul> <li>HTTP, SCGI and FastCGI backends - improved memory allocation for CGI variables.</li> <li>Fetching values from JSON objects using get(...), find(...) APIs is now done with 0 memory allocation.</li> <li>URL mapping is now done with 0 or very low memory allocation.</li> <li>Various filters like <code>escape</code>, <code>urlencode</code> and some others now work with no or few memory allocations.</li> </ul> </li> <li><p>Performance improvements in caching by replacing the balanced binary tree by hash table in the primary cache key index.</p></li> </ul> <h2>Breaking Changes:</h2> <ul> <li><code>json::object</code> had changed from <code>std::map&lt;std::string,value&gt;</code> to <code>std::map&lt;string_key,value&gt;</code>. It should be fully transparent for almost all users.</li> </ul> <h2>Bugs:</h2> <ul> <li>Fixed a crash in http::response when writing HTTP headers throws due for example to incorrect file permissions.</li> <li>Fixed a bug in <code>booster::regex</code> that prevented some valid patterns to be matched against some regular expressions.</li> <li>Fixed a bug that may prevent from <code>booster::regex</code> to work on big endian 64 bit platforms</li> <li>Added initial support of Python3 for templates compiler.</li> <li>Added a workaround for systems that use python3 by default.</li> </ul> </div> CppCMS benchmarks vs Java, C#, PHP http://blog.cppcms.com/post/67 http://blog.cppcms.com/post/67 <div style="direction:ltr"> <p>Long time ago I had posted a <a href="http://art-blog.no-ip.info/cppcms/blog/post/22">benchmarks</a> comparing CppCMS based blog and PHP based one.</p> <p>I wanted to compare real life applications with each other. For a long time I had been searching for similar applications in several technologies doing very similar jobs in leading technologies: PHP, Asp.Net and Java/JSP. The last two were particularly important as they use static type system and "compiled" languages as C# and Java that are known to be faster then other dynamic typed languages like PHP, Python, Ruby and Perl popular in web development.</p> <h2>Setup</h2> <p>Unfortunately I had failed to find such application, so finally I decided to write something representative and small on my own an application with following requirements:</p> <ol> <li>Uses simple time-out based page caching</li> <li>Uses MySQL and the database and keeps open connections in pool.</li> <li>For each request access to database (if page is not cached), fetches the page content and comments for "sample article" in blog.</li> <li>Converts text to HTML using a markdown filter and displays it on page.</li> </ol> <p>I used following technologies:</p> <ul> <li><p>CppCMS:</p> <p>Version: 0.99.3<br/> MySQL Connection: dbixx/libdbi library using libmysqlclient<br/> Markdown library: <a href="http://www.pell.portland.or.us/~orc/Code/discount/">discount</a><br/> Connection: internal HTTP server</p></li> <li><p>PHP</p> <p>Version: 5.2.6<br/> MySQL Connection: internal driver<br/> Markdown library: <a href="http://michelf.com/projects/php-markdown/">PHP-Makrdown</a><br/> Connection: Lighttpd 1.4.19 + FastCGI<br/> Bytecode Cache: XCache</p></li> <li><p>Asp.Net/Mono</p> <p>Version: 2.6.7<br/> MySQL Connection: Connector/Net<br/> Markdown library: <a href="http://code.google.com/p/markdownsharp">MarkdownSharp</a><br/> Connection: internal HTTP server XSP (found to be much faster than fastcgi server)</p></li> <li><p>JSP/Tomcat</p> <p>Version - Tomcat: 6.0.18<br/> Version - Java: Sun Java 1.6.0_12<br/> MySQL Connection: Connector/J<br/> Markdown library: jmd-0.8.1<br/> Caching: oscache 2.4.1<br/> Connection: HTTP</p></li> </ul> <p>I tested following parameters:</p> <ul> <li>Pages per seconds generation for different cache hit/miss ratio: stating from 0% miss ratio up to 100% miss ratio.</li> <li>Memory usage</li> </ul> <p>For each test the application was "warmed up" with 100 requests to fill the cache, and then 1000 request with max concurrency of 5 request are done, while certain percent of them is new pages and the other are taken from "warmed up" once.</p> <h2>Notes:</h2> <p>I used the fastest Markdown implementation I had found.</p> <p>C# implementation is the same one that <a href="http://stackoverflow.com">http://stackoverflow.com</a> uses - it is actually heavily optimized implementation based on older C# implementation</p> <p>The Java implementation is based on the above C# and the fastest one I had found.</p> <p>Discount is the fastest C implementation of markdown that I had found.</p> <h2>Results:</h2> <p><img src="http://cppcms.com/files/test1-small.png" alt="Benchmarks Markdown" /></p> <p><img src="http://cppcms.com/files/mem-use.png" alt="Memory" /></p> <h2>Summary</h2> <ol> <li>C#, Java and PHP implementation behave very similarly and without significant differences.</li> <li>The memory usage of Java/Tomcat and Mono/Asp.Net was significantly higher - up one or two orders of magnitude in comparison to CppCMS and PHP</li> <li>Surprisingly PHP behaves very well, in comparison to "compiled" languages like Java and C#.</li> </ol> <h2>Revisiting</h2> <p>After doing some profiling it was clear that C implementation of Markdown was significantly faster then all other implementations. So I decided to create my own mini-markdown that make some basic handing of titles, lists, paragraphs and quotes at one level only. That is very simple syntax but implemented similarly in all 4 languages using same algorithm.</p> <p>The results were following:</p> <p><img src="http://cppcms.com/files/test3-small.png" alt="Benchmarks Markdown" /></p> <p>The difference between CppCMS and other implementations was still significant but still much smaller then the difference between real markdown implementation. So the performance difference was less dramatic.</p> <h2>2nd Revision</h2> <p>And in the last revision I decided not to use any text filters by fetch ready HTML formatted content from DB and display it on the web as is.</p> <p>Such comparison actually profile the most basic stuff:</p> <ol> <li>Caching</li> <li>SQL Connection</li> <li>Request/Response handling</li> </ol> <p>And would ignore hundreds lines of code used in any web applications responsible for the actual business logic.</p> <p><img src="http://cppcms.com/files/sql-only.png" alt="Benchmarks HTML" /></p> <h2>Conclusions</h2> <ol> <li>Using C++ with CppCMS provides significant performance gains in developing web applications even in <strong>very</strong> basic case.</li> <li>The performance is effected not only by the framework itself but also by many other libraries that are in use. Using highly optimized C and C++ libraries may give significant performance gain in many cases.</li> <li>Such called "jit-compiled" languages as C# and Java and the frameworks based on the use significant amount of memory and still provide much lower performance then the one that can be achieved using real compiled languages like C++.</li> <li>It is good to remember that these benchmarks are still quite synthetic ones and in real life the actual performance depend on many factors - but using high quality and high performance libraries available for C++ have significant impact on performance.</li> </ol> <h2>Results Data</h2> <pre><code>Markdown --------- Miss % CppCMS Mono PHP JSP/Tomcat 0 3200.73 747.164 974.142 821.887 1 2891.2 427.727 724.173 337.736 2 2734.69 300.017 544.162 257.44 5 2285.95 162.686 301.507 130.023 10 1749.14 89.4447 174.724 68.5387 20 1247.86 47.7347 93.7919 25.7081 50 642.769 19.8311 38.979 15.1298 100 356.968 9.77116 20.1892 7.96328 Mini-Markdown --------- Miss % CppCMS Mono PHP JSP/Tomcat 0 3103.14 763.222 1152.63 744.72 1 2933.97 728.971 1076.38 765.599 2 2944.42 726.338 1016.42 724.869 5 2804.44 661.613 866.32 822.927 10 2592.99 584.725 705.465 753.218 20 2239.03 471.576 507.021 674.488 50 1625.5 309.443 274.962 374.26 100 1156.09 197.123 159.974 164.515 HTML ----- Miss % CppCMS Mono PHP JSP/Tomcat 0 3286.51 849.849 1147.21 808.038 1 3055.53 776.305 1137.35 748.829 2 2991.02 691.502 1122.88 693.439 5 2687.84 693.257 1074.22 756.618 10 2390.12 615.311 1016.27 604.452 20 1886.69 521.467 917.225 668.23 50 1947.93 346.672 669.693 289.656 </code></pre> <h2>System and Hardware</h2> <ul> <li>OS: Linux, Debian Lenny, 64 bit</li> <li>Hardware: AMD Athlon XP 3000, 64 bit, 1GB memory</li> </ul> <h2>Related:</h2> <ul> <li><a href="http://www.reddit.com/r/programming/comments/ds69u/web_development_benchmarks_of_ccppcms_vs_php/">Comments on Reddit</a></li> <li><a href="http://art-blog.no-ip.info/cppcms/blog/post/22">CppCMS Based Blog vs WordPress Benchmarks</a></li> </ul> <h2>Code</h2> <p>The <a href="http://cppcms.com/files/benchmarks-code.tar.gz">Code</a> can be downloaded from there. note, to run it you will need to have some libraries installed and configure some hardcoded paths to make it run.</p> <p> <a href="/post/67">more...</a> </p> </div> CppCMS 0.0.4 Released http://blog.cppcms.com/post/41 http://blog.cppcms.com/post/41 <div style="direction:ltr"> <p>Version 0.0.4 of CppCMS had released.</p> <p>It includes optimizations required for using it in embedded systems.</p> <p><em>Normal Embedded Build:</em></p> <ul> <li>Caching is completely removed. Small memory footprint is very important for embedded system thus, caching stuff in memory is quite useless.</li> <li>Zlib compression are removed -- it removes dependency on boost::iostreams, zlib and bzip2 libraries.</li> <li>Removed mod-prefork.</li> <li>Removed dynamic templates loading --- this feature requires export of symbols to binary and increases its size in order to make RTTI work. Thus, all templates should be statically compiled into the binary.</li> </ul> <p><em>Embedded CGI Mode:</em></p> <ul> <li>FastCGI and SCGI APIs are removed</li> <li>Mod-thread and mod process are removed including all thread pool facilities</li> <li>Changes in files based session backend to work properly with CGI mode including garbage collection (sessions that had time-out).</li> </ul> <p>Downloads are avialable from Sf Project Page.</p> </div> Preparing to Beta 2... http://blog.cppcms.com/post/29 http://blog.cppcms.com/post/29 <div style="direction:ltr"> <p>What expected in the next beta version:</p> <ol> <li><p>Now CppCMS is really ready for serving high load sides, thanks to new distributed cache module based on Boost.Asio.</p> <p> Several CppCMS processes running on different computers can share same cache distributed over several TCP/IP cache servers.</p></li> <li>Staticly typed template system that fully integrated with the framework that allows: <ul> <li>"Django style" template inheritance.</li> <li>Powerful extendsions abilities using C++ code directly.</li> <li>Static compilation with generated templates code or loading templates as external shared objects.</li> <li>Creates a potential for future "forms/widgets" integration.</li> </ul> </li> <li>Various bugfixes and code cleanup.</li> </ol> <p>Possibly included: form validation and generation modules.</p> <p>to be Continued...</p> </div> Caching System: Internals http://blog.cppcms.com/post/21 http://blog.cppcms.com/post/21 <div style="direction:ltr"> <p>One of the latest implemented features of CppCMS is a caching system.</p> <p>Each cached entry is stored using:</p> <ul> <li>Unique key that defines the entry</li> <li>An actual data</li> <li>Entry lifetime period.</li> <li>The set of triggers --- this is a feature that is not available in many cache system like <a href="http://www.danga.com/memcached/">memcached</a>.</li> </ul> <p>For example: main page that displays 5 recent posts may have a key <code>main_page</code> and triggers: <code>post_123</code>, <code>post_124</code>, ... , <code>post_128</code>. More then that, each time, during page build, when you fetch some cached data, like a sidebar or set of options, their sets of triggers are automatically added to the set of triggers of the page you build.</p> <p>For example, when the page is created and sidebar block is fetched from cache all its triggers are automatically added: if <code>sidebar</code> depends of <code>options</code>, then trigger <code>sidebar</code> and <code>options</code> will be automatically added to triggers of <code>main_page</code>.</p> <p>Thus, when certain trigger is risen, all pages that depends on it are automatically trashed. This makes a cache system quite powerful and easy to control correct data representation.</p> <p>The developer is expected to create a rational model of data/triggers that represent the relations between parts of internal data and rise these triggers when committing changes to database.</p> <p> <a href="/post/21">more...</a> </p> </div> API Changes and mod-prefork http://blog.cppcms.com/post/24 http://blog.cppcms.com/post/24 <div style="direction:ltr"> <p>There have been lot of work in recent weeks in order to make deep internal changes in the framework. Now they include:</p> <ol> <li>Transparent support of 3 web server APIs: fastcgi, cgi and scgi.</li> <li>Support of new mod prefork that allows safer management of worker processes.</li> <li>Implementation of a cache that is shared between forked processes.</li> </ol> <p> <a href="/post/24">more...</a> </p> </div> CppCMS vs WordPress http://blog.cppcms.com/post/22 http://blog.cppcms.com/post/22 <div style="direction:ltr"> <h3>Setup</h3> <p>I had compared two blog systems: this one and WordPress 2.5 with a patched WP-Cache-2 addon. I used following configuration:</p> <ol> <li>Web Server lighttpd 1.4.13</li> <li>Interface FastCGI</li> <li>PHP 5.2</li> <li>Bytecode cacher: XCache 1.2.1</li> <li>Database MySQL 5.0</li> <li>Caching for WP: WP-Cache-2 with an additional <a href="http://art-blog.no-ip.info/cppcms/blog/post/20">performance patch</a></li> <li>Hardware: AMD Athlon XP 64bit, 1G RAM</li> <li>OS: Linux, Debian Etch 64bit.</li> </ol> <p>I prepared two blogs that were filled up with 1000 articles each. Each article had 10 comments, all the articles were organized in 10 categories in each blog.</p> <p> <a href="/post/22">more...</a> </p> </div> Patch For WP-Cache-2 plugin http://blog.cppcms.com/post/20 http://blog.cppcms.com/post/20 <div style="direction:ltr"> <p>I'm going to run a heavy benchmarks comparing WordPress -- the blog system I know very well, with CppCMS based blog -- the system I had written.</p> <p>The new caching system that was developed for CppCMS is quite smart, it stores the entry pages twice: original and gzip compressed. On heavy loads, this allows serving pages significantly faster because only thing that should be done is to push html or compressed html page directly from the cache. Otherwise, gzip compression (even fastest) would take lots of resources and reduces a preformace of the system.</p> <p>When it comes to benchmarks, I had discovered that WP-Cache-2 plugin does the job well, but it caches only html version of the file, thus, even if the page is cached it still must pass a compression by Apache's mod_deflate or by PHP engine itself.</p> <p>I had patched this plugin and now it stores two versions of same page: an original and compressed. and was able to get 60% performace improvement.</p> <ul> <li>WordPress native plugin: 450 requests per second</li> <li>WordPress patched plugin: 720 requests per second</li> </ul> <p>So after this patch I can feel that the benchmarks would be proper, because without it this would be incorrect to compare time required for fetching a cache with the time required for compressing entry page.</p> <p>Links:</p> <ul> <li><a href="http://mnm.uib.es/gallir/wp-cache-2/">WP Cache 2</a></li> <li><a href="http://cppcms.com/files/wp-cache.patch">Patch</a></li> </ul> <p><strong>N.B.:</strong> The full benchmarks coming soon</p> </div> The Roadmap to The First Beta Version of CppCMS http://blog.cppcms.com/post/15 http://blog.cppcms.com/post/15 <div style="direction:ltr"> <p>After quite a long period of development I had decided to get prepared to first public beta release of CppCMS.</p> <p>The major components of this blog and the framework I want to introduce in first beta are following:</p> <ul> <li>Implementation of Django style templates inheritance, filters (done 70%)</li> <li>Introduce powerful cache system (done 100%)</li> <li>Replace <a href="http://soci.sourceforge.net">SOCI</a> by <a href="http://libdbi.sourceforge.net">LibDBI</a> (done 100%)</li> <li>Improve blog: true markdown, LaTeX equations, categories etc. (done 100%)</li> <li>Write Documentation (done 20%)</li> <li>Migrate my Hebrew blog from Word Press to CppCMS (done 100%)</li> </ul> <p>There are lots of work to do, but CppCMS now looks much mature then before.</p> <p> <a href="/post/15">more...</a> </p> </div> Next Step - Caching http://blog.cppcms.com/post/7 http://blog.cppcms.com/post/7 <div style="direction:ltr"> <p>As we had seen in previous article, the <a href="http://art-blog.no-ip.info/cppcms/blog/post/4">benchmarks</a> had shown an ability of CppCMS to produce about 630 compressed pages per second and an output of about 20Mbit/s. Is this enough?</p> <p>For most of cases it is... But as we had seen I want to use every cycle of the CPU as smart as I can. Even, if the model I had suggested, was able to show "a prove of concept" there is an important point that was missed: "Why should I create same page so many times?"</p> <h2>Caching</h2> <p>This is the next logical step in the development of high performance web development framework.</p> <p>First of all we should understand a requirements of the caching system:</p> <ol> <li>Efficiency</li> <li>Support of "dropping cache on update"</li> <li>Support of drop the cache by timeout</li> <li>Work using three models: single process cache, shared cache between processes, shared over the network.</li> <li>Support of caching on entry page level and single view level as well</li> <li>Transparent storage of compressed content</li> </ol> <p>Lets describe each one of them:</p> <p> <a href="/post/7">more...</a> </p> </div>