Next Step - Caching
As we had seen in previous article, the benchmarks had shown an ability of CppCMS to produce about 630 compressed pages per second and an output of about 20Mbit/s. Is this enough?
For most of cases it is... But as we had seen I want to use every cycle of the CPU as smart as I can. Even, if the model I had suggested, was able to show "a prove of concept" there is an important point that was missed: "Why should I create same page so many times?"
Caching
This is the next logical step in the development of high performance web development framework.
First of all we should understand a requirements of the caching system:
- Efficiency
- Support of "dropping cache on update"
- Support of drop the cache by timeout
- Work using three models: single process cache, shared cache between processes, shared over the network.
- Support of caching on entry page level and single view level as well
- Transparent storage of compressed content
Lets describe each one of them:
Efficiency is straightforward feature. However, when we talking about CppCMS, that means that is should be extremely efficient, that usually means -- direct access to the memory structures. No IPC should be used unless we must to.
Drop on update is one of the important parts for the cache. Lets assume, we have and article that is cached in memory. If it is edited, or new comment was posted, we should update the cache.
In some cases, that means that we need to update more then a single page. For example, If I had edited the category name, than the cache of every article that belongs to this category should be dropped or probably we need to drop every cached pages.
So the caching system should be flexible enough in order to support dropping of "range" of cached objects. This usually means, that using "hashed" keys like this is done in memcached is not an option, because this data structure do not allow certain range search.
Timeouts is important feature. In many cases it is very complicated to describe what exactly should be "dropped" on certain updates, thus it is simpler to define timeout, after that the cache will be updated. This may work well for main pages of high load sites that can, for example, update a statistics shown on the main page every 30 seconds.
Memory models -- as we had seen, at this point, CppCMS supports two models of process management -- multithreaded, multiprocess and in the future distributed. Thus, the cache should support all of them.
If we have simple multithreaded model, it is much simpler to implement the cache using STL data structures like map/list etc, however, when we talk about multiprocess model, the cache should be stored in shared memory and managed by all processes, even if some of them may go down or up. The standard STL data structures will not work in shared memory, thus careful design is required.
In order to support "distributed" model, the multiprocess model may be extended to support caching requests over TCP/IP like memcached does.
Levels of caching should be flexible. In some cases I want to compress an entry page, and probably store it in already compressed format in order to reduce CPU load for "gzipping", but in some cases I want to store parts like "sidebar" or "header" that usually do not update frequently but sometimes require quite complex DB requests. We should have and ability to store them as well.
The storage of "gzipped" content should be transparent for the web developer -- he should not care about it.
Design
The cache should be based on following:
- STL map or RB-Tree as key/value storage.
- STL list for storage of Least Recently Used (LRU) entry.
- STL map or Heap for management of timeouts index.
- TCP/IP server module for support of caching over the network.
- The data may be placed in both normal process memory and shared memory.
The data should be protected using read-write locks in order to allow fastest possible concurrent access of readers and safe lock on update.
Can we do more?
Yes!
Lets assume each page has a big header, footer and sidebar that almost never change but the content of the page is dependent on the request, or the main page that includes small part for each user like:
Welcome Your Name, you have 3 personal messages
And the rest of the page is the same. So? The solution seems to be very simple, cache almost all page and update only changed parts.
So we do nothing twice! Indeed?
Lets take a simple example: www.whatsup.org.il
The main page is of size of 87K. The compression of the page using default compression takes about 3.4ms on Athlon 3000. That means, that the server that updates only single user name on the page and compresses it will be able to serve at most 290 pages per second?
Should we compress everything from the beginning? No !!!
There is a possibility to "concatenate" already compressed streams. So we can:
- Store a compressed part of text before "welcome..."
- Store a compressed part of text after "welcome..."
- Compress "wellcome itself"
- Concatenate them all together saving lots of wasted CPU cycles.
You tell I'm crazy? Maybe, but if all this already prepared for you and you need only mark cached and non cached parts and the framework will do this for you... Will you use this?
I think you will.
to be continued...
Comments
Very interesting.
The ability to drop automatically all the page affected by a single modification is a killer feature for me (I currently use memcached). But how will you make this a reality?
What do you mean? Impementation? I described this. I just need to use of "Tree data structure" instead of hash table.
The implementation of cache system like memcached is really very simple, It does very simple job: accept request fetch data from memory, return it.
Today the simple cache (buld on STL) implements all requierd feature.
I still need to implement a support of Process Shared Memory and add a simple TCP/IP server (that is not complex thing to do as well).
The long story is how connect the new cache system to other programs. The major advantage of memcached is wide support of many languages that actually call requierd libs.
After re-reading your message,
There is no "magic": you have to describe -- what do you need to drop. I think, I'll add support of "tags/multiple keys" as well a thing I still need to think about it.
Anyway. The idea you need to drop "ranges of pages"... but to do this on your own. There is no any magic -- nothing replaces the brain of developer ;-)
Thank you for the refinement.
I thought you have planned to make a dependency tree for each cached ressource.
Don't you think you will reinvent the wheel for each part of the project to gain some speed? A layer on top on memcached could be usefull to let people use their actual infrastructure. It may be a bit slower but it would help to keep the project simple and easy to improve.
I think to do a cache with multiple keys (still not described in this article)
So for each page you can define several keys (primary and secondary that may be multiple) and you'll be able to drop all cached pages by certain keys. I still need to work a bit on it and test several cases.
About a layer over memcached -- the problem is you need new API -- functionality that memcached misses. So in any case you need to build something new... However there no reasons that something new will look similar to memcached API or it will be easy to switch to another one.
Add Comment:
You must enable JavaScript in order to post comments.