Home  /  RSS  /  RSS Comments  /  Enter

SOCI or DBI

Saturday, March 8, 2008, by artyom ; Posted in: Storage, Benchmarks, Berkeley DB; 8 comments

One of the problematic issues in writing cross-SQL code is an API that differs from one SQL to another.

There are two open source libraries that provide unified API

  1. soci -- C++ library.
  2. dbi -- C library.

At the first point I had chosen soci as native solution of C++ programmer. After running some benchmarks on the new version of this blog I had found 20% performance reduction for MySQL database. But I also remembered that there should be negligible difference between MySQL and Berkeley DB. This was mostly due to incorrect design of my BDB database layout I had done.

That had seem to be strange and I stared benchmarking the system more and more.

First of all I had tested the SQL drivers: native driver and soci wrapper. I had found that soci was significantly slower. I could think that it was due to wrapping that can never be as fast as native version.

So I tested other library dbi and I had written a little C++ wrapper I had called "dbixx". This wrapper mostly provides syntactic sugar, integration with C++ exceptions and allows auto escaping of input parameters -- all the job that soci had done.

I created a simple table:

CREATE TABLE numbers
(
   id integer primary key not null,
   number varchar(10) not null
 );

Filled it with 10,000 rows of type (N,'N') And run 10,000 queries:

SELECT id,number
FROM   numbers
WHERE  id>?
ORDER BY id DESC
LIMIT 10

With "?" changing from 1 to 10,000.

I measured overall run time of 10,000 queries using 4 drivers: native, soci, dbi and dbixx.

There are results:

Driver   MySQL PgSQL  Sqlite3  MySQL PgSQL Sqlite3
--------------------------------------------------
native   0.62  2.04   0.76
--------------------------------------------------
soci     1.07  3.30   1.47     -42%  -38%  -48%
dbi      0.68  2.13   0.95     - 9%  - 4%  -20%
dbixx    0.73  2.22   0.97     -14%  - 8%  -22%

Any wrapper had introduced a performance reduction, however soci had reduced the performance by about 40%, when dbi had given much better result. C++ wrapper of dbi had increased a runtime a little by additional few percents however still had given much better results then soci.

I must admit that dbi hadn't given good results for sqlite3. But in overall it behaves better then soci.

When I had written dbixx wrapped I had taken may good ideas form soci, so it was quite simple to covert the code of this blog to dbixx from soci and I did it in about half an hour.

And then run new benchmarks (similar to these: one, two):

Data Base       gzip no gzip   gzip no gzip
-------------------------------------------
Berkeley DB     565  830                  

MySQL    (dbi)  580  800       + 3%   - 4%
MySQL    (soci) 475  645       -16%   -22%

Sqlite3  (dbi)  330  410       -42%   -51%
Sqlite3  (soci) 410  515       -27%   -38%

PgSQL    (dbi)  350  430       -38%   -48%
PgSQL    (soci) 305  360       -46%   -57%
-------------------------------------------

I still do not understand why, but sqlite3 does not behave well with dbixx driver, it had given lower results then soci one, the situation that contradicts driver benchmarks I did before.

However it is clear, soci is not as fast as I expected. It is probably well suited for many purposes, but for high performance web framework it is not good enough (when there are some other alternatives).

Comments

ik_5, at 3/8/08 1:36 PM

Object oriented is usually slower then regular procedural approach. For example exception handling takes some of the speed, so does references of the object itself.

So I think that it is problematic to compare two different approaches on the same level. I also believe that if you will develop a custom made engine that will suite only for your need, and you will write it in a very optimized way, it will be faster, so you must judge IMHO each engine on it's own criteria.

artyom, at 3/8/08 2:14 PM

Object oriented is usually slower then regular procedural approach

This is not correct in 99% of cases. Actually in order to do fair comparison I had written dbixx. More then that, dbi library is already object oriented (even it is not written in C++).

For example exception handling takes some of the speed

Not correct, exception are never used in normal flow. You need that only to manage true errors on different non standard situations.

So both soci and dbi(+xx) do the same job. So I do expect from them work fast in same way.

Shai, at 3/9/08 3:12 PM

For example exception handling takes some of the speed

Not correct, exception are never used in normal flow. You need that only to manage true errors on different non standard situations.

No, ik_5 is correct. Even when exceptions aren't actually thrown, code compiled with support for exceptions has to "prepare" itself for them -- every block that contains code which might trigger an exception essentially becomes a try-block; and because this is a compile-time decision, almost every non-inlined function call becomes a potential exception-thrower.

artyom, at 3/9/08 5:26 PM

code compiled with support for exceptions has to "prepare" itself for them — every block that contains code which might trigger an exception essentially becomes a try-block

Every try/catch blog does some code, but you should not include try catch in every function, but in top one.

If you want to check yourself take this code:

extern "C" void somefunc(void);

struct SOME_CLASS {
  SOME_CLASS();
  ~SOME_CLASS();
};

SOME_CLASS::SOME_CLASS() { somefunc(); };
SOME_CLASS::~SOME_CLASS() { somefunc(); };


void throw_me()
{
    int n=10;
    throw n;
}

int run_throw(int x)
{
    SOME_CLASS a;
    throw_me();
    return x*2;
}

void catch_me()
{
    try{
        run_throw(10);
    }
    catch(int n){
        //
    }
}

Produce an assembly with

gcc -O2 -S test.cpp

You will see that in "run_throw" there is no peparations preparations for "exceptions" for normal flow... So it would be even more effective then "C" style exception handling with checking return value for each call.

So... The thing you tell just not correct

Chris, at 4/14/08 5:30 PM

I am looking at using SOCI for a website back-end and I wanted to know if you tried ODBC vs. MySQL on the same database. Looking through the code it look like SOCI-MySQL does string processing of result sets and inputs whereas SOCI-ODBC binds the data directly. I would expect binding the data would be much faster.

artyom, at 4/14/08 7:23 PM

As far as I remember soci does not use binding, however in any case any ODBC solutions are usually slower because they use generic protocol rather than native driver. The major bottleneck of such applications is IPC rather then "copying memory".

IMHO, soci mysql driver has significant problems:

  1. It lacks support of RowID for mysql backend. See here
  2. You can't do processing row by row unless you use prepared statements (even soci mysql driver do not use them)

There are the reasons I prefered libdbi. I'd suggest to try it -- it seems to be much more mature driver.

You can also try my dbixx C++ wrapper that has quite complete support of libdbi features and has quite similar syntactic sugar to soci(BSD licensed).

Another option, if you use MySQL to take any other C++ wrappers for it, there are lots.

Sergey, at 11/14/11 10:50 AM

Artyom what you think about nosql? www.redis.io

artyom, at 11/14/11 4:35 PM

Sergey,

There are plenty of Key-Value stores, some fast some not, some transactional some not, some durable.

Bottle line. It depends on your tasks and needs.

Add Comment:

 
 the email would not displayed
 

You can write your messages using Markdown syntax.

You must enable JavaScript in order to post comments.

Pages

Categories