Monday, September 23, 2013

Scintilla, Sqlite and Boost - a simple syntax highlighting sql editor

Side projects. Tinkering with idea and thoughts. I have for a number of years packaged Sql scripts and resources into resource only dlls to install and update client systems. It has been a mixed bag of results. 'Install always' has been a decent workflow in a lot of scenarios. Gone are the days of burning a day on an issue that might have been thought was a binary problem to find that a script was out of date. I'm more than a little pleased with how the modules are broken up and really if I have one improvement for the system it is in how the resources are distributed. I'm not a huge fan of recompiling to get the script to the client.

I build what some may consider over engineered solutions. When asked to build the third custom import/export tool for a county and knowing a fourth and fifth would be commissioned with 'an aggressive time plan' I started a framework. I refactored the common bits and got to a place where extra projects were only a pair of dll away. Instead of boilerplate logging and application parameters I lived in a happy little world of templated classes and a small list of functions that needed to be overridden. More time was made for getting the high level planning done in time. There was peace in the kingdom.

Then maintenance cycles and occasionally 3 projects would need updated simultaneously. We brought on an additional developer, then another. An older, more senior developer took over the maintenance of one of the projects and my happy kingdom had started to become a more haphazard array of virtual servers where requirements needed to match but sometimes didn't. Boost would get updated on some machines but not others. A service pack for Visual Studio would get installed on one machine but then balk halfway thru another (that machine still runs what I call SP1 and a half). A new developer would change a mapping and chaos would ensue for a half day but the systems are more or less stable. Those are all sorted out right now, as far as I know, so I have a little space to tinker.

So back to packaging resources. In about a half dozen systems I have in production the only updates these days are tuning Sql scripts. I have been working with sqlite and scintilla for a few of my 'spare hours' and nearly have them working well enough to start loading the resources into. By the end of the week I should have them stable enough for a more formal code review. The test harness gui is just a simple CDialog project with a somewhat explorer style layout. Right side to browse available files packed into blob fields in the sqlite database and a simple sql editor on the left with Scintilla providing a rudimentary syntax highlighting. Ideally once in place the other developers and I won't have to recompile satellite dll for sql-only updates. Instead we'll just add the updated files into the .sqlite3 file and the update will be to just replace the old one.

Couple points of interest in getting a mini sql editor working. Have found boost signals2 to be an enormous lift. Not having to code the parent references into child classes, no longer having to work out and maintain lists of registered messages for communication, it's simple. Then, having moved most of the system system over to using mapped files from the boost::iostreams namespace has taken all but solved the large buffer issues. No more ballooning memory or wait to read in a 500+ Mb Xml file. It tickles me. I'm simple like that.

Getting Scintilla to play well with a CDialog was awkward. There is a better than even chance that the most awkward bit of getting that to work is sitting here typing right now. Turned out that PJ Naughter's wrapper ended up needing just the tiniest bit of editing to get the class to register for the CDialog message map. And getting the lexer to compile into a static library,.. that was worth the couple hours it took to get right. I just don
t like having a list of little dll that are needed with it can just be a part of the application

And then Sqlite. Coming from MsSql their syntax has taken a little getting used to, but the resources in blob fields looks stable.

All in all for the number of hours I have into the project I have no real complaints. I should have an alpha online either from here or over on codeplex this week.

Here is the current UI:



Sunday, September 15, 2013

sqlite bulk insert performance via native c++

getting the most out of sqlite on the desktop.

I have been working on a couple desktop applications with Sqlite and insert performance out of the box wasn't quite what I needed. For a simple table being fed from a flat file a little over 200Mb the inserts per second were around 38,000 per second. After tuning the project on my development machine comes up to around 230,000 inserts per second.

After some hints ( google is your friend ) and then some testing I have what seems to be more of less the best configuration :
+ Tune Sqlite
+ Commit after around 50,000 inserts
+ strtok directly into statement.Bind
+ depending on input file stronging consider using boost::iostreams::mapped_file_source for access to the file

SQLite3::CDB is provided of course by PJ naughter

CREATE TABLE IF NOT EXISTS stop_times 
( trip_id        INTEGER 
, arrival_time   TEXT 
, departure_time TEXT 
, stop_id        INTEGER 
, stop_sequence  INTEGER 
, stop_headsign  TEXT 
, pickup_type    TEXT 
, drop_off_type  TEXT 
, traveled       TEXT )

void SetDbParameters( SQLite3::CDB* pDbSQLite )
 {
  if ( NULL == pDbSQLite  )
  {
   // go home
   return;
  }

  std::string sSynchronous  = "PRAGMA synchronous   = OFF";
  std::string sJournalMode  = "PRAGMA journal_mode  = MEMORY";
  std::string sCountChanges = "PRAGMA count_changes = OFF";
  std::string sTempStore    = "PRAGMA temp_store    = MEMORY";
  std::string sPageSize     = "PRAGMA page_size     = 32768";  
  std::string sLockingMode  = "PRAGMA locking_mode  = EXCLUSIVE";

  pDbSQLite->Exec( sSynchronous.c_str() );
  pDbSQLite->Exec( sJournalMode.c_str() );
  pDbSQLite->Exec( sCountChanges.c_str() );  
  pDbSQLite->Exec( sTempStore.c_str() );  
  pDbSQLite->Exec( sPageSize.c_str() );   
  pDbSQLite->Exec( sLockingMode.c_str() );

 }


void PrepareBind( char* buffer )
 {
  SQLite3::CStatement stmt;
  stmt.Prepare(m_db, "INSERT INTO stop_times VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)"); 
  stmt.Bind( 1, strtok (buffer, ",") );         
  stmt.Bind( 2, strtok (NULL, ",") );  
  stmt.Bind( 3, strtok (NULL, ",") );  
  stmt.Bind( 4, strtok (NULL, ",") );  
  stmt.Bind( 5, strtok (NULL, ",") );  
  stmt.Bind( 6, strtok (NULL, ",") );  
  stmt.Bind( 7, strtok (NULL, ",") );  
  stmt.Bind( 8, strtok (NULL, ",") );  
  stmt.Bind( 9, strtok (NULL, ",") );  
  stmt.Step();
  stmt.ClearBindings();
  stmt.Reset();
 }