Enhanced JSON Support for iOS/macOS, Part 9: CSV and SQLite

When examining the MPWPlistStreaming protocol used in my JSON parsing series, you might have noticed it’s not specifically designed for JSON. It wasn’t even initially created for parsing, but for generating.

This leads to the question: can we use it for other deserialization tasks? Absolutely!

CSV parsing

In my performance book, one example focuses on parsing Comma Separated Values rapidly. The goal was to optimize the processing of a 139Mb GTFS file on a phone. Using CoreData/SQLite took 20 minutes, which was reduced to less than a second by implementing custom in-memory data structures that were significantly faster for querying.

The initial CSV parser in the project took roughly 18 seconds. While not a significant portion of the original 20 minutes, it became a bottleneck when the rest of the process was optimized to a few hundred milliseconds. The updated parser, MPWDelimitedTable ( .h .m ), utilizes a block-based approach for efficiency.

Here’s the basic interface:


` ``` -(void)do:(void(^)(NSDictionary* theDict, int anIndex))block;

``` `


The block is executed for each row, receiving a dictionary containing the header row as keys and the row contents as values. Adapting this to the MPWPlistStreaming protocol is simple:


``` -(void)writeOnBuilder:(id )builder { [builder beginArray]; [self do:^(NSDictionary* theDict, int anIndex){ [builder beginDictionary]; for (NSString *key in self.headerKeys) { [builder writeObject:theDict[key] forKey:key]; } [builder endDictionary]; }]; [builder endArray]; } ```


This direct implementation leverages the existing API. While not perfectly optimized, it functions with current builders and avoids storing the entire CSV in memory. A future enhancement could involve inverting this API usage. By modifying the low-level API to use MPWPlistStreaming and layering a higher-level block- and dictionary-based API on top, it can be made compatible with other MPWPlistStreaming clients.

SQLite

SQL databases represent another tabular data format. SQLite, commonly accessed through CoreData or the lightweight fmdb, is frequently used on macOS/iOS.

While my initial thought was to create a MPWPlistStreaming adapter for fmdb, a closer examination revealed it handled more than necessary for MPWPlistStreaming. I opted for a minimalistic approach directly utilizing the SQLite C-API. The result is remarkably concise and straightforward:


` ``` @interface MPWStreamQLite()

@property (nonatomic, strong) NSString *databasePath;

@end

@implementation MPWStreamQLite { sqlite3 *db; }

-(instancetype)initWithPath:(NSString*)newpath { self=[super init]; self.databasePath = newpath; return self; }

-(int)exec:(NSString*)sql { sqlite3_stmt res; int rc = sqlite3_prepare_v2(db, [sql UTF8String], -1, &res, 0); @autoreleasepool { [self.builder beginArray]; int step; int numCols=sqlite3_column_count(res); NSString keys[numCols]; for (int i=0;i < numCols; i++) { keys[i]=@(sqlite3_column_name(res, i)); } while ( SQLITE_ROW == (step = sqlite3_step(res))) { @autoreleasepool { [self.builder beginDictionary]; for (int i=0; i < numCols; i++) { const char text=(const char)sqlite3_column_text(res, i); if (text) { [self.builder writeObject:@(text) forKey:keys[i]]; } } [self.builder endDictionary]; } } sqlite3_finalize(res); [self.builder endArray]; } return rc; }

-(int)open { return sqlite3_open([self.databasePath UTF8String], &db); }

-(void)close { if (db) { sqlite3_close(db); db=NULL; } }

``` `


This implementation primarily focuses on reading data, omitting updates, inserts, or deletes. Despite its simplicity, it’s efficient and convenient, although there’s room for improvement.

Often in development, achieving simplicity, convenience, and speed simultaneously is challenging. Simple and convenient code can be slow, convenient and fast code can be complex, and simple and fast code might be inconvenient.

Let’s illustrate its ease of use. This code snippet transforms a table into an array of dictionaries:


``` #import int main(int argc, char* argv[]) { MPWStreamQLite *db=[[MPWStreamQLite alloc] initWithPath:@"chinook.db"]; db.builder = [MPWPListBuilder new]; if( [db open] == 0 ) { [db exec:@"select * from artists;"]; NSLog(@"results: %@",[db.builder result]); [db close]; } else { NSLog(@"Can't open database: %s\n", [db error]); } return(0); } ```


This example demonstrates good performance but utilizes a generic data structure (array of dictionaries) for the results, which might not be optimal. (One indication of CoreData’s predecessor, EOF, not being particularly fast was that fetching raw dictionaries was promoted as an optimization over fetching objects.)

What if we want to work with objects directly? The solution is straightforward: replace MPWPListBuilder with MPWObjectBuilder, providing the desired class as a parameter. Naturally, the class definition is needed, but it’s assumed to already exist for object conversion tasks. This process could also be automated.


` ```

#import @interface Artist : NSObject { }

@property (assign) long ArtistId; @property (nonatomic,strong) NSString *Name;

@end

@implementation Artist

-(NSString*)description { return [NSString stringWithFormat:@"<%@:%p id: %ld name: %@>",[self class],self,self.ArtistId,self.Name]; }

@end

int main(int argc, char* argv[]) { MPWStreamQLite *db=[[MPWStreamQLite alloc] initWithPath:@“chinook.db”]; db.builder = [[MPWObjectBuilder alloc] initWithClass:[Artist class]]; if( [db open] == 0) { [db exec:@“select * from artists”]; NSLog(@“results: %@”,[db.builder result]); [db close]; } else { NSLog(@“Can’t open database: %s\n”, [db error]); } return(0); } ``` `


Importantly, this approach bypasses creating an intermediate plist representation, directly converting database results into objects. The MPWPlistStreaming protocol acts as a generic intermediary “format,” treating plist and objects as peers.

TOC

Somewhat Less Lethargic JSON Support for iOS/macOS, Part 1: The Status Quo
Somewhat Less Lethargic JSON Support for iOS/macOS, Part 2: Analysis
Somewhat Less Lethargic JSON Support for iOS/macOS, Part 3: Dematerialization
Equally Lethargic JSON Support for iOS/macOS, Part 4: Our Keys are Small but Legion
Less Lethargic JSON Support for iOS/macOS, Part 5: Cutting out the Middleman
Somewhat Faster JSON Support for iOS/macOS, Part 6: Cutting KVC out of the Loop
Faster JSON Support for iOS/macOS, Part 7: Polishing the Parser
Faster JSON Support for iOS/macOS, Part 8: Dematerialize All the Things!
Beyond Faster JSON Support for iOS/macOS, Part 9: CSV and SQLite

Licensed under CC BY-NC-SA 4.0
Last updated on Nov 04, 2022 18:54 +0100