A Slightly More Energetic Approach to JSON Support for iOS/macOS, Part 1: The Current Situation

I recently watched Daniel Lemire’s talk about simdjson, a JSON parser that achieves an impressive speed of 2.5GB/s. This sparked a memory of a Twitter discussion where I offered to develop a fast, Swift-compatible JSON parser inspired by MAX, my XML parser. While replicating simdjson’s speed isn’t feasible due to certain limitations, I believe a faster alternative to existing Swift parsers is achievable.

Environmental assumptions and constraints

For this endeavor, we’ll focus on the Apple ecosystem, specifically macOS. Our goal is to parse data into usable domain objects (or structs) within our applications.

Our example uses a simple class containing integers and a string, defined in both Swift:


` ``` @objc class TestClass: NSObject, Codable { let hi:Int let there:Int let comment:String … }

``` `


and Objective-C:


` ``` @interface TestClass : NSObject

@property (nonatomic) long hi,there; @property (nonatomic,strong) NSString *comment;

@end

``` `


We’ll use a 44MB JSON file containing a million instances of this class, populated with incrementing integers and the string “comment.” This file, consistent across different serialization methods, will be used to benchmark various parsers.

First, we need to establish a performance baseline by measuring the time it takes to create these objects directly in code, without parsing:

This will give us an upper limit on achievable parsing performance.


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
`#define COUNT 1000000
-(void)createObjects
{
    NSMutableArray *objResult=[NSMutableArray arrayWithCapacity:COUNT+20];
    for ( int i=0;i` 

* * *

On my system, this completes in 0.141 seconds, translating to a rate of 312 MB/s. Interestingly, this is considerably slower than the JSON parsing speed of Lemire's parser, highlighting that object creation overhead can be significant.

It's counterintuitive that object creation, not the low-level parsing itself, is often the bottleneck.  While character-level processing resides in the innermost loop, the sheer magnitude of performance differences between inner and outer loops can shift the performance bottleneck outwards.

### NSJSONSerialization

Apple's primary JSON handling mechanism is `[NSJSONSerialization](https://developer.apple.com/documentation/foundation/nsjsonserialization)`, similar to how `NSPropertyListSerialization` handles general serialization. Its performance is decent, converting our JSON file into an `NSArray` of `NSDictionary` instances in 0.421 seconds (105 MB/s) when called from Objective-C, and 0.562 seconds (78 MB/s) from Swift.

However, this only gets us a property list representation, not the desired domain objects.

![](https://dl.dropbox.com/s/tx10szymoh5kc97/json2dict2objects.png?dl=0)

As detailed in my book (more on that later!), dictionaries come with inherent performance costs in creation, memory, and access.  Creating dictionaries equivalent to our test objects takes 0.321 seconds (137 MB/s), significantly slower than creating the objects directly.

* * *

-(void)createDicts { NSMutableArray *objResult=[NSMutableArray arrayWithCapacity:COUNT+20]; for ( int i=0;i


While using dictionary literals or immutable copies offer minor improvements, the process remains expensive.

Bridging the gap from dictionaries to objects typically involves manually fetching values and invoking setters. For this benchmark, we’ll employ Key Value Coding (KVC) for simplicity:


` ``` -(void)decodeNSJSONAndKVC:(NSData*)json { NSArray *keys=@[ @“hi”, @“there”, @“comment”]; NSArray *plistResult=[NSJSONSerialization JSONObjectWithData:json options:0 error:nil]; NSMutableArray *objResult=[NSMutableArray arrayWithCapacity:plistResult.count+20]; for ( NSDictionary *d in plistResult) { TestClass *cur=[TestClass new]; for (NSString *key in keys) { [cur setValue:d[key] forKey:key]; } [objResult addObject:cur]; } NSLog(@“NSJSON+KVC %@ with %ld objects”,objResult[0],[objResult count]); }

``` `


Keep in mind that KVC is notably slower than direct method calls, significantly impacting the overall performance. The combined parsing and object creation using this method takes 1.142 seconds (38 MB/s).

Swift JSON Coding

Swift’s initial JSON support relied on a wrapped NSJSONSerialization, inheriting its performance characteristics. Numerous third-party “parsers” emerged, but most (except for Big Nerd Ranch’s Freddy) merely transformed the output of NSJSONSerialization, resulting in significant overhead.

Swift’s Codable protocol promised a more efficient solution.


` ``` func readJSONCoder(data:Data) -> [TestClass] { NSLog(“Swift Decoding”) let coder=JSONDecoder( ) let array=try! coder.decode([TestClass].self, from: data) return array }

``` `


While Codable excels in convenience, its performance (10 MB/s) lags behind NSJSONSerialization and the KVC approach. However, it remains significantly faster than previous third-party solutions.

Third Party JSON Parsers

Examining third-party parsers like JASON, STJSON, and ZippyJSON reveals varying performance levels. STTJSON (10 MB/s) and JASON (59 MB/s) both fall short compared to NSJSONSerialization.

ZippyJSON stands out by utilizing simdjson and being Codable compatible. While I couldn’t compile it for testing, its claimed 3x speed increase over JSONDecoder would place it in a competitive position.

ZippyJSON’s documentation attributes its speed to bypassing the intermediate NSDictionary creation used by Apple’s implementation. While true, this isn’t the primary bottleneck. Our benchmarks show that Codable itself contributes significantly to the performance overhead.

To confirm this, I tested Flight-School, a MessagePack implementation of Codable. Despite not using NSJSONSerialization, it took 18 seconds to decode the equivalent data (2.4 MB/s normalized), further illustrating Codable’s performance limitations.

MAX and MASON

Given simdjson’s demonstrated potential, the current state of JSON parsing in Swift appears bleak. The performance gap between simdjson’s 2.5 GB/s and JSONDecoder’s 10 MB/s is stark, highlighting the need for improvement.

This is where my experience with MAX (MAX) comes in. MAX is a high-performance XML parser that offers both super-convenient and the ability to efficiently create object representations from XML data. (Details can be found in my book, mentioned earlier!)

This begs the question: can these techniques be applied to JSON parsing? The answer is cautiously optimistic. While JSON’s lack of explicit structure adds complexity, the core principles can be adapted.

The goal is to create a parser that approaches the 300 MB/s theoretical limit while maintaining a level of convenience comparable to Codable. Direct Codable support will be avoided due to its overheads, but integration possibilities, as suggested by ZippyJSON, will be explored.

This parser, currently named MPWMASONParser, is a work in progress. In its initial implementation, it parses JSON to dictionaries in 0.58 seconds (76 MB/s), slightly slower than NSJSONSerialization.

There’s a long way to go! Join me on this exploration to enhance JSON parsing performance in Swift.

TOC

Somewhat Less Lethargic JSON Support for iOS/macOS, Part 1: The Status Quo
Somewhat Less Lethargic JSON Support for iOS/macOS, Part 2: Analysis
Somewhat Less Lethargic JSON Support for iOS/macOS, Part 3: Dematerialization
Equally Lethargic JSON Support for iOS/macOS, Part 4: Our Keys are Small but Legion
Less Lethargic JSON Support for iOS/macOS, Part 5: Cutting out the Middleman
Somewhat Faster JSON Support for iOS/macOS, Part 6: Cutting KVC out of the Loop
Faster JSON Support for iOS/macOS, Part 7: Polishing the Parser
Faster JSON Support for iOS/macOS, Part 8: Dematerialize All the Things!
Beyond Faster JSON Support for iOS/macOS, Part 9: CSV and SQLite

Licensed under CC BY-NC-SA 4.0
Last updated on Nov 26, 2022 10:41 +0100