Faster Prime Number Sieving Using Software Integrated Circuits

I’ve been revisiting Objective-Smalltalk, which led me to contemplate its underlying motivations. The content might not be immediately clear without understanding its origins and the reasons behind them.

Objective-C, specifically the idea of Software-ICs (components interacting through dynamic messaging), played a significant role. This aligns with the Scripted Components pattern (pdf), where a single language, Objective-C, is used for both scripting and components.

Essentially, Objective-C is more of middleware with language features than a language itself. Confusion arises when it’s treated solely as a language, particularly when its “Objective” aspects are used for algorithmic programming, which it’s not well-suited for. The “C” part, constituting about 90% of the language, is designed for that.

To clarify, a Scripted Component system employs one language, often C or C++, for component implementation, and a more dynamic scripting language (e.g., Unix shell, Visual Basic) for connecting them. A prime example is using Tcl for computational steering in supercomputer applications. This highlights a key aspect: combining component language performance with scripting language flexibility and interactivity.

This brings us back to Objective-C. No one using Tcl for steering supercomputers would consider implementing computations in Tcl itself. That’s where C, C++, or potentially FORTRAN come in. Components are implemented in these languages, then combined, parameterized, and invoked from the scripting language.

However, with Objective-C, people often implement algorithmic parts using its scripting/middleware aspect, leading to code like this (from Vincent’s post):


` ``` NSArray *sieve(NSArray *sorted) { if ([sorted count] == 0) { return [NSArray new]; } NSUInteger head = sorted[0].unsignedIntegerValue; NSArray *tail = [sorted subarrayWithRange:NSMakeRange(1, sorted.count - 1)]; NSIndexSet *indexes = [tail indexesOfObjectsPassingTest:^BOOL(NSNumber *element, NSUInteger idx, BOOL *stop) { return element.unsignedIntegerValue % head > 0; }]; NSArray *sieved = [tail objectsAtIndexes:indexes]; return [@[@(head)] arrayByAddingObjectsFromArray:sieve(sieved)]; }

NSMutableArray *numbers = [NSMutableArray new]; for (NSUInteger i = 2; i <= 100; i++) { [numbers addObject: @(i)]; } NSArray *primes = sieve(numbers); NSLog(@"%@ -> %@", numbers, primes); ``` `


This code is nonsensical, akin to implementing computational fluid dynamics in Tcl and scripting it using C. While the Swift equivalent looks cleaner:


` ``` func sieve(_ sorted: [Int]) -> [Int] { guard !sorted.isEmpty else { return [] } let (head, tail) = (sorted[0], sorted[1..&now lt;sorted.count]) return [head] + sieve(tail.filter { $0 % head > 0 }) }

let numbers = Array(2…100) let primes = sieve(numbers) print("(numbers) -> (primes)")

``` `


The initial confusion stemmed from the fact that this code, while computing primes, doesn’t actually implement the Sieve of Eratosthenes!

The Sieve’s essence lies in computing primes without relying on expensive division (or multiplication beyond simple left shifts). It achieves this by initializing an array of candidate numbers (starting from 2, as all numbers are multiples of 1) and systematically eliminating non-primes (multiples of other numbers). For instance, eliminating multiples of 2 involves traversing the array with a step size of 2, and so on. This was highly effective on early microcomputers lacking hardware multiplication and division.

The provided algorithm, however, merely checks divisibility by the current number using modulo, making it an inefficient approach.

Beyond the incorrect algorithm, the bizarre use of Objective-C Foundation objects for integer algorithms is striking.

This code appears strange because it is. Objective-C is designed for building and connecting Software-ICs. Algorithms should be implemented within these Software-ICs using C, then packaged and connected using the dynamic messaging provided by the “Objective” part.

Let’s demonstrate this approach. We can start with a C implementation of the (actual) Sieve of Eratosthenes, like this one from a DuckDuckGo search result here:


  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
`#include #define LIMIT 1500000 /*size of integers array*/
#define PRIMES 100000 /*size of primes array*/

int main(){
    int i,j,numbers[LIMIT];
    int primes[PRIMES];

    /*fill the array with natural numbers*/
    for (i=0;i` 

* * *

While longer than the previous examples, it correctly implements the Sieve of Eratosthenes. It works by initializing an array of size N with integers from 1 to N. The Sieve then marks multiples of n (from 1 to N) by setting the corresponding array element to -1. Finally, all non-negative numbers, representing primes, are extracted from the array and printed.

Since Objective-C is a superset of C, we could use this code directly. However, it lacks certain conveniences like adjustable array sizes, reusability, and streamlined printing. Embracing the Software-IC concept, we recognize that an integer array is central here. While Apple's Foundation lacks this, we can reuse the [one](https://github.com/mpw/MPWFoundation/blob/master/Collections.subproj/MPWIntArray.h) implementation from [MPWFoundation](https://github.com/mpw/MPWFoundation).

As we'll be operating extensively on this array, extending [MPWIntArray](https://github.com/mpw/MPWFoundation/blob/master/Collections.subproj/MPWIntArray.h) with the core algorithm as a category seems appropriate.

Additionally, breaking down the Sieve algorithm into smaller, well-named functions and utilizing conveniences like `-select:` for filtering improves readability.  We can also incorporate functionality for printing the array contents.

* * *

 ` ```
@import MPWFoundation;

@implementation MPWIntArray(primes)

-(void)knockoutMultiplesOf:(int)n
{
    int *numbers=data;
    if (n > 0){
       for (int j=2*n-2,max=[self count];j 1 ? atoi(argv[1]) : 100000;
    MPWIntArray *numbers=[[@(2) to:@(n-2)] asArray];
    [numbers sievePrimes];

    MPWIntArray *primes = [numbers select:^(int potentialPrime) {
        return (BOOL)(potentialPrime > 0);
    }];
    printf("primes: %s\n",[[primes description] UTF8String]);
    printf("number of primes: %d\n",[primes count]);
} 
``` ` 

* * *

Performance is a significant advantage of the Software-IC or Scripted Components style; otherwise, its use in supercomputer environments wouldn't be justifiable. I compared the Swift and Software-IC versions (ignoring the original Objective-C code) and observed a remarkable performance gap, making it challenging to find N values that yielded somewhat comparable results:

* * *

    

      

        **N**
      

      

        **Swift Sieve**
      

      

        **Software-IC Sieve**
      

      

        **Ratio**
      

    

	
    

      

        **100000**
      

      

        0,501
      

      

        0,006
      

      

        83,5
      

    

    

      

       **200000**
      

      

        1,78  
      

      

          0,007  
      

      

          254,3  
      

    

    

      

          **300000**
      

      

          3,78  
      

      

          0,008  
      

      

          472,5  
      

    

    

      

          **400000**  
      

      

          6,43  
      

      

          0,009  
      

      

          714,4  
      

    

    

      

          **500000**
      

      

          9,74  
      

      

          0,010  
      

      

          974,0  
      

    

![Sieve times](https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEimF-AEYr9DkgWDhame-y2fgfjQ9e05O9rGGtD3xf9fIWC4jU7mrO9d1J-rFHe-UWdGL1LQ9U6jGLKJsB2LtzQNUbfmO9DbJCfqIN5Vq1Td3H5Rj6fjpVd5ehU6RtCU4g1A_47mBK2VA8JT/s0/sieve-times.png "sieve-times.png")

* * *

I couldn't push the Swift version much beyond N=500,000 because it consumed around 8GB of memory (and increasing) at that point. Paging started on my relatively busy machine (with Xcode open), and I wanted to avoid further "benchmark-cleaning." The Software-IC version, on the other hand, was barely affected at these N values. Its execution time was dominated by startup overhead and random fluctuations. In fact, running it with N=1 took 6-8ms, similar to N=500,000.

So, while Swift might be a better choice for computations than misusing Objective-C's middleware, Objective-C excels when used correctly for building and interfacing with Software-ICs.  This doesn't imply we can't improve both aspects further: better algorithms and, more importantly, better middleware with clearer separation of concerns. Unsurprisingly, this is the general idea behind Objective-Smalltalk, but that's a topic for another time.

**UPDATE:**

My brief post sparked a Twitter discussion that proved more insightful.  While my focus was on the scripted components pattern and its relationship to Objective-C's Software-ICs, I may have overemphasized the performance aspect. It might have also come across as a "Swift vs. Objective-C" debate.  To be clear, the initial Objective-C code was likely the slowest, and I'm confident that a C-like Sieve implementation in Swift could achieve comparable performance.

My main point was to highlight the misunderstanding and misapplication of Objective-C, which I don't attribute to ignorance. It stems from the unresolved tension between scripted components and unified languages.

Helge succinctly captured the essence:

> ObjC is pretty awesome in how it manages to embed a “COM”. Swift doesn’t (currently) provide anything like it (and IMO it lost a chance not just using the ObjC runtime for that)
> 
> — Helge Heß (@helje5) [January 6, 2019](https://twitter.com/helje5/status/1081991902398988288?ref_src=twsrc%5Etfw)

  

Joe Groff offered another insightful perspective:  

> It’s kinda funny that in many ways pre-Foundation AppKit was better at this, since the only classes were for things that make sense as objects. As soon as you have data structure objects it becomes an attractive nuisance to do algorithms on them
> 
> — Joe Groff (@jckarter) [January 6, 2019](https://twitter.com/jckarter/status/1082019464663597056?ref_src=twsrc%5Etfw)

  

And a surprising revelation:  

> An early version of FoundationKit defined the geometry types as objects. There’s a reason that one didn’t ship 😬
> 
> — Jeff Nadeau (@jnadeau) [January 6, 2019](https://twitter.com/jnadeau/status/1082052868469739520?ref_src=twsrc%5Etfw)

  

Finally, apologies to Vincent for singling him out, even if slightly.
  

Sieve times

Licensed under CC BY-NC-SA 4.0
Last updated on Jul 26, 2023 19:04 +0100