## Collatz sequence generation performance profiling in Clojure (Project Euler Problem 14).

I investigate the performance of several *Collatz* sequence generation algorithms through profiling *Clojure* code, inspired by *Project Euler* Problem 14 (no spoilers as I won't show the solution number in this article, however, if you run the code you will get the solution as the return value). My approach focuses on the number of specific operations of interest rather than clock time (but clock figures are also discussed).

### Motivation

John Conway in 1972 proved that *Collatz*-type problems can be formally undecidable. Better yet, Paul ErdÅ‘s commented that *“mathematics is not yet ready for such problems.”* See the *Wolfram* *MathWorld* entry for the Collatz Problem for more historical context and extensive bibliography.

### Problem

The Collatz conjecture, proposed by Lothar Collatz, states that all *Hailstone* sequences as generated by the following piece-wise function eventually reach the number `1`

:

So for example, starting with `6`

, we obtain the following sequence:

`6 → 3 → 10 → 5 → 16 → 8 → 4 → 2 → 1`

The 14th problem on Project Euler challenges us to find the starting number, under one million, which results in the longest *Collatz* sequence.

### Solutions and Performance Analysis

Since the *Clojure* contrib package for profiling was removed, we will use the profiling facilities provided by the wonderful timbre library by Peter Taoussanis; hence all code examples are surrounded as such:

For clock time, we use versions of the code without any profiling macros and rely on the invaluable Criterium library by Hugo Duncan; hence all code examples are surrounded as such:

All benchmarks shown here were done on my machine, an iMac with 2.93 GHz Intel Core i7 and 1333 MHz DDR3 RAM, running x86_64 Mac OS X 10.9.3 with Oracle's JVM 1.8.0_05-b13 and HotSpot 64-Bit Server VM 25.5-b02 mixed mode.

#### A. Brute Force

The most staightforward strategy for solving this problem is to generate the *Collatz* sequence for each integer starting from `1`

up to `limit`

, and then find the longest one.

Of course, this strategy is pretty hopeless, which is not a surprise for a problem from *Project Euler*. The running time is in the order of *O(n * C(n))* as we calculate one complete sequence for each integer, and the length of each sequence, *C(n)*, is dependent on its first integer (this function “goes up and down”, hence the name *hailstone*). The space requirement is in *O(n)* as we only keep each integer and corresponding sequence length once.

Since the *Collatz conjecture* is an open problem, and we don't know whether the successor function is defined for every *n*, we cannot provide an asymptotic time complexity measure. The best we can do for now is to claim that the complexity of *C(m)* is proportional to *Θ(m)* for known sequences starting with *m* in the range `1`

to `limit`

.

From the profile names, we see that our algorithm has predictably generated one series less than the limit (since we started incrementing from `2`

, but has calculated over *131 million* *hailstone* successors! Clearly there is a lot of repetition. On my fairly modern i7 iMac this takes well over five minutes to complete and gobbles around *1 GB* of memory, causing over one thousand garbage collector sweeps.

Profiled Name | Instance Count |
---|---|

:user/collatz-seq | 999,998 |

:user/next-collatz | 131,434,272 |

#### B. Simple Memoization

The problem becomes more easily solvable once we notice that every *Collatz* sequence will eventually contain, as a suffix, a shorter *Collatz* sequence, and assuming the convergence hypothesis to be true, all such suffix sequences will converge to the integer `1`

. Therefore, we will employ a memoized algorithm which generates sequences starting from an integer `N`

but only up to an integer `N'`

for which a sequence has already been found, and proceed inductively (the base case is the sequence of length `1`

which contains `1`

itself); the length of the sequence corresponding to the integer `N`

is then the length of the sequence up to the integer `N'`

plus the length of the stored sequence for `N'`

.

We have now constrained the number of sequence generations to the order of *O(n)* because we will never generate a *hailstone* more than once, although given that *Clojure*'s lookup time in our memoization map is *O(log n)*, our algorithm is technically still in *O(n log n)*; however, the actual cost for *Clojure*'s lookup operations is so small (*O(log32 n)*) that this algorithm significantly outperforms the previous brute force one, as expected, and thus we may consider it near *O(n)*. The space requirements have grown, however, since we keep successors in memory, but only slightly as it is within *O(n * C(n))*.

Once again, the algorithm generated one series less than the limit, as before, but now it has calculated just over *two million* *hailstones*. On the same machine, this takes roughly one and half minutes, and *1 GB* of memory with only *27* garbage collector sweeps since the memoization does not lose references to its contents.

Profiled Name | Instance Count |
---|---|

:user/collatz | 999,998 |

:user/collatz-seq | 999,998 |

:user/next-collatz | 2,168,610 |

Benchmark Metric | Value |
---|---|

Execution time sample mean | 1.087031 min |

Execution time mean | 1.087147 min |

Execution time sample std-deviation | 881.593049 ms |

Execution time std-deviation | 887.078664 ms |

Execution time lower quantile | 1.072142 min ( 2.5%) |

Execution time upper quantile | 1.129129 min (97.5%) |

Overhead used | 3.385260 ns |

#### C. Sequences-Only Memoization

We can make a trade between calculating successor *hailstones* and memoizing all of them, by memoizing only entire sequences. Once a sequence reaches a number for which a sequence has previously been generated, its length is calculated by adding the memoized length to the current sequence length.

This time we have calculated over *five million* *hailstone* successors (again *O(n * C(n))*, but kept our space requirements to the order of *C(n)*. Unfortunately, we have now introduced *2 * (C(n) + n)* look ups in the memoization (*:user/existing-lookup-condition* and *:user/existing-lookup*), thanks to *Clojure*'s implementation of `split-with`

which uses `take-while`

and `drop-while`

resulting in the predicate being called more than once on many *hailstones*. Still, this algorithm takes roughly *4.1 seconds* to complete, and *0.5 GB* memory with *139* garbage collector sweeps.

Profiled Name | Instance Count |
---|---|

:user/collatz | 999,998 |

:user/next-collatz | 5,226,259 |

:user/existing-lookup-condition | 12,452,514 |

:user/existing-lookup | 999,998 |

:user/existing-assoc | 999,998 |

Benchmark Metric | Value |
---|---|

Execution time sample mean | 3.989335 sec |

Execution time mean | 3.989318 sec |

Execution time sample std-deviation | 21.328040 ms |

Execution time std-deviation | 22.139850 ms |

Execution time lower quantile | 3.950992 sec ( 2.5%) |

Execution time upper quantile | 4.030324 sec (97.5%) |

Overhead used | 3.147864 ns |

#### D. Sequences-Only Memoization Without Extra Lookups

We can make yet another trade and avoid the extra lookups by calculating a few more successors. In order to do this, we need to avoid `split-with`

from the previous algorithm and replace it with `take-while`

; unfortunately, `take-while`

does not return the last *hailstone* as it's the one satisfying the predicate `split`

, so we will have to call `next-collatz`

again for the last *hailstone* and thus duplicate a small amount of our work, specifically *n* additional successor calculations.

We got rid of all but *n* of the extra lookups, and got a small speed up: this algorithm runs in roughly *3.7 seconds*, just *400 milliseconds* faster than the previous one (and uses the same memory). Can we get rid of the duplication as well?

Profiled Name | Instance Count |
---|---|

:user/collatz | 999,998 |

:user/next-collatz | 6,226,257 |

:user/existing-lookup-condition | 6,226,257 |

:user/existing-lookup | 999,998 |

:user/existing-assoc | 999,998 |

Benchmark Metric | Value |
---|---|

Execution time sample mean | 3.630457 sec |

Execution time mean | 3.630670 sec |

Execution time sample std-deviation | 20.309249 ms |

Execution time std-deviation | 21.256671 ms |

Execution time lower quantile | 3.606157 sec ( 2.5%) |

Execution time upper quantile | 3.677730 sec (97.5%) |

Overhead used | 3.174435 ns |

#### E. Sequences-Only Memoization Without Extra Lookups (Another Attempt)

We can refactor the previous solution to avoid the `n`

lookups (named *:user/existing-lookups*) and the duplicated calls to `next-collatz`

. In order to do so, we note that *Clojure*'s `take-while`

will not yield the value for which the predicate is `false`

, which we need since it's the first memoized *hailstone*. We also don't want to use `split-with`

since the lookups in the predicate will be repeated many times, defeating our purpose. Furthermore we can't return more than one value from our successor `collatz-next`

.

We can, however, modify this successor to do two things: first, it will now be responsible for making the lookups and returning the memoized length needed by `collatz`

, appended to the sequence after the last new *hailstone*; second, it will employ a sentinel value in the form of a different data type for the last, special sequence entry, just before returning `nil`

and terminating `collatz`

's iteration.

We got rid of the extra lookups and the duplication, but this algorithm runs slower! I takes roughly *five and a half seconds*. Evidently, *Clojure*'s overhead is too large for these values of *n*.

Profiled Name | Instance Count |
---|---|

:user/collatz | 999,998 |

:user/next-collatz | 5,226,259 |

:user/existing-lookup-condition | 6,226,257 |

:user/existing-assoc | 999,998 |

Benchmark Metric | Value |
---|---|

Execution time sample mean | 4.821011 sec |

Execution time mean | 4.820981 sec |

Execution time sample std-deviation | 27.390670 ms |

Execution time std-deviation | 27.611627 ms |

Execution time lower quantile | 4.761616 sec ( 2.5%) |

Execution time upper quantile | 4.878438 sec (97.5%) |

Overhead used | 3.170384 ns |

#### F. Iterative Sequences-Only Memoization

Recall that the length of a sequence is actually what we're looking for: there is no need for our algorithm to calculate the sequence for *any* value of `n`

which appears as a *hailstone* in a memoized sequence, since the distance of that *hailstone* from the end of the sequence can be trivially measured. Therefore, the following algorithm 'processes' each sequence and keeps a mapping between each *hailstone* and its distance from the end. The larger the new sequence, the more lengths for values of `n`

we get (in near constant time as we `recur`

on the sequence).

We only needed to process roughly four hundred thousand sequences, and got away with skipping over half a million sequences as they involved a value of `n`

for which we had already calculated its length, mostly from the distance of its value from the end of a memoized sequence. We're back down to roughly *four seconds*, but only *0.3 GB* of memory required this time.

Profiled Name | Instance Count |
---|---|

:user/collatz | 999,998 |

:user/next-collatz | 2,168,610 |

:user/existing-lookup-last | 432,363 |

:user/existing-lookup-condition | 5,769,581 |

:user/existing-lookup-sequence | 2,168,610 |

:user/existing-assoc | 2,168,610 |

:user/existing-skip | 1,355,037 |

:user/existing-seen | 567,635 |

:user/process-collatz-sequence | 432,363 |

Benchmark Metric | Value |
---|---|

Execution time sample mean | 4.002129 sec |

Execution time mean | 4.001909 sec |

Execution time sample std-deviation | 98.338164 ms |

Execution time std-deviation | 99.362992 ms |

Execution time lower quantile | 3.897592 sec ( 2.5%) |

Execution time upper quantile | 4.229077 sec (97.5%) |

Overhead used | 3.047829 ns |

#### G. Iterative Sequences-Only Memoization with Heuristic Constraint

Do we need to memoize the lengths of all *hailstones* we encounter? Perhaps we can experiment with a very simple heuristic: only memoize lengths for *hailstones* less than `limit`

; if we are lucky, for the given values of `n`

we should more or less stay within the range of `1`

to `limit`

most of the time, and only miss out on a few sequences which contain *hailstones* beyond our limit (thus forcing us to calculate segments of sequences more than once instead of finding them in our memoization).

As expected, most of the profiled counts are the same with those of the previous algorithm; after all, we are computing the exact same sequences and memoizing the same length values for each `n`

. But crucially, we reduced the memoized values by a half (*:user/existing-assoc*), at the tiny cost of calculating roughly *11.6%* more successors (*:user/next-collatz*) and corresponding lookups (*:user/existing-lookup-condition*). This algorithm now runs in just over *three seconds* and uses only *0.3 GB* of memory.

Profiled Name | Instance Count |
---|---|

:user/collatz | 999,998 |

:user/next-collatz | 2,355,035 |

:user/existing-lookup-last | 432,363 |

:user/existing-lookup-condition | 6,142,431 |

:user/existing-lookup-sequence | 999,998 |

:user/existing-assoc | 999,998 |

:user/existing-skip | 1,355,037 |

:user/existing-seen | 567,635 |

:user/process-collatz-sequence | 432,363 |

Benchmark Metric | Value |
---|---|

Execution time sample mean | 3.187209 sec |

Execution time mean | 3.187221 sec |

Execution time sample std-deviation | 16.322258 ms |

Execution time std-deviation | 16.469264 ms |

Execution time lower quantile | 3.161468 sec ( 2.5%) |

Execution time upper quantile | 3.218255 sec (97.5%) |

Overhead used | 3.064180 ns |

### Further Exploration

A discussion of various other attempts at this problem in *Clojure*, including a native *Java* integer array, can be found at the Clojure Companion Cube by Ivar Thorson. Although none of those solutions perform better than the ones above, Ivar shares several important lessons learned about performance in *Clojure* in particular and the *JVM* in general.

As with all problems, this one is being discussed in the *Project Euler* forum.

### Conclusion

As Alan Perlis once suggested, *“A LISP programmer knows the value of everything, but the cost of nothing.”*

Or, as Randall Munroe from xkcd 710 puts it, slightly more insightfully:

### Acknowledgments

Gratitude to Dr Anastasia Niarchou for her feedback.

## Comments

comments powered by Disqus