A Lazy Definition of Primes
I explore a succinct, corecursive definition for the sequence of prime numbers in Clojure — and how it works.
In Clojure, it’s possible to define the sequence of prime numbers in a remarkably succinct way — just one line of code. This post explores how that’s possible using lazy evaluation and corecursion, even though the definition initially appears self-referential and paradoxical. We'll walk through how Clojure's lazy sequences are constructed, how self-referential bindings like this are possible, and why the implementation doesn't fall into an infinite loop.
The "one-line" definition
In a Stack Overflow discussion, a user shared the following compact definition of primes in Clojure:
(def primes (remove
(fn [x]
(some #(zero? (mod x %)) primes))
(iterate inc 2)))
This defines primes
as a lazy sequence that removes any number divisible by a prior element of primes
.
On its face, it looks circular — how can primes
refer to itself before it's fully defined?
But it works:
(take 10 primes)
;; => (2 3 5 7 11 13 17 19 23 29)
(nth primes (dec 10000))
;; => 104729
So what makes this definition valid?
The apparent paradox
The predicate used inside remove
checks whether a number is divisible by any prior prime.
But primes
is the very sequence we are defining.
Doesn't that create an infinite loop?
In eager languages, this would cause an immediate stack overflow.
But Clojure’s lazy evaluation defers computation until absolutely necessary.
That, combined with how the LazySeq
class works internally, allows this recursive reference to behave safely and correctly.
How lazy sequences work in Clojure
Many core sequence functions in Clojure — including map
, filter
, remove
, take
, iterate
, and others — return lazy sequences.
Internally, these functions use the lazy-seq
macro to defer computation.
This means that even though primes
appears to be defined with immediate recursion, the recursive reference to primes
is embedded inside a remove
call, which itself is lazy.
So the self-reference does not trigger immediate evaluation — it's safely delayed until needed.
Let's look at how Clojure defines lazy sequences.
The core building block is the lazy-seq
macro:
(defmacro lazy-seq
"Takes a body of expressions that returns an ISeq or nil..."
[& body]
(list 'new 'clojure.lang.LazySeq (list* '^{:once true} fn* [] body)))
This macro wraps a body of code in a LazySeq
object that defers evaluation until seq
is called.
Here’s a basic example:
(defn integers
([] (integers 1))
([n] (lazy-seq (cons n (integers (inc n))))))
Calling (integers)
returns a LazySeq
that holds an unevaluated closure.
This closure is invoked only once, upon demand.
Internally, a LazySeq
has three fields:
private IFn fn; // The unevaluated generator.
private Object sv; // Cached result of fn.
private ISeq s; // Final cached sequence.
When seq
is called on a LazySeq
(e.g., via take
), it executes the function once (if it hasn't already), caches the result, and returns it.
This happes in sval
:
final synchronized Object sval() {
if (fn != null) {
sv = fn.invoke();
fn = null;
}
if (sv != null)
return sv;
return s;
}
Why the prime definition works
So why doesn’t this seemingly circular definition blow up?
The key is that remove
is lazy, and so is everything it operates on.
The expression
(def primes (remove
(fn [x] (some #(zero? (mod x %)) primes))
(iterate inc 2)))
defines primes
as a lazy sequence that filters the infinite stream of integers starting at 2.
For each number x
, it checks whether x
is divisible by any earlier element of primes
.
Because the whole sequence is lazy, the inner reference to primes
in the predicate doesn’t trigger immediate evaluation.
Instead, when we evaluate something like (take 1 primes)
, the filtering process begins with 2.
At that point, primes
hasn’t produced any values yet, so the predicate checks divisibility against an empty sequence.
some
returns nil
, so 2 is accepted.
Then 3 is evaluated.
By now, primes
contains [2]
, so the predicate checks whether 3 is divisible by 2 — it isn’t, so 3 is included.
This process continues: each candidate is tested against only the primes that have already been realized.
The recursive reference always resolves to a finite, previously computed prefix.
This bounded visibility ensures that the recursive self-reference remains well-defined and terminates at each step.
Why :once
matters
In the definition of lazy-seq
, the closure is marked ^{:once true}
.
This tells the compiler that the function will be invoked at most once.
This isn’t strictly necessary for correctness in the primes definition, but it is important for performance and memory safety.
Here's a simple example to illustrate how :once
works:
(let [f (let [i (iterate inc 2)]
(^:once fn* [] (first i)))]
[(f) (f)])
;; => [2 nil]
The second call to f
returns nil
because the closure was already invoked and then invalidated.
If :once
were not used in lazy-seq
, the closure returned by lazy-seq
could be accidentally invoked multiple times.
That would allow recomputation of the sequence prefix, which is not only inefficient but could lead to infinite recursion or stack overflow, especially in recursive definitions like primes
.
More subtly, omitting :once
would prevent the compiler from discarding closed-over references.
In this example, it would mean the closure holds onto a reference to primes
itself.
As a result, the JVM’s garbage collector wouldn't be able to free earlier parts of the sequence, leading to unnecessary memory retention.
So while :once
is not the reason lazy evaluation works, it plays a crucial supporting role in making it safe and efficient for self-referential definitions.
Conclusion
This deceptively simple definition of primes relies on deep interactions between Clojure’s lazy evaluation model, self-referential bindings, and the one-shot semantics of fn*
with :once
.
While not the most efficient way to compute primes, it showcases the expressiveness and elegance of functional programming in Clojure.
It also serves as a gateway to understanding the subtleties of laziness, memoization, and evaluation in the language.
I’m @siclait on Twitter — reach out to continue the conversation.