performance - What's the ideal implementation for the Sieve of Eratosthenes between Lists, Arrays, and Mutable Arrays?

Question

Ask a Question

Welcome To Ask or Share your Answers For Others

performance - What's the ideal implementation for the Sieve of Eratosthenes between Lists, Arrays, and Mutable Arrays?

asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

In Haskell, I've found three simple implementations of the Sieve of Eratosthenes on the Rosetta Code page.

Now my question is, which one should be used in which situations?

Correcting my initial reasoning would be helpful too:

I'm assuming the List one is the most idiomatic and easy to read for a Haskeller. Is it correct, though? I'm wondering if it suffers from the same problems as another list-based sieve that I then learned was not actually implementing the algorithm:
(edit: shown here is the list-based sieve I know has problems, not the one from RosettaCode, which I pasted at the bottom)

primes = sieve [2..] where
         sieve (p:x) = p : sieve [ n | n <- x, n `mod` p > 0 ]

In terms of performance, the immutable Array seems to be the winner. With an upper bound m of 2000000, the times were about:

1.3s for List
0.6s for Array
1.8s for Mutable Array

So I'd pick Array for performance.

And of course, the Mutable Array one is also easy to reason about since I have a more imperative language background. I'm not sure why I would pick this one if I'm coding in Haskell, though, since it's both slower than the others and non-idiomatic.

Code copied here for reference:

List:

primesTo m = 2 : eratos [3,5..m] where
eratos (p : xs) | p*p>m = p : xs
                | True  = p : eratos (xs `minus` [p*p, p*p+2*p..])

minus a@(x:xs) b@(y:ys) = case compare x y of
     LT -> x : minus  xs b
     EQ ->     minus  xs ys
     GT ->     minus  a  ys
minus a        b        = a

Immutable Array:

import Data.Array.Unboxed

primesToA m = sieve 3 (array (3,m) [(i,odd i) | i<-[3..m]] :: UArray Int Bool)
  where
    sieve p a 
      | p*p > m   = 2 : [i | (i,True) <- assocs a]
      | a!p       = sieve (p+2) $ a//[(i,False) | i <- [p*p, p*p+2*p..m]]
      | otherwise = sieve (p+2) a

Mutable Array:

import Control.Monad (forM_, when)
import Control.Monad.ST
import Data.Array.ST
import Data.Array.Unboxed

primeSieve :: Integer -> UArray Integer Bool
primeSieve top = runSTUArray $ do
    a <- newArray (2,top) True           -- :: ST s (STUArray s Integer Bool)
    let r = ceiling . sqrt $ fromInteger top
    forM_ [2..r] $ i -> do
      ai <- readArray a i
      when ai $ do
        forM_ [i*i,i*i+i..top] $ j -> do
          writeArray a j False
    return a

-- Return primes from sieve as list:  
primesTo :: Integer -> [Integer]
primesTo top = [p | (p,True) <- assocs $ primeSieve top]

EDIT

I showed Turner's Sieve at the top but that's not the list-based example I'm comparing with the other two. I just wanted to know if the list-based example suffers from the same "not the correct Sieve of Eratosthenes" problems as Turner's.
It appears the performance comparison is unfair because the Mutable Array example goes through evens as well and uses Integer rather than Int...

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

831 views

1 Answer

深蓝 · Answer 1 · 2021-10-23T19:24:52+0000

This

primes = sieve [2..] where
         sieve (p:x) = p : sieve [ n | n <- x, n `mod` p > 0 ]

is not a sieve. It's very inefficient trial division. Don't use that!

I'm curious about how you got your times, there is no way that the Turner "sieve" could produce the primes not exceeding 2,000,000 in mere seconds. Letting it find the primes to 200,000 took

MUT     time    6.38s  (  6.39s elapsed)
GC      time    9.19s  (  9.20s elapsed)
EXIT    time    0.00s  (  0.00s elapsed)
Total   time   15.57s  ( 15.59s elapsed)

on my box (64-bit Linux, ghc-7.6.1, compiled with -O2). The complexity of that algorithm is O(N2 / log2 N), almost quadratic. Letting it proceed to 2,000,000 would take about twenty minutes.

Your times for the array versions are suspicious too, though in the other direction. Did you measure interpreted code?

Sieving to 2,000,000, compiled with optimisations, the mutable array code took 0.35 seconds to run, and the immutable array code 0.12 seconds.

Now, that still has the mutable array about three times slower than the immutable array.

But, it's an unfair comparison. For the immutable array, you used Ints, and for the mutable array Integers. Changing the mutable array code to use Ints - as it should, since under the hood, arrays are Int-indexed, so using Integer is an unnecessary performance sacrifice that buys nothing - made the mutable array code run in 0.15 seconds. Close to the mutable array code, but not quite there. However, you let the mutable array do more work, since in the immutable array code you only eliminate odd multiples of the odd primes, but in the mutable array code, you mark all multiples of all primes. Changing the mutable array code to treat 2 specially, and only eliminate odd multiples of odd primes brings that down to 0.12 seconds too.

But, you're using range-checked array indexing, which is slow, and, since the validity of the indices is checked in the code itself, unnecessary. Changing that to using unsafeRead and unsafeWrite brings down the time for the immutable array to 0.09 seconds.

Then you have the problem that using

forM_ [x, y .. z]

uses boxed Ints (fortunately, GHC eliminates the list). Writing a loop yourself, so that only unboxed Int#s are used, the time goes down to 0.02 seconds.

{-# LANGUAGE MonoLocalBinds #-}
import Control.Monad (forM_, when)
import Control.Monad.ST
import Data.Array.ST
import Data.Array.Unboxed
import Data.Array.Base

primeSieve :: Int -> UArray Int Bool
primeSieve top = runSTUArray $ do
    a <- newArray (0,top) True
    unsafeWrite a 0 False
    unsafeWrite a 1 False
    let r = ceiling . sqrt $ fromIntegral top
        mark step idx
            | top < idx = return ()
            | otherwise = do
                unsafeWrite a idx False
                mark step (idx+step)
        sift p
            | r < p     = return a
            | otherwise = do
                prim <- unsafeRead a p
                when prim $ mark (2*p) (p*p)
                sift (p+2)
    mark 2 4
    sift 3

-- Return primes from sieve as list:
primesTo :: Int -> [Int]
primesTo top = [p | (p,True) <- assocs $ primeSieve top]

main :: IO ()
main = print .last $ primesTo 2000000

So, wrapping up, for a Sieve of Eratosthenes, you should use an array - not surprising, its efficiency depends on being able to step from one multiple to the next in short constant time.

You get very simple and straightforward code with immutable arrays, and that code performs decently for not too high limits (it gets relatively worse for higher limits, since the arrays are still copied and garbage-collected, but that's not too bad).

When you need better performance, you need mutable arrays. Writing efficient mutable array code is not entirely trivial, one has to know how the compiler translates the various constructs to choose the right one, and some would consider such code unidiomatic. But you can also use a library (disclaimer: I wrote it) that provides a fairly efficient implementation rather than writing it yourself.

Categories

performance - What's the ideal implementation for the Sieve of Eratosthenes between Lists, Arrays, and Mutable Arrays?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags