Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have written some Scala code to perform an element-wise operation on a collection. Here I defined two methods that perform the same task. One method uses zip and the other uses zipped.

def ES (arr :Array[Double], arr1 :Array[Double]) :Array[Double] = arr.zip(arr1).map(x => x._1 + x._2)

def ES1(arr :Array[Double], arr1 :Array[Double]) :Array[Double] = (arr,arr1).zipped.map((x,y) => x + y)

To compare these two methods in terms of speed, I wrote the following code:

def fun (arr : Array[Double] , arr1 : Array[Double] , f :(Array[Double],Array[Double]) => Array[Double] , itr : Int) ={
  val t0 = System.nanoTime()
  for (i <- 1 to itr) {
       f(arr,arr1)
       }
  val t1 = System.nanoTime()
  println("Total Time Consumed:" + ((t1 - t0).toDouble / 1000000000).toDouble + "Seconds")
}

I call the fun method and pass ES and ES1 as below:

fun(Array.fill(10000)(math.random), Array.fill(10000)(math.random), ES , 100000)
fun(Array.fill(10000)(math.random), Array.fill(10000)(math.random), ES1, 100000)

The results show that the method named ES1 that uses zipped is faster than method ES that uses zip. Based on these observations, I have two questions.

Why is zipped faster than zip?

Is there any even faster way to do element-wise operations on a collection in Scala?

question from:https://stackoverflow.com/questions/59598239/why-is-zipped-faster-than-zip-in-scala

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
229 views
Welcome To Ask or Share your Answers For Others

1 Answer

To answer your second question:

Is there any more faster way to do element wise operation on a collection in Scala?

The sad truth is that despite it's conciseness, improved productivity, and resilience to bugs that functional languages aren't necessarily the most performant - using higher order functions to define a projection to be executed against collections not free, and your tight loop highlights this. As others have pointed out, additional storage allocation for intermediate and final results will also have overhead.

If performance is critical, although by no means universal, in cases like yours you can unwind Scala's operations back into imperative equivalents in order to regain more direct control over memory usage and eliminating function calls.

In your specific example, the zipped sums can be performed imperatively by pre-allocating a fixed, mutable array of correct size (since zip stops when one of the collections runs out of elements), and then adding elements at the appropriate index together (since accessing array elements by ordinal index is a very fast operation).

Adding a third function, ES3 to your test suite:

def ES3(arr :Array[Double], arr1 :Array[Double]) :Array[Double] = {
   val minSize = math.min(arr.length, arr1.length)
   val array = Array.ofDim[Double](minSize)
   for (i <- 0 to minSize - 1) {
     array(i) = arr(i) + arr1(i)
   }
  array
}

On my i7 I get the following response times:

OP ES Total Time Consumed:23.3747857Seconds
OP ES1 Total Time Consumed:11.7506995Seconds
--
ES3 Total Time Consumed:1.0255231Seconds

Even more heineous would be to do direct in-place mutation of the shorter of the two arrays, which would obviously corrupt the contents of one of the arrays, and would only be done if the original array again wouldn't be needed:

def ES4(arr :Array[Double], arr1 :Array[Double]) :Array[Double] = {
   val minSize = math.min(arr.length, arr1.length)
   val array = if (arr.length < arr1.length) arr else arr1
   for (i <- 0 to minSize - 1) {
      array(i) = arr(i) + arr1(i)
   }
  array
}

Total Time Consumed:0.3542098Seconds

But obviously, direct mutation of array elements isn't in the spirit of Scala.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...