Can Scala collection’s seq/par/view/force be seen as a violation of the uniform return type principle?

Spread the love

Question Description

Most of the implementation complexity of the collection framework arises from the fact, that Scala can – unlike C#’s LINQ or other collection frameworks – return the “best” collection type for higher order functions:

val numbers = List(1,2,3,4,5)
numbers map (2*) // returns a List[Int] = List(2, 4, 6, 8)

val doubles = Array(1.0, 2.0, 3.0)
doubles filter (_ < 3) // returns Array[Double] = Array(1.0, 2.0)

Why does this principle not hold for methods like seq, par, view, force?

numbers.view.map(2*).force 
// returns Seq[Int] = List(2, 4, 6, 8)

numbers.seq 
// returns scala.collection.immutable.Seq[Int] = List(1, 2, 3, 4)

doubles.par.seq 
// returns scala.collection.mutable.ArraySeq[Double] = ArraySeq(1.0, 2.0, 3.0)

Is there a technical limitation which prevents it from working?
Or is this by design/intent?
Considering that LINQ is basically lazy, Scala's equivalent (view, force) isn't more type-safe (only when using the strict methods), right?

Practice As Follows

It is possible to embed more type information into the parallel collection classes so that you get back the collection you've started from, that's true. This would mean that after turning a List into a ParVector by calling par (in O(n), because elements are copied into the vector) and then calling seq, you would again get a List. To obtain the list with seq, you would have to copy all the elements from the vector back into the list. What happens instead is:

  • ParVector gets converted back into a Vector when seq is called - it gets converted in O(1)
  • calling par again on this vector will give you a ParVector in O(1), since they both vector and the parallel vector share the same underlying data

Notice that a collection such as a list has to be restructured when turned into a parallel collection, otherwise the operations on it cannot be efficiently parallelized.

Thus, you don't have to repetitively pay for copying when calling par and seq - the conversions become much more efficient. Since the primary goal of parallel collections was to increase efficiency, this was deemed more important than the uniform return type principle.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.