1. Collections Pretty Slow
Scala World 2015, Penrith
BillVenners
Artima, Inc.
Escalate Software
Monday, September 21, 2015
2. •Work in progress
• In feature-equasets branch of scalatest repo
• Goal is to explore ways to simplify Scala
standard collections
Scalactic Collections
Monday, September 21, 2015
3. •Want to keep things stable
•Want to make things better
• Deprecate..., then remove
•Very rarely, break source
• Use needles not machetes
The Straight Jacket of Compatibility
Monday, September 21, 2015
4. // Case Insensitive String Wrapper
case class CIS(value: String) {
override def equals(other: Any): Boolean = {
other match {
case CIS(s) => s.toLowerCase == value.toLowerCase
case _ => false
}
}
override def hashCode: Int = value.toLowerCase.hashCode
}
scala> val ciSet = Set(CIS("hi"), CIS("HI"), CIS(" hi "))
ciSet: scala.collection.immutable.Set[CIS] = Set(CIS(hi), CIS( hi ))
Driving use case for “EquaSet”s:
Set with custom equality
Monday, September 21, 2015
5. // Whitespace Insensitive String Wrapper
case class WIS(value: String) {
override def equals(other: Any): Boolean = {
other match {
case WIS(s) => s.trim == value.trim
case _ => false
}
}
override def hashCode: Int = value.trim.hashCode
}
scala> val wiSet = Set(WIS("hi"), WIS("HI"), WIS(" hi "))
wiSet: scala.collection.immutable.Set[WIS] = Set(WIS(hi), WIS(HI))
Monday, September 21, 2015
6. scala> ciSet union Set(CIS("ha"), CIS("HA"), CIS(" ha "))
res1: scala.collection.immutable.Set[CIS] = Set(CIS(hi), CIS( hi ), CIS(ha), CIS( ha ))
scala> wiSet union Set(WIS("ha"),WIS("HA"),WIS(" ha "))
res2: scala.collection.immutable.Set[WIS] = Set(WIS(hi),WIS(HI),WIS(ha),WIS(HA))
scala> ciSet union wiSet
<console>:14: error: type mismatch;
found : scala.collection.immutable.Set[WIS]
required: scala.collection.GenSet[CIS]
ciSet union wiSet
^
What about union, intersect, diff?
Monday, September 21, 2015
9. You know how I’d do that...
Equality[E]
Collections[E]
Set[+T]
val ci: Collections[String] = ...
ci.Set[String](“hi”)
Monday, September 21, 2015
10. scala> import org.scalactic._
import org.scalactic._
scala> import StringNormalizations._
import StringNormalizations._
scala> val ci = Collections(lowerCased.toHashingEquality)
ci: org.scalactic.Collections[String] = org.scalactic.Collections@258c0257
scala> val wi = Collections(trimmed.toHashingEquality)
wi: org.scalactic.Collections[String] = org.scalactic.Collections@2b62cef9
scala> val ciSet = ci.Set("hi", "HI", " hi ")
ciSet: ci.immutable.inhabited.Set[String] = Set(hi, hi )
scala> val wiSet = wi.Set("hi", "HI", " hi ")
wiSet: wi.immutable.inhabited.Set[String] = Set(hi, HI)
Monday, September 21, 2015
11. scala> ciSet union ci.Set("ha", "HA", " ha ")
res0: ci.immutable.Set[String] = Set(hi, hi , ha, ha )
scala> wiSet union wi.Set("ha", "HA", " ha ")
res1: wi.immutable.Set[String] = Set(hi, HI, ha, HA)
scala> ciSet union wiSet
<console>:24: error: type mismatch;
found : wi.immutable.inhabited.Set[String]
required: ci.immutable.Set[?]
ciSet union wiSet
^
What about union, intersect, diff?
Monday, September 21, 2015
12. What about that plus sign? Set[+T]
Fruit
Orange
Valencia
Apple
Monday, September 21, 2015
14. scala> val orangeSet = Set(Orange(true), Orange(true))
orangeSet: scala.collection.immutable.Set[Orange] = Set(Orange(true))
scala> orangeSet + Apple(true)
<console>:19: error: type mismatch;
found :Apple
required: Orange
orangeSet + Apple(true)
^
scala> val fruitSet = orangeSet.map(o => o: Fruit)
fruitSet: scala.collection.immutable.Set[Fruit] = Set(Orange(true))
scala> fruitSet + Apple(true)
res22: scala.collection.immutable.Set[Fruit] = Set(Orange(true),Apple(true))
Scala Sets are invariant
Monday, September 21, 2015
15. Intensional versus Extensional Sets
• Scala Set complects intensional and extensional,
so invariant
• Scalactic Set models extensional only,
so covariant
• Scalactic Membership models intensional only,
so contravariant
Monday, September 21, 2015
16. U >:T <: E
Equality[E]
Collections[E]
Set[+T]
Monday, September 21, 2015
17. scala> val fr = Collections[Fruit]
fr: org.scalactic.Collections[Fruit] = org.scalactic.Collections@70798f1a
scala> val valenciaSet = fr.Set(Valencia(true))
valenciaSet: fr.immutable.inhabited.Set[Valencia] = Set(Valencia(true))
scala> valenciaSet + Orange(true)
res4: fr.immutable.inhabited.Set[Orange] = Set(Valencia(true), Orange(true))
scala> valenciaSet + Apple(true)
res5: fr.immutable.inhabited.Set[Fruit] = Set(Valencia(true),Apple(true))
Scalactic Sets are covariant
(but with an upper bound)
Monday, September 21, 2015
18. scala> valenciaSet + 88
<console>:30: error: inferred type arguments [Any] do not conform to method +'s
type parameter bounds [U >:Valencia <: Fruit]
valenciaSet + 88
^
<console>:30: error: type mismatch;
found : Int(88)
required: U
valenciaSet + 88
^
So,Any not inferred here:
Monday, September 21, 2015
19. scala> val evenInts = Membership { (i: Int) => (i & 1) == 0 }
evenInts: org.scalactic.Membership[Int] = <membership>
scala> val oddInts = evenInts.complement
oddInts: org.scalactic.Membership[Int] = <membership>
scala> (evenInts(0), evenInts(1))
res0: (Boolean, Boolean) = (true,false)
scala> (oddInts(0), oddInts(1))
res1: (Boolean, Boolean) = (false,true)
scala> val allInts = oddInts union evenInts
allInts: org.scalactic.Membership[Int] = <membership>
scala> (allInts(0), allInts(1))
res2: (Boolean, Boolean) = (true,true)
Membership
has
Set
operations
intersect,
union,
diff,
complement
Monday, September 21, 2015
21. scala> Collections.default
res10: org.scalactic.Collections[Any] = org.scalactic.Collections@79eb4d3e
scala> import Collections.default._
import Collections.default._
scala> val dfSet = Set("hi", "HA", " ha ")
dfSet: org.scalactic.Collections.default.immutable.inhabited.Set[String] = Set(hi, HA, ha )
scala> dfSet + 88
res1: org.scalactic.Collections.default.immutable.inhabited.Set[Any] = Set(hi, HA, ha , 88)
Default collections like Scala standard
Monday, September 21, 2015
22. scala> val ciView = ciSet.map(_.length)
ciView: org.scalactic.views.inhabited.SetView[Int] = FastSetView(2,4)
scala> val wiView = wiSet.map(_.length)
wiView: org.scalactic.views.inhabited.SetView[Int] = FastSetView(2,2)
scala> ciView.force
res2: org.scalactic.Collections.default.immutable.inhabited.Set[Int] = Set(2, 4)
scala> wiView.force
res3: org.scalactic.Collections.default.immutable.inhabited.Set[Int] = Set(2)
map/flatmap/etc. are lazy
.force gives you Collections.default
Monday, September 21, 2015
23. scala> val wiBang = wiSet.map(_ + "!")
wiBang: org.scalactic.views.inhabited.SetView[String] = FastSetView(hi!,HI!)
scala> wiBang.force
res11: org.scalactic.Collections.default.immutable.inhabited.Set[String] = Set(hi!, HI!)
scala> wiBang.forceInto(wi)
res12: wi.immutable.inhabited.Set[String] = Set(hi!, HI!)
scala> wiBang.forceInto(ci)
res13: ci.immutable.inhabited.Set[String] = Set(hi!)
.forceInto lets you specify a non-default
Collections instance for the result
Monday, September 21, 2015
24. CanBuildFrom shows up rarely
• Use overriding and covariant return types
where possible
• Because transformation methods returnViews,
no need for CanBuildFrom in strict types
• Because of reduced inheritance, often don’t
need it inViews
• Sometimes, CanBuildFrom will show up in the
force and forceInto methods: e.g., MapView
• Puzzler: No BitSet in Scalactic Collections.
Monday, September 21, 2015
27. • No Collections.Seq, Set, or Map that could be
either mutable or immutable
• SeqView does not extend immutable.Seq
•Views are transient helpers, and don’t have a
hierarchy
• I.e., HashSetView does not extend SetView
• Instead, there’s one XView for each X
Less Inheritance
Monday, September 21, 2015
29. Focus design on user experience,
not implementer experience
• No types solely for implementation in public
interface
•All types in public interface are for users
• How do you prevent bugs given code
duplication?
• Use code generation where helpful
•Test
Monday, September 21, 2015
32. • Design for busy teams.
• Make it obvious, guessable, or easy to remember.
• Design for readers, then writers.
• Make errors impossible, or difficult.
• Exploit familiarity.
• Document with examples.
• Minimize redundancy.
• Maximize consistency.
• Use symbols when your users are already experts in them.
• Minimize the magic.
Simplicity in Scala Design
Monday, September 21, 2015