Posted on Sat 21 February 2015

Using More Expressive Types

As a Scala user, I spend a lot of my time thinking about my types. Making good use of types is the key to compile time correctness checking, since compilers are not yet good at checking control flow. For example, consider the following:

def foo1(o: Option[String]) = {
  if(o.isDefined) {
    println(o.get)
  }
}

def foo2(o: Option[String]) = {
  o.foreach(println)
}

The first method, foo1, is basically a translation to the Java approach to null checking (if(o != null)) and foo2 is the more idiomatic Scala approach. As long as you have no mistakes, they do the same thing. The difference is that the first approach is correctness is achieved with manual control flow, and the second one makes use of our types for correctness.

If we made a mistake:

def foo3(o: Option[String]) = {
  if(o.isEmpty) {
    println(o.get) // runtime exception!!
  }
}

then this is a runtime error. Compilers are much better at saving us from type-correctness mistakes than from control flow mistakes. So then, how can we take better advantage of this? How can we move as much of our logic into our compile verified types?

Perhaps the most important thing types do for our correctness is not the methods they provide, but actually the methods they do not provide. The more limitations our types have, the less possibility we have for bugs.

Consider this example, from a great talk by Adelbert Chang's Reasoning with Types.

def sum(xs: List[Int]): Int

/*

All of the following implmentations would compile!

*/

def sum(xs: List[Int]) = 40
def sum(xs: List[Int]) = xs.foldLeft(0)(_ + _)
def sum(xs: List[Int]) = xs.foldLeft(0)(_ + _) - 1

Since the type signature doesn't limit our implementation, we can return any integer, or have a bug that puts our answer off by some integer value. We can improve this a little bit by telling the method less about the types:

def sum[A](xs: List[A])(implicit A:Numeric[A]): A = xs.foldLeft(A.zero)(A.plus)

Here, we're saying we have a list of A, such that A has a Numeric typeclass defined. However, looking at the signature of Numeric[T]:

trait Numeric[T] {
  def plus(x: T, y: T): T
  def times(x: T, y: T): T
  def negate(x: T): T
  def abs(x: T)
  ...
  /* a bunch more: http://www.scala-lang.org/api/current/index.html#scala.math.Numeric */
}

it's clear that there's still a bunch of possible wrong implementations that type-check

def sum[A](xs: List[A])(implicit A:Numeric[A]): A = xs.foldLeft(A.zero)(A.times)
def sum[A](xs: List[A])(implicit A:Numeric[A]): A = A.negate(xs.foldLeft(A.zero)(A.plus))

Take a look at an even more restrictive type-class

trait Monoid[T] {
  def append(f1: T, f2: T): T
  def zero: T
}

using Monoid instead of Numeric, we've limited our possible implementations that will compile down to basically just the correct one:

def sum[A](xs: List[A])(implicit A:Monoid[A]): A = xs.foldLeft(A.zero)(A.append)

Clearly, if we can limit what the compiler allows us to do to only the correct things, we'll have fewer bugs. But can this apply to real code we write every day?

Consider something I deal with in telecom software. In the SMS ecosystem, every carrier is identified by 2 numbers, a "Mobile Country Code" and a "Mobile Network Code". These ids just happen to be integers. However, when working with them, it's never valid to do any of the things computer languages are allowed to do to integers. It makes no sense to add or decrement an MCC. The MCC 210 has no relationship to 209 to 211. So how can I stop the compiler from letting me do this? Or worse yet, passing an MCC into a field where an MNC was required. I can't count the number of times this has happened thanks to lazy copy/paste. It's embarrassing.

Here's a first attempt:

type Mnc = Int
type Mcc = Int
case class Rate(amount: Decimal, currency: Currency)
case class RouteCost(mcc: Mcc, mnc: Mnc, cost: Rate)
case class Message(id: UUID, to: Did, from: Did, body: String, mcc: Mcc, mnc: Mnc)

trait Db {
  def fetchMessage(id: UUID): Message
  def fetchCost(mcc: Mnc, mnc: Mnc): RouteCost
}

we've defined type aliases for MCC and MNC, which is a good start from both an abstraction and a documentation perspective. However, type aliases in Scala are really just that...

val message = fetchMessage(someUuid)
fetchCost(message.mcc, message.mcc) // oops!! typo.

and this will compile. So we haven't really accomplished anything.

One solution is to wrap them in their own classes

case class Mnc(val i: Int)
case class Mcc(val i: Int)

and this is ok. It solves the problem above with passing an MCC where we need an MNC. But, on the JVM, it incurs a small cost of extra runtime objects. And, it's not nice to have to wrap numbers in constructors all the time.

What we really want is something like Haskell's newtype keyword, which is like a type alias at runtime, but like a new type at compile time. It let's us say "these are not the same thing" at compile time, even if they are the same at runtime.

Luckily, Scala 2.10 added value classes! Value classes let you wrap a primitive type without changing the runtime underlying representation. All you need to do is subclass with extends AnyVal. Now,

case class Mnc(val i: Int) extends AnyVal
case class Mcc(val i: Int) extends AnyVal

We can even take it one step further with implicits, so that we don't always need to use a constructor:

implicit class Mnc(val i: Int) extends AnyVal {
  override def toString = i.toString
}

implicit class Mcc(val i: Int) extends AnyVal {
  override def toString = i.toString
}

which allows code like this:

val someMnc: Mnc = 23 // implicily converts to Mnc

// and

fetchCost(210, 50)

// and

val msg = fetchMessage(someId)
fetchCost(msg.mcc, msg.mnc)

but will fail for code like this:

val msg = fetchMessage(someId)
fetchCost(msg.mnc, msg.mnc) // required Mcc, got Mnc!

hurray! So to sum it up, implicits value classes:

  • Give us type safety between different types that both happen to be integer numbers
  • Have 0-cost of types at runtime; at runtime we have a regular primitive int
  • Let us write literal numbers and have them converted into the appropriate types
  • Prevents us from doing unrelated "integer things", like math, to the identifiers

Learning Scala was really a huge shift for me. Although I'd never spent much time in statically typed languages, it never occurred to me the type system can actually be a benefit to you, instead of your adversary when trying to make something compile.

To the extent these techniques are even possible in languages like Java, people seem to avoid using them. In open source Java code, I often see null being abused to get around type safety, or things being "stringly typed." Java 1.9 might be adding native support for value types, however.

In any case, whatever language you're in, use the type system to its fullest. It's really the only way a compiler can prevent your bugs.

Category: misc

Tags: scala, anyval, types, newtype

Comments: toggle

© Chad Selph. Built using Pelican. Theme by Giulio Fidente on github. .