Monday, February 18, 2008

Typesafe Ranged Integers in Scala

For the past few weeks I've been working on a basic numeric library for Scala for eventual contribution to the Scalax project. My goal is to create an extensible, type-safe library that provides numeric types that are easy to use and more closely resemble their mathematical foundations than the primitive operations on which they are built. My initial focus has been on building a working rational numbers implementation with unlimited precision integers. However, today on the Scala mailing lists, someone raised the issue that Scala's use of primitive numbers doesn't belong in such a high-level language. When I read this, I thought "Hmmmm, my library helps with that."

Integer Underflow and Overflow

First, let's consider integer underflow and overflow. The standard Java/Scala integer type is a 32-bit twos-complement signed integer. That means it can representing numbers between 2147483647 and -2147483648 inclusive. Often times this is plenty range, so what we really want is for exceptions to be thrown when underflow and overflow. So consider the following:

scala> import scalax.math.Int32Domain._
import scalax.math.Int32Domain._

scala> val a: Int32 = 5
a: scalax.math.Int32Domain.Int32 = 5

scala> val b: Int32 = 7
b: scalax.math.Int32Domain.Int32 = 7

scala> a + b
res0: scalax.math.Int32Domain.N = 12

scala> a * b
res1: scalax.math.Int32Domain.N = 35

scala> val i: Int32 = 2147483647
i: scalax.math.Int32Domain.Int32 = 2147483647

scala> i + 1
java.lang.ArithmeticException: result exceeds maximum value: 2147483647
at scalax.math.Int32Domain$Int32.checkRange(int32.scala:55)
at scalax.math.Int32Domain$Int32.$plus(int32.scala:60)
at .(:8)
at .()
at RequestResult$.(:3)
at RequestResult$.()
at RequestResult$result()
at sun.reflect.NativeMethodAccess...

Magic! Overflow has been detected. Well, not quite magic. The way it works (and I'm sure it could work much, much better, I just through this together tonight) is that it converts the 32-bit integers into 64-bit integers prior to calculations, performs the calculations, then checks the result to ensure that it is within range. If it is within range, then the result is converted back into a 32-bit integer and returned. Between the usage of an object instead of a primitive and all this extra work for range checking, this class is probably at least an order-of-magnitude slower that using primitive Ints.

Bounded Ranges

Now let's consider bounded ranges. In order to accomplish this, we simply create an object that extends Int32 and overrides its minimum and maximum value.

scala> object SmallDomain extends scalax.math.Int32Domain {
| override val max: N = 10
| override val min: N = -10
| }
defined module SmallDomain

scala> val d = SmallDomain.Integer(5)
d: SmallDomain.N = 5

Now we have a domain that is limited from -10 to 10 inclusive. Attempting to mix integers from this domain with integers from other domains will yield a type error:

scala> i + d
:9: error: type mismatch;
found : SmallDomain.N
required: scalax.math.Int32Domain.N
i + d
^

scala> d + i
:9: error: type mismatch;
found : scalax.math.Int32Domain.Int32
required: SmallDomain.N
i + a
^

Even if you had a Int32 within the range of SmallDomain, you would still receive the type error. Also, as you can see, Int32 will not allow itself to be mixed with an integer from SmallDomain. If you look at the code (which I'll publish soon, I promise) for Int32Domain and its superclasses, you will see a lot of complex stuff involving mixins and type constructors. However, the end class is easily extensible by the user and still provides good type safety.

Rationals

I mentioned rationals at the beginning of this blog, so I thought I would give a little sneak-peak:

scala> import scalax.math.BigRationalDomain._
import scalax.math.BigRationalDomain._

scala> val a = Integer(3)
a: scalax.math.BigRationalDomain.IS = 3

scala> val b = Integer(7)
b: scalax.math.BigRationalDomain.IS = 7

scala> a / b
res0: scalax.math.BigRationalDomain.N = 3/7

scala> a.inverse
res1: scalax.math.BigRationalDomain.N = 1/3

scala> a.inverse * a
res2: scalax.math.BigRationalDomain.N = 1

scala> (a / b)^2345
res3: scalax.math.BigRationalDomain.N = 70687450412160609527952431393691553815800419127671886519112946722799601077076407951140254943159198319665943578284183744314587093153151823879346566151398437630838022610858501398173872472014103905434680267558985498302895510754430260294086995134629615707980341285332639084519209821403179213183556756416404920769037158823806045214292550723098516538040...

scala> a * b
res4: scalax.math.BigRationalDomain.I = 21

scala> Integer(3) / Integer(6)
res5: scalax.math.BigRationalDomain.N = 1/2

Notice how an integer times an integer equals an integer, but an integer divided by an integer equals a rational. This is because integer is a subtype of rational, because in all integers are rationals, but not all rationals are integers.

More to come soon!

Sphere: Related Content

Friday, February 08, 2008

Linguistic Success

There's a new favorite pastime on the fringes of the Scala community. That pastime is blogging about aspects of the Scala language that will prevent it from being a "success" unless they are quickly addressed. "Success" is roughly defined as "widespread commercial use." A good metric might be: "How hard is it to find a job programming Scala?" The premise of the criticism usually revolves around one or more complicated, terse, or "foreign" (relative to the author) constructs that are common Scala, or at least favorites among frequent posters, and how these constructs will prevent "average" programmers from understanding Scala and thereby prevent commercial adoption. A recent favorite is symbolic function names (/: and :\ for folds).

The logic of this argument seems relatively sound. When choosing a technology for a project, it is very important to consider the availability of potential employees who know that technology. Forcing every new member to scale a potentially steep learning curve is a frightening prospect. I can imagine development managers lying awake at night fearing that their expert in some obscure technology will leave, and it will take months to replace him. It's a legitimate, although I think slightly exaggerated, fear.

That being said, I think it has little to do with the adoption of a new language. The majority of programmers simply do not spend their free time learning new languages. Many, maybe most, won't even use their free time to learn languages that they know have ready pre-existing demand. They learn when they are paid to learn, or when failing to learn will lead to immediate negative consequences for their career. I think the same can be said of most professions. Most people work because they have to, not because they want to.

Consequently, I think expanding beyond a core of enthusiasts is very difficult, if not impossible, simply be attracting people to the language. Right now some leading-edge Java people are taking a look at Scala, because they think it might be the next big thing in Java-land. These people are different than enthusiasts. Enthusiasts will learn a language for the sake of learning it. The leading-edge folks learn it as a high risk investment. If they can get a head start on the next big thing, it will be great for their careers (and businesses). These people constantly think "can I sell using this technology?" and "if I do sell it, while it come back to haunt me?" This is a very pragmatic perspective, and it is the perspective I take when I'm at work.

Confusing, odd-ball language features make the sell a lot harder. Pitching them as features increases personal risk.

But it doesn't matter.

Why? Because the vast majority of developers are not going to learn a new language because they want to, they are going to learn it because they have to. Not to mention that there are countless languages out there, so going after the enthusiasts and leading-edgers (who are mostly lookers anyway) is just fishing in an already over-fished pond.

So how does a language become a success?

Enough people use it for real projects. Let's say a consultant rapidly prototypes an application using the technology, and that application makes it into production. Now maintenance programmers have to learn that technology. Sure, it's complicated, but unlike the guy learning it on free weekends, the maintenance programmers have all-day every-day. It's hard to learn complex concepts a hour or two at a time, but these guys have all day, and the next day. It's hard to memorize something by looking at it a couple hours a week, but spend all day staring and it and it will click. Not to mention that their livelihoods depend on it. Any sort of "cowboy" development team can cause this to happen, and frankly such teams are pretty common.

So maybe one-in-five maintenance programmers actually like the technology, and admire the cowboys, so when they go get a chance to do new development, they use it, too.

The same thing can happen with products from startups. Let's say a startup builds a piece of enterprise software using Scala. They sell it to big, conservative companies by emphasizing the Java aspect. They sell customization services, too. And then it's back to the maintenance programmer, who has no choice.

Notice a pattern? The key to language success is making it powerful enough for a couple cowboys to do the work of an entire team in a shorter period of time. Selling fast and cheap is easy. If you have enough fast and cheap, the business people won't care if you are making it out of bubble-gum and duct-tape, because you are giving them what they want.

The key to success is making the reward justify the risk. Judging by what some people using Scala for real-world projects are saying, and my one hands-on experience, I think Scala offers it. It's just a matter of time before it sneaks its way into enterprises, just like Ruby has.

Sphere: Related Content