Monday, December 31, 2007

Double (Im)precision

Most computer users have, at one time or another, received an odd result from a spreadsheet or other program that performs calculations. Most programmers know that this is because of the impedance mismatch between the most common way for computers to "think" about numbers (base 2) and the way most people and businesses think about numbers (base 10). This issue receives a fair amount of attention, probably because we've all (we being programmers) have had to explain to users why a piece of software can't seem to do arithmetic properly. If you're scratching your head right now, or want to know more, I suggest reading this FAQ about decimal arithmetic.

However, as I've long know but rarely contemplated, there if another form of floating point error that can cause significant problems. Standard floating point numbers only store so many digits of precision. The result is that if you add a very large number to a very small number, the very small number simply vanishes. Let me demonstrate:

scala> 10000000000000000.0 + 1.0
res62: Double = 1.0E16
scala> 

The 1.0 simply vanished, because standard double-precision floating point numbers don't have enough digits to store the 1.0 part of such a large number. Doubles are an approximation, and that's fine, because often times all we're approximating things, anyway, and the roughly 15 decimal digits of precisions provided by a double is plenty, right?

Well, actually, no. It depends on what you're doing with them. A month or so ago my father convinced me that instead of spending my free time playing with AI problems and studying fringe programming languages, I should do something useful like predict stock price movements. Of course I can use AI and fringe programming languages to do this...

The first thing I decided to do was to smooth out stock price history and make it continuous by fitting a polynomial. So I wrote a quick polynomial class, grabbed Jama (a Java matrix library), downloaded some data, wrote a function to search for the appropriate degree, and this is what I got (supposedly with about 4% error):

Hmmm...that looks kind of funny. The blue line is the actual price. The red line is the polynomial, which has a degree of 44. That's a rather large degree, but certainly not enough to generate all those squiggles. Those squiggles are an artifact of double precision numbers not being precise enough to be used in calculating a degree-44 polynomial. They don't work that well for the matrix calculations that produce the polynomial, either.

I think that red squiggly line is a very good illustration of what happens when you don't pay enough attention to how your software works under-the-hood. Anyway, here's what the results look like using floating point numbers (courtesy of JScience) with 80 digits of precision (keeping the error at about 4%).

Looks a lot better, doesn't it? One interesting thing is that the resulting polynomial has a much, much lower degree than what was produced by the same algorithm using double precision numbers. With double precision numbers, 4% error was about as good as it could get with this dataset. However, using the higher-precision numbers it can go much lower. Below is a graph at about 2% error:

At this point you might be wondering why anyone would use double precision numbers for this type of thing, or perhaps thinking I'm a moron for trying. I certainly felt a little stupid. But using doubles is relatively common. Matlab and everyone's favorite Excel use doubles. Many of the other toolkits and libraries that I found do as well.

Overall I think I'm either missing something (because the use of doubles is so common), or I find this very frightening. I'm not an expert in this area. I know the doubles use much, much less memory and computational resources. They are also "just there" in pretty much all programming environments, so they are the default. Making arbitrary precision floating points work right wasn't a cakewalk, either. I spent the better part of a day tracing through my code, only to discover a bug in JScience. Also, once you have everything working, you have to figure out what precision to use. The default of 20 digits precision in JScience wasn't noticeably better than regular doubles.

Sphere: Related Content

Tuesday, December 11, 2007

Data Center Meltdown

Denial and the coming “data meltdown” by ZDNet's Michael Krigsman -- Subodh Bapat, Sun Microsystems eco-computing vice president, believes we’ll soon see the first world-class data center meltdown. According to News.com: “You’ll see a massive failure in a year,” Bapat said at a dinner with reporters on Monday. “We are going to see a data center failure of that scale.” “That scale” referred to the problems [...]

Now let's think about that. How can a datacenter fail?

  1. It can lose power for an extended period. It takes a lot of backup generators to keep a 50 megawatt datacenter humming.
  2. A virus or worm can shut it down.
  3. A natural disaster can destroy it or force a shutdown. Often times this is due to a power failure rather than destruction.
  4. It can have its WAN/Internet connection(s) severed. This isn't quite catastrophic unless you're on the other side of the WAN.

Michael has pointed out that Subodh Bapat doesn't point to the expected cause of a major data center meltdown, he just says one is coming. That's because it's not really a matter of one cause. There are so many risks, and you multiply them by the growing number of data centers, and what you come up with is that there's bound to be a major failure soon. We just don't know what the precise cause will be.

Most of the threats involve a geographically localized destruction or disabling of the data center. This means you need off-site recovery, and it probably needs to be fast. That means you probably need more than recovery, you need one or more hot sites that can take over the load of one that fails. This is extremely expensive for a "normal" data center. I can hardly imagine how much it would cost for a 50+ megawatt facility. Basically what we have is too many eggs in one basket, with economies of scale pushing us to keep on putting more eggs in the same basket.

What surprises me is that Subodh Bapat didn't say, oh, well, Sun has the solution. Considering that Jonathan Schwartz put it forward over a year ago. Ok, well, he didn't exactly suggest Blackbox as a means of easily distributing data centers. But think about it. If you're a large corporation you are probably already geographically distributed. If you expand your data center in cargo-container sized units across the nation (or world), you are half..err..maybe one quarter of the way there. You still have to figure out how to make them be hot sites for each other, or at least recovery sites. But at least losses at any given site would be minimized.

Sphere: Related Content

Sunday, December 09, 2007

Why isn't enterprise software sexy?

Robert Scoble asks: Why isn't enterprise software sexy?

A couple of the Enterprise Irregulars and several commentors respond: Because it's supposed to be reliable and effective, not sexy.

I think the reasons are more fundamental than that. Consider:

  1. Most enterprise software customers have mutual non-disclosure agreements with the software vendors.
  2. Most bloggers have day jobs
  3. Most companies do not want their employees violating the terms of contracts.
  4. Most companies do not want their employees airing the company's dirty laundry in a public forum
  5. Many of the most interesting pieces of information surrounding enterprise software involve dirty laundry.

Personally, I have two ground rules for blogging:

  1. Keep it professional. No personal matters, politics, etc.
  2. Keep my employer, coworkers, and my employer's customers and suppliers out of it.

Let's face it. Most IT and software development projects don't go right. The commercial software that you use doesn't perform half as well as the salesman said it would. Consultants have mixed motives, and more importantly aren't miracle workers. The internal team if often over extended and working outside their core area of expertise. Goals are unclear, completely undefined, or changing on a regular basis. Politics are everywhere on all sides.

This is the reality of business, and talking about it in the general case is OK. People read it, nod their heads, and this "Been there, done that, doing it again right now." But specifics are quite different. Specifics can lead to embarrassment, contract violations, and lost sales.

More people don't blog about enterprise software because it strikes too close to home. I don't think it has anything to do with whether enterprise software is sexy or not.

Update:

It just occurred to me that I probably wasn't very clear here on a few points. What is mean by "enterprise software" is enterprise application software, like SAP, Oracle's various applications, PeopleSoft, etc. I don't mean infrastructure software like DBMSes, application servers, operating systems, etc. There is plenty of good information freely available on infrastructure, and people blog about it all the time.

Also, if you look at a lot of the blogs that are out there for, for example, SAP, they are almost all too high level to be particularly useful. It's pretty easy to find platitudes about how you need the right team, buy-in, executive sponsorship, etc and how those (or a lack of them) caused an implementation to succeed or fail. That's all (for the most part) true but everyone already knows it. But there's not a lot of people (that I know of, please post links in the comments if I am wrong) out there posting technical discussions about implementations. Google for "ABAP programming blog" (ABAP is the programming language for SAP), and then do the same for Scala and Lisp. I bet there are more people earning their living off of ABAP than Scala and Lisp combined. Probably an order of magnitude more. So why aren't there more people writing about it? Ok, so Scala and Lisp are both interesting languages. So do the same for Ada and COBOL.

Update: December 10, 2007: Feedback on enterprise software

Enterprise software does receive feedback. Often times significant amounts in forms much better thought out than most blogs. The difference is that the feedback is private between the customer and the software provider. Occasionally it is shared among customers through conferences or other types of "user group" meetings.

If the software supplier will listen (including acting), then this can work fairly well for software that is already deployed. The problem is that there is no information solid information available to support purchasing decisions. There are no readily available sources of "tips and tricks" or "common pitfalls" for deployment or customization efforts. For example someone could write:

We've spent the last 3 months trying to make Foomatic run on top of Oracle. All the sales material touts the Oracle support. But then your deployment tanks, the consultants come in, and they ask you why the heck you aren't using SQL Server. The software is developed against SQL Server and then "ported" to Oracle, and the Oracle port never works right.

Fill in your favorite or least favorite infrastructure products there, the name of an enterprise application, and there you have some useful information. The sales material lies. Once you deviate from their default stack you are screwed. That's the type of information that would be useful - before buying the software or before trying to run it on your "enterprise standard." Not from an over-priced consultant.

Sphere: Related Content

Running Linux apps on a Mac using X

John asks:

Please can you post something about how you set up and use X to run your linux apps from your mac? I've been using X to login and use my linux desktop on my mac laptop for a few things but what I would really love to know how to do is to run individual programs which I cannot get running on my mac or haven't got room for.
Certainly! I'll start from the beginning. The Linux distribution I'm using is Fedora Core 6. The instructions should be similar for any other Unix variant. I used the GUI administration screens to do the configuration (I know - LAME!) so that's what I'm going to use here.

1. Make sure your Linux box has a static IP address.

This step largely depends on your network setup, and technically isn't 100% necessary. In order to access your Linux box, you must either know its IP address or have a domain name for it that will be translated into an IP address. If you are on a corporate network there's a good chance that there's already a DNS entry for your Linux box so you just need to know its name. Here's what I did for my home network.

1.1 Login your router and reserve a range of IP addresses so that they will not be assigned out by DHCP.

Please refer to your router's manual for details on how to do this. Depending on your router you may be able to assign a static IP address explicitly to your Linux box based on its MAC address. I could not.

1.2 Configure your Linux box to use a static IP address.

Launch the network config utility. This will require the root password. Edit your ethernet adapter and configure it for a static IP address instead of DHCP. Choose an IP address within the range that you reserved in your router, or one that your network admin has provided.

2. Enable sshd on your Linux box.

You can do this using the "Services" applet or by editing your /etc/rcN.d (where N is the desired runlevel) scripts. Enable the service and start it. Notice that I am at runlevel 3. This is because I'm now running my Linux box in headless mode, so I just want it to boot up to a console login prompt, not a graphical prompt. If you your Linux box boots up to a graphical prompt, and you intend to keep in that way, then you want runlevel 5.

3. Setup a DNS entry on your Mac.

As root, edit your /etc/hosts file and add an entry for your Linux box. So open a terminal window and type:
  cd /etc
  emacs hosts
The line for "gondolin" is the entry that I added: Save the file by hitting Control-X Control-S, then exit emacs by hitting Control-X Control-C.

4. On your Mac, launch X and ssh as follows to your Linux box.

  ssh -X gondolin
The "-X" flag tells ssh to automatically setup your DISPLAY environment variable. Once you are logged in, simply type the name of the application that you want to run, followed by an ampersand.

5. Be careful about some applications.

Most applications work fine when started from the command line, but starting the full KDE or GNOME desktop sessions does not work properly for me. I'm pretty sure that's because on my Linux box they are configured to run locally. There are some applications, such as Nautilus and Konqueror that are setup (and least on my FC6 box) to plug themselves into the desktop manager session. This results in all sorts of weirdness when there is no KDE or GNOME session running. Nautilus tries to start a GNOME session, which brings up an unusable desktop, and Konqueror just acts weird. However, Nautilus works fine if you give pass it the "--no-desktop" command-line option. The following opens Nautilus in your home directory.
  nautilus --no-desktop ~/
That's it! Feel free to post any questions.

Sphere: Related Content

Thursday, December 06, 2007

Adventures in Widefinding: Performance

I've done some basic performance tests against:

  1. Scala Actor-based parallel Widefinder
  2. Scala serial Widefinder using a BufferedReader
  3. Tim Bray's Ruby Widefinder
The test platform was my 2.2 GHz MacBook with 4GB of RAM using a 6 million line file.  The times were as follows:
Scala Parallel:
  • real 0m14.588s
  • user 0m24.541s
  • sys 0m1.383s
Scala Serial:
  • real 0m20.095s
  • user 0m18.821s
  • sys 0m1.441s
Ruby:
  • real 0m14.301s
  • user 0m12.485s
  • sys 0m1.813s
The good news is that the parallel Scala version is noticeably faster than the serial version.  The bad news is that it is roughly the same speed as the Ruby version, and takes significantly more CPU.  The Scala versions are doing substantially more work, because they have to transcode the 8-bit ASCII contained in the input file into 16-bit Unicode strings.  This requires a full scan of the data.  I believe a fair amount of performance could be gained by combining the transcoding pass over the input with the line-splitting pass.
For those that are curious, the source code for the parallel widefinder is available here: the parallel IO code the actual widefinder code

Sphere: Related Content

Wednesday, December 05, 2007

Adventures in Widefinding: Complexity

For the moment I'm going to ignore the complexity of my widefinder "under the hood," and just focus on the complexity of it at the top level.   The following is Tim Bray's goal.  The red asterisk indicates that the loop should be processed in parallel instead of in serial.

counts = {}
counts.default = 0

ARGF.each_line* do |line|
if line =~ %r{GET /ongoing/When/\d\d\dx/(\d\d\d\d/\d\d/\d\d/[^ .]+) }
counts[$1] += 1
end
end

keys_by_count = counts.keys.sort { |a, b| counts[b] <=> counts[a] }
keys_by_count[0 .. 9].each do |key|
puts "#{counts[key]}: #{key}"
end
So here's the first part of my Scala solution:
   val pat = Pattern.compile("GET /ongoing/When/\\d\\d\\dx/(\\d\\d\\d\\d/\\d\\d/\\d\\d/[^ .]+ )")
   val map = new ConcurrentHashMap[String, AtomicInteger]()
   for (file <- args; line <- LineReader(file)) {
       val mat = pat.matcher(line)
       if (mat.find()) {
         val s = mat.group().substring(23)
         map.putIfAbsent(s, new AtomicInteger(0))
         map.get(s).incrementAndGet()
       }
     }
 

Notice that this is not a map-reduce based solution. This is because I tried to make it look as similar to the original Ruby as possible. There are some minor differences due to the fact that regular expressions are not special in Scala the way they are in Ruby. The big, and in my opinion most intrusive difference, is that a ConcurrentHashMap has to be used to store the results, AtomicIntegers must be used for the counts, and that weird "putIfAbsent" call needs to be used for pulling counts out of the map.

So while the concurrency itself is implicit, and the code is certainly succinct, the programmer still needs to be very much aware of the fact that the block inside the for loop will be executed in a concurrent fashion. This is certainly less elegant than a pure functional approach such as map-reduce, where no synchronization would be necessary, but I also think it is less intrusive to a programmer accustomed to imperative programming.

Here's an ugly part. This is largely due to an impedance mismatch between Java collections and Scala collections. It would look a lot better if Scala had an equivalent to ConcurrentHashMap in its library. There's also probably lots of other ways to make it cleaner. I tried several different approaches and always ran into some problem.

    val array = new Array[(String, Int)](map.size)
    val iter = map.entrySet.iterator
    var i = 0
    while (iter.hasNext) {
      val e = iter.next
      array(i) = (e.getKey, e.getValue.intValue)
      i += 1
    }
And here's the final part, once the ugliness is done:
    quickSort(array)((e) => OrderedTuple(e))
    val limit = if (array.length < 10) array.length - 1 else 9
    for(i <- 0 to limit) {
      print(array(i)._2)
      print(": ")
      println(array(i)._1)
    }

Notice having less that ten results is cleanly handled, unlike in TB's code (unless Ruby magically ignores invalid indices).

Overall I would say I came pretty close to TB's goal without without requiring a fundamental change in programming methodology. I didn't alleviate the need for the programmer to be aware of concurrency, but I don't think that's possible. Sure, people can learn to program in ways that are fundamentally parallelizable, and that probably is what universities should start teaching, but that's going to require a major shift on of part of millions of programmers, or for several generations to retire and be replaced by ones trained in functional programming. Lots of cores are here NOW, so it's not practical to wait until skills catch up. We need to work with what we have.

Sphere: Related Content

Monday, December 03, 2007

Switch

After about two years of running Linux-only (Fedora, to be exact) at home I've switched to a Mac.  I wanted a laptop, needed a new computer desperately (the old one is almost 6 years old), didn't want Windows, and was tired of messing with Linux configuration stuff.

This weekend I bought a Macbook.  So far I love it.  I was hesitant about the keyboard, which is odd, but I actually like it.  I was nervous about using a touchpad instead of a pointer stick, but somehow the touch pad on the Macbook works substantially better than the one on the HP I use at work.  I was worried about quality problems...we'll see about that one.  I just read about some Macbooks and Macbook Pros having defective harddrives, but the model number on mine is different.
So what have I done so far?
  1. Booted the first time.  Wireless worked flawlessly on the first try.  I've spent way too much time making Windows PCs (my wife's, my work laptop, visitors' machines) work with my wireless network.  One of my friends tried to make his Linux laptop work with it and after many hours never succeeded.
  2. Installed Eclipse and the Scala plugin, played around with that.
  3. Installed AquaMacs - IMHO much better than Emacs on Linux.
  4. Surfed the net, poked around, reconfigured some stuff to my liking, etc.
  5. Setup NFS mounts to my old Linux box. This was amazingly painless on both sides.  The only hiccup was that I had to change my UID on my Mac from 501 to 500, which created some oddities before I logged out and back in, but nothing horrible.  The whole process probably took about 15-30 minutes, compared to hours I spent to make Samba work properly with with wife's Windows XP Thinkpad.
  6. Setup sshd on my Linux box and then ran XWindows applications on my Linux box from my Mac.  This whole process took about 10 minutes, and the XWindows apps actually looked good and were very responsive.  Possibly more responsive than running them locally.  Note that this and the above are a tribute to both Fedora and MacOS X.
  7. Upgraded my RAM from the stock 1GB to 4GB.
  8. Ran my Scala widefinder from Eclipse.  It spikes both CPUs once everything is cached.  Puts both at about half if it has to read from disk.
  9. Wrote this blog.
So all that is very positive.  And the UI is so beautiful.  The fonts all look just right...although if I didn't have good eyes I think a lot of them might be too small.
Ok, so what's wrong:
  1. I haven't figured out the keyboard shortcut to switch between windows of an open application.  Command tab switches between applications.
  2. No page down, page up, home, end, etc keys on the laptop keyboard.
  3. The caps lock key is WAY too big.  Keys sizes should be proportionate to usefulness/frequency of use.  Spacebar is huge.  Enter and delete are big.  Caps lock being the size of enter makes not sense.
  4. I had to install ~200MB worth of updates fresh out of the box...over half of that for Leopard.  Why can't a new computer to all up-to-date?  Seems like that "genius" who sold it to me could have given me one that was up-to-date.
  5. After the reboot for the Leopard updates, everything froze.  Just a blue-grey screen with no cursor.  That's a bad first impression.  But I turned it off and back on again and everything was fine.
  6. I had to install codecs for video various video formats.  That's mildly annoying.  Why can't they be preinstalled?  It's not like I had to pay to download and install them.  But what's more annoying is that the first link presented to me for XVid didn't seem to have any appropriate install files for Mac.  I used DivX instead, which worked.
So overall I am very impressed.  I think I'm going to keep my old Linux box around as a server and to run Linux-only stuff over X.  If anything goes wrong I'll be sure to complain about it.

Sphere: Related Content

Sunday, November 25, 2007

Adventures in Widefinding

Over the past few weeks in my copiuos free time I've been trying to come up with a fast Widefinder using Scala Actors where the parallelization magic is hidden behind what looks like a nice friendly API. Based on timings on my oh-so-fast Athlon XP 1900+ I seem to have failed on the first account (Ruby is faster) and the second part requires some awareness of parallel programming. But I'm hopeful that when Tim runs it on the T5120 it will be competive with the other solutions, although I doubt that it will be the fastest. For those of you who don't want to listen to me ramble, here's a jar containing all of the class files needed to run my solution along with the Scala source code. It's so big because I put the Scala libraries in there for Tim so he wouldn't have to worry about CLASSPATH error or installing Scala. Ok, so why isn't it the fastest? What's the point? Well, first let me define what I think a good widefinder solution should demonstrate.

  1. The "application code" should be similar in complexity and readability to Tim Bray's example in Ruby. Anything else should be hidden behind a general purpose API, preferably an out-of-the-box one.
  2. It should be environmentally friendly. This means that it shouldn't hog all the available RAM and make other processes start paging. It should also be well behaved on a machine under load. This is not relative to input file size. It should ALWAYS be environmentally friendly.
  3. It should maintain the abstractions to which programmers are accustomed. For example, it shouldn't bypass regexes or use non-standard string types. Also, if a language normally deals well with unicode and the potential for multiple encodings, it should deal well with unicode and multiple encodings as well.
  4. The person running it shouldn't have to tell it about details like how many worker processes to spawn. It should figure it out based on its environment. Ideally the programmer shouldn't have to worry about it either.
  5. It should be sufficiently performant on a single processor box.
  6. It should run faster as more processors are made available.
I think my solution is reasonable according to all of these measure. Hopefully we'll see about execution on lots of cores shortly. #1 is a little tricky, because the using a concurrent map was tricky and there is a definite impedance mismatch between Java collections and idiomatic Scala. Depending on your perspective a map-reduce based solution would have been more elegant, but unless map-reduce becomes required learning for all professional programmers then I'm not so sure. #2 is one area where, in my opinion, many of the current leading solutions fall down. A favorite way of achieving easy parallel IO (well, IO on most disk can't really be done in parallel) is to spawn a bunch of processes (or threads) and have them each mmap a large chunk of the input file and process it. The issue with this is that, as Fredrik Lundh noticed (BTW - he has an excellent discussion - go read it), this can cause serious paging on some operating systems. Other solutions allocate buffers measured in megabytes for each worker thread/process. I personally don't see this as very environmentally friendly when only kilobytes are really needed. My solution reads the file sequentially using one set of buffers per worker (about 128k total per worker). Each worker reads in 32k and then passes the FileChannel to the next worker. #3 is the real kicker, and what kicked down the performance of Scala (and, I believe, any other JVM based solution). Java strings are unicode and based on 16 bit characters. The input file is ASCII and based on 8 bit characters. Consequently, a String-oriented solution on the JVM has to transcode the characters. Futhermore, it should be able to transcode the characters from any common file format, not just ASCII or UTF8. This turns out to take a lot of cycles, around 10% depending on the charset and the cost of some other functions. The good news is that each buffer can be transcoded in parallel. #4 is more about maintaining abstractions. My solution does it automatically, the other solutions don't seem to...although adding it probably wouldn't be a big deal. #5 is probably true of most of the solutions...or true if you substitute dual-core for uniprocessor, because that's what most of the people used to develop their solutions. In the case of my solution I think there is about a 30% performance penalty on a single processor box (that and transcoding are enough to make it slower than Ruby). #6 is interesting because this is an I/O bound task. Assuming the data is actually being read from disk (I believe it is cached in all or most of his trials), then this task should only be able to keep a small handful of processors busy. But that's not to say it isn't important. The reason it's important is because from the normal programmer's point of view, file I/O isn't just shuttling information from the disk to a buffer, it's everything that happens between him making the request and a string representing a line being returned. There's a lot of work that goes into making that string. Here's a list (more or less) for Java and Scala.
  1. Read the data into a direct ByteBuffer.
  2. Copy the data into a heap ByteBuffer**
  3. Transcode the ByteBuffer into a CharBuffer
  4. Find the characters that represent a line
  5. Make a String from those characters
Your operating system probably uses something called readahead to asynchronously fetch the next N bytes of a file before you actually request it, on the assumption that you will. The Java or Scala library could asynchrously fill a fixed-sized queue of lines, assuming concurrency is cheap enough. I learned a lot in this excersise, but my two big takeways are:
  1. Abstraction has a real cost in terms of CPU cycles, and existing abstractions in mainstream languages (ok, Scala isn't mainstream yet, but Java is) are not optimized for multi-core environments. This will need to change.
  2. Abstraction has a big cost in terms of programmer cycles if the programmer wants something different. Without the source available from OpenJDK I would have been lost when I was trying to figure out why approaches that seem like they require less CPU work actually require more. The Client and Server VMs make a huge difference in performance that varies widely depending on file size and program structure. With the libraries, JVM, and OS all trying different things to make your code go faster it is extremely difficult to figure out exactly what is happening and why.
**I looked at the OpenJDK source and if you use NIO to read into a non-direct buffer, what actually happens is the Java library reads into an available direct buffer and then copies it into a direct buffer. This is essentially what happens with a regular FileInputStream as well. At first I thought I could speed things up by skipping the copy from the direct buffer to the non-direct buffer, but it doesn't. You see, decoders are optimized to use the array backing the buffer if it is available because direct array accesses are much faster than function calls. Array copies are done using low-level native code. Consequently, copying the direct buffer and the transcoding the in-heap copy is much faster than directly transcoding the direct buffer.

Sphere: Related Content

Wednesday, October 10, 2007

Why test parallelism on a simple function?

On my last blog anonymous asked:

would a more expensive line-match-function make it more obvious if you are working in parallel?
I would say that one should be able to demonstrate that transparently supporting the potential for parallelism should be near free. If you can use a parallel algorithm to solve a problem that doesn't benefit much from parallelism with roughly the same or better performance characteristics as the serial code then it should be a lot better when you actually give it a more complex problem. Basically, parallelism should be free. A lot of people have commented on Tim Bray's blogs that his test is unsuited for demonstrating the benefits of parallelism because it is IO bound. Tim claims this isn't true, and I suspect there's some truth to that if you have really optimized IO, but I do think the benefits of parallelization for him problem are very limited. That being said, one thing that it he has clearly demonstrated is that parallelism isn't free. His "obvious" newbie solution in Erlang performed horribly and was considerably longer than the Ruby solution. Others have greatly improved the performance with extremely long, complicated chunks of code, but have yet to match Ruby. I find that really sad. So I would like to prove that parallelism can be almost free, meaning:
  1. Leveraging it does not impose a significant additional cognitive load on the programmer.
  2. Problems that are not effectively parallelizable should execute "about as fast" when parallel functionality is used as with serial code.
From an interface perspective I think I have it with the monadic interface to the file. I just need to work out some bugs or change the interface to make them go away. I'll write more on this when I've worked out some of the wrinkles. So that leaves the performance problem. One of the big challenges with parallelization is that spawning new threads or processes is very expensive, and synchronization is somewhat expensive, so it's very easy for the cost of parallelization to overwhelm the cost of the actual solution. The most straight forward way to address this problem is to not parallelize when the function is not complex enough or the input data set isn't large enough to justify it, but that is back to imposing a cognitive load on the programmer because he has to figure that out. Either that or always "start serial" and use runtime profiling tricks to detect if the problem is worth parallelizing, which sounds expensive put probably has merit. Another challenge is knowing how to divide up the problem to avoid excessive synchronization and/or messaging. When processing a file line-by-line, one could send each line out to be processed independently, but that requires a lot of messaging and synchronization if you don't have lock-free messaging. So really you want to break the problem into properly sized chunks and send each chunk as a message rather than simply use the most natural division. Figuring out how big a chunk should be (or how many chunks you should have) is a challenge because it is problem and runtime dependent. Again, this creates the potential to burden the programmer, use complex and potentially expensive runtime profiling, or somehow come up with a magic cheap hueristic. So you can either solve the problems above, or you can have sufficiently cheap parallelism that you don't need good solutions. Right now I'm going after the sufficiently cheap approach. What I have so far is a mapreduce-style function using Scala Actors that breaks a file into chunks of lines and sends them off to be processed by an Actors. I plan on adding a parallel foreach function that could be used for problems like Widefinder using a parallel hash map. Performance wise it's looking promising. Here's some numbers (using my 5+ year old machine): Serial:
Count: 185300 Serial: 11592 real 0m12.107s user 0m11.254s sys 0m0.784s
Parallel:
Count: 185300 Serial: 11722 real 0m12.225s user 0m11.441s sys 0m0.723s
As you can see the parallel code is slightly slower than the serial code. Across runs their times actually overlap a bit, but serial generally times to be a tad faster. One thing I've noticed is that the deltas between the serial and parallel implementations don't really grow - and to some extent shrink - with increasing input sizes. I believe this is because there is a fixed penalty for setting up the thread pool for the actors. This only has to be done once per process invokation, and appears to cost about 200ms on my machine. In other words, parallelization for file processing can be almost free. I actually think it could be better-than-free, even on a single processor box, if IO was more efficient. My current solution is using a BufferedReader to read in the file one line at a time. This means the IO is probably being done in a less-than-optimal way, and that a lot of work is being done in serial for each line (converting from 8bit ASCII to 16-bit Unicode strings, splitting it into lines). I'd like to use nio to read the file in a block at a time, and then let all this work be done in separate threads. I think then there would be a performance increase because one thread would be doing nothing but reading in buffers as fast as the OS and JVM and provide them, and others would be doing all the computation while the IO thread is blocking. But before that I'm going to get the interface cleaned up and solve the memory problem on large files.

Sphere: Related Content

Sunday, October 07, 2007

Dangerous Monads?

I've been trying to write a parallelized version of Tim Bray's Widefinder problem in Scala where the parallization is "hidden" in some library-like code so that the programmer can write something that looks as clean...nay...cleaner than Tim's Ruby code using Scala. I'm also trying to figure out what the heck a monad is, and thinking about how I'll have to use ugly, imperative Java IO classes, so I decided to write some classes to make a file look like a list of strings, with each element representing one line. One critical aspect of this list is that it is lazily generated, so you don't have the wait for the entire file to be loaded to start working with it. Another critical trait is that there are no "back references" from subsequent lines, so that if you no longer hold a reference to previous nodes in the list, those nodes can be garbage collected. The interface supports all the required methods for "for" comprehensions - map, foreach, filter, and flatMap. I think it qualifies as a monad, or at least is something close. I ended up with something where usage looked kind of like this:

def main(args : Array[String]) : Unit = {
  val pat = Pattern.compile("GET /ongoing/When/")
  for (file <- args; line <- Line.open(file) if pat.matcher(line).find) cnt = cnt + 1  } 
Note that I'm just counting lines there, not performing the full Widefinder functionality. But still, that's pretty concise and easy on the eyes. The "for" comprehension is translated into calls to foreach. It works great for a lot of files, but it blows up with a OutOfMemoryError on large files. So why is that? I'll give you a hint, fully expanded it would look kind of like this:
args.foreach((file) => Line.open(file).filter((line) => pat.matcher(line).find).foreach((line) => cnt = cnt + 1))
Can you see the problem? No? Try this:
args.foreach((file) => {
 val firstLine = Line.open(file)
 firstList.filter((line) => pat.matcher(line).find).foreach((line) => cnt = cnt + 1))
})
Can you see it now? The problem is that there's a reference to the first line hanging around. Even if you don't declare the variable it's still there, lurking as the implicit "this" parameter for filter. That reference makes it so the first line is not collectable, and as a result none of the lines are collectable because they are all reachable from the first line. So the whole file is loaded into memory, resulting in an OutOfMemoryError. That seems pretty dangerous to me. So how can we solve this problem? Well, we have to make it so that references to lines disappear as we get done with them. There are a couple ways to do it. The obvious imperative way is the use a var and a while loop, but then you might was well use BufferedReader directly or use an iterator. The functional way is to use tail recursion, so you write:
def scanLines(line: Line, pat: Pattern, cnt: Int): Int = {
  if (line == EndOfFile) cnt
  else {
    val v = if (pat.matcher(line.value).find) 1 else 0
    scanLines(line.next, pat, cnt + v)
  }
}
...and call it for each file like this:
for(file <- args) cnt = cnt + scanLines(Line.open(file), pat, 0)   
Notice that there is no val declaration holding the first line. This is critical, because if there is, it will run out of memory. So what do we do about it? Well, it would be easy enough to refactor methods like foreach that trigger immediate file traversal out of the normal interface and into a tail-recursive method on the companion object. Unfortunately that would break usage in for comprehensions, be inconsistent with other collection-like objects, and in general feel like poor OO. Another way to fix it would be to fix Scala so it supported full tail call optimization. Of course, that would also require adding full tail call support to the JVM. That way the unneeded "this" reference could silently disappear from the stack. This would also allow many methods to be expressed in a much cleaner way. For example:
  final def foreach(f: (String) => Unit): Unit = {
    def fe(line: Line): Unit = {
      if (line != EndOfFile) {
        f(line.value)
        fe(line.next)
      }
    }
    fe(this)
  }
Could be simplified down to:
  final def foreach(f: (String) => Unit): Unit = {
    f(value)
    next.foreach(f)
  }
For those of you who want to take a look at the code I have so far, here it is as an Eclipse project. I've tried to comment it a bit, but it's still a work-in-progress. There is a working (at least I think it works) mapreduce function hiding in there that allows lines to be parallel processed using actors. Unfortunately it is slower (but not substantially slower) than just processing the lines serially. But then I'm running it on an old uniprocessor PC, so maybe with more cores it would do better. If I get some free time at work I'll try it out on a multicore machine and see what happens, but I suspect that unless (or until) I hack together something that uses nio in an optimal way the task will remain IO bound...and even then it may remain IO bound.

Sphere: Related Content

Monday, September 17, 2007

The Tar Pit

The editor of TSS has decided to run a series discussing The Mythical Man Month, by Frederick Brooks. Hopefully it will produce some good discussion. There are a lot of Agile advocates that hang out on TSS that really could use to (re)learn some of the lessons of software development. I make a point of rereading it every few years lest I forget the lessons learned by computing's pioneers. The first chapter - The Tar Pit - contains on of my favorite concepts, as illustrated by the graphic below (slightly changed from original): Programs Let me explain. A program is what we all have developed. It's simple piece of software that is useful to the programmer and/to some set of users who are directly involved in defining its requirements. Most bespoke departmental and a substantial portion of enterprise applications fall into this category. They are sufficiently tested and documented to be useful within their originating context, but once that context is left their usefulness breaks down quickly. In addition, they are not solidly designed to be extensible and certainly not to be used as a component in a larger system. Obviously this is a range, and I've really described a fairly well developed program - one almost bordering on a programming product. That script you wrote yesterday to scan the system log for interesting events that has to be run from your home directory using your user account in order to work is also just a program. Programming Products Moving up the graph, we hit programming product. In theory, all commercial applications and mature bespoke applications are programming products. In practice this isn't really the case - but we'll pretend because they are supposed to be and I increased the standard over what Brooks originally described. The big challenge with programming products is that, according to Brooks, they cost three times as much to develop than simple fully-debugged programs yet they contain the same amount of functionality. This is why it's so hard to get sufficient budget and schedule to do a project right. The difference between a solid piece of software and something just cobbled together is very subtle (you can't tell in a demo) yet the cost difference is quite astounding. Consequently, I think most commercial applications are released well before they hit this stage, and bespoke ones require years to mature or a highly disciplined development process to reach this point. Programming Systems Programming systems are programs intended to be reused as parts of larger systems. In modern terms, they are libraries, frameworks, middleware, and other such components that are all the rage in software development. Like programming products, programming systems are thoroughly tested, documented, and most importantly are useful outside of the context in which they were created. And, like programming products, according to Brooks they take three times as long to develop as a regular program. Developing programming systems for bespoke applications or niche use can be a tar pit all its own. For one, many programmers like building libraries and frameworks. The problems are more technically challenging, and there is no strange-minded user to consider. The programmer and his colleagues are the user. Programming systems are relatively common in groups that execute a lot of similar projects and/or that contain programmers who really want to build components. Programming System Products Amazingly, programming systems products are relatively common - even if there really aren't that many of them. As you've probably guessed, a programming system product has all the traits of both a programming product and a programming system. It is useful to a wide range of users and can be effectively extended and/or embedded for the creation of larger systems. It has complete documentation and is extensively tested. Where are these wondrous things? Well, you are using one right now (unless you printed this). Your operating system is one. It both provides useful user-level functions and a huge amount of infrastructure for creating other programs. MS Office is one as well, because it has a pretty extensive API. Most commercially developed enterprise systems should be programming system products, because:

  1. They provide off-the-shelf functionality for regular users
  2. Customers always customize them
  3. They often must be integrated with other products
  4. Simply integrating their own components would go better with a programming system
The problem is that they are not, because of: The Tar Pit Brooks didn't explicitly write this definition of The Tar Pit but I think he would agree. Put yourself in the position of a development manager at a startup or in a larger company about to launch on a new product. On on hand, you want to make the product as good as possible. You know that what you develop today will serve as the base for the company/product line for years to come. It needs to be useful. It needs to be extendable. It needs to be thoroughly tested and documented... It needs to be cheap and delivered yesterday. The differences between a programming system product and a simple programming product are far more subtle than the differences between a program and a programming product. But the programming system product costs a full NINE TIMES as much to develop as the program with essentially the same "outward functionality" - at least if you are a sales guy or a potential customer sitting in a demo. I think this is the struggle of all engineering teams. If the product is long lived, doing it right will pay major dividends down the line. But it can't be long lived if it is never released. It stands a worse chance if it comes out after the market is flooded with similar products (actually, that's debatable...). The ultimate result is a mish-mash of tightly coupled components that, as individuals, fall into one of the lesser categories but as a whole fall down. There is a documented API, but the documentation isn't really that good and the application code bypasses it all the time. The user documentation is out-of-date. Oh, and the application isn't really that general - hence why all the major customers need the API so they can actually make it work. Escaping the Tar Pit Ok, so if you develop a large system you are destined to fall into the tar pit because cheap-and-now (well, overbudget and past schedule) will override right-and-the-next-decade. You need a programming system product, but budget and schedule will never support much more than a program. So how do you escape it? Accept It Products that give the outward impression of being far more than they are often sorely lacking in conceptual integrity. If you are building an application - build it right for the users. Remember you can build it three times for the cost of building a programming system product. Maybe by the third time there will be budget and schedule for it. Or maybe, just maybe, you can evolve your current system. But pretending will just make a mess while wasting significant amounts of time and money. Partition It Some pieces of your system are probably more important than others. There are places where requirements will be volatile or highly diverse amoung customers - those are the places where you need a truly extensible system. You should also be able to reuse strong abstractions that run through your system. The code associated with those abstractions should be top-notch and well documented. Other pieces just need to be great for the users, while a few that remain need to be convenient for yourself to extend or or administrators. Open Source It Find yourself developing yet another web framework because what exists just isn't right? Open source it. This isn't really my idea. It's what David Pollak of CircleShare is doing with the lift web framework for Scala (actually, I'm guessing at David's reasoning, I could be wrong). The infrastructure for your application is essential, but it isn't your application. It is what Jeff Bezos refers to as muck. You have to get it right, but it's distracting you from your core mission. So why not recruit others with similar needs to help you for free? That way you don't have completely give up control but also don't have to do it alone. Theoretically the same could be done for applications. Many large customers of software companies have significant software development expertise - sometimes more than the software companies. I think it would be entirely feasible for a consortium of such companies to develop applications that would better serve them than commercial alternatives. But I have yet to convince someone of that...

Sphere: Related Content

Tuesday, September 04, 2007

Is implementation important to a startup?

I'm not entirely qualified to answer this question, but here it goes. Taken from the Scala mailing list:

Amir Michail
On 9/2/07, Erik Engbrecht > ...
> All-in-all I would say Scala would be perfect for a small team of > extraordinarily smart developers, but I wouldn't turn your average Java or > C# programmer pulled off the street loose on it. Not because they couldn't > write working Scala code, they could, it would just be a more concise > Java/C#. But I wouldn't start a company based on programmers grabbed off > the street (or leased across the ocean), either. >
I actually don't think implementation is much of an issue for internet startups, particularly for Facebook apps. This is off topic, but I would be interested in knowing what you think of it: http://weblog.fortnow.com/2007/08/impact-of-facebook-platform-on-cs.html http://weblog.fortnow.com/2006/07/science-and-art-of-computation.html
So here's what I wrote (off-list) in response:
Amir, I'm not really qualified to comment on the skills required to write Facebook apps because not only have I never written one, I've never even accessed Facebook. However, I think one of the most effective ways of achieving long-term success for a business is to have a product that is in demand (obviously) and others find very difficult to replicate - or at least replicate to the point where people have difficulty deciding whether to use the new product or the old. The best way I can think of to do that is for the product to have an inherent internal complexity - meaning it requires either hard-core CS skills, substantial domain knowledge, or both to grok it. For example, Google is really an AI company. They don't actively publicize it (nor do they hide it), and I bet the majority of their employees aren't working on anything AI related, but ultimately that's what the company is - and that's why they'll stay ahead. Replicating the core of their business is simply too hard. It certainly could be done, but you'd need to get a sufficiently large group of self-motivated geniuses together who would both cooperate and have enough entrepreneurial spirit to not be lured away by Google. I'm not saying you can't build a successful startup simply by creating something cool and getting a lot of people to use it. You certainly can. In fact, I would say the vast majority of successful startups were built that way. But I also think the vast majority of startups don't have long-lasting products. They have products that fill a (possibly transient) need for a larger player and get acquired. They spend hundreds of thousands for maybe a few million dollars building something that would take a larger player >$10x plus months to replicate, not because the startup is better at engineering or product design - it may or may not be - but because the startup is 1 of 100 or even 1000 so the larger player would have to fund >10 initiatives a pick the somehow pick the best result in order to replicate what the startup did, and it can't do that because it can't release >10 similar products all at the same time without looking stupid to investors and confusing its customers. From what I've read about Facebook apps they are relatively easy to build. If you build a business around one, then essentially you are either trying to be that 1 in >100 that people latch onto long enough for a larger player to be acquired, or so are going to launch yourself on a never ending cycle of re-inventing your application as people duplicate it and trends change. Of course, that's not to say you can't have something really sophisticated at the core of your Facebook app. I'm sure you can. But if you do, implementation matters a whole lot. Implementation tends to lag theory by years if not decades. Your ability to implement is your competitive discriminator. As for the blog articles...well...I think the CS should remain more pure. CS is not programming, but programming is an essential skill for computer scientists (as I actually believe it is for most scientists, but in CS is is a much stronger need). So I think adapting CS to "Web 2.0" is a rather foolhardy exercise. By the time programs adapt we'll be on Web 4.0, so the programs will be behind the times and will have lost the theoretical goodness of CS. That's not to say those programs are bad ideas, I just don't think they should be labeled CS. It dilutes the brand, so-to-speak. I would say CS programs should offer far more electives that are cross-listed with other disciplines, such as business, economics, or the sciences. But these can't replace fundamental courses or even upper level technical electives, so they would need to be framed as to replace what my alma mater called "humanities requirements." -Erik
That's just my opinion, and to tell the truth my opinion has changed over the years. But this is closely related to the ongoing debate about whether a CS backrgound is or is not important for being a software developer. I actually think it is, but I also think a good CS program is overkill (in the CS theory and programming sense) for what most software developers do and grossly inadequate from a communications/requirements analysis sense.

Sphere: Related Content

Tuesday, August 28, 2007

Scala Code for GPS

The link to the source code in the previous blog didn't work. I put it in the wrong directory. It's fixed now. You can get the code here: scala-gps.zip

Sphere: Related Content

Tuesday, August 21, 2007

Scala vs Lisp for GPS: Part I: Expressivity

Introduction Before I found Scala I was looking for a language with the following characteristics:

  1. Expressivity on par with Python or Lisp
  2. Performance on par with Java
  3. Has a REPL or at least an interpreter
  4. Compiled
  5. Object-oriented
  6. Large number of modern libraries and frameworks
  7. Cross platform, including think-client GUIs
  8. Open Source
Now I've found Scala and I think I have my language, but unfortunately it's hard to tell about the expressivity without doing a real project and doing a real project involves a measure of risk. Instead, I decided to port some Lisp programs from Norvig's Paradigms of Artificial Intelligance Programming to Scala and compare them to the original Lisp programs. As as second step, I'll "Scala-ize" each program to be more idiomatic Scala (mostly increasing object orientation) to see the result. Finally, I'll attempt to do some sort of performance comparison. Background - What is GPS GPS stands for "General Problem Solver." The algorithm here comes from Alan Newell and Herbet Simon way back in 1957. The original attempt was to create a computer program that could solve any problem. It fell quite short of its goal, but the algorithm is still interesting and its expression in Lisp or Scala quite concise. Those of you familiar with planning problems will recognize the basic structure and the sample problems. For more information, see Chapter 4 of PAIP. The Algorithm A problem is described using a set of initial conditions, a set of operators that can change those conditions, and a set of goal conditions. Each operator has a name, a list of preconditions (conditions that most be true for it to be invoked), an add list (conditions it will add to the state when invoked), and a delete list (conditions it will remove from the state with invoked). Note that these conditions are simple. They are not based on on propositional or predicate logic. The GPS uses a technique called means-ends analysis. A brief explanation is contained here about half way down the page. In a nutshell, the algorithm tries to achieve each goal one at a time, until it fails to a achieve a goal or it has achieved all the goals. The preconditions for each action are achieved by recusively invoking the algorithm on them. For a detailed description of the implementation of the algorithm, see PAIP or look at the comments in my Scala code. Comparison of Implementations There are some substantial differences between the Lisp implementation and the Scala implementation. They are:
  1. The Scala implementation is more explicitly structured. Some of the structure, such as the static typing, is forced by Scala. Others, such as using classes instead of conventions on list contents to distinguish between normal conditions and "executing conditions," are not forced by Scala but are simply easier and more natural.
  2. The resulting plan is a list of operators rather than a list of lists obeying a convention. This is straightforward because "executing conditions" are a special class of condition and are explicitly tied back to the operators they represent.
  3. The core algorithm in Scala is completely and explicitly devoid of mutable state. The Lisp solution makes only one small use of mutation in achieve-all, and that occurrence could easily be removed in the same was as it was from Scala. However, cons cells, and the lists they form, are mutable and therefore it is much more difficult to prove that the algorithm uses no mutable state.
  4. The Scala solution required fewer utility functions. This is due to both the richer standard library and case classes.
  5. The problem-definition DSL required significantly more code in Scala, especially in order to make it interpreter-friendly.
  6. The syntax of the problem-definition DSL in Scala is statically checked. For example, you cannot use preconditions in the place of the add list. Achieving this took a fair amount of work, and I almost just went with a looser version that was equally easy to type and read but less validated.
  7. In my subjective opinion, the Scala version is much easier to read. I find the fact that Lisp control structure like "if" implicitly return nil on failure if not "else" is provided to be quite confusing.
General observations:
  1. At least for this problem (which didn't use macros), Lisp code could be almost directly converted to Scala.
  2. Scala implicits make an effective substitute for special variables (i.e. defparameter) in Lisp.
  3. Overall LOC count is too close to matter.
  4. Lisp development is far more interactive that Scala.
  5. The Scala compiler catches a lot of "stupid" mistakes that the Lisp compiler does not (but the Lisp compiler catches some, much more than Python, Ruby, et. al.), reducing the number of edit/run cycles by a fair margin.
  6. Case classes are really useful. They allow you to add structure to your program without going to all the work of implementing all the support methods, such as equals, hashCode, toString, and unapply for pattern matching. Lisp is worse than Java et. al. because it has four different equals functions (eq, eql, equal, equalp) all with subtle (and not so subtle) differences. In Scala, if you are wise and don't have to deal with Java libraries (hah!), you pretend eq (reference equality) doesn't exist (indeed, pretending references don't exist, and null doesn't exist, seems like a good idea to me). In fact, I find case classes so useful, I think I'm going to write a Python metaclass so I can use them in Python, too.
  7. I retract previous negative statements about pattern matching on lists. It's useful. I get it now. But I still wouldn't convert a string to a list just for pattern matching.
Updated - Finished point 6. Conclusion I'm not going to conclude anything yet. So far, Scala seems to be as expressive as Lisp, and its added structure and compiler definitely reduce the number of errors made while coding. However, this comparison was on a relatively simple piece of code and the Scala implementation was mostly a port. This means that I can't say one way or another whether the added structure of Scala would be a help or a hindrance to the early stages of flushing out code. Based on my experience with Python, I think without case classes the added structure would definitely be a hindrance. Keeping all the basic functions (comparison, hash, toString) up-to-date on rapidly changing objects is a real pain.

Sphere: Related Content

Friday, August 17, 2007

Syntactic Sugar for Composition/Delegation

Composition/Delegation is a common pattern in object oriented software. There are a lot of different definitions floating around for both composition and delegation, but basically it amounts to this, as demonstrated in Java code:

class MyStringList implements List<String> { 
    private List<String> backingList;
    public MyStringList(List<String> backingList) {
        if (backingList == null) throw new NullPointerException();
        this.backingList = backingList;
    }
    public String get(int i) { return backingList.get(i); }
    public boolean isEmpty() { return backingList.isEmpty(); }
    // ... lots of other methods like the ones above
    public void add(String s) {
        if (s == null || s.length() == 0)
           throw new IllegalArgumentException();
        backingList.add(s);
    }
    // and some more slightly modified methods to complete List
}
So MyStringList is composed of backingList and it delegates most of its methods to it. Furthermore, in this case it is using backingList to implement the interface List. There are other ways to use composition and delegation, both together and separately, but this is the pattern with which I'm concerned. If I had full written out the code, the first thing you would notice is that there is a painfully large amount of boiler-plate code (the List interface requires 25 methods) to accomplish very little. If you know Java or a similar language, you would then probably think that it would be easier to use inheritance as follows:
class MyStringList extends ArrayList<String> {
    public MyStringList() {}
    public void add(String s) {
        if (s == null || s.length() == 0)
            throw new IllegalArgumentException();
        super.add(s);
    }
    // do something for the other add methods as required...
}
That's a lot less typing. It's also strongly coupled to ArrayList. Depending on what you're trying to do, that may or may not be a problem. For example, let's say you are developing a web application using Hibernate, and you want Hibernate to lazily fetch the contents of the list. The only way would be to extend Hibernate's list instead, thereby breaking any code you already wrote that assumed ArrayList and strongly coupling you to a specific ORM. The other problem is that Java only supports single inheritance, so you can only save typing for one interface. If you have several, you are out of luck. Trust me, these are not good things. So, with Java at least, you are stuck between a rock and a hardplace. You either get strong coupling or a lot of typing boilerplate code. If you don't believe me, ask Google. But problem is very solveable. On one hand there's mixins, but these introduce their own set of restrictions, oddities, and coupling. It would be better to add something simple that preserves the pattern in an easily understandable way while eliminating the boilerplate code. Delegation should be made part of the language. So instead of writing the initial piece of code, you would write:
class MyStringList implements List<String> { 
    private delegate List<String> backingList;
    public MyStringList(List<String< backingList) {
        if (backingList == null) throw new NullPointerException();
        this.backingList = backingList;
    }
    public void add(String s) {
        if (s == null || s.length() == 0)
           throw new IllegalArgumentException();
        backingList.add(s);
    }
    // and some more slightly modified methods to complete List
}
The delegate keyword tells the compiler that any methods from List that are not already implemented should be generated by delegating responsibility to the member backingList. This eliminates the boilerplate code without introducing excess coupling, reducing flexibility, or even being particularly confusing. Furthermore, this scheme could work in many languages. I'm not a compiler hacker, but concepts like this are always better with a demonstration. So I've thrown together some Python that enables this behavior. Code is here. It's far from production-quality. I'm sure people will find plenty of flaws. But it does demonstrate the concept (along with showing how easily extensible Python is.

Sphere: Related Content

Tuesday, August 14, 2007

Business Engagement and IT Project Failure

CIO.com recently ran an article by the CIO of GE Fanuc regarding "functional engagement" (i.e. "the business" or more commonly "the users") and project failure. Michael Krigsman later posted a more succinct summary on his ZDNet blog. Here's a even shorter version:

  1. Make sure a single business-side has significant capability, responsibility, and authority for project success.
  2. Don't short-circuit important parts of the project
  3. Make that person understands the "laws of IT" and defends them to his peers, subordinates, and superiors
#1 is just plain common sense. #3 amounts to establishing a scape goat for the inevitable failure caused by short-circuiting appropriate systems engineering or architectural activities. Notice that there is neither a definition of "failure" nor a definition of "success." I think it can be inferred that he's defining "failure" as "exceeding project or maintenance budget and/or schedule." Not once does he mention delivering real innovation to the business or even real value - just avoiding causing massive amounts of pain. Consider the following statement:
Enterprise platforms like Siebel, Oracle and SAP are not intended to be heavily customized. When going from a niche, custom application to Siebel, you need a strong functional leader to push back on every “Yeah, but” statement. They must start at zero and make the business justify every customization. Saying “no” to customization is bitter medicine for your business partners. They will make contorted faces and whine ad nauseum. But it is for their own good.
Let's think for a moment. Your customer (internal or external) wants to spend millions of dollars implementing a CRM (or ERP, PLM, etc.) package. Earlier, it went to the trouble of building one from scratch. That probably means it considers CRM to be an extremely important part of its business, and it expects to derive a competitive advantage from having a shiny new system. The best way to be competitive is to copy what everyone else is doing, right? Also, repeatedly receiving "no" as an answer will really make your customer want to take an active role in the implementation, right? Hmmm....somehow I don't think so. Introducing a strong impedance mismatch between the organization and the software it uses is failure. Standing in the way of innovation is failure. Mr. Durbin wants you to fail. There is actually a simple, albeit occasionally painful, solution to this problem: don't buy an off-the-shelf application that neither meets the business's requirements nor can be cost-effectively extended to meet them. First, you have to understand your business processes and requirements. I mean really understand them, not just understand the official process documentation that no one follows. All those winces and "yes buts" are requirements and process details that absolutely must be understood and addressed. Second, you do a trade study. This is a key part of systems engineering. Replacing enterprise systems is expensive, often prohibitively expensive, so do a thorough analysis. Take some of your key users to training sessions and write down every wince and vaguely answered question, because those are all either issues that will stand in the way to delivering value or will require customization. Finally, keep your pen in your pocket. Don't be tempted to write checks, buy licenses, and sign statements-of-work before your really understand the product. Just ignore those promises of discounts if you get a big order in before the end of the quarter. The next quarter isn't that far away, and the end of the fiscal year may even be approaching. The discounts will reappear then. Instead, make sure the vendor really understands the requirements and business process, and structure any contracts in a way that they must be met or the vendor will face significant penalties. It's amazing what comes out of the woodwork when you do this. All of a sudden that 3x cost overrun from the original projection is sitting right there in front of your eyes in the form of a quote, all before you've sunk any real money into the project. The purpose of engaging "the business" in an IT project is not to create a personal shield for all the deficiencies in the system you are deploying. It is to make sure you identify those deficiencies early enough in the project cycle to avoid cost and schedule overruns.

Sphere: Related Content

Friday, August 03, 2007

Minimal Software Development Process

It's not uncommon for me to find myself in debates regarding what constitutes a sufficient software development process. On one side, there are the Agile folks who argue that code-and-fix, with user representatives determining what should be fixed. Of course they have lots of rules to regulate the code-and-fix, some of which are quite sensible and some of which are quite dubious, to make it appeal to closet process engineers. On the other side you have old-school developers who believe in such rarities as requirements signed in blood and requiring human sacrifice to alter. Ok, so perhaps I'm exaggerating a little bit. But that's how it often feels. So I'm going to propose my own process that is every bit as simple as what the Agile crowd pushes while being more all-encompassing than traditionalists. Here it is:

  1. Define the problem
  2. Define the solution
  3. Apply the solution
  4. Validate the result
Now you are probably thinking that this is nothing novel. I'm just using weird words for:
  1. Develop requirements
  2. Design and Code to the Requirements
  3. uhh....Test?
  4. Customer Acceptance Testing
Wrong! Even if you were more creative than me, or pulled out RUP's phases or anything like that. But that's expected, as I rarely see steps 1 and 4 completed. Let me explain. Define the Problem This is where most projects get in trouble. Their problem definitions looks kind of like this:
  • We need to rollout a corporate standard ERP system.
  • We need to web-based foo tracking database accessible to the entire corporation.
  • We need an automated workflow for the bar process.
I could go on-and-on. Sometimes these are followed by phrases like "will save X million dollars" or "will reduce cycle time by X%." Really the problem is someone said that a competitor was saving money by applying an automated workflow to the bar process in order to track foos in the ERP system with a web-based frontend, the company at hand doesn't have one, so that's a problem. Anyway, often times these statements are really masking genuine problems such as:
  • My old college roommate is an ERP salesmen and wants a new boat
  • We have a ton of foo, but no one really knows where it is or if it's being used. So we keep on buying more foo, even though we probably have enough. The problem is when we find so foo, the people with the foo always claim that they need all of it, even though often times they clearly aren't using it. We have some data, but it's massive and impossible to interpret. We need a way to find unused foo and prove that it is unused so that we can transfer it where it is needed.
  • Some cowboys in department X keep on ignoring the bar process. They think they are heros because it saves time and money upfront, but really they just end up creating costly, time consuming problems down the line (for me). I need a way for force them to follow the bar process, but it can't be too hard otherwise they'll convince upper management to let them ignore it.
So why is the difference important? Don't you just end up with the first set of statements as your project charter, anyway? No. A couple months ago I faced a situation similar to the second item. A consultant had (unfortunetly) convinced a couple senior managers that they wanted a big, fancy database integrated to half of our internal systems and requiring a whole bunch of data maintenance. Fortunately the consultant also directed them to me. They had tons of data, and had spent countless hours fiddling with spreadsheets trying to turn it into actionable information. Having failed, they decided they needed a huge database that would cost hundreds of thousands of dollars to develop and then require staff to keep up-to-date. They also needed something in about a couple weeks. So I poked a proded until I finally understood what they needed to know, what data they had, and what they needed to decide based on that data. Then I wrote a few hundred lines a Python to analyze the data and make pretty graphs, along with a couple dozen lines of VBA to stick the outputs of the Python program into a PowerPoint presentation. They were thrilled with the result. Hundreds of thousands of datapoints were transformed into actionable charts that even the most impatient executive could correctly interpret. This took me about 2 weeks effort. Their original requirements would have taken a couple man years effort to implement, and the result would not have solved their problem. Traditionalists would have wasted the time to implement the requirements (which were actually fairly well developed), or at least a portion of them. Agilists would have fiddled around for a while and achieved the same result. Now I'll admit that on the majority of projects it's the other way around. Understanding the problem makes that cost of the solution grow by an order-of-magnitude, rather than shrink. My guess is only 1 in 4 can actually be simplified by understanding the problem, and 2 in 4 become significantly more complex. But solid estimates that can be tied to solid business cases are extremely important. Delivering a cheap solution that doesn't deliver value is a waste of money. In my experience development teams assume that "the customer" or "the business" already understand their problem and are defining requirements that will solve it. When in reality, the problem is usually vaguely understood at best, and the description of requirements is every bit a design activity as coding is. Define the Solution This is where most of the traditional software engineering activities occur. You have a customer (or marketing) who has given you some high level requirements defining the general shape of the system, and then you go about gathering more detailed requirements, followed by architecture, design, code, and test. Or maybe you are agile so you do all of those activities at once and only bother writing down code (in the form of application code and test code). Either way, knowing the problem really helps. Some people probably object to lumping all of these activities under one heading because they take so much time. I would agree, but they rarely done entirely sequentially. Prototyping is every bit as valid of a method for eliciting requirements as interviews. Sometimes it is a lot more effective. Also, there are strong feedback loops among all of the elements. So really, they are all done pretty much at the same time. It just happens that requirements kick off the process and testing finishes it up. Others would object because "the completed system is the solution." Well, no. It's not. You don't really know if you have a solution until after you've deployed and run the system long enough for the business to adjust itself. Apply the Solution This is just another way to saying "deploy," plus all the other things you have to do like training. If you think of it at a really high level (too high for effective engineering), the organization is the problem, and you apply the software to the organization to see if the organization gets better. Validate the Result This is where you measure the impact of deploying the software on the problem so see if indeed the problem has been solved. I don't think anyone will disagree that a piece of software can meet all of its requirements and fail to make a positive impact. So you need to measure the impact of the deployed system. In practice, success is declared as soon as a piece of software that meets "enough" of its requirements is rolled out. This puts the organization in a very bad position, because if the software subsequently fails to deliver value, then the declaration of success and those who made it are called into question. In most cases the organization will just end up either ignoring the system or limping along until enough time has passed to call the system a problem.

Sphere: Related Content

Tuesday, July 17, 2007

More Efficient User Interfaces

Over the weekend I ran into the following video about changing user user interface paradigms. I found the language based interface really interesting, but didn't much care for the ZUI. Unfortunately, the language based interface reminds me of old games from about the time personal computers became powerful enough to run Eliza-like software (for hackers, go here for code, and here for an easy environment on Windows). Basically you typed in "natural language" commands, and the characters were supposed to carry them out. Personally, I thought it was about the most awful way to play a game. Maybe in the past couple decades technology has progressed. It looks like it has potential. Anyway, the first thing that struck me was it reminded me of SciFi books where they describe any advanced computer use "programming," possibly because I largely stick to classic SciFi. Of course the vast majority computer users, who indeed perform relatively complex tasks, are not progammers. Not by a long shot. But I think the language-based interface concept could bring "programming" one step closer to the masses (whether that is good or bad is an excercise for the reader). The second thing I noticed was a stricking similarity to SmallTalk, only SmallTalk is a lot better. You see, if all of your software was inside of a SmallTalk image, then you could easily interact with it programmatically. Furthermore, SmallTalk's syntax and dynamic nature make it fairly approachable for those who have not been corrupted by years of experience with less lofty languages (in other words, me). Given a "humanized," if you will, object model I could see some real potential. At a minimum it would be a realization of much more componentized software. Alas, we can dream... The paradigm for such a system would be something like this: Someone builds an API that translates human-language like typed commands into Smalltalk code. It could quite possibly be an internal DSL. Then you build the interface so that an easy keystroke let's you "talk" to the computer by typing. AI would be optional, but I think essential for really advanced use by lay users. As time progresses, the user and the computer would learn more and more complex tasks - the user would essentially be programming the computer, but it would feel more like training. Ultimately this could facilitate much more efficient and sophisticated computer use. Maybe. There are counter points. Paul Murphy seems to think that everyone should learn keyboard shortcuts and Unix command-line utilities. While I agree in the utility of both, somehow I don't think the every-day computer user is ever going to do it. The more convincing counter point was made by my wife, an attorney, who wasn't even really trying to make a counter point. The exchange went something like this: Me: I saw this video the other day about changing user interface paradigms to consist of unified functionality instead of distinct applications. Her: Why would you want that? Me: Well, difference applications do different things, and so often times you have to use more than one to get it done. Her: Word is complicated enough already. Wouldn't that make it more complicated? Me: Yeah, that's why this guy advocated switching to text-based commands instead of menus and dialogs and stuff. Her: Huh? Me: You know how Word has a ton of menus, and those have sub-menus, and those launch dialog boxes with a bunch of tabs in them? Her: Yeah. It's confusing. Me: Well, if there were easy text commands that resembled natural language, then you could just tell the computer what to do and not worry about the menus. Her: Wouldn't that mean I have to learn all that? What's wrong with what we've got. It works. I know how to use it. Why change? People won't like the change. It will confuse them. Me: Well, how about intelligent search.... And go on to another defeat of the engineer. But that's the point. We dream of novel new things. We dream of users who care to really learn applications. We get people who want consistency. They've learned it, and they don't want to learn it again. So why is this important? Well, two reasons. One is that people are amazingly hampered by most application. By people, I mean everyone. How many of you have spent hours fighting Word or PowerPoint to get some document to look right? I know I've done it plenty of times. That's wasted time, and time is money. But there's a more important reason. I think we've hit a plateau in terms of technology innovation. We have all this shiny new processing power and networks, and most "innovation" seems to consist of thing like "social networking" (many were doing that a 1200bps or less years ago), sharing pictures (ditto), playing media (replacing TV/VCR isn't innovation), and other such pursuits that while may be very beneficial, hardly represent technology advancements. Why? There are a lot of reasons. I think one of them is that computers are so complicated already that getting a user to accept them doing something complicated is incredibly low. You can do it in specialized fields that have a technical leaning (science, engineering, finance, economics), but doing it for others is another ballgame. In the long run, in order to keep innovating, we have to be able to bring innovative software to people who today do not want it, because what they have today is complex enough. We have to change the user interface paradigm, and we have to do it in a way that doesn't scare people who already "know something" away.

Sphere: Related Content