Erik Engbrecht's Blog: Solaris

...or why Java infects Unix with the Windows mindset.

Recently Paul Murphy, the king of the Sun zealots, blogged about Java bringing the Windows mentality to Windows, all the while slamming Java. In response, John Carrol, a Microsoft employee, rose to the defense of Sun's self-declared crown jewel. Talk about weird.

The funny thing is they are both right, although Murph's arguments are pretty weak.

A little history

Unix and Windows evolved with a very different definition of what the primary unit of isolation should be. On Windows, it is (or was) the node. Each Windows user (and DOS user before him) occupied exactly one node. The worst that could happen is the user destroys his own workspace, so interactive performance reigned supreme over system integrity. You have a node. You have a user. The node does what the user wants as fast as it can. Initially this applied to running a single application at a time, then to allowing several to be open at once but with the one in the foreground receiving primary resources, and finally to allow several applications to run simultaneously. Multithreading reigned king because it was lower overhead and focused on making that foreground process more responsive. Threads were optimized, while processes were neglected.

Unix evolved to be fundamentally multiuser, and its primary unit of isolation is the process. Unix systems were intended to be shared, so it was important that one user could not dominate over another. Furthermore, and slew of processes (daemons) all ran as the same under the same account, while providing services to multiple users, so in order for users to share processes must share. Unlike on Windows, one process crashing the entire system was not acceptable, because that would destroy multiple users' data. As a result, processes were designed to represent a strong level of isolation and heavily optimized to make sure people used it. Threads were largely ignored, or simply treated as processes with a shared heap space, because several cheap processes could simply be chained together to accomplish the same thing in a simpler manner.

The Unix Way

I want you to consider good old-fashioned CGI programs for a moment. Imagine one written in C. First, you may think "Oh my God, running a web application in a non-managed environment. The resource leaks! The memory leaks! The memory consumption of all those processes! Oh the horror!." Of course, you would be wrong. Repeating launching and terminating a Unix process is dirt cheap. Especially a simple program written in C. The OS will cache an image of the executable in memory which can be shared among invocations. The individual process can leak all the resources it wants, because as soon as it terminates all the resources will be automatically freed by the OS, not matter how incompetent the programmer. If the process fails to terminate your friendly neighborhood sysadmin can kill it without hurting any other process.

This method works for producing super-available applications despite incredibly crappy code. I've seen it, both in the for of CGI and in the form of much more sophisticated applications. It works. Users get upset about lost transactions, but the application as a whole almost never goes down.

Enter Java

Java took cheap Unix processes and made them expensive. To compensate, it provided primitives for multithreading. It provided a garbage collector to at least slow memory leaks. It turned all those transient application processes into one big JVM process not only serving all the transactions for a given user, but serving all the transactions for an entire application or even multiple applications. Java made it more difficult to make destructive program errors, but it also made the consequences much more severe. Your friendly neighborhood sysadmin is powerless against a runaway thread or a slow memory leak. All he can do is kill the process, bumping out all of the users, killing all of their sessions.

It's so bad, the process might as well be a node. Unix becomes Windows. The JVM is practically an operating system, but without all of the features of an operating system and a whole lot less mature.

Enter Java Frameworks

This is really what Murph was railing against, although he didn't name it and he conflated it with the core language by labeling "Business Java." Frameworks evolved for a myriad of reasons which are often summarized as "taking care of the plumbing to the developer can focus on the business logic." The "plumbing" is a lot of things, including managing certain resources and generally ensuring the application code executes within a well defined life cycle where it is unlikely to do damage. In other words, instead of giving the user a simple, uniform mechanism like a process to protect the world from his mistakes, he is given dozens of hooks where he can implement little snippets of focused and hopefully bug-free functionality. All this involves a lot of learning above and beyond "the Java you learned in school" (meaning the core language and libraries), putting a cognitive load on the programmer and additional runtime load on the machine.

Multiprocess versus Multithreaded

Most Unixes have evolved efficient threading, and Windows has come a long way in becoming a multiprocess, multiuser environment. Consequently, developers needs to be able to intelligently decide when to use multiple processes, when to use multiple threads, and when to use a hybrid approach. For example, Apache httpd has for quite a while now used a hybrid approach. One one hand on most operating systems threads involve less overhead than processes, so it is more efficient to use multiple threads than multiple processes. On the other hand multiple processes ultimately will give you better reliability because they can be spawned and killed independently from one another, so making a system that can run for months without stopping doesn't require writing a program that will run for months without stopping.

So how do you choose? My rule of thumb is to look at the amount of shared data or messaging required between concurrent execution paths and balance against how long the "process" (not OS process) is expected to live. Execution paths with lots of shared data or that are chatty will benefit from the lower overhead of threading, and threading allows you to avoid the complexities of shared memory or IPC. Of course, multiprocessing allows you to avoid the complexities of threading APIs, and there are libraries to address both, so the complexity issue could be a wash depending on your previous experience.

So why is Murph so wrong? Is JC right?

I think Murph wants to divide the world along nice clean lines. System programmers program in C. They don't need the hand-holding of managed runtimes or languages that treat them like impudent children. They do need lots of flexibility and lots of rope. Application programmers, on the other hand, need high-level abstractions that are close to the business domain that they are addressing. They need to be able to rapidly build software and rapidly change it as requirements evolve. They don't need lots of flexibility and should stay away from low-level details. So, in Murph's eyes, the problem with Java is that it doesn't do either particularly well. The managed runtime and object-orientation get in the system programmer's way, while the general-purpose nature of the language and mish-mash of libraries and frameworks just confuse application developers, or rather distract them from their true purpose. System programmers need C. Application developers need 4GLs.

The fatal flaw in Murph's reasoning is that it ignores the in-between. What happens when the systems programmer or 4GL creator fails to provide the right abstraction for the application developer? He's stuck, that's what happens. Software development is as much about creating abstractions as using them. Consequently, application developers need general-purpose languages.

Sphere: Related Content

Last week I made a comment on Paul Murphy's blog about how the thin-ness of Sun Rays is really up to interpretation. Today he's decided to dedicate an entire blog to explaining why I'm wrong, because he figures if I have an incorrect understanding of Sun Rays, then a lot of people have an incorrect understanding of Sun Rays. He's probably right, although I don't think my understanding is that far off base, and he's been kind enough to let me see a draft copy of his blog so I can get a head start on the response. Here's what I said:

Smart, Thick, Thin, Display
It's all word games. Depending on how you define "processing," there is processing going on. It still has to render graphics, translate keyboard and mouse events, etc. A SunRay is just a compacted Sun workstation of yesteryear without a harddrive and special firmware designed to work solely as an X-Windows server.
The problem is the attempt to make "smart displays" seem more fundamentally different from other similar solutions just muddies the waters. People like me groan because yet another term has been introduced that means almost the same as other terms that will need to be explained to the higher-ups. The higher-ups get confused and either latch onto it or, more likely, have their eyes glaze over.
Anyway, enough with our industry's incredible ability to make sure words are completely meaningless...
The problem with Sun Ray and other similar solutions is that they are really a local optimum based on today's technology and practices for a relatively narrow range of priorities. Change the priorities and the solution is no longer optimum. Introduce distributed computing techniques with the same low administrative overhead and they lose out entirely.

As far as I can tell, the first part is technically accurate. Older Sun Rays ran a 100Mhz UltraSparc II, had 8mb of RAM, and ran a microkernel. See here and here. Newer ones use an even beefier system-on-a-chip.

So the Sun Ray client is obviously processing something, and actually has a fair amount of processing power. Just because it is not maintaining any application state, doesn't mean it's not doing anything. Murph asserts that a Sun Ray is not an X-Terminal, but he'll have to explain the difference to me. He could be right...I don't know. It's been about 7 years since I've used a Sun Ray, but from what I remember it felt just like using Exceed on a PC under Windows, which is quite common at my employer. He did mention this:

Notice that the big practical differences between the Sun Ray and PC all evolve from the simplicity of the device in combination with the inherently multi-user nature of Unix. In contrast the differences between the Sun Ray and X-terminal arise because the X-terminal handles graphics computation and network routing -making it more bandwidth efficient, but marginally less secure.

But the Sun Ray quite clearly has a graphics accelerator and talks over the network, so while there is probably a subtle difference in there that I'm not grasping, it doesn't seem particularly marterial. But that's not really the meat of the debate, it's just a technical quibble over what consitutes processing and an operating system. He's dilluting the debate by calling Sun Ray's "smart displays" instead of "thin clients" and thus drawing a false dichotomy, and I'm doing the same by pointing at internal technical specs that have little to do with actual deployment. The real debate is: "Where should processing take place?" I'll give you a contrite answer - as close to the data as possible. Any computation involves a set of inputs and a set of outputs. It makes no sense to shuttle a million database rows from a database server to an application server or client machine in order to sum up a couple fields. It makes much more sense to do it where the data is, and then ship the result over the network. Likewise, if you have a few kilobytes of input data and several megabytes/gigabytes of results, it makes sense to do the computation wherever the results are going to be needed. So this is my first issue with the centralized computing paradigm. Right now I'm typing this blog in Firefox on Linux, and my computer is doing a fair amount of work to facilitate that interaction with Blogger. I've also got a dozen other Windows open. Most of the memory and CPU I'm consuming is dedicated to the local machine interacting with me, the local user. Only a couple pages of text are being exchanged back-and-forth with blogger. So why not let the Sun Ray run Firefox (and an email client, a word processor, etc.)? The new ones have the processing power. They probably would need $100 worth of RAM or so to keep a stripped-down Unix variant in RAM, which could be loaded from the network. Intelligent configuration could make the client smart about whether to run an app locally, on a server, or on an idle workstation down the hall. Murph gives seven reasons: 1. portability Murph asserts that with Sun Rays you gain portability, because you can halt a session one place and immediately resume it another place. I don't doubt that is true, but I don't see any technical reason why the same could not be accomplished with a distributed architecture. All that happens is your terminal becomes the processing server for a remote application. Remember, in Unix, there isn't a fundamental difference between a client and a server. I'm not going to address the laptop debate right now. Murph has made some very good arguments against laptops in the past based on the security concerns of them being stolen, despite strong encryption. I think he underestimates the value of laptops and is probably wrong, but there are a substantial number of people who could live with a "portable terminal" because their homes and hotels have sufficient bandwidth. 2. reliability This is where the distributed model really shines. In my experience, networks just are generally one of the less reliable portions of the computing environment, especially WANs and my own internet connection. A pure thin-client solution simply stops working when the network goes down. In the past, Murph has asserted that everyone needs network connectivity to work, so this doesn't matter. But in my opinion most professionals can continue working for several hours, possibly at reduced productivity, when disconnected from the network. That buys time for IT to fix the network before the business starts bleeding money in terms of productivity. Keeping processing local, along with caching common apps and documents, increases the effective reliability of the system. 3. flexibility Murph lists nothing that cannot be done with a locally-processing workstation. 4. security Don't use x86 workstations, especially running Windows. The security gains are from a more secure operating system on a processor architecture designed for security and reliability. Eliminating permanent storage from the client does buy some security, because there is then no way to walk out the door with all the data, but distributed processing doesn't preclude centralized permanent storage. There are, of course, substantial advantages to having local storage, like being able to make a laptop that can be used in an entirely disconnected fashion. But I think that's a separate debate. 5. processing power There's nothing about a distributed computing model that says you can't install compute servers. Heck, this is done all the time with Windows (both to Windows servers and more commonly to Unix servers). Murph's example of a high-performance email server has nothing to do with the thin-client architecture, and everything to do with properly architecting your mail server. 6. cost There aren't significant cost savings in terms of hardware when switching to Sun Rays. Hardware is cheap, and you can throw out a lot of pieces in the common PC to reduce the cost. In fact, I bet Sun Rays cost more because of the servers. I don't doubt that when effectively administrated they cost less to keep running than a Windows solution, but that's mostly because of Unix. I'll admit that it is probably cheaper to administer Sun Rays than my distributed model because I think it will require greater skill and discipline (meaning higher paid admins), so in abscense of detailed numbers I'll say it's a wash. 7. user freedom This is partially a consequence of using Unix instead of Windows, and mostly a consequence of changing culture. So as I said before, Sun Rays, and centralized computing in general, represent a kind of a local optimum for a given solution and today's practices. But I don't think they make a solid generalized approach. Distributed computing can be successfully with all the advantages of Murph's Sun Ray architecture using today's technology, it just isn't common. Now I've ignored the elephant in the room: Much essential software only runs on Windows, and the minute you introduce Windows into the mix (local or centralized), you start compromising many of the advantages outlined above. Of course, what good is a computing environment if it won't run the desired software? Consequently, I think it will be a long time before anything like this flies in most enterprise environments.

Sphere: Related Content

Erik Engbrecht's Blog

Wednesday, April 09, 2008

Multiprocess versus Multithreaded...

...or why Java infects Unix with the Windows mindset.

A little history

The Unix Way

Enter Java

Enter Java Frameworks

Multiprocess versus Multithreaded

So why is Murph so wrong? Is JC right?

Monday, July 16, 2007

Sun Ray Thin Clients

Blog Archive

Labels

About Me