Erik Engbrecht's Blog: Adventures in Widefinding: Performance

Thursday, December 06, 2007

Adventures in Widefinding: Performance

I've done some basic performance tests against:

Scala Actor-based parallel Widefinder
Scala serial Widefinder using a BufferedReader
Tim Bray's Ruby Widefinder

The test platform was my 2.2 GHz MacBook with 4GB of RAM using a 6 million line file. The times were as follows:

Scala Parallel:

real 0m14.588s
user 0m24.541s
sys 0m1.383s

Scala Serial:

real 0m20.095s
user 0m18.821s
sys 0m1.441s

Ruby:

real 0m14.301s
user 0m12.485s
sys 0m1.813s

The good news is that the parallel Scala version is noticeably faster than the serial version. The bad news is that it is roughly the same speed as the Ruby version, and takes significantly more CPU. The Scala versions are doing substantially more work, because they have to transcode the 8-bit ASCII contained in the input file into 16-bit Unicode strings. This requires a full scan of the data. I believe a fair amount of performance could be gained by combining the transcoding pass over the input with the line-splitting pass.

For those that are curious, the source code for the parallel widefinder is available here: the parallel IO code the actual widefinder code

Sphere: Related Content

Erik Engbrecht's Blog

Thursday, December 06, 2007

Adventures in Widefinding: Performance

No comments:

Blog Archive

Labels

About Me