On my last blog anonymous asked:
would a more expensive line-match-function make it more obvious if you are working in parallel?I would say that one should be able to demonstrate that transparently supporting the potential for parallelism should be near free. If you can use a parallel algorithm to solve a problem that doesn't benefit much from parallelism with roughly the same or better performance characteristics as the serial code then it should be a lot better when you actually give it a more complex problem. Basically, parallelism should be free. A lot of people have commented on Tim Bray's blogs that his test is unsuited for demonstrating the benefits of parallelism because it is IO bound. Tim claims this isn't true, and I suspect there's some truth to that if you have really optimized IO, but I do think the benefits of parallelization for him problem are very limited. That being said, one thing that it he has clearly demonstrated is that parallelism isn't free. His "obvious" newbie solution in Erlang performed horribly and was considerably longer than the Ruby solution. Others have greatly improved the performance with extremely long, complicated chunks of code, but have yet to match Ruby. I find that really sad. So I would like to prove that parallelism can be almost free, meaning:
- Leveraging it does not impose a significant additional cognitive load on the programmer.
- Problems that are not effectively parallelizable should execute "about as fast" when parallel functionality is used as with serial code.
Count: 185300 Serial: 11592 real 0m12.107s user 0m11.254s sys 0m0.784sParallel:
Count: 185300 Serial: 11722 real 0m12.225s user 0m11.441s sys 0m0.723sAs you can see the parallel code is slightly slower than the serial code. Across runs their times actually overlap a bit, but serial generally times to be a tad faster. One thing I've noticed is that the deltas between the serial and parallel implementations don't really grow - and to some extent shrink - with increasing input sizes. I believe this is because there is a fixed penalty for setting up the thread pool for the actors. This only has to be done once per process invokation, and appears to cost about 200ms on my machine. In other words, parallelization for file processing can be almost free. I actually think it could be better-than-free, even on a single processor box, if IO was more efficient. My current solution is using a BufferedReader to read in the file one line at a time. This means the IO is probably being done in a less-than-optimal way, and that a lot of work is being done in serial for each line (converting from 8bit ASCII to 16-bit Unicode strings, splitting it into lines). I'd like to use nio to read the file in a block at a time, and then let all this work be done in separate threads. I think then there would be a performance increase because one thread would be doing nothing but reading in buffers as fast as the OS and JVM and provide them, and others would be doing all the computation while the IO thread is blocking. But before that I'm going to get the interface cleaned up and solve the memory problem on large files. Sphere: Related Content