Happy new year, everyone. It’s a new year and time for new resolutions. Let’s talk about time. Specifically, measuring it in Haskell.
How do you measure how long something takes in Haskell? Here’s a naive attempt:
import Control.Exception
import Data.Time
= do
main <- getCurrentTime
start sum [1 .. 1000000])
evaluate (<- getCurrentTime
end print (diffUTCTime end start)
Running it, we see that it does what we expect:
> main
λ0.316653s
Here’s what’s wrong with this implementation:
If you’re on an Ubuntu desktop, time is updated when you first boot up from NTP servers. If you’re on a server, likely there is a daily cron job to update your time, because you don’t tend to reboot servers. My laptop has been on for 34 days:
$ uptime
21:13:47 up 34 days, 2:06, 3 users, load average: 0.74, 0.83, 0.84
If I run a manual update, it adjusts my clock by 500 milliseconds:
$ sudo ntpdate ntp.ubuntu.com
5 Jan 21:11:53 ntpdate[4805]: adjust time server x.x.x.x offset 0.517166 sec
Because there is a certain amount of “drift” that occurs over time.
Additionally, leap seconds can be introduced at any time and cannot be predicated systematically, but there is at least a 6 months in advance notice for time servers. In 2015 there will be an extra second added to time in-between the 30th of June to the 1st of July.
These factors mean that if our main function is run during an update, the reported time could be completely wrong. For something simple like the above, maybe it doesn’t matter. For long term logging and statistics gathering, this would represent an anomaly. For a one-off, maybe it’s forgivable, because it’s convenient. But above all, it’s simply inaccurate reporting.
Readers familiar with this problem will think back to measuring time in C; it requires inspecting the system clock and dividing by clocks per second. In fact there are a couple solutions around that use this:
These are more reliable, because the time cannot be changed. But they are limited, as both only measure CPU time and not IO time. So if your program takes 10 seconds but only does 5 seconds of CPU processing and 5 seconds of waiting for the disk, then you will not have the real time. Also known as wall time.
In the Criterion package, there’s need for fine-grained, fast, accurate measuring of both real and CPU time, so it includes its own cross-platform implementations:
That’s nice, but it’s embedded in a specific package built for benchmarking, which we may not necessarily be doing. For example, I am dabbling with a program to measure the speed of my key presses. It turns out there is a package that does similarly to Criterion, already prepared and similarly cross-platform and only depends on base and ghc-prim.
I discovered this really nice package called clock which has the option for several time measurements:
Monotonic
: a monotonic but not-absolute time which
never changes after start-up.Realtime
: an absolute Epoch-based time (which is the
system clock and can change).ProcessCPUTime
: CPU time taken by the process.ThreadCPUTime
: CPU time taken by the thread.Let’s rewrite our example using this package and the formatting
package (which provides a handy TimeSpec
formatter as of
6.1):
{-# LANGUAGE OverloadedStrings #-}
import Control.Exception
import Formatting
import Formatting.Clock
import System.Clock
=
main do start <- getTime Monotonic
sum [1 .. 1000000])
evaluate (<- getTime Monotonic
end % "\n") start end fprint (timeSpecs
Running it, we see we get similar information as above, but now it’s accurate.
> main
λ276.05 ms
If you just want CPU time for the process, or the OS thread, just
provide a different argument to getTime
.
So next time you want to measure how long something takes, unless
you’re doing benchmarking, check out the clock
package!