Data.Typeable and Data.Data are rather mysterious. Starting out as a Haskell newbie you see them once in a while and wonder what use they are. Their Haddock pages are pretty opaque and scary in places. Here’s a quick rundown I thought I’d write to get people up to speed nice and quick so that they can start using it.1
It’s really rather beautiful as a way to do generic programming in Haskell. The general approach is that you don’t know what data types are being given to you, but you want to work upon them almost as if you did. The technique is simple when broken down.
First, there is a class exported by each module. The class
Typeable
and the class Data
. Your data types
have to be instances of these if you want to use the generic programming
methods on them.
Happily, we don’t have to write these instances ourselves (and in GHC
7.8 it is actually not possible to do so): GHC provides the extension
DeriveDataTypeable
, which you can enable by adding
{-# LANGUAGE DeriveDataTypeable #-}
to the top of your
file, or providing -XDeriveDataTypeable
to
ghc
.
Now you can derive instances of both:
data X = X
deriving (Data,Typeable)
Now we can start doing generic operations upon X
.
As a simple starter, we can trivially print the type of any instance
of Typeable
. What are some existing instances of
Typeable
? Let’s ask GHCi:
> :i Typeable
λclass Typeable a where typeOf :: a -> TypeRep
instance [overlap ok] (Typeable1 s, Typeable a) => Typeable (s a)
instance [overlap ok] Typeable TypeRep
instance [overlap ok] Typeable TyCon
instance [overlap ok] Typeable Ordering
instance [overlap ok] Typeable Integer
instance [overlap ok] Typeable Int
instance [overlap ok] Typeable Float
instance [overlap ok] Typeable Double
instance [overlap ok] Typeable Char
instance [overlap ok] Typeable Bool
instance [overlap ok] Typeable ()
That’s the basic Prelude types and the Typeable library’s own types.
There’s only one method in the Typeable
class:
typeOf :: a -> TypeRep
The TypeRep
value has some useful normal instances:
> :i TypeRep
λinstance [overlap ok] Eq TypeRep
instance [overlap ok] Ord TypeRep
instance [overlap ok] Show TypeRep
instance [overlap ok] Typeable TypeRep
So we can use this function on a Char
value, for
example, and GHCi can print it:
> :t typeOf 'a'
λ'a' :: TypeRep
typeOf > typeOf 'a'
λChar
This is mostly useful for debugging, but can also be useful when writing generic encoders or any tool which needs an identifier to be associated with some generic value.
We can also compare two type representations:
> typeOf 'a' == typeOf 'b'
λTrue
> typeOf 'a' == typeOf ()
λFalse
Any code which needs to allow any old type to be passed into it, but which has some interest in sometimes enforcing or triggering on a specific type can use this to compare them.
A common thing to need to do is when given a generic value, is to sometimes, if the type is right, actually work with the value as the concrete type, not a polymorphic type. For example, a printing function:
char :: Typeable a => a -> String
The specification for this function is: if given an
Char
, return its string representation, otherwise, return
"unknown"
. To do this, we need a function that will convert
from a polymorphic value to a concrete one:
cast :: (Typeable a, Typeable b) => a -> Maybe b
This function from Data.Typeable
will do just that. Now
we can implement char
:
> let char x = case cast x of
λJust (x :: Char) -> show x
Nothing -> "unknown"
> char 'a'
λ"'a'"
> char 5
λ"unknown"
> char ()
λ"unknown"
That’s more or less where the interesting practical applications of
the Typeable
class ends. But it becomes more interesting
once you have that, the Data
class can take advantage of
it. The Data
class is much more interesting. The point is
to be able to look into a data type’s constructors, its fields and
traverse across or fold over them. Let’s take a look at the class.
Again, there are some basic instances provided:
instance Data a => Data [a]
instance Data Ordering
instance Data a => Data (Maybe a)
instance Data Integer
instance Data Int
instance Data Float
instance (Data a, Data b) => Data (Either a b)
instance Data Double
instance Data Char
instance Data Bool
It’s a rather big class, so I’ll just cover some methods that demonstrate the key use-cases.
Similar to the TypeRep
, you can use
dataTypeOf
to get a unique representation of a data
type:
dataTypeOf :: Data a => a -> DataType
For example:
> dataTypeOf (Just 'a')
λDataType {tycon = "Prelude.Maybe", datarep = AlgRep [Nothing,Just]}
There aren’t any other interesting instances for this type, but we’ll
look at uses for this type later. Representations (so-called
FooRep
) tend to be references from which you can reify into
more concrete values.
The most common thing to want to do is to get a list of constructors
that a type contains. So, the Maybe
type contains two.
> :t dataTypeConstrs
λdataTypeConstrs :: DataType -> [Constr]
> dataTypeConstrs (dataTypeOf (Nothing :: Maybe ()))
λNothing,Just] [
We’ll look at what we can do with constructors later.
It’s also surprisingly common to want to see what the constructor is at a particular index. We could write this function ourself, but there’s already one provided:
> indexConstr (dataTypeOf (Nothing :: Maybe ())) 2
λJust
Sometimes you want to know whether a data type is algebraic (in other words, does it have constructors and is it not one of the built-in types like Int/Float/etc)?
> isAlgType (dataTypeOf (Just 'a'))
λTrue
> isAlgType (dataTypeOf 'a')
λFalse
We have the method
toConstr :: a -> Constr
Which given any instance of Data
will yield a
constructor.
> :i Constr
λdata Constr
instance Eq Constr
instance Show Constr
You can’t do much with a constructor as-is, but compare and print it:
> toConstr (Just 'a')
λJust
> toConstr (Just 'a') == toConstr (Nothing :: Maybe Char)
λFalse
However, those operations by themselves can be useful.
By the way, we can also get back the DataRep
of a
constructor:
> constrType (toConstr (Just 'a'))
λDataType {tycon = "Prelude.Maybe", datarep = AlgRep [Nothing,Just]}
Another typical thing to want to do is to use the field names of a constructor. So for example:
> data X = X { foo :: Int, bar :: Char } deriving (Typeable,Data)
λ> toConstr (X 0 'a')
λX
> constrFields (toConstr (X 0 'a'))
λ"foo","bar"] [
It’s a good use-case for serializing and debugging.
It’s actually possible to produce a value from its constructor. We have this function
fromConstr :: Data a => Constr -> a
Example:
> fromConstr (toConstr (Nothing :: Maybe ())) :: Maybe ()
λNothing
But what do you do when the constructor has fields? No sweat. We have this function:
fromConstrB :: forall a. Data a
=> (forall d. Data d => d) -> Constr -> a
Haskell beginners: Don’t fear the rank-N type. What it’s saying is
merely that the fromConstrB
function determines what the
type of d
will be by itself, by looking at
Constr
. It’s not provided externally by the caller, as it
would be if the forall d.
were at the same level as the
a
. Think of it like scope.
let a = d in let d = …
doesn’t make sense: the
d
is in a lower scope. That means we can’t just write:
5 :: Int) (toConstr (Just 1 :: Maybe Int)) :: Maybe Int fromConstrB (
The Int
cannot unify with the d
because the
quantification is one level lower. It basically doesn’t exist outside of
the (forall d. Data d => d)
(nor can it escape). That’s
okay, though. There is a type-class constraint which lets us be generic.
We already have a function producing a value of that type:
> :t fromConstr (toConstr (1 :: Int))
λ1 :: Int)) :: Data a => a fromConstr (toConstr (
So we can just use that:
> fromConstrB (fromConstr (toConstr (1 :: Int)))
λJust 1 :: Maybe Int)) :: Maybe Int
(toConstr (Just 1
Tada! But wait… What if there’re more fields? How do we provide more than one, and of different types?
Enter fromConstrM
:
fromConstrM :: forall m a. (Monad m, Data a)
=> (forall d. Data d => m d) -> Constr -> m a
Because it’s monadic we can use a state monad to keep an index! Observe:
> :t execState
λexecState :: State s a -> s -> s
> :t execState (modify (+1))
λ+1)) :: Num s => s -> s
execState (modify (> :t execState (forM_ [1..5] (const (modify (+1))))
λ1..5] (const (modify (+1)))) :: Num s => s-> s
execState (forM_ [> execState (forM_ [1..5] (const (modify (+1)))) 5
λ10
Let’s put this to use with fromConstrM
:
> evalState
λ
(fromConstrMdo i <- get
(+1)
modify (return
case i of
(0 -> fromConstr (toConstr (5::Int))
1 -> fromConstr (toConstr 'b')))
Foo 4 'a')))
(toConstr (0 :: Foo
Foo 5 'b'
> λ
In other words, keep an index starting at 0. Increase it each
iteration that fromConstrM
does. When we’re at index 0,
return an Int
, when we’re at index 1, return a
Char
. Easy! Right?
A common thing to want is to map over a value in a
structure-preserving way, but changing its values. For that we have
gmapT
:
gmapT :: forall a. Data a
=> (forall b. Data b => b -> b) -> a -> a
Similar to fromConstr*
, there is a rank-n type
b
that refers to each type in the constructor of type
a
. It’s easy enough to use:
> gmapT
λ->
(\d case cast d of
Nothing -> d
Just x ->
if isUpper x then '!' else x)))
fromJust (cast (Foo 4 'a')
(Foo 4 'a'
> gmapT
λ->
(\d case cast d of
Nothing -> d
Just x ->
if isUpper x then '!' else x)))
fromJust (cast (Foo 4 'A')
(Foo 4 '!'
Here I’m doing a little check on any field in the constructor of type
Char
and if it’s upper case, replacing it with
!
, otherwise leaving it as-is. The first trick is to use
the cast
function we used earlier to reify the generic
d
into something real (Char
). The second trick
is to cast our concrete Char
back into a generic
d
type.
Just like fromConstrM
earlier, if you want to operate on
exact indices of the constructor rather than going by type, you can use
gmapM
and use a state monad to do the same thing as we did
before.
Another slightly different use-case is to walk over the values of a
data structure, collecting the result. You can do this with
gmapM
and a state monad or a writer, but there’s a handy
function already to do this:
gmapQ :: forall a. Data a => (forall d. Data d => d -> u) -> a -> [u]
Trivial example:
> gmapQ (\d -> toConstr d) (Foo 5 'a')
λ5,'a'] [
A more useful example can be found in structured-haskell-mode which walks over the Haskell syntax tree and collects source spans into a flat list. Another decent example is in the present package. There’s also an example in Fay to encode types to JSON with a specific Fay-runtime-specific encoding.
Here’s a trivial (not very good, but something I wrote once) generic printer:
gshows :: Data a => a -> ShowS
= render `extQ` (shows :: String -> ShowS) where
gshows
render t| isTuple = showChar '('
. drop 1
. commaSlots
. showChar ')'
| isNull = showString "[]"
| isList = showChar '['
. drop 1
. listSlots
. showChar ']'
| otherwise = showChar '('
. constructor
. slots
. showChar ')'
where constructor = showString . showConstr . toConstr $ t
= foldr (.) id . gmapQ ((showChar ' ' .) . gshows) $ t
slots = foldr (.) id . gmapQ ((showChar ',' .) . gshows) $ t
commaSlots = foldr (.) id . init . gmapQ ((showChar ',' .) . gshows) $ t
listSlots = all (==',') (filter (not . flip elem "()") (constructor ""))
isTuple = null (filter (not . flip elem "[]") (constructor ""))
isNull = constructor "" == "(:)" isList
I wrote it because the GHC API doesn’t have Show
instances for most of its data types, so it’s rather hard to actually
inspect any data types that you’re working with in the REPL. It
has instances for pretty printing, but pretty printing confuses
presentation with data.
Example:
> data Foo = Foo Char Int deriving (Data,Typeable)
λ> gshow ([Just 2],'c',Foo 'a' 5)
λ"([(Just (2))],('c'),(Foo ('a') (5)))"
Note: no Show
instance for Foo
.
We’ve briefly covered how to query types, how to cast them, how to walk over them or generate from them. There’re other things one can do, but those are the main things. The real trick is understanding how to make the types work and that comes with a bit of experience. Fiddle around with the concepts above and you should gain an intution for what is possible with this library. See also: Data.Generics.Aliases.
Hope it helps!
I’ll migrate this to the HaskellWiki when it doesn’t look so, uh, shall we say, unattractive.↩︎