Friday, June 27, 2008

Deserializing JSON to Haskell Data objects

Since last week I found a few other posts and libraries on the subject of JSON serialization and deserialization: here and here for example. Nonetheless I've continued on my own path, since the best way of learning is doing. It took me a while to figure out how to reconstruct a Haskell object, and I've only got some limited functionnality to work, but it works. I looked into the source code for gread for clues, I have to say.

So what we want is simple enough:

jsonToObj :: forall a. Data a => JSON.Value -> a


Given a JSON value we want an haskell object, and hopefully the type will match... OK, I'm a bit terse on the error handling side of things at the moment!
(Note: for the code to work you need to import: Control.Monad.State, Data.Generics, Data.List, qualified Data.Map as M, Data.Maybe)

It's simple enough for Strings and Bools:

jsonToObj (JSON.String s)=fromJust $ cast s
jsonToObj (JSON.Bool b)=fromJust $ cast b


Of course, if you expected a Foo and parse a JSON.String, this will fail(The cast return a Maybe). You have to cast since the signature doesn't force Strings or Bools, only Data.

For other types of objects (including numbers, because cast doesn't perform number conversion I found) it's a bit more complicated. Basically we have to figure out the type we want, find a constructor to build an instance of that type, and pass to that constructors proper values from the JSON object.
To find the type we want, I use funny code find in gread:

myDataType = dataTypeOf (getArg getType)
getArg :: a' -> a'
getArg = undefined
getType :: a
getType = undefined


This create a dummy function returning my type, and a dummy function returning what it gets as parameter. They don't need to be implemented, the compiler only cares about the type.

The meat of the function, that decides what constructor to invoke and what data to useas constructor parameter is a bit more involved:
(values,cons)=case x of
JSON.Object m -> let
c=indexConstr myDataType 1
in ((map (\x->fromJust $ M.lookup x m) (constrFields c)),c)
JSON.Number fl -> ([],if isPrefixOf "Prelude.Int" (dataTypeName myDataType)
then mkIntConstr myDataType (round fl)
else mkFloatConstr myDataType fl)
JSON.Array [] -> ([],indexConstr myDataType 1)
JSON.Array (x:xs) -> ([x,(JSON.Array xs)],indexConstr myDataType 2)


This snippet calculates the objects to iterator over and the constructor index to use. There's a bit a hand waving there: for objects we will take the first construtor we defined (I'll implement multiple constructor support later), and the JSON values it contains in the map. We just ensure we get the values in the order the fields are defined (given by the constrFields function). For number we use the int constructor if our result type looks like an int, otherwise we use the float constructor. There is probably a better way to construct a number from a Double, but I still need to find it. For the moment we look if the type name starts with "Prelude.Int", which is arguably not very Haskelly. For Arrays, we need to recreate the (head,rest) tuple that gave me trouble when serializing, so we deal first with the head of the list, and put the rest afterwards. The empty list is the first constructor, a non empty the second.

Then to actually create the object we pass the values inside a State monad:
State f=(fromConstrM (State (\(x:xs) -> (jsonToObj x,xs))) cons)


For the list of values we convert the head from json and keep the rest as state. This simple line was the result of intense thinking, I implemented my own State monad first to really understand what I needed to do and then figured out that the State monad did the exact same thing. The final working code is always shorter that all the previous failed attempts!
We then only need to call the State function with the values we calculated earlier, and that give us our full code:

jsonToObj x=
fst $ f values
where
getArg :: a' -> a'
getArg = undefined
getType :: a
getType = undefined
myDataType = dataTypeOf (getArg getType)
(values,cons)=case x of
JSON.Object m -> let
c=indexConstr myDataType 1
in ((map (\x->fromJust $ M.lookup x m) (constrFields c)),c)
JSON.Number fl -> ([],if isPrefixOf "Prelude.Int" (dataTypeName myDataType)
then mkIntConstr myDataType (round fl)
else mkFloatConstr myDataType fl)
JSON.Array [] -> ([],indexConstr myDataType 1)
JSON.Array (x:xs) -> ([x,(JSON.Array xs)],indexConstr myDataType 2)
State f=(fromConstrM (State (\(x:xs) -> (jsonToObj x,xs))) cons)

Friday, June 20, 2008

Serializing Haskell objects to JSON

I'm trying to do some simple serialization of Haskell objects to save some state to disk. After tearing some of my hair out debugging parse errors due to silly code I'd written in Read instances declarations, I've decided to use another approach. I'm going to save objects as JSON, since I've already used that JSON library.
The first task is to write generic code that can serialize a Data instance to JSON. This has probably been done somewhere else, but I need to learn, right? So I took a deep breath and dived into Data.Generics.
I quickly rounded up the issues I would face. Most notably, Strings and lists are accessed without any syntaxic sugar, so to speak: Strings are lists of Char, and lists are made of two fields, the head and the rest. Of course, you say, but from that I need to regenerate String and lists of JSON Value objects.
So, to start, how to recognize Strings to treat them differently than other algrebraic types:

isString :: Data a => a -> Bool
isString a = (typeOf a) == (typeOf "")


There's probably better ways to do that, just tell me (-:

Lists (that are not strings) can be recognized with abstract representations of constructors, which are equals regardless of the actual type of elements in the list

isList :: Data a => a -> Bool
isList a
| isString a = False
| otherwise = (typeRepTyCon (typeOf a)) == (typeRepTyCon $ typeOf ([]::[Int]))


Now, transforming a list of the form (head,rest) to [head1,head2...]

jsonList :: Data a => a -> [JSON.Value]
jsonList l=
concat $ gmapQ f l
where f b
| (isList b)= jsonList b
| otherwise = [objToJson b]


For each element (the actual number depends on whether we're the empty list or not) we either reapply the same method, if it's the inner list or we simply transform to JSON

And then the actual method on objects:

objToJson :: Data a => a -> JSON.Value
objToJson o | isString o=JSON.String (fromJust $ ((cast o)::(Maybe String)))
objToJson o | isList o=JSON.Array (jsonList o)
objToJson o | otherwise=
let
c=toConstr o
in
case (constrRep c) of
AlgConstr _-> JSON.Object (Data.Map.fromList(zip (constrFields c) (gmapQ objToJson o)))
StringConstr s -> JSON.String s
FloatConstr f -> JSON.Number f
IntConstr i -> JSON.Number (fromIntegral i)


We first handle Strings, then list, then general objects using constrRep to distinguish between algebraic types that create JSON objects with the proper field names (using constrFields) and other types for JSON primitives.

And that's it for the serialization! The Generics package is not that hard to use but you have to look up examples to figure out how actually use the functions like gmapQ and such...

Now, I have to work on the opposite process: given a type and JSON data, reconstruct the objects... More Haskell fun!