Types of TODO
Now that we have a very basic webserver, we must decide what it is we
actually want to create to proceed any further. On request by David
Wickes I decided to make a “todo-mvp” for the purpose of this
tutorial. We won’t be following the specifications exactly because
I couldn’t be arsed to read them there are things I want to
discuss beyond what makes sense for a todo-mvp.
Basic types
First, let’s create the data-type that will hold the todo list’s information. We need a list of tasks, each task must have a state; “done” or “not done”. Let’s first create a type to differentiate between “done” and “not done”. And no, we won’t be using a boolean. Haskell lets us be more specific than that.
data TaskStatus = Done | NotDone
This creates a type called TaskStatus
, which is
inhabited by Done
and NotDone
. We can use our
TaskStatus
in the definition of a Task
:
data Task = Task String TaskStatus
The first Task
is the type name, the second
Task
is the constructor name. Done
and
NotDone
are also constructors, they just don’t take any
arguments. The Task
constructor, on the other hand,
requires a String
and a TaskStatus
as an
argument. A constructor can be used as a function, so we can create a
value of type Task
like so:
myTask :: Task
= Task "create todo list" NotDone myTask
When defining a type with only one constructor, it is common for the type’s name and the constructor name to be the same. Since types and terms can’t occur in the same place, the compiler will have no trouble figuring out which one we mean.
note
We can also create types as records:
data Task = Task description :: String { status :: TaskStatus , }
this can be useful to name the fields of your type, but I won’t go into records yet.
Finally, we want to have a list of tasks:
myTodoList :: [Task]
=
myTodoList Task "create todo list" Done
[ Task "invent terror drones" NotDone
, Task "achieve world domination" NotDone
, ]
Just like IO
, List
is a higher-kinded type
with kind * -> *
. It takes a type and produces the type
for a list of elements of that type. List Int
is a list of
integers, List String
is a list of strings,
List (List Boolean)
is a list of a list of booleans, and so
on.
note
List is also a monad. We get
return
by putting a single element in a list,fmap
bymap
and we can get>>=
using concatenationconcat :: [[a]] -> [a]
by doing anfmap
followed by aconcat
. Many other data structures are monads too.
We’ll want to send our todo list as HTML, so we must make a function
to convert [Task]
into HTML. Let’s start with a conversion
function for a single task:
toHTML :: Task -> String
=
toHTML task case task of
Task description status ->
case status of
NotDone -> "<p>" ++ description ++ "</p>"
Done -> "<p><strike>" ++ description ++ "</strike></p>"
What’s going on here? We’re using a case expression. A case
expression is similar to a switch in imperative programming; it
evaluates a different expression based on the value of a variable.
Rather than matching an exact value though, we match on patters. The
pattern NotDone
matches a TaskStatus
created
with the NotDone
constructor. Similarly, the pattern
Task description status
matches any value of type
Task
created with the Task
constructor. In
that case description
and status
would bind to
the first and second argument of the constructor, and we could use them
on the right side of the arrow.
++
is the string concatenation operation.
Oh, and Haskell is indentation-sensitive.
Finally, let’s make a function that turns our snippets of HTML into a web page.
toHTMLPage :: String -> String
=
toHTMLPage innerHTML "<!DOCTYPE html>\
\<html xmlns=\"http://www.w3.org/1999/xhtml\" lang=\"\" xml:lang=\"\">\
\<head>\
\ <meta charset=\"utf-8\" />\
\ <title>TODO list </title>\
\</head>\
\<body>\
\ " ++ innerHTML ++
"</body>"
We use \
to escape a newline, and top escaping at the
next \
. This might seem obnoxious at first because it’s a
lot more characters than using a multi-line string delimiter, but this
allows us to have the right indentation in both code and output. We also
use \
to escape the "
string delimiter.
Run stack build
to ensure you have no type errors.
Hoogle
But there’s a problem, the responseLBS
function from
last chapter only accepts a lazy ByteString
, whereas here
we have a String
. So what is the difference? A
String
is a list of Char
s, whereas a
ByteString
is an array of bytes. A Char
is a
representation of unicode code point, a byte is (generally) 8 bits with
no meaning attached to it.
So what is the distinction between a lazy and a strict
ByteString
? Isn’t everything in Haskell lazy? Well, this is
a different sort of lazy. A strict ByteString
is a single
vector of bytes. But because Haskell is pure, we cannot mutate it. That
means that if we need to change or append 1 byte, we have to deep copy
everything else in the ByteString
. A lazy
ByteString
consists of multiple vectors of bytes, so we can
copy references to the parts that didn’t change.
So, how do we change a list of code points into a sequence of bytes?
We have to specify the right encoding. We could use the
Data.ByteString.Lazy.Char8
module that comes with the
bytestring package, but the documentations tells us “all Chars will be
truncated to 8 bits”, which is not what we want! HTML supports more than
just ASCII characters.
Let’s Hoogle it!
Visit hoogle.haskell.org and search for “String -> ByteString”. These are the results I get:
We’ve already rejected the first result, but fromString in the utf8-string package looks promising.
Add the right version of utf8-string under dependencies in
package.yaml
. My dependencies now look like this (you may
have different versions):
dependencies:
- base >= 4.7 && < 5
- warp >= 3.2.28 && < 4
- wai >= 3.2.2.1 && < 4
- http-types >= 0.12.3 && < 0.13
- utf8-string >= 1.0.1.1 && < 2
We’ll need to import a new module, namely
Data.ByteString.Lazy.UTF8
:
import qualified Data.ByteString.Lazy.UTF8 as UTF8 (fromString)
Now, we can change our requestHandler
to send a task as
response:
requestHandler :: Request -> (Response -> IO ResponseReceived) -> IO ResponseReceived
=
requestHandler request respond let
= UTF8.fromString $ toHTMLPage $ toHTML myTask
htmlPage = responseLBS status200 [] htmlPage
response in
do
putStrLn "Received an HTTP request!"
respond response
Now you can stack run
to start your server and navigate
to http://localhost:8080
to see a rather disappointing
website.
Typeclasses
Okay, so we can now send a single Task
as an HTML
element, but we want to send more than just that! We need to be able to
at least convert a list of tasks to HTML and possibly other stuff too.
We want different types to be able to have similar functions available,
and be able to write code based on those functions, rather than for each
individual type. Enter typeclasses! A typeclass defines a set of
functions that must be available for a type for it to be considered a
member of that class.
For instance, for our toHTML function, we might make an
HTML
class:
class HTML a where
toHTML :: a -> String
Here we create a class HTML
, we use a
as a
placeholder for the yet unknown type, and declare that a member of the
class HTML
must specify a function called
toHTML
with type signature a -> String
(where a
will be replaced by the type in question).
We can declare that Task
is an instance of the
HTML
class:
instance HTML Task where
=
toHTML task case task of
Task description status ->
case status of
NotDone -> "<p>" ++ description ++ "</p>"
Done -> "<p><strike>" ++ description ++ "</strike></p>"
By providing an implementation of the requirements for being a member
of the HTML
class, we have essentially proven that
Task
is a member of HTML
.
We can do something similar for lists:
instance HTML a => HTML [a] where
=
toHTML listOfElements let
elementToListElement :: HTML a => a -> String
= "<li>" ++ toHTML element ++ "</li>"
elementToListElement element in
"<ul>" ++ (concat $ map elementToListElement listOfElements) ++ "</ul>"
There is a lot of new stuff in there, so let’s look at it bit by bit.
What is this weird =>
arrow? Well it’s a sort of
restriction to the type signature that comes after it.
instance HTML a => HTML [a]
is saying that
[a]
is an instance of the HTML
class, but only
if the variable type a
is as well. So a [Task]
is an instance of HTML
, because Task
is, but
[Char]
is not, because Char
is not an instance
of HTML
(not yet anyway). This restriction on the type
variable a
tells us a
must follow the rules of
the HTML
class. Specifically, there must be a function
toHTML :: a -> String
, and we will be able to use that
function when declaring toHTML
for [a]
.
Similarly, in elementToListElement
, HTML a
tells us this function only works for instances of the HTML
class, letting us use the toHTML
function.
Finally, we find the functions map
- which you’re likely
familiar with, it creates a new list by applying a function to all
elements of another list - and concat, which concatenates a list of
String
s into a single String
.
The open world assumption
Now, you might want to also make a String
an instance of
HTML
. Unfortunately, you cannot do that, because
String
is defined as being a list of characters
[Char]
. We already have an instance for [a]
,
and [Char]
is also a [a]
. If we were to
declare:
instance Html [Char] where -- Bad!
= ... toHTML
the compiler would complain, because it is no longer clear which
implementation of toHTML
to use. Even if we were to remove
the instance declaration for [a]
, it would still be
illegal, as we might add it later, potentially changing the behavior of
code in a completely different place. We say it violates the open
world assumption. We do not know what else there is, so we are only
allowed to match instances on data declarations.
But how can a String
not be a data declaration? We can
create aliases for existing types:
type String = [Char]
This defines the String
type to be a different name for
the [Char]
type. If we had:
data String = String [Char]
We would not have this problem. It is not a popular opinion, so take
it with a grain of salt, but I personally consider type
declarations to be bad style: things that are the same should be called
by the same name, otherwise we’re just obfuscating the types. Aliases
can save you some time though, and I have used them myself, but I’m
never proud of them.
side note 1
Instance declarations match on the right side of the
=>
. That means that even if you were to used advanced type programming to make the left sides of two instance declarations mutually exclusive, you still would not be able to have overlapping instances.
side note 2
There is a language extension that will let you have overlapping instances, but I do not recommend it.
Putting it all together
We can now also change our toHTMLPage
to be defined in
terms of the HTML
class:
toHTMLPage :: HTML a => a -> String
=
toHTMLPage a "<!DOCTYPE html>\
\<html xmlns=\"http://www.w3.org/1999/xhtml\" lang=\"\" xml:lang=\"\">\
\<head>\
\ <meta charset=\"utf-8\" />\
\ <title>TODO list </title>\
\</head>\
\<body>\
\ " ++ toHTML a ++
"</body>"
And of course don’t forget to change requestHandler
to
work with our new definition! And let’s send the entire list over while
we’re at it.
= UTF8.fromString $ toHTMLPage myTodoList htmlPage
stack run
it and let’s see our website:
Great success!