As part of recent type-refactoring efforts in Haskell, a discussion about adding
Semigroup as a parent class of
Monoid has been bouncing around the mailing list. From a theoretical point of view, this is a great idea: it is more flexible than the current approach that would allow for greater expressibility.
From a practical point of view, however, I am inclined to oppose it. Not because it is in itself a bad change—it’s a very reasonable change that has advantages for new code—but because I have, in the past, had to update large systems written in Haskell after GHC updates, and therefore I know that this kind of change has a cost. The Applicative-Monad changes seem to have made way for the Foldable-Traversable Prelude, but those have in turn inspired a number of suggestions for modifications to the Haskell standard library, each one of which, taken on its own, is reasonable, but taken as a mass, mean much more work for maintainers. This is especially true if we continue on this path of making minor refactorings at each release: each year a project will need changes, or it will quickly bit-rot beyond utility.
Default Superclass Instances
There is, however, an alternative I would like to discuss—one which has also been brought up on the mailing list, but which I’d like to provide more concrete motivation for. Some of these changes—especially the
Monoid split—seem to involve taking the functionality of a class and splitting it into multiple smaller classes. For example, we can think of the
Monoid change as converting
Something that has been proposed before (in a few different forms) and which I suggest be more actively considered if changes like these are to become common is to allow superclass instances to be declared within a subclass declaration. This would allow you to write a single
instance declaration for a class, and in that body also include implementations for methods belong to a superclass of that class. As an example, perhaps we could be able to write2:
Semigroup proposal, this would mean that any
Monoid instances that exist would continue to work unchanged, but new instances could (optionally) split apart their declarations. Under this proposal, either of these would be acceptable:
And because the
Monoid declaration for
 is already written like the former, we can make the
Monoid split without having to rewrite the instance declarations!
Exciting Additional Opportunities
Because this lowers the cost of updating for new versions, various other useful changes might be considered that would otherwise involve far too much breakage. For example, we could consider splitting
Num apart into small constituent parts. With Default Superclass Instances, we could refactor
Num into the following without changing any instance declarations:3
class Add a where (+) :: a -> a -> a class Sub a where (-) :: a -> a -> a class Add a => Zero a where zero :: a class Mul a where (*) :: a -> a -> a class Mul a => One a where one :: a class (Zero a, One a) => FromInteger a where fromInteger :: Integer -> a instance Zero a where zero = fromInteger 0 instance One a where one = fromInteger 1 class Signed a where negate :: a -> a abs :: a -> a signum :: a -> a class ( Eq a, Show a, Add a, Sub a , Mul a, FromInteger a, Signed a) => Num a where
which would allow certain numeric types to only implement a subset of the relevant operations:
and also allow current
Num functions to have a looser set of constraints than they do at present:
We could also consider splitting
Arrow4 into distinct components, again without having to change any instance declarations:
or even (dare I dream) splitting
Bits into something that is not a 22-method monstrosity!
On the other hand, this proposal does have some down-sides:
Grepping for Instance Declarations
Right now, I can often find an instance declaration for a type
T by grepping for
instance C T (modulo some line noise) whereas with this change, it’s possible that there is no declaration for
instance C T, because all of
C’s methods are declared by a subclass
C' instead. The compiler ought to be able to deal with this without problem, which means that tools like Haddock documentation should somewhat alleviate this problem, but a user might be confused.
Introduces New Possible Errors
The declarations below are of course nonsensical, and would be rejected by the compiler—but the fact that this change would introduce new failure conditions at all is a drawback of the proposal.
A Pragma-Less Extension
In order to be really useful, we’d want to use this without a
LANGUAGE pragma. After all, we’re arguing for it on the basis of preserving backwards-compatibility, but that argument is much less compelling if we still have to change the source files to make use of it! On the other hand, that means we’d have included a GHC extension that takes effect despite not being enabled, which is also a worrying precedent!
It still might be a useful extension even if it had to be enabled by a
LANGUAGE pragma, as it is easier to add said pragma to a source file than to manually break apart dozens of instance declarations, but it makes this feature less compelling in general.
As I said before, my primary objection to typeclass changes like those above—even if they are good changes for newly written code—is that they break existing code. I don’t want to have to modify a handful of miscellaneous instance declarations on a yearly basis as people discover new levels of abstraction or become dissatisfied with current choices, especially as those changes will grow more extensive as I build more projects in Haskell! But with an extension like this, we could grow the typeclass ecosystem gradually and fix what we see as past warts while maintaining backwards-compatibility, which would be a very powerful tool to have.
This is perhaps more simplistic than we want: we can also use the existing
Semigroupclass from the
semigrouppackage and then, in the
Monoidclass declaration, explain how to derive the methods of the superclass if no superclass instance is present. This, according to the linked proposal, would look like:
The example above is slightly simpler, which is why I relegated this version to a footnote.↩
This isn’t a concrete proposal, so maybe the actual syntax or semantics of these things should be changed! I want to focus on the feature and not the instantiation.↩
For this example, I added
Oneclasses so that a given type might implement an additive and multiplicative unit while not necessarily having a sensible
FromIntegerimplementation. For example, it might not make sense to implement
fromIntegerfor a complex number, but complex numbers clearly have both an additive and multiplicative unit:
This means that the
Productmonoids could be rewritten like:
Notice that I added
Onein such a way that an existing
Numinstance declarations needn’t be changed!↩
I have on numerous occasions had reason to use the
Arrowabstraction, but haven’t had a sensible implementation of
arr. To use a slightly contrived example, I could define a GADT that can describe the structure of boolean circuits in a way that resembles an Arrow, but has no way of expressing
data Circ a b where BoolFunc :: (Bool -> Bool) -> Circ Bool Bool Ident :: Circ a a Compose :: Circ a b -> Circ b c -> Circ a c First :: Circ a b -> Circ (a, d) (b, d) Second :: Circ a b -> Circ (d, a) (d, b) Parallel :: Circ a b -> Circ a' b' -> Circ (a, a') (b, b') Fanout :: Circ a b -> Circ a b' -> Circ a (b, b') instance Category Circ where id = Ident (.) = flip Compose instance Arrow Circ where first = First second = Second (***) = Parallel (&&&) = Fanout arr = error "Nothing sensible to put here!" -- for example, using this definition, we can write xor: xor :: BoolCircuit (Bool, Bool) Bool xor = ((first Not >>> And) &&& (second Not >>> And)) >>> Or -- ...and using xor, write a half-adder: halfAdder :: BoolCircuit (Bool, Bool) (Bool, Bool) halfAdder = xor &&& And
This is not an unreasonable definition—it would be nice to abstract over such a definition using existing tools—but the structure of the
Arrowtypeclass makes it difficult!↩