Skip to content

Commit 6795954

Browse files
committed
Add docs
1 parent 7be0099 commit 6795954

File tree

2 files changed

+374
-0
lines changed

2 files changed

+374
-0
lines changed

docs/docs/reference/derivation.md

Lines changed: 372 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,372 @@
1+
---
2+
layout: doc-page
3+
title: Typeclass Derivation
4+
---
5+
6+
Implicit instances for some typeclass traits can be derived automatically. Example:
7+
```scala
8+
enum Tree[T] derives Eq, Ordering, Pickling {
9+
case Branch(left: Tree[T], right: Tree[T])
10+
case Leaf(elem: T)
11+
}
12+
```
13+
The derives clause automatically generates typeclass instances for
14+
`Eq`, `Ordering`, and `Pickling` in the companion object `Tree`:
15+
```scala
16+
impl [T: Eq] of Eq[Tree[T]] = Eq.derived
17+
impl [T: Ordering] of Ordering[Tree[T]] = Ordering.derived
18+
impl [T: Pickling] of Pickling[Tree[T]] = Pickling.derived
19+
```
20+
21+
**Note**: This page uses the new syntax proposed for implicits that is explored in #5448. This is not yet an endorsement of that syntax, but rather a way to experiment with it.
22+
23+
### Deriving Types
24+
25+
Besides for `enums`, typeclasses can also be derived for other sets of classes and objects that form an algebraic data type. These are:
26+
27+
- individual case classes or case objects
28+
- sealed classes or traits that have only case classes and case objects as children.
29+
30+
Examples:
31+
32+
```scala
33+
case class Labelled[T](x: T, label: String) derives Eq, Show
34+
35+
sealed trait Option[T] derives Eq
36+
case class Some[T] extends Option[T]
37+
case object None extends Option[Nothing]
38+
```
39+
40+
The generated typeclass instances are placed in the companion objects `Labelled` and `Option`, respectively.
41+
42+
### Derivable Traits
43+
44+
A trait can appear in a `derives` clause as long as
45+
46+
- it has a single type parameter,
47+
- its companion object defines a method named `derived`.
48+
49+
These two conditions ensure that the synthesized derived instances for the trait are well-formed. The type and implementation of a `derived` method are arbitrary, but typically it has a definition like this:
50+
```
51+
def derived[T] with (ev: Shaped[T, S]) = ...
52+
```
53+
That is, the `derived` method takes an implicit parameter of type `Shaped` that determines the _shape_ `S` of the deriving type `T` and it computes the typeclass implementation according to that shape. Implicit `Shaped` instances are generated automatically for all types that have a `derives` clause.
54+
55+
This is all a user of typeclass derivation has to know. The rest of this page contains information needed to be able to write a typeclass that can be used in a `derives` clause. In particular, it details the means provided for the implementation of data generic `derived` methods.
56+
57+
58+
### The Shape Type
59+
60+
For every class with a `derives` clause, the compiler generates a type named `Shape` in the companion object of that class. For instance, here is the generated `Shape` type for the `Tree` enum:
61+
```scala
62+
type Shape[T] = Cases[
63+
Case[Branch[T], (Tree[T], Tree[T])],
64+
Case[Leaf[T], T *: Unit]
65+
]
66+
```
67+
Informally, this states that
68+
69+
> The shape of a `Tree[T]` is one of two cases: Either a `Branch[T]` with two
70+
elements of type `Tree[T]`, or a `Leaf[T]` with a single element of type `T`.
71+
72+
The type constructors `Cases` and `Case` come from the companion object of a class
73+
`scala.compiletime.Shape`, which is defined in the standard library as follows:
74+
```scala
75+
sealed abstract class Shape
76+
77+
object Shape {
78+
79+
/** A sum with alternative types `Alts` */
80+
case class Cases[Alts <: Tuple] extends Shape
81+
82+
/** A product type `T` with element types `Elems` */
83+
case class Case[T, Elems <: Tuple] extends Shape
84+
}
85+
```
86+
87+
Here is the `Shape` type for `Labelled`:
88+
```scala
89+
type Shape[T] = Case[Labelled[T], (T, String)]
90+
```
91+
And here is the one for `Option`:
92+
```scala
93+
type Shape[T] = Cases[
94+
Case[Some[T], T * Unit],
95+
Case [None.type, Unit]
96+
]
97+
```
98+
Note that an empty element tuple is represented as type `Unit`. A single-element tuple
99+
is represented as `T *: Unit` since there is no direct syntax for such tuples: `(T)` is just `T` in parentheses, not a tuple.
100+
101+
The `Shape` type generation is suppressed if the companion object already contains a type member named `Shape`.
102+
103+
### The Shaped TypeClass
104+
105+
For every class `C[T_1,...,T_n]` with a `derives` clause, the compiler also generates a type class instance like this:
106+
```scala
107+
impl [T_1, ..., T_n] of Shaped[C[T_1,...,T_n], Shape[T_1,...,T_n]] ...
108+
```
109+
This instance is generated together with the `Shape` type in the companion object of the class.
110+
For instance, the definition
111+
```scala
112+
enum Result[+T, +E] derives Logging {
113+
case class Ok[T](result: T)
114+
case class Err[E](err: E)
115+
}
116+
```
117+
would produce the following members:
118+
```scala
119+
object Result {
120+
import scala.compiletime.Shape._
121+
122+
type Shape[T, E] = Cases[(
123+
Case[Ok[T], T *: Unit],
124+
Case[Err[E], E *: Unit]
125+
)]
126+
127+
impl [T, E] of Shaped[Result[T, E], Shape[T, E]] = ...
128+
}
129+
```
130+
131+
The `Shaped` class is defined in package `scala.reflect`.
132+
133+
```scala
134+
abstract class Shaped[T, S <: Shape] extends Reflected[T]
135+
```
136+
It is a subclass of class `scala.reflect.Reflected`, which defines two methods that map between a type `T` and a generic representation of `T`, which we call a `Mirror`:
137+
```scala
138+
abstract class Reflected[T] {
139+
140+
/** The mirror corresponding to ADT instance `x` */
141+
def reflect(x: T): Mirror
142+
143+
/** The ADT instance corresponding to given `mirror` */
144+
def reify(mirror: Mirror): T
145+
146+
/** The companion object of the ADT */
147+
def common: ReflectedClass
148+
}
149+
```
150+
151+
The `reflect` method maps an instance value of the ADT `T` to its mirror whereas the `reify` method goes the other way. There's also a `common` method that returns a value of type `ReflectedClass` which contains information that is the same
152+
for all instances of a class (right now, this consists of essentially just the names of the cases and their parameters).
153+
154+
### Mirrors
155+
156+
A mirror is a generic representation of an instance value of an ADT. `Mirror` objects have three components:
157+
158+
- `reflected: ReflectedClass`: The representation of the ADT class
159+
- `ordinal: Int`: The ordinal number of the case among all cases of the ADT, starting from 0
160+
- `elems: Product`: The elements of the instance, represented as a `Product`.
161+
162+
The `Mirror` class is defined in package `scala.reflect` as follows:
163+
164+
```scala
165+
class Mirror(val reflected: ReflectedClass, val ordinal: Int, val elems: Product) {
166+
167+
/** The `n`'th element of this generic case */
168+
def apply(n: Int): Any = elems.productElement(n)
169+
170+
/** The name of the constructor of the case reflected by this mirror */
171+
def caseLabel: String = reflected.label(ordinal)(0)
172+
173+
/** The label of the `n`'th element of the case reflected by this mirror */
174+
def elementLabel(n: Int) = reflected.label(ordinal)(n + 1)
175+
}
176+
```
177+
178+
### ReflectedClass
179+
180+
Here's the API of `scala.reflect.ReflectedClass`:
181+
182+
```scala
183+
class ReflectedClass(val runtimeClass: Class[_], labelsStr: String) {
184+
185+
/** A mirror of case with ordinal number `ordinal` and elements as given by `Product` */
186+
def mirror(ordinal: Int, product: Product): Mirror =
187+
new Mirror(this, ordinal, product)
188+
189+
/** A mirror with elements given as an array */
190+
def mirror(ordinal: Int, elems: Array[AnyRef]): Mirror =
191+
mirror(ordinal, new ArrayProduct(elems))
192+
193+
/** A mirror with an initial empty array of `numElems` elements, to be filled in. */
194+
def mirror(ordinal: Int, numElems: Int): Mirror =
195+
mirror(ordinal, new Array[AnyRef](numElems))
196+
197+
/** A mirror of a case with no elements */
198+
def mirror(ordinal: Int): Mirror =
199+
mirror(ordinal, EmptyProduct)
200+
201+
202+
/** Case and element labels as a two-dimensional array.
203+
* Each row of the array contains a case label, followed by the labels of the elements of that case.
204+
*/
205+
val label: Array[Array[String]] = ...
206+
```
207+
208+
The class provides four overloaded methods to create mirrors. The first of these is invoked by the `reify` method that maps an ADT instance to its mirror. It simply passes the
209+
instance itself (which is a `Product`) to the second parameter of the mirror. That operation does not involve any copying and is thus quite efficient. The second and third versions of `mirror` are typically invoked by typeclass methods that create instances from mirrors. An example would be an `unpickle` method that first creates an array of elements, then creates
210+
a mirror over that array, and finally uses the `reify` method in `Reflected` to create the ADT instance. The fourth version of `mirror` is used to create mirrors of instances that do not have any elements.
211+
212+
### How to Write Generic Typeclasses
213+
214+
Based on the machinery developed so far it becomes possible to define type classes generically. This means that the `derived` method will compute a type class instance for any ADT that has a `Shaped` instance, recursively.
215+
The implementation of these methods typically uses three new typelevel constructs in Dotty: inline methods, inline matches and implicit matches. As an example, here is one possible implementation of a generic `Eq` type class, with explanations. Let's assume `Eq` is defined by the following trait:
216+
```scala
217+
trait Eq[T] {
218+
def eql(x: T, y: T): Boolean
219+
}
220+
```
221+
We need to implement a method `Eq.derived` that produces an instance of `Eq[T]` provided
222+
there exists evidence of type `Shaped[T, S]` for some shape `S`. Here's a possible solution:
223+
```scala
224+
inline def derived[T, S <: Shape] with (ev: Shaped[T, S]): Eq[T] = new Eq[T] {
225+
def eql(x: T, y: T): Boolean = {
226+
val mx = ev.reflect(x) // (1)
227+
val my = ev.reflect(y) // (2)
228+
inline erasedValue[S] match {
229+
case _: Cases[alts] =>
230+
mx.ordinal == my.ordinal && // (3)
231+
eqlCases[alts](mx, my, 0) // [4]
232+
case _: Case[_, elems] =>
233+
eqlElems[elems](mx, my, 0) // [5]
234+
}
235+
}
236+
}
237+
```
238+
The implementation of the inline method `derived` creates an instance of `Eq[T]` and implements its `eql` method. The right hand side of `eql` mixes compile-time and runtime elements. In the code above, runtime elements are marked with a number in parentheses, i.e
239+
`(1)`, `(2)`, `(3)`. Compile-time calls that expand to runtime code are marked with a number in brackets, i.e. `[4]`, `[5]`. The implementation of `eql` consists of the following steps.
240+
241+
1. Map the compared values `x` and `y` to their mirrors using the `reflect` method of the implicitly passed `Shaped` evidence `(1)`, `(2)`.
242+
2. Match at compile-time against the type `S`. Dotty does not have a construct for matching types directly, buy we can emulate it using an `inline` match over an `erasedValue`. Depending on the actual type `S`, the match will reduce at compile time to one of its two alternatives.
243+
3. If `S` is of the form `Cases[alts]` for some tuple `alts` of alternative types, the equality test consists of comparing the ordinal values of the two mirrors `(3)` and, if they are equal, comparing the elements of the case indicated by that ordinal value. That second step is performed by code that results from the compile-time expansion of the `eqlCases` call `[4]`.
244+
4. If `S` is of the form `Case[elems]` for some tuple `elems` for element types, the elements of the case are compared by code that results from the compile-time expansion of the `eqlElems` call `[5]`.
245+
246+
Here is a possible implementation of `eqlCases`:
247+
```scala
248+
inline def eqlCases[Alts <: Tuple](mx: Mirror, my: Mirror, n: Int): Boolean =
249+
inline erasedValue[Alts] match {
250+
case _: (Shape.Case[_, elems] *: alts1) =>
251+
if (mx.ordinal == n) // (6)
252+
eqlElems[elems](mx, my, 0) // [7]
253+
else
254+
eqlCases[alts1](mx, my, n + 1) // [8]
255+
case _: Unit =>
256+
throw new MatchError(mx.ordinal) // (9)
257+
}
258+
```
259+
The inline method `eqlCases` takes as type arguments the alternatives of the ADT that remain to be tested. It takes as value arguments mirrors of the two instances `x` and `y` to be compared and an integer `n` that indicates the ordinal number of the case that is tested next. Its produces an expression that compares these two values.
260+
261+
If the list of alternatives `Alts` consists of a case of type `Case[_, elems]`, possibly followed by further cases in `alts1`, we generate the following code:
262+
263+
1. Compare the `ordinal` value of `mx` (a runtime value) with the case number `n` (a compile-time value translated to a constant in the generated code) in an if-then-else `(6)`.
264+
2. In the then-branch of the conditional we have that the `ordinal` value of both mirrors
265+
matches the number of the case with elements `elems`. Proceed by comparing the elements
266+
of the case in code expanded from the `eqlElems` call `[7]`.
267+
3. In the else-branch of the conditional we have that the present case does not match
268+
the ordinal value of both mirrors. Proceed by trying the remaining cases in `alts1` using
269+
code expanded from the `eqlCases` call `[8]`.
270+
271+
If the list of alternatives `Alts` is the empty tuple, there are no further cases to check.
272+
This place in the code should not be reachable at runtime. Therefore an appropriate
273+
implementation is by throwing a `MatchError` or some other runtime exception `(9)`.
274+
275+
The `eqlElems` method compares the elements of two mirrors that are known to have the same
276+
ordinal number, which means they represent the same case of the ADT. Here is a possible
277+
implementation:
278+
```scala
279+
inline def eqlElems[Elems <: Tuple](xs: Mirror, ys: Mirror, n: Int): Boolean =
280+
inline erasedValue[Elems] match {
281+
case _: (elem *: elems1) =>
282+
tryEql[elem]( // [12]
283+
xs(n).asInstanceOf[elem], // (10)
284+
ys(n).asInstanceOf[elem]) && // (11)
285+
eqlElems[elems1](xs, ys, n + 1) // [13]
286+
case _: Unit =>
287+
true // (14)
288+
}
289+
```
290+
`eqlElems` takes as arguments the two mirrors of the elements to compare and a compile-time index `n`, indicating the index of the next element to test. It is defined in terms of an another compile-time match, this time over the tuple type `Elems` of all element types that remain to be tested. If that type is
291+
non-empty, say of form `elem *: elems1`, the following code is produced:
292+
293+
1. Access the `n`'th elements of both mirrors and cast them to the current element type `elem`
294+
`(10)`, `(11)`. Note that because of the way runtime reflection mirrors compile-time `Shape` types, the casts are guaranteed to succeed.
295+
2. Compare the element values using code expanded by the `tryEql` call `[12]`.
296+
3. "And" the result with code that compares the remaining elements using a recursive call
297+
to `eqlElems` `[13]`.
298+
299+
If type `Elems` is empty, there are no more elements to be compared, so the comparison's result is `true`. `(14)`
300+
301+
Since `eqlElems` is an inline method, its recursive calls are unrolled. The end result is a conjunction `test_1 && ... && test_n && true` of test expressions produced by the `tryEql` calls.
302+
303+
The last, and in a sense most interesting part of the derivation is the comparison of a pair of element values in `tryEql`. Here is the definition of this method:
304+
```scala
305+
inline def tryEql[T](x: T, y: T) = implicit match {
306+
case ev: Eq[T] =>
307+
ev.eql(x, y) // (15)
308+
case _ =>
309+
error("No `Eq` instance was found for $T")
310+
}
311+
```
312+
`tryEql` is an inline method that takes an element type `T` and two element values of that type as arguments. It is defined using an `inline match` that tries to find an implicit instance of `Eq[T]`. If an instance `ev` is found, it proceeds by comparing the arguments using `ev.eql`. On the other hand, if no instance is found
313+
this signals a compilation error: the user tried a generic derivation of `Eq` for a class with an element type that does not support an `Eq` instance itself. The error is signalled by
314+
calling the `error` method defined in `scala.compiletime`.
315+
316+
**Note:** At the moment our error diagnostics for meta programming does not support yet interpolated string arguments for the `scala.compiletime.error` method that is called in the second case above. As an alternative, one can simply leave off the second case, then a missing typeclass would result in a "failure to reduce match" error.
317+
318+
**Example:** Here is a slightly polished and compacted version of the code that's generated by inline expansion for the derived `Eq` instance of class `Tree`.
319+
320+
```scala
321+
impl Eq_Tree_impl[T] with (elemEq: Eq[T]) of Eq[Tree[T]] {
322+
def eql(x: Tree[T], y: Tree[T]): Boolean = {
323+
val ev = implOf[Shaped[Tree[T], Tree.Shape[T]]]
324+
val mx = ev.reflect(x)
325+
val my = ev.reflect(y)
326+
mx.ordinal == my.ordinal && {
327+
if (mx.ordinal == 0) {
328+
derived$Eq.eql(mx(0).asInstanceOf[Tree[T]], my(0).asInstanceOf[Tree[T]]) &&
329+
derived$Eq.eql(mx(1).asInstanceOf[Tree[T]], my(1).asInstanceOf[Tree[T]])
330+
}
331+
else if (mx.ordinal == 1) {
332+
elemEq.eql(mx(0).asInstanceOf[T], my(0).asInstanceOf[T])
333+
}
334+
else throw new MatchError(mx.ordinal)
335+
}
336+
}
337+
}
338+
```
339+
340+
One important difference between this approach and Scala-2 typeclass derivation frameworks such as Shapeless or Magnolia is that no automatic attempt is made to generate typeclass instances of elements recursively using the generic derivation framework. There must be an implicit instance of `Eq[T]` (which can of course be produced in turn using `Eq.derived`), or the compilation will fail. The advantage of this more restrictive approach to typeclass derivation is that it avoids uncontrolled transitive typeclass derivation by design. This keeps code sizes smaller, compile times lower, and is generally more predictable.
341+
342+
### Derived Instances Elsewhere
343+
344+
Sometimes one would like to derive a typeclass instance for an ADT after the ADT is defined, without being able to change the code of the ADT itself.
345+
To do this, simply define an instance with the `derived` method of the typeclass as right hand side. E.g, to implement `Ordering` for `Option`, define:
346+
```scala
347+
impl [T: Ordering] of Ordering[Option[T]] = Ordering.derived
348+
```
349+
Usually, the `Ordering.derived` clause has an implicit parameter of type `Shaped[Option[T], Option.Shape[T]]`. Since the `Option` trait has a `derives` clause, the necessary implicit instance is already present in the companion object of `Option`. If the ADT in question does not have a `derives` clause, an implicit `Shaped` instance would still be synthesized by the compiler at the point where `derived` is called. This is similar to the situation with type tags or class tags: If no implicit instance is found, the compiler will synthesize one.
350+
351+
### Syntax
352+
353+
```
354+
Template ::= InheritClauses [TemplateBody]
355+
EnumDef ::= id ClassConstr InheritClauses EnumBody
356+
InheritClauses ::= [‘extendsConstrApps] [‘derivesQualId {‘,’ QualId}]
357+
ConstrApps ::= ConstrApp {‘withConstrApp}
358+
| ConstrApp {‘,’ ConstrApp}
359+
```
360+
361+
### Discussion
362+
363+
The typeclass derivation framework is quite small and low-level. There are essentially two pieces of infrastructure that are generated by the compiler
364+
365+
- The `Shape` type representing the shape of an ADT
366+
- A way to map between ADT instances and generic mirrors
367+
368+
Generic mirrors make use of the already existing `Product` infrastructure for case classes, which means they are efficient and their generation requires not much code.
369+
370+
Generic mirrors can be so simple because, just like `Product`s, they are weakly typed. On the other hand, this means that code for generic typeclasses has to ensure that type exploration and value selection proceed in lockstep and it has to assert this conformance in some places using casts. If generic typeclasses are correctly written these casts will never fail.
371+
372+
It could make sense to explore a higher-level framework that encapsulates all casts in the framework. This could give more guidance to the typeclass implementer. It also seems quite possible to put such a framework on top of the lower-level mechanisms presented here.

0 commit comments

Comments
 (0)