parametric polymorphism
1. intro
A parameterized type is really a function from parameter types to concrete types (see also kinds in Types in Haskell). How can we get parametric polymorphism or generics?
2. Templates
- Approach taken by by C++
- A generic class is written with type variables, which are to be later substituted with specific types when the class is instantiated
- What gets compiled? You can think of it like this: if an object of type MyGenericClass<Int> ever gets instantiated, then the compiler will create a new source file, based on the template MyGenericClass<E>, where \(E\) is everywhere substituted for <Int>. Then this is the file that actually gets compiled. This will be done for every such concrete \(T\) that appears in the program as MyGenericClass<T>.
- Downside: this can lead to coad bloat: we're essentially making multiple copies of code from a template, and the copies all share a good amount of duplicated material
- Upside: the compiler can be aggressive about optimizing performance for each such copy, since it knows specifics about
T
. This will not be the case for unconstrained polymorphism
3. Unconstrained polymorphism
- The approach taken by the ML family
- For example, consider a generic data structure that we are writing that will hold values of type \(E\)
- We can write our implementation, everywhere using the type \(E\)
- Any operations that take an
E
value as an argument will also need to be passed in as a first-class value - Then, our implementation can be compiled once (no copies!)
- Downside: typically, since we want our implementation to work with every possible parameter type, we need to make some simplifying assumptions for the compiler. For example, if our datastructure needs to store things, then we need to know the size of the parameter type values before hand. However, we can't possibly know that! So instead, we assume that it's a fixed word-length size, and if it's not, then we assume it's a fixed word-length pointer to a value on the heap. This is called boxing. The downside of all this is more things stored on the heap and more work for the garbage collector
4. subtype constrained polymorphism
- Can be seen in Java
- You can constrain your parameter type to implement an interface
- The interface is a contract of functions that the parameter type must implement
- Then, we can statically check whether the parameter type implements the operations needed by our generic class
- This is done using type erasure: in the source code the parameter type is replaced by the interface it implements
- question: As a consequence of this replacement we get type erasure. But is type erasure necessary to this process?
- This is done using type erasure: in the source code the parameter type is replaced by the interface it implements
- Downside: The subtype needs to be explicitly declared to implement an interface ahead of time. So if we're getting our subtype from a library that someone else wrote, we can't make it implement an interface, unless we wrap it another class
- Also uses boxing
5. Type class constraints
- Approach taken by haskell, ocaml
- To address the last bullet above, Haskell uses typeclasses
- So given some types, we can write a type-class that provides the methods necessary for a certain constraint
- We can do this, even if we don't have access to the source code for the types. This is retroactive modeling
- Each typeclass is specific to the parameter type, meaning that we can get run-time information about the type parameter
- contrast this with the Java approach above, in which we lose information about the type parameter through type erasure. We know that we have a list of Comparables, but we don't know what implements the comparable interface.
- question: why is this so good? I guess it lets us pattern match on the generic class, and then we can do something like dispatch on the parameter type to get more efficient implementation?