Table of Contents ¶
- Introductory Example
- Writing a Generic Class
- Writing a Generic Trait and Interface
- Writing a Generic Method
- Generics and Type Inference
- Override on Return Type
- T<mixed> Compatibility
- Constraints
- Open and Closed Types
- Style Guidelines
Hack introduces generics to PHP (in the same vein as statically type languages such as C# and Java). Generics allow classes and methods to be parameterized (i.e., a type associated when a class is instantiated or a method is called). Here is an example:
<?hhclass Box<T> {
protected T $data;
public function __construct(T $data) {
$this->data = $data;
}
public function getData(): T {
return $this->data;
}
}
The Box class has a generic type parameter, T, associated with it. For any instance of Box, T can be given any available type. In addition, the method getData() returns a T, meaning that the method will return the same type as associated with Box at instantiation. Note, though, that the getData() method is not itself generic; it is just accessing the top-level type parameter of Box. A client use of Box would look like the following:
<?hh
function main_gen() {
$gi = new Box(3);
$gs = new Box("Hi");
$ga = new Box(array());
echo $gi->getData()."\n";
echo $gs->getData()."\n";
echo $ga->getData()."\n";
}
main_gen();
The above example will output:
3 Hi Array
Parameterization of a class or method provides the following, related, benefits:
- Generics code can be statically checked (in this case using the Hack type checker). Otherwise, without generics, to achieve code like adding similarly-typed items to a list, there would have to be instanceof checks and casting at runtime.
- Clients of generics code can (currently through inference) specify the particular data type to be used, making code more readable and maintainable.
Note:HHVM allows syntax such as $x = Vector<int>{5,10};, but Hack disallows the <int> syntax in this situation, instead opting to infer it.
Introductory Example ¶
An example of a generic class is Vector<T>, from the Hack collections implementation. The T is what makes Vectorgeneric, as it can hold any type of object, from int to a user-defined class. However, for any instantiation of the class, once a type has been associated with T, it cannot hold any other type.
<?hh/* Signature of Vector
*
* class Vector<Tv> implements MutableCollection<Tv> {
* :
* }
*
*/
function main_vec() {
$x = Vector {1, 2, 3, 4}; // T is associated with int
$y = Vector {'a', 'b', 'c', 'd'}; // T is associated with string}
main_vec();
$x is a Vector<int>, while $y is a Vector<string>. A Vector<int> and Vector<string> are not the same type. Methods can also be generic, even when the class is not. Normally, the situation is to have a generic class with non-generic methods or a non-generic class with generic methods. While it is possible, having a generic class with generic methods is usually the exception more than the rule. Having both a generic class and a generic method in that class might be useful when the method is something like a compare method and the generic type parameter to the method is thus different than the class. Here is a contrived example of a generic method in a non-generic class:
<?hh
// Testing generic methods in a non-generic class.
class Box<T> {
public T $value;
public function __construct(T $v) {
$this->value = $v;
}
}
class FooGenMethod {
public function swap<T>(Box<T> $a, Box<T> $b) : void {
$temp = $a->value;
$a->value = $b->value;
$b->value = $temp;
}
}
function main_genmeth() {
$f = new FooGenMethod();
$y = new Box(3);
$z = new Box(4);
echo $y->value." ".$z->value;
$f->swap($y, $z);
echo $y->value." ".$z->value;
}
main_genmeth();
The above example shows a generic method swap<T>() in a non-generic class FooGenMethod.
Note:Notice that in the swap<T>() method, an object was used instead of a primitive for the parameter. This is becausereferences to primitives are not typed (e.g., int &a).
Generics allow developers to write one class or method with the ability to be parameterized to any type, all while preserving type safety. Without a generics paradigm, to accomplish a similar model would require treating everything as a top-level object, many instanceof() checks, and casts to the appropriate type.
Writing a Generic Class ¶
When writing a generic class, keep the following guidelines in mind:
- The generic parameter must start with a capital letter T.
- The generic class must not collide with any existing, non-generic class name (i.e, class Vec andclass Vec<T>).
- Inside the generic class, T can be referred to in class method arguments as well as the return type.
- Nullable T is supported (i.e., ?T).
- Hack does not currently support casting to T or creating new instances of T.
Here is an example of a generic class:
<?hh
// The mailbox can contain any type, but, per instantiation, once associated
// with a type, it cannot change. Mailbox<int>, Mailbox<string>, Mailbox<mixed>class Mailbox<T> {
private ?T $data;
public function __construct() {
$this->data = null;
}
public function put(T $mail): void {
$this->data = $mail;
}
public function check(): ?T {
if ($this->data !== null) {
return $this->data;
}
return null;
}
}
Writing a Generic Trait and Interface ¶
Like classes, traits and interfaces can be generic as well. The guidelines are similar to that of classes. Here is an example:
<?hh// Copyright 2004-present Facebook. All Rights Reserved.
// generic interfaceinterface Box<T> {
public function add(T $item): void;
public function remove(): T;
}
// generic traittrait Commerce<T> {
public function buy(T $item): void {
echo 'Bought a '.get_class($item)."\n";
}
public function sell(T $item): void {
echo 'Sold a '.get_class($item)."\n";
}
}
// generic class that uses generic trait and implements generic interfaceclass BestBuy<T> implements Box<T> {
protected Vector<T> $vec;
private int $last;
public function __construct() {
$this->vec = Vector{};
$this->last = -1;
}
use Commerce<T>;
public function add(T $item): void {
$this->vec->add($item);
$this->last++;
}
public function remove(): T {
$item = $this->vec->at($this->last);
$this->vec->removeKey($this->last--);
return $item;
}
}
// For example purposesabstract class Computer {}
class Apple extends Computer{}
class Lenovo extends Computer {}
class Dell extends Computer {}
function main_gti() {
$store = new BestBuy();
$store->add(new Lenovo());
$store->add(new Apple());
$store->add(new Dell());
echo get_class($store->remove())."\n";
$store->sell($store->remove());
}
main_gti();
The above example will output:
Dell Sold a Apple
Writing a Generic Method ¶
When writing a generic method, keep the following guidelines in mind:
- The generic parameter must start with a capital letter T.
- The generic method must not collide with any existing, non-generic method name (i.e, public function swapand public function swap<T>).
- Inside the generic method, T can be referred to for the return type.
- Nullable T is supported (i.e., ?T).
- Hack does not currently support for casting to T or creating new instances of T.
Here is an example of a generic method:
<?hh
class FooGenMethod {
public function swap<T>(Box<T> $a, Box<T> $b) : T {
$temp = $a->value;
$a->value = $b->value;
$b->value = $temp;
return $temp;
}
}
Generics and Type Inference ¶
Generics may have some surprising semantics when it comes to type inference. This code example will be used to discuss how generics and type inference are handled within Hack.
<?hh // strict
class A {}
class Z extends A {}
function foo_gi(): void {
$x = Vector {};
$x->add(new A());
$x->add(new Z());
}
function foo_gi2(): void {
$x = Vector {};
$x->add(new Z());
$x->add(new A());
foo_gi4($x);
}
function foo_gi3(bool $b): void {
$x = Vector {};
$x->add(new Z());
$x->add(new A());
foo_gi5($x);
}
function foo_gi4(Vector<Z> $vec): void {}
function foo_gi5(Vector<A> $vec): void {}
function foo_gi6(Vector<mixed> $vec): void {}
As probably expected, Hack allows an element of type Z to be added to a Vector<A>. Possibly surprisingly, the reverse is also true — to a point. Some programming languages require the type of the associated with the generic class be known at instantiation time. Hack, on the other hand, can begin determining the type of the generic upon first usage.
In function foo_gi2(), a new, empty Vector is assigned to $x. However, since the Vector was empty, the type associated with $x was not declared. After the first call to $x->add(), the type association for $x begins. Now the type associated with $x is still unresolved, but allows a Z. After the second call to $x->add(), the type association is still unresolved, but allows a Z or an A. For all intents and purposes, at this point $x is a Vector<Unresolved[Z, A]>.
After the ->add()s are completed, that is when things get interesting. Once we start exposing this created Vector to the outside world, either by a method/function call or returning from a method/function, the Vector becomes type established. Effectively, $x is now a Vector<mixed>, but with inheritance properties. Calling foo_gi4($x) becomes problematic to the type checker. foo_gi4() takes a Vector<Z>. However, $x has an A in it; at this point, $x cannot have an element that is a parent of Z.
Calling foo_gi5($x) would not cause a type checker error since foo_gi5() takes a Vector<A> and since Z is a child of A, all is fine. Calling foo_gi6($x) would be ok in all circumstances since foo_gi6() takes a Vector<mixed>.
The bottom line here is that until this generic collection is exposed to code that takes or returns a type that will cause incompatibility, the type checker will be relaxed as to what it allows to be added.
Override on Return Type ¶
Hack brings about a feature to override on return type in a particular circumstance. In essence, a method can be overridden by return type in a subclass if the return type is strictly compatible with the return type of the same method in the superclass. So, this works:
<?hhclass Foo {}
class FooChild extends Foo {}
class AA {
protected function bar(): Foo { return new Foo(); }
}
class BB extends AA {
protected function bar(): FooChild { return new FooChild(); }
}
Take a look at this seemingly similar example:
<?hhclass FooG {}
class FooGChild extends FooG {}
class AAG {
protected function bar(): Vector<FooG> {
$x = new Vector();
$x->add(new FooG());
return $x;
}
}
class BBG extends AAG {
protected function bar(): Vector<FooGChild> {
$x = new Vector();
$x->add(new FooGChild());
return $x;
}
}
On the surface, it may seem like the above should be accepted by the Hack type checker. However, a closer look indicates otherwise. The parent, AAG has a bar() that returns a vector containing FooG. The child, BBG, has a bar()that returns a vector containing FooGChild. In other words, both return Vector, not individual instances of FooG orFooGChild. Thus, since the return type is a Vector, it is impossible to guarantee that the returned vector fromBBG:bar() won't contain, for example, a mix of FooChild and FooG — even to a caller which called it as AAG::bar().
The above example will output:
File "overriding_generics.php", line 15, characters 7-9: This object is of type BBG File "overriding_generics.php", line 15, characters 19-21: It is incompatible with this object of type AAG Because some of their methods are incompatible, Read the following to see why: File "overriding_generics.php", line 8, characters 36-39: This is an object of type FooG File "overriding_generics.php", line 16, characters 36-44: It is incompatible with an object of type FooGChild
T<mixed> Compatibility ¶
mixed is sometimes a source of confusion when it comes to compatibility with other types. This confusion can become exacerbated when it comes to generics. For example, should this pass the Hack type checker?
<?hh
class Mailbox<T> {
private ?T $data;
public function __construct() {
$this->data = null;
}
public function put(T $mail): void {
$this->data = $mail;
}
public function check(): ?T {
if ($this->data !== null) {
return $this->data;
}
return null;
}
}
function mbint(): Mailbox<int> {
$mbi = new Mailbox();
$mbi->put(3);
return $mbi;
}
function mbmixed(Mailbox<mixed> $mbm): void {}
function main() {
$m = mbint();
mbmixed($m);
}
Reading the code, one might believe that passing a Mailbox<int> to a function that takes a Mailbox<mixed> should pass the type checker since mixed should be a superset of int. However, that belief would be incorrect. AMailbox<int> is not a subtype of Mailbox<mixed>.
Imagine if the code above was modified to look like this:
<?hh
// The mailbox can contain any type, but, per instantiation, once associated
// with a type, it cannot change. Mailbox<int>, Mailbox<string>, Mailbox<mixed>class Mailbox<T> {
private ?T $data;
public function __construct() {
$this->data = null;
}
public function put(T $mail): void {
$this->data = $mail;
}
public function check(): ?T {
if ($this->data !== null) {
return $this->data;
}
return null;
}
}
function mbint(): Mailbox<int> {
$mbi = new Mailbox();
$mbi->put(3);
return $mbi;
}
function mbmixed(Mailbox<mixed> $mbm): void {
// Put a string into the mixed Mailbox
$mbm->put("Hello");
}
function main() {
$m = mbint();
// This function puts a string into the Mailbox
mbmixed($m);
// Now what was a Mailbox<int> becomes a Mailbox<string>. Probably not expected behavior.
var_dump($m);
}
main();
Since generic objects are passed by reference, adding a string to the Mailbox<mixed> in mbmixed() actually transforms the Mailbox<int> in main() to one that can now take a string. This is probably not expected behavior and should not be able to pass the type checker. And, in fact, this code does not pass the Hack type checker!
The above example will output:
File "/tmp/mailbox.php", line 36, characters 11-12: Invalid argument File "/tmp/mailbox.php", line 30, characters 26-30: This is a mixed value File "/tmp/mailbox.php", line 24, characters 27-29: It is incompatible with an int
The above code does run on HHVM, however. And that is why this type of code needs to be checked before runtime.
The above example will output:
object(Mailbox)#1 (1) { ["data":"Mailbox":private]=> string(5) "Hello" }
Even though Hack is able to catch a T<int> to T<mixed> conversion attempt before runtime, a cleaner way to write the above code is to use a generic method:
<?hh
// The mailbox can contain any type, but, per instantiation, once associated
// with a type, it cannot change. Mailbox<int>, Mailbox<string>, Mailbox<mixed>class Mailbox<T> {
private ?T $data;
public function __construct() {
$this->data = null;
}
public function put(T $mail): void {
$this->data = $mail;
}
public function check(): ?T {
if ($this->data !== null) {
return $this->data;
}
return null;
}
}
function mbint(): Mailbox<int> {
$mbi = new Mailbox();
$mbi->put(3);
return $mbi;
}
function mbgen<T>(Mailbox<T> $mbm, T $item): void {
$mbm->put($item);
}
function main() {
$m = mbint();
mbgen($m, 4);
var_dump($m);
}
main();
Constraints ¶
Generic class and methods can have restrictions to the types that are able to be used for the parameterized type argument upon instantiation or call, respectively. These are called constraints on type parameters. Here is an example:
<?hh
abstract class Identity<T> {
private T $id;
public function __construct(T $id) {
$this->id = $id;
}
public function getID(): T {
return $this->id;
}
}
interface IFoo {}
class CA {}
class CB extends CA {}
class Bar implements IFoo {}
class Baz implements IFoo {}
class Biz {}
final class AnyIdentity<T> extends Identity<T> {}
final class CAIdentity<T as CA> extends Identity<T> {}
final class CBIdentity<T as CB> extends Identity<T> {}
final class FooIdentity<T as IFoo> extends Identity<T> {}
function main_constraints(): void {
$ai = new AnyIdentity("Hello");
$ai2 = new AnyIdentity(new Biz());
$cb = new CBIdentity(new CB());
$cb2 = new CBIdentity(new CA()); // HACK ERROR!
$ca = new CAIdentity(new CA());
$ca2 = new CAIdentity(new CB());
$fi = new FooIdentity(new Bar());
$fi2 = new FooIdentity(new Baz());
$fi3 = new FooIdentity(new Biz()); // HACK ERROR!}
main_constraints();
Constraints are declared using the as keyword. In the above example, Identity is an abstract class with a parameterized type T. This class cannot be instantiated; thus, concrete class implementations must be created in order to access the functionality of Identity. Four final classes were created that extend Identity. AnyIdentity is a concrete implementation of Identity. It has no constraints. Any type may be passed in for T. The other three classes do have constraints:
- CAIdentity has a constraint that only the type CA (and its children!) may be used for T.
- CBIdentity has a constraint that only the type CB (and its children!) may be used for T.
- FooIdentity has a constraint that only those types that implement the IFoo interface (and their children!) may be used for T.
For the above example, there will be two errors thrown by the Hack type checker. The first is on this line of code:
$cb2 = new CBIdentity(new CA());
The above example will output:
File "hack_constraints.php", line 27, characters 29-30: This is an object of type CB File "hack_constraints.php", line 38, characters 25-32: It is incompatible with an object of type CA File "hack_constraints.php", line 17, characters 31-31: Considering the constraint on the type 'T'
This is because CBIdentity has a CB constraint and CA is NOT a child of CB. It is actually a parent.
The second Hack error is on this line of code:
$fi3 = new FooIdentity(new Biz());
The above example will output:
File "hack_constraints.php", line 28, characters 30-33: This is an object of type IFoo File "hack_constraints.php", line 42, characters 26-34: It is incompatible with an object of type Biz File "hack_constraints.php", line 17, characters 31-31: Considering the constraint on the type 'T'
This error occurs because Biz does not implement the interface IFoo, where IFoo is a constraint on FooIdentity.
Here is an example of a constraint on a generic method:
<?hh
interface IFaz {}
interface IFar implements IFaz {}
class Faz implements IFaz {}
class Far implements IFar {}
class NoIFaz {}
class Bip {
public function get<T as IFaz>(T $x): T {
return $x;
}
}
function main_constraints_gm(): void {
$bip = new Bip();
$bip->get(new Faz());
$bip->get(new Far());
$bip->get(new NoIFaz()); // Hack error;}
main_constraints_gm();
The generic method get() has a constraint that only types that implement IFaz (or their children!) may be used as a the type parameter T to the method. Since the class Far implements IFar and IFar implements IFaz, a Far is able to be passed to get(). However, a NoIFaz is not able to be passed to get() since it has no chain that leads to an IFaz. Here is the Hack error that will occur:
The above example will output:
File "hack_constraints_method.php", line 11, characters 28-31: This is an object of type IFaz File "hack_constraints_method.php", line 20, characters 13-24: It is incompatible with an object of type NoIFaz File "hack_constraints_method.php", line 11, characters 41-41: Considering the constraint on the type 'T'
A possible question that one may have when reading about constraints is, "Why have a constraint? Why not just have the class in question extend the class that is being used as the constraint?" Here is an example that addresses that question:
<?hh
class A {
public function getVal(): int {
return 1;
}
}
class B extends A {
public function getVal(): int {
return 2;
}
public function foo(): void {}
}
class Box<T> {
private T $data;
public function __construct(T $x) {
$this->data = $x;
}
public function get(): T {
return $this->data;
}
}
class BoxOfA<T> extends Box<A> {
private int $sum = 0;
public function __construct(T $e) {
parent::__construct($e);
$this->sum += $e->getVal();
}
}
function main_con(): void {
$b = new B();
$box = new BoxOfA($b);
$b2 = $box->get();
$b2->foo();
}
The above example will output:
File "constraints.php", line 42, characters 8-10: The method foo is undefined in an object of type A
Since BoxOfA<T> extends Box<A>, the type of $box returned upon instantiation is a Box<A> even when BoxOfA is passed a B. Thus, trying to call foo() on $b2 cannot work since there is no foo() defined in A.
Note:Since generic types are dropped at runtime, HHVM will not thrown an error when $b2-foo() is called. Instead it will provide the expected output:The above example will output:object(B)#1 (0) { } object(BoxOfA)#2 (2) { ["sum":"BoxOfA":private]=> int(2) ["data":"Box":private]=> object(B)#1 (0) { } } object(B)#1 (0) { } bool(true)
Constraints can remedy this situation. Making a few small changes to the example, now a BoxOfA is constrained to taking an A or its children! So when a B (a child of A) is passed to BoxOfA, a B is returned when calling$box->get(). Thus, $b2->foo() is a perfectly legitimate call. No Hack errors!
<?hh
class A {
public function getVal(): int {
return 1;
}
}
class B extends A {
public function getVal(): int {
return 2;
}
public function foo(): bool {
return true;
}
}
class Box<T> {
private T $data;
public function __construct(T $x) {
$this->data = $x;
}
public function get(): T {
return $this->data;
}
}
// class BoxOfA<T> extends Box<A>class BoxOfA<T as A> extends Box<T> {
private int $sum = 0;
public function __construct(T $e) {
parent::__construct($e);
$this->sum += $e->getVal();
}
}
function main_con(): void {
$b = new B();
var_dump($b);
$box = new BoxOfA($b);
var_dump($box);
$b2 = $box->get();
var_dump($b2);
var_dump($b2->foo());
}
main_con();
A couple of important notes about constraints and the Hack type checker:
- Currently, Hack only allows one constraint on a type or a method.
- Constraints cannot be used with primitive types. Since primitives are not inheritable, it wouldn't make sense in such a context. Just have the method take the primitive directly.
Open and Closed Types ¶
With generics, it is useful to discuss the concept of open and closed types. A type is open if it is a type parameter (e.g.T). Obviously enough, a type is closed if it is not open (e.g. int) . Here is an example of using open and closed types when it comes to generics:
<?hh
class Foo<T> {
public function getOpen(Vector<T> $vec): T {
return $vec[0];
}
public function getClosed(Vector<int> $vec): int {
return $vec[0];
}
}
The method getOpen() takes an open type and returns an open type. The method getClosed() takes a closed type and returns a closed type.
What happens, though, when a method takes an open type and tries to return a closed type using that open type? Take this example using two generic methods:
<?hh
function foo<T>(T $a): void {
echo "foo<T>()";
}
function bar<T>(T $a): int {
echo "bar<T>()";
return $a;
}
function main_oct() {
$x = 5;
foo($x);
bar($x);
}
main_oct();
The above example will output:
File "open_type_arguments.php", line 9, characters 10-11: Invalid return type File "open_type_arguments.php", line 7, characters 24-26: This is an int File "open_type_arguments.php", line 7, characters 17-17: It is incompatible with of type T
Since bar() takes an open type argument, $a is not guaranteed to be compatible with an int. So it cannot be returned as an int, even if $a is an int when passed in. foo() returns void. Thus, as written, foo() can type check correctly. However, what happens if $a is attempted to be used as an int within foo()?
<?hh
function foo<T>(T $a): void {
$x = $a + 10;
echo "foo<T>()";
}/*function bar<T>(T $a): int {
echo "bar<T>()";
return $a;
}*/
function main_oct() {
$x = 5;
foo($x);
//bar($x);}
main_oct();
The above example will output:
File "open_type_arguments.php", line 5, characters 8-9: Typing error File "open_type_arguments.php", line 5, characters 8-9: This is an int/float because this is used in an arithmetic operation File "open_type_arguments.php", line 4, characters 17-17: It is incompatible with of type T
As probably expected, $a cannot be guaranteed to be an int since it is passed as an open type T. Thus this arithmetic operation cannot be performed.
How can these issues be resolved? These are possibilities, but should be used with caution:
- Casting: bar() could cast $a as int before returning.
- Return an open type: bar() can return a T instead of an int.
- // UNSAFE: Use Hack's UNSAFE annotation to get around the type checker. Use with extreme caution.
- Constraints: Assuming primitives aren't involved (they are not supported), constraints can be used to help guarantee a return type.
- Rework code: Maybe if this situation arises, the code can be redesigned better.
Style Guidelines ¶
The following are recommended style guidelines for when using generics:
- Begin all type parameters with T.
- Name generic parameters with descriptive names (e.g., class Foo<TWrappedObject>), unless a single letter is self-explanatory or the generic parameter is a generic-generic parameter type.
- When using a non-descriptive name, a generic class or method with one parameter should generally be named<T>.
- When using a non-descriptive name, a generic class or method with two or more enumerated parameters should be named in the form of <Ta, Tb, ...> or <T1, T2,...>.
- A collection type with a key and value (such as a Map) should have its generic parameters named <Tk, Tv>.
No comments:
Post a Comment