Table of Contents ¶
- Introductory Example
- Why?
- What types can be used?
- How to type annotate
- Examples
- Type Casting
- Annotating Arrays
- Annotating Closures
- Annotating Constructors
- Annotating with this
- Mixed Types
- Passing By Reference
- Typing Generators
- Summary
Type annotations allow for PHP code to be explicitly typed on parameters, class member variables and return values (types are inferred for locals). These annotated types are checked via a type checker. Here are examples of the same code without and with type annotations:
<?phpclass MyClass {
  const MyConst = 0;
  private $x;
  public function increment($x) {
    $y = $x + 1;
    return $y;
  }
}<?hhclass MyClass {
  const int MyConst = 0;
  private string $x = '';
  public function increment(int $x): int {
    $y = $x + 1;
    return $y;
  }
}
It is clear that the second example provides more description and insight into the intention of the code. Type annotations provides three primary code improvements:
- Readability by helping other developers understand the purpose and intention of the code. Many use comments to express such annotations. Hack formalizes such annotations.
- Correctness by forbidding unsafe coding practices (e.g., sketchy null checks) as well as allowing tools to check type annotations before runtime.
- Refactorability by allowing Hack to inherit the reliable and automatic refactoring of a statically typed language. It is quite difficult to refactor a dynamically typed language such as PHP. Renaming or changing the number of parameters to a function require a manual search of the code base to find all call sites. A type checker, however, will throw and display a list of errors when a "breaking" change is made that can then be fixed one-by-one (or automatically with tooling).Introductory Example ¶<?hhclass AnnotatedClass {
 public int $x;
 private string $s;
 protected array $arr;
 public AnotherClass $ac;
 function bar(string $str, bool $b): float {
 if ($b && $str === "Hi") {
 return 3.2;
 }
 return 0.3;
 }
 }What should be noticeably different from PHP is that AnnotatedClass has type information on all member variables (whether public, private or protected), as well as its function parameters and return types. The annotated types in this example are:- int for $x.
- string for $s.
- array for $arr.
- AnotherClass for $ac.
 - string for $str in bar().
- bool for $b in bar().
- float as the return type of bar().Why? ¶Normally dynamically typed languages allow variables to take on any type representation and allow this type representation to be changed on-the-fly. So, in PHP, a variable $x can be assigned to an int and then, down the line, be assigned to a string ... all in the same local scope. In other words, not only is the value of a variable mutable, the type of a variable is mutable. This ability allows for rapid prototyping, more concise code, and a lot of flexibility. Dynamic typing also comes with a cost. Errors are only caught at runtime. There is no compile-time analysis for code optimization. (Virtual machines like HHVM can mitigate the optimization disadvantage of a dynamically typed language by providing an intermediate translation step before runtime, the time which can be used for optimizations.) Bugs can go undetected for years and rear their ugly head when, for example, a call to a method with an unexpected parameter type is made."Wait a second! Facebook.com, with its billion+ users, is written with tens of millions of lines of PHP! A dynamic language worked well for Facebook."Yes, Facebook has done quite well with dynamically typed PHP. However, it is possible to bring some statically typed language features to PHP without affecting functionality and performance. With statically typed languages, errors can be caught before runtime. Code becomes more readable and self-explaining. The likelihood of the rogue "calling a method with an unexpected parameter type" bug becomes very small as these are caught before execution.Hack helps bridge the gap between dynamically and statically typed languages by providing features normally seen in statically typed languages to PHP. A primary goal of Hack was to bring these features but remain as compatible as possible with current PHP codebases. Type annotations is a big step toward accomplishing this goal.What types can be used? ¶With Hack, most every PHP type can be used for type annotations. These types can be used to annotate function arguments, return types or member variables.- Primitive, basic types: int, float, string, bool, array (However, do not use the aliases double, integer,boolean, real)
- User-defined classes: Foo, Vector<some type>
- Mixed: mixed
- Void: void
- Nullable or optional: ?someType (e.g., ?int, ?bool)
- Typed arrays: array<Foo>, array<string, array<string, Foo>>
- Tuples: tuple(type1, type2, ....) (e.g., tuple(string, int) )
- XHP elements :x:frag, :x:base, :div, and the catch-all :xhp.
- generics: Foo<T>
- closures: (function(type1, type2, ...): return_type)
- resources: resource
 The array type may only be used in the Hack default (i.e., // partial) or // decl modes. When using arrays in// strict mode, Hack will throw an error about using a collection class such as a Vector or Map.How to type annotate ¶Hack decided to annotate the return types at the end of a function/method declaration instead of near the beginning found in languages like C#. This was done mainly for readability purposes. But there are other reasons to have them positioned the way they are because of closures and searchability. With respect to closures, if the return type is annotated at the beginning of a function, PHP could interpret the return type to be a constant string, thus ignoring the return type altogether. With respect to searchability, searching for "function foo" will produce more useable results than having to use wildcards or some other mechanism to find all the functions named foo(), regardless of return type.Here is a matrix of many of the types available for Hack. This matrix shows how the types are defined and used in a class member, parameter and return type scenario.TypeDefinitionExample Class Member UsageExample Parameter UsageExample Return UsageBooleanboolbool $b = false;function foo(bool $b): boolIntegerintint $i = 3;function foo(int $i): intFloatfloatfloat $f = 3.14;function foo(float $f): floatStringstringstring $s = "Hello";function foo(string $s): stringUntyped Array (partial mode only)arrayarray $x = array();function foo(array $arr): arrayArray-as-vectorarray<someType>array<string> $arrs = array("hi", "bye");function foo(array<string> $arrs): array<string>Array-as-maparray<keyType, valueType>array<int, string> $arrs = array(42 => "answer");function foo(array<int, string> $arrs): array<int, string>Generic TypeNonPrimitiveType<T>T $t;function foo(T $t) or function foo<T>(T $t): TVectorVector<T>protected Vector<int> $vec = Vector {3, 4};function foo(Vector<int> $vec): Vector<int>MapMap<Tk, Tv>protected Map<string, int> $map = Map {"A" => 1, "B" => 2};function foo(Map<string, int> $map): Map<string, int>SetSet<Tv>protected Set<int> $set = Set{1,2};function foo(Set<int> $set): Set<int>PairPair<Tv1, Tv2>protected Pair<int, string> $pair = Pair {7, 'a'};function foo(Pair<int, string> $pair): Pair<int, string>User ObjectFooClassprotected FooClass $a;function foo(FooClass $a): FooClassVoidvoidN/AN/A: voidMixed Typemixedprotected mixed $m = 3;function foo(mixed $m): mixedNullable?someTypeprotected ?int $ni = null;function foo(?int $ni): ?intTupletuple(type1, type2)protected (string, int) $tup = tuple("4", 4);function foo((string, int) $tup): (string, int)Closure(function(type1, type2, …): returnType)protected (function(int, int): string) $x;function foo((function(int, int): string) $x): (function(int, int): string)Resourceresource$r = fopen('/dev/null', 'r');function foo(resource $r): resourceMost of the time, initialization of the class members will be done in a constructor (i.e., __construct()). For brevity, most of the initialization in the table above was done inline. However, sometimes brevity doesn't work very well. For the generic type, user object, nullable and closure class members, here are example initializations in the constructor:<?hhclass FooClass{}
 class MyClass {
 T $t;
 FooClass $a
 ?int $ni;
 (function(int, int): string) $x;
 public function __construct() {
 $this->t = $val;
 $this->a = new FooClass();
 $this->ni = $val === 3 ? null : 4;
 $this->x = function(int $n, int $m): string {
 $r = '';
 for ($i=0; $i < $n+$m; $i++) {
 $r .= "hi";
 }
 return $r;
 };
 }
 }Examples ¶Below are some basic, contrived examples using some of the above types within the context of the Hack type annotation framework:Annotating With Basic Types<?hhfunction increment(int $x): int {
 return $x + 1;
 }
 function average(float $x, float $y): float {
 return ($x + $y) / 2;
 }
 function say_hello(string $name): string {
 return "Hello ".$name;
 }
 function invert(bool $b): bool {
 if ($b) {
 return false;
 } else {
 return true;
 }
 }
 function sort(array $arr): array {
 sort($arr);
 return $arr;
 }
 // A piece of code that computes the average of three numbersfunction avg(int $n1, int $n2, int $n3): float {
 $s = $n1 + $n2 + $n3;
 return $s / 3.0;
 }Annotating with void<?hh
 // void is used to indicate that a function does not return anything.function say_hello(): void {
 echo "hello world";
 }Annotating with Nullable<?hh
 // The nullable type is used to indicate that a parameter can be null.
 // It is also useful as a return type, where the error case returns null.
 // The type checker will force you to handle the null case explicitly.
 function f1(int $x): ?string {
 if ($x == 0) {
 return null;
 }
 return "hi";
 }
 function f2(int $x): void {
 $y = f1($x);
 // $y here has a type of ?string
 if ($y !== null) {
 // $y can be used as an string. No casts required.
 }
 }Annotating with mixed<?hh
 // The mixed type should be used for function parameters where the behavior depends on the type.
 // The code is forced to check the type of the parameter before using itfunction encode(mixed $x): string {
 if (is_int($x)) {
 return "i:".($x + 1);
 } else if (is_string($x)) {
 return "s:".$x;
 } else {
 ...
 }
 }Annotating Classes<?hh
 class A {}
 function foo(A $x): void {
 ...
 }
 function sum(Vector<int> $arr): int {
 $s = 0;
 foreach ($arr as $v) {
 $s += $v;
 }
 return $s;
 }Annotating Tuples<?hhclass TupleTest {
 // This is a Vector of tuples. Notice how the "tuple" reserved
 // word is not used when annotating.
 private Vector<(string, string)> $test = Vector {};
 // The return type is a tuple. Again, the "tuple" reserved
 // word is not used.
 public function bar(): (string, string) {
 return $this->test[0];
 }
 public function foo() {
 // But to use an actual tuple, use the "tuple" reserved word
 $this->test->add(tuple('hello', 'world'));
 }
 }Annotating Resources<?hhfunction f1(): ?resource {
 // UNSAFE
 return fopen('/dev/null', 'r');
 }
 function f2(resource $x): void {
 }
 function f3(): void {
 $x = f1();
 if (is_resource($x)) {
 f2($x);
 }
 }Type Casting ¶HHVM allows type casting, basically allowing for a variable to be cast to another, appropriate type (e.g., int to bool). Hack allows type casts as well. For example, the type checker will give no errors for this type of code.<?hhfunction foo(): bool {
 $foo = 10; // $foo is an integer
 $bar = (bool) $foo; // $bar is a boolean
 return $bar;
 }
 foo();That said, there are types that are synonyms for other types (e.g., double for float). Hack generally disallows this. For consistency purposes, Hack allows one type for one meaning:AllowedNot Allowedfloatdouble, realboolbooleanintintegerbinaryHere is an example of how the type checker will throw an error if you try to use a synonym of a type that is not supported.<?hhfunction foo(): bool {
 $foo = 10; // $foo is an integer
 $bar = (boolean) $foo; // $bar is a boolean
 return $bar;
 }
 foo();The above example will output:File "test.php", line 4, characters 11-17: Invalid Hack type. Using "boolean" in Hack is considered an error. Use "bool" instead, to keep the codebase consistent. Annotating Arrays ¶Annotating arrays deserves a bit more of a mention. Arrays in Hack can take the following forms:- Untyped array (partial mode only): array
- Explicitly typed array with integer keys: array<someType>
- Explicitly typed array with string or integer keys: array<int, someType> or array<string, someType>
 Here is an example of how various arrays are annotated. Remember that, in Hack, the use of arrays are more restricted in // strict mode.<?hh
 class FooFoo {}
 class HackArrayAnnotations {
 private array<FooFoo> $arr;
 private array<string, FooFoo> $arr2;
 public function __construct() {
 $this->arr = array();
 $this->arr2 = array();
 }
 public function bar<T>(T $val): array<T> {
 return array($val);
 }
 public function sort(array<int, float> $a): array<int, float> {
 sort($a);
 return $a;
 }
 public function baz(FooFoo $val): array<FooFoo> {
 $this->arr[] = $val;
 return $this->arr;
 }
 }
 function main_aa() {
 $haa = new HackArrayAnnotations();
 var_dump($haa->bar(3));
 var_dump($haa->bar(new FooFoo()));
 var_dump($haa->sort(array(1.3, 5.6, 2.3, 0.2, 1.4)));
 var_dump($haa->baz(new FooFoo()));
 }
 main_aa();The above example will output:array(1) { [0]=> int(3) } array(1) { [0]=> object(FooFoo)#2 (0) { } } array(5) { [0]=> float(0.2) [1]=> float(1.3) [2]=> float(1.4) [3]=> float(2.3) [4]=> float(5.6) } array(1) { [0]=> object(FooFoo)#2 (0) { } }Examining a typed array a bit more...<?hh
 class BarBar {}
 class ABCD {
 private array<BarBar> $arr;
 private int $i;
 public function __construct() {
 $this->arr = array(new BarBar());
 $this->i = 4;
 }
 public function getBars(): array<BarBar> {
 if ($this->i < 5) {
 return array();
 } else if ($this->i < 10) {
 return $this->arr;
 } else {
 return array(null); // Type Error
 }
 }
 }An empty array can be returned from a method that is annotated to return a typed array. However, an array with the first element null is not compatible. In order to make that work, a nullable typed array must be used as the annotation (e.g.,: array<?BarBar>).Annotating Closures ¶Annotating closures and callables require their own callout beyond the brief summary above about using PHP types with Hack. Take this unannotated, non-Hack PHP code that uses a closure:<?php
 function foo_closure($adder_str) {
 return function($to_str) use ($adder_str) {
 return strlen($to_str) + strlen($adder_str);
 };
 }
 function main_closure_example() {
 $hello = foo_closure("Hello");
 $facebook = foo_closure("Facebook");
 $fox = foo_closure("Fox");
 echo $hello("World") . "\n";
 echo $facebook("World") . "\n";
 echo $fox("World") . "\n";
 }
 main_closure_example();How is the function foo_closure() and the closure function actually annotated? Here is the proper Hack type annotation for such a function:<?hh
 function foo_closure(string $adder_str): (function (string): int) {
 return function($to_str) use ($adder_str) {
 return strlen($to_str) + strlen($adder_str);
 };
 }
 function main_closure_example() {
 $hello = foo_closure("Hello");
 $facebook = foo_closure("Facebook");
 $fox = foo_closure("Fox");
 echo $hello("World") . "\n";
 echo $facebook("World") . "\n";
 echo $fox("World") . "\n";
 }
 main_closure_example();Note:The return type annotation of foo_closure() is actually a skeleton signature of the actual closure being returned. Thus, for example, trying to return true from the closure will throw a Hack error since the return type annotation clearly specifies that the closure returns an int. Note that the actual closure is not type annotated, nor are any useparameters part of the type annotation for a closure.The same style of annotating closures are used in function/method parameters (if the function/method takes a closure as a parameter), as well as class member variables. Here is a final example:<?hh
 // Completely contrived
 function f1((function(int, int): string) $x): string {
 return $x(2,3);
 }
 function f2(): string {
 $c = function(int $n, int $m): string {
 $r = '';
 for ($i=0; $i<$n+$m; $i++) {
 $r .= "hi";
 }
 return $r;
 };
 return f1($c);
 }Annotating Constructors ¶With constructors, parameters are annotated as normal. It may also be tempting to annotate the return type of__construct() with : void. However, this is misleading (and technically incorrect). While there is no explicit returnstatement in __construct (a general hint that void is correct), the constructor actually does implicitly return the instantiated type for which __construct was called.Therefore, to avoid any confusion, do not annotate the return type of __construct. The Hack type checker will throw an error if there is an annotation present.Annotating with this ¶The this type is a pretty useful type, which you'll usually see as a return type. Here are some examples of it being used:<?hh
 class Base {
 private int $x = 0;
 public function setX(int $new_x): this {
 $this->x = $new_x;
 // $this has type "this"
 return $this;
 }
 public static function newInstance(): this {
 // new static() has type "this"
 return new static();
 }
 public function newCopy(): this {
 // This would not typecheck with self::, but static:: is ok
 return static::newInstance();
 }
 // You can also say Awaitable<this>;
 public async function genThis(): Awaitable<this> {
 return $this;
 }
 }
 final class Child {
 public function newChild(): this {
 // This is OK because Child is final.
 // However, if Grandchild extends Child, then this would be wrong, since
 // $grandchild->newChild() should returns a Child instead of a Grandchild
 return new Child();
 }
 }this is the type of $this and new static(). If Base::setX() returns this, that means that at callsites,$child->setX() is known to return an instance of Child.Here are some invalid uses of this:COUNTER EXAMPLES<?hh
 class Base {
 public static function newBase(): this {
 // ERROR! The "this" return type means that $child->newBase()
 // should return a Child, but it always returns a Base!
 return new Base();
 }
 public static function newBase2(): this {
 // ERROR! This is wrong for the same reason that new Base() is wrong
 return new self();
 }
 // This function is fine
 abstract public static function goodNewInstance(): this;
 public static function badNewInstance(): this {
 // ERROR! Child::badNewInstance() would call Base::goodNewInstance() which is wrong
 return self::goodNewInstance();
 }
 }this can only be used in covariant locations, which means you cannot use this in as a function parameter typehint or as a member variable typehint. When Hack has proper covariance support, you will be able to use this to instantiate any covariant type variable, like Awaitable<this> and ImmVector<this>. Until then, you can only use this withAwaitable.At the moment, there is no return type that means $this and only $this. The this can be satisfied with any object with the same type as $this.Mixed Types ¶Sometimes the type of a function parameter or a return type can be "various". And this could be quite intentional. When confronted with this situation, there are two choices. One is to leave the type blank and let the Hack type checker assume the engineer knows what he/she is doing. The other is to use the PHP provided mechanism called mixed in order to have the type checker force the engineer to check the type before using it. The following example shows mixed being used as the parameter type and the subsequent needed if check.<?hhfunction sum(mixed $x): void {
 if (is_array($x) || $x instanceof Vector) {
 $s = 0;
 foreach ($x as $v) {
 $s += $v;
 }
 return $s;
 }
 //... do something else or throw an exception...}Passing By Reference ¶The Hack typechecker largely does not understand references and pretends that they do not exist. For example, the following code passes the typechecker:<?hh
 function swap(int &$x, int &$y): void {
 $x = $y;
 $y = 'boom';
 }
 function main(): void {
 $x = 1;
 $y = 1;
 swap($x, $y);
 var_dump($x);
 var_dump($y);
 swap($x, $y);
 }
 main();It seems pretty clear that this shouldn't be done (trying to swap an int but writing in a string, when the swap()method takes two ints.). In fact, HHVM will balk at runtime when trying to execute this code. But getting an error at runtime defeats the whole purpose of the benefits of Hack.Hack allows passing primitives by reference in // partial mode only. Hack will not allow this in // strict mode. The only reason this is even allowed by Hack in // partial mode is to allow for easier migration of various PHP codebases.Why are these type of references bad? Consider the following PHP code snippet:<?phpfunction foo() {
 $arr = array(1,2,3,4);
 foreach ($arr as &$k) {
 echo $k;
 }
 echo "\n";
 $k = 'foo';
 var_dump($arr);
 }
 foo();After the foreach completes, there will still be a dangling reference to the last element in the array. So, if a developer then writes$k = 'foo';, the array will be mutated ... behavior probably not desired.The above example will output:1234 array(4) { [0]=> int(1) [1]=> int(2) [2]=> int(3) [3]=> &string(3) "foo" }Furthermore, references are just incredibly difficult to typecheck. They can arbitrarily change the types of the parameters passed into them at-a-distance. Being sound would require either looking into the called function (too expensive perf-wise) or completely forgetting the type of any reference parameters (too restrictive). Thus this compromise was reached.Typing Generators ¶Generators can be somewhat tricky to type check. For example:<?hhfunction gen() {
 yield 1;
 yield "a";
 }Code using gen() must match the right type, in the right order. Static checking of this is not possible. Therefore, Hack implements two kinds of generators:TypeDefinitionHow to YieldNotesContinuation<T>The interface for items that are generators. They can be looped over.yield directly (e.g.,yield $foo;)Generators must always yield the same type. Continuations yield items of a type T. Continuations are used only withyield, not async/await, etc.Awaitable<T>The interface for items that can be prepared.In an asyncfunction, return or awaitdirectlyThe return type of async functions are all Awaitables. They can all be prepared using prep() and await. In this example, $foo is of type T for every expressionyield result($foo);. Not all Awaitables are Continuations.The following two examples are correctly typed:<?hhfunction gen(): Continuation<int> {
 yield 1;
 yield 2;
 yield 3;
 }
 function foo(): void {
 foreach (gen() as $x) {
 echo $x, "\n";
 }
 }<?hh
 async function f(): Awaitable<int> {
 return 42;
 }
 async function g(): Awaitable<string> {
 $f = await f();
 $f++;
 return 'hi test ' . $f;
 }Summary ¶Changing PHP code to Hack code has been purposely made simple. As possibly gleaned from the examples above, first change <?php to <?hh. Then annotate types or use other Hack features. Then run the type checker.Of course, developers must not be unnecessarily stymied when it comes to pushing out code. Hack implements type annotations in a way to not only be PHP compatible, but also to provide engineers ways to bypass the type checker. Some code might just be inherently dynamic in nature, code needs to be tested quickly, or the type checker could have a bug. In these cases there are options to have "unsafe" and other types of Hack code to get around the type checker's grip. These options shouldn't be used often as they defeat the overall purpose of having a reliable codebase, but they are there as needed.
 
 
No comments:
Post a Comment