Pages

Wednesday, March 26, 2014

Other Hack Rules and Features

Table of Contents 

  • Variable Number of Arguments
  • Number Handling
  • Type Inference
  • Class Initialization
  • Callbacks and fun
  • Overriding on Return Type
  • invariant()
  • Function Signature Ordering
  • Calling Non-Private Methods During Initialization
  • Union Types
  • Heredocs
  • Nowdocs
  • Casting
  • Class Name Resolution
There are some other Hack rules and features that are worth noting and knowing.

Variable Number of Arguments 

Hack provides the capability to indicate that a function takes a variable number of arguments. This is done using ...(three dots).
<?hh
function var_foo(int $x, ...) : int {
  
// use func_get_args() to get the actual arguments returned as an array
  // Remember, in this case, the $x will be counted as an argument too.
  
$arg_arr func_get_args();
  return 
$x count($arg_arr);
}

function 
main_vna() {
  
$y var_foo(32'hi', array(), 1.3); // $y = 8 (3+5)
  
$z var_foo(3); // $z = 4 (3+1)
  
echo $y."\n";
  echo 
$z;
}
main_vna();
The general gotcha here will be that any explicitly defined parameters are counted in the call to func_get_args().
Here is another example:
<?hh
function sum(...): int {
  
$s 0;
  foreach (
func_get_args() as $e) {
    
$s += $e;
  }
  return 
$s;
}

function 
main_vna() {
  
$x sum(32531.3);
  echo 
$x//$x = 14.3}
main_vna();
With the above example, HHVM will give an error (probably Invalid Operand Type) if anything "non-summable" (e.g.,array()) is passed to sum() as a variable argument.
Note: It is important to note that functions defined with a variable number of arguments are not able to be type checked statically. Hack does not currently allow variable number of arguments to be type annotated.
Note:
There is discussion around implementing a way to type annotate variable number of arguments. For example,function foo(int $x, string ...). This would imply that all of the variable number of arguments would be of thestring type. However, this could not cover the case where the variable number of arguments are different types or where they are based on some sort of format string (e.g. sprintf).

Number Handling 

In PHP, float and int are incompatible at the type hint level:
<?phpfunction f1(float $x) {
}
f1(10);
The above example will output:
HipHop Fatal error: Argument 1 passed to f1() must be an instance of float, int given
This incompatibility implies that the distinction between int and float must be tracked. However, some math operators have a return type which depends on their input:
<?php
$x 
10;
echo 
gettype($x/2), "\n";
echo 
gettype($x/3), "\n";
The above example will output:
integer
double
Hack has an internal num type. For example, if Hack throws an errors saying something to the effect of "It is incompatible with an int/float because it is the result of a division (/)", this means that the type represented in the error message is either an int or float, but cannot be determined statically.
Note: At this point, num is not an annotatable type. For example, a function cannot return foo(): num{...}. This may be implemented in the future. This discussion is with respect to num as an internal type to Hack.
In Hack, mathematical operations are handled in the following way:
  1. int & float are subtypes of num.
  2. Basic math operators (+, -, *, /) return a float if one of the inputs is a float.
  3. Basic math operators (+, -, *) return int if both the inputs are ints.
  4. In all other cases, basic math operators return a num, and may need to be explicitly cast to an appropriate type. See next bullet.
  5. When returning from or calling another function, num needs to be explicitly cast to the correct type (either int orfloat).
<?hh
function f1(float $x) {
}

function 
f2(int $xfloat $y): int {
  
$a $x// $a is an int
  
$b $y// $b is a float
  
$c $x $a// $c is a num
  
$d $c $y// $d is a float
  
f1($d);  // this is allowed
  
return (int) $c;  // $c needs to be cast here}

Using Equivalent Types 

It is important to note that Hack does not support the use of certain types that are equivalent to other types:
  • double (use float)
  • real (use float)
  • boolean (use bool)
  • integer (use int)
Take double as an example. double is an alias for float . It is the exact same data type as float. Using doublecan lead to confusion (i.e., thinking that a double has more precision than a float) and make the readability of the codebase inconsistent, so double was removed from Hack.

Type Inference 

Most people are familiar with statically typed languages (like C or Java). Having a type inference system may, however, be a new concept. A type inference system brings many pros (e.g., write less and more readable code), but there are also a few gotchas.

Block level inference 

Hack implements a unique block level inference system.
Assume a nullable ?int variable and a function which takes an int. There are two choices for calling this function with a nullable. ?int can be cast to an int, in which case null will be converted to 0. Or code can be written like this:
<?hh
function foo(int $y): void { }

function 
bar(?int $x): void {
  if (
is_int($x)) {
    
foo($x);
  } else {
    ...
  }
}
When is_int($x) is written inside a conditional statement, the Hack type checker will know that within the if/thenbranch, $x is an int. In the if/else branch, the type will be null. Here is another example:
<?hh
class Foo {}
class 
Bar extends Foo { public function blah(): void {} }

function 
foo(Foo $x): void {
  if (
$x instance of Bar) {
    
// $x is now a Bar
    
$x->blah();
    ...
  } else {
    
// $x is still just a Foo
    
...
  }
}

Local vs. member variables 

protected, private and public member variables are not typed in the same way as local variables. The reason is that protected and private variables might change when calling other functions.
This code is fine:
<?hh
class Foo {
  public function 
f1(?int $x): void {
    if (
is_int($x)) {
      
$this->doSomething();
      
$y $x 2;
    }
  }

  public function 
doSomething(): void {}
}
This code however is incorrect:
<?hh
class Foo {
  protected ?
int $x;

  public function 
f1(): void {
    if (
is_int($this->x)) {
      
$this->doSomething();
      
// can no longer assume $this->x is an int, doSomething might have changed it back to null.
      // note: can't analyze doSomething() because a child class of Foo might change its behavior.
      
$y $this->2;
    }
  }

  public function 
doSomething(): void {}
}
Here is a possible fix to the above problem:
<?hh
class Foo {
  protected ?
int $x;

  public function 
f1(): void {
    
$x $this->x;
    if (
is_int($x)) {
      
$this->doSomething();
      
$y $x 2;
    }
  }

  public function 
doSomething(): void {}
  }

Class Initialization 

Hack enforces strict rules regarding class initialization. The following piece of code is correct:
<?hh
abstract class {
  protected 
int $x;
}

class 
extends {
  public function 
__construct() {
    
$this->10;
  }
}
For an abstract class, member variables need not be initialized. For concrete child classes, however, member variables must be initialized, including any abstract class parent variables. If B fails to initialize $x, the type checker will complain about the "class member is not always properly initialized".
In the non-abstract case, the following piece of code is accepted:
<?hhclass Foo {
  protected 
int $x 10;
}
However, this piece of code is not accepted:
<?hhclass Foo {
  protected 
int $x;
}
Again, the type checker will complain about the "class member is not always properly initialized". The solution is to write the accepted case above or use __construct():
<?hhclass Foo {
  protected 
int $x;

  public function 
__construct() {
    
$this->10;
  }
}
There is another implication here. A protected or a public instance method cannot be called before the constructor has finished initializing the member variables. The following piece of code is correct:
<?hhclass Foo {
  protected 
int $x;

  public function 
__construct() {
    
$this->10;
    
$this->foo();
  }
  protected function 
foo() {
    ...
  }
}
Finally, note that class initialization does not apply to nullable types. So, the following code is correct and acceptable:
<?hhclass Foo {
  protected ?
int $x;

  
// no __construct() needed

  
protected function foo() {
    ...
  }
}

Callbacks and fun() 

A standard way to call a function is by using a callback. There are two primary PHP functions for calling callbacks:call_user_func() and call_user_func_array().
Note:
eval() can be used to execute string-based code, but it can be extremely dangerous and is not advised (even though HHVM does support it).
Take the following example:
<?php
function cufun1(string $x): string {
  return 
$x.$x;
}

function 
cufun2(): (function(string): string) {
  return function(
$x) { return $x.$x; };
}

class 
CUFunFunFun {
  public function 
testFun1() {
    
var_dump(call_user_func('cufun1'"blah"));
  }

  public function 
testFun2() {
    
$x cufun2();
    
var_dump($x('blah'));
  }
}

function 
main_cufff() {
  
$f = new CUFunFunFun();
  
$f->testFun1();
  
$f->testFun2();
}
main_cufff();
The above code shows the use of call_user_func() to call cufun1() from the function testFun1(). This code runs perfectly fine in HHVM. But, if <?hh was used instead of <?php, this code would not pass the Hack type checker:
The above example will output:
File "cuf.php", line 14, characters 14-45:
This call is invalid, this is not a function, it is a string
Hack throws an error here because its type inference and safety cannot be guaranteed when using a string as the function name to call_user_func(). For example, imagine changing the call_user_func() line to be:
var_dump(call_user_func('cufun1', 3));
Normally Hack would catch such a problem (passing an int to a function that takes a string), but Hack cannot provide these guarantees with call_user_func(). Thus an HHVM fatal error will be thrown *at runtime*.
The above example will output:
[~/www/tests] php cuf.php
HipHop Fatal error: Argument 1 passed to cufun1() must be an instance of string, int given in cuf.php on line 6
In order to make these type of callbacks type-checkable and type-safe, Hack has introduced fun(). fun() is a special function used to create a "pointer" to a function in a type-safe way. fun() takes a string corresponding to the name of the function to be called. It returns a "type-safe" string that can be used in, for example, call_user_func(). Building upon the example above to use fun():
<?hh
function fun1(string $x): string {
  return 
$x.$x;
}

function 
fun2(): (function(string): string) {
  return 
fun('f1');
}

class 
FunFunFun {
  public function 
testFun1() {
    
$x fun('fun1');
    
var_dump(call_user_func($x"blah"));
  }

  public function 
testFun2() {
    
$x fun2();
    
var_dump($x('blah'));
  }
}

function 
main_fff() {
  
$f = new FunFunFun();
  
$f->testFun1();
  
$f->testFun2();
}
main_fff();
Before calling call_user_func(), fun() is called. Returned from fun() is a string() that can be used and analyzed by the Hack type checker. This code runs exactly the same in HHVM as the first example, but, now the Hack type checker can catch typing errors. In testFun1(), imagine passing an int instead of a string. Hack will throw an error.
var_dump(call_user_func($x, 3));
The above example will output:
File "fun.php", line 20, characters 17-17:
Invalid argument
  File "fun.php", line 9, characters 15-20:
  This is a string
  File "fun.php", line 20, characters 17-17:
  It is incompatible with an int
It is important to note that the argument to fun() must be a single-quoted, constant literal string representing a valid function name. For example:
<?hhfunction foo(): void {}

function 
main(): void {
  
$func fun('foo');
  
$str "foo";
  
// $wontwork = func($str);}
The bottom line is to use fun() before using call_user_func() for these type of function calls.

Overriding on Return Type 

Hack provides the ability to annotate return types on functions. This brings with it another added feature. Under a specific circumstance, Hack provides the ability to override functions based on just return type. That circumstance is a subclass overriding a method in a parent class with a return type compatible with the return type of the parent class. For example:
<?hhclass Foo {}
class 
FooChild extends Foo {}

class 
AA {
  protected function 
bar(): Foo { return new Foo(); }
}

class 
BB extends AA {
  protected function 
bar(): FooChild { return new FooChild(); }
}
In the above example, Hack is happy because BB::bar() returns FooChild, which is a child of the Foo thatAA::bar() returns.
Here are some unsupported examples:
<?hh
class Foo {}
class 
FooChild extends Foo {}

class 
AA {
  protected function 
bar(): Foo { return new Foo(); }
}

class 
BB extends AA {
  protected function 
bar(): int { return 1; }
}
In the above, Hack will balk since int and Foo are not compatible types.
<?hh
class Foo {}
class 
FooChild extends Foo {}

class 
AA {
  protected function 
bar(): FooChild { return new FooChild(); }
}

class 
BB extends AA {
  protected function 
bar(): Foo { return new Foo(); }
}
In the above, Hack will again balk since returning Foo from BB:bar() is not compatible with AA:bar() returningFooChild. The other way around, as shown above, does, of course, make Hack happy.
Note:
Overloading on return type within the same class is not supported by Hack (in fact, like PHP, no overloading exists at all; thus, this holds for function arguments as well). For example:
<?hhclass AA {
  function 
bar(): string { return 's';}
  function 
bar(): int { return 1;}
}


invariant() 

There are times when it is desirable to have an object be type-checked as a more specific type than it is currently declared. For example, an interface needs to be type-checked as one of its implementing classes. invariant() is used to help the Hack type-checker make this more specific type determination.
<?hh
interface {
  public function 
foo();
}

class 
implements {
  public function 
foo(): void {
    echo 
"A";
  }
}

class 
implements {
  public function 
foo(): void {
    echo 
"B";
  }

  public function 
yay(): void {
    echo 
"B->yay!";
  }
}

function 
baz(int $a): {
  return 
$a === ? new A() : new B();
}

function 
bar(): {
  
$iface baz(2);
  
invariant($iface instanceof B"must be a B");
  
$iface->yay();
  return 
$iface;
}
bar();
Without the invariant(), Hack will give an error similar to the following:
The above example will output:
File "in2.php", line 30, characters 11-13:
The method yay is undefined in an object of type I
File "in2.php", line 23, characters 23-23:
Check this out
Here is another example where a variable can be mixed and invariant is used to help Hack understand$untyped_arrayis a typed array.
<?hh
class Foo {
  public function 
foo_method(): void {}
}

function 
mixed_method(int $x): mixed {
  if (
$x === 3) {
    
$a = array();
    
$a[0] = new Foo();
  } else {
    return 
false;
  }
}

function 
bar(): bool {
  
$untyped_array mixed_method(3); // Let's assume that this method can return an array of Foo
  
invariant(is_array($untyped_array), "must be an array of Foo()");
  
$untyped_array[0]->foo_method(); // Hack now understands that $untyped_array is an array
  
return true;
}
bar();
Without the invariant(), the Hack type checker would give an error similar to the following:
The above example will output:
File "h5.php", line 20, characters 5-21:
This is not a container, this is a mixed value
  File "h5.php", line 7, characters 32-36:
  You might want to check this out
In many ways, invariant() puts the onus on the programmer to be correct; otherwise, bad things may happen if the invariant is not satisfied (e.g., exceptions or php fatals).

Function Signature Ordering 

PHP allows some flexibility when it comes to the ordering of a function signature. For example, the following ways of defining the foo() signature are perfectly valid:
<?phpclass {
  public static function 
foo() {}
}
<?phpclass {
  static public function 
foo() {}
}
However, in Hack, there are rules for function signature ordering:
  1. The keywords can be in any order.
  2. A keyword cannot be repeated (e.g. no public public function foo()).
  3. Specific keywords cannot collide. (i.e., no collision between public/protected/private. No collision betweenabstract/final).
Here is a sane function signature ordering for Hack.
<?hhclass {
  
// (access) [static] function (name) {}
  
public static function foo() {}
}

Calling Non-Private Methods During Initialization 

The Hack type checker will throw an error when trying to call non-private methods during the initialization of an object. Here is an example:
<?hh
class PettingZoo {
  private 
FluffyBunny $fluffy;

  public function 
__construct(FluffyBunny $bunny) {
    
$this->doOtherInit();
    
$this->fluffy = new FluffyBunny();
  }

  protected function 
doOtherInit(): void { }
}
The above example will output:
File "PettingZoo.php", line 8, characters 5-22:
Until the initialization of $this is over, you can only call private methods
The initialization is not over because $this->fluffy can still potentially be null
It is very important for Hack to be able to check the codebase quickly. In order to do this, Hack needs to be able to parallelize itself and check classes and files separately as much as possible. Therefore, it is not desirable to have to look at a whole inheritance hierarchy in order to figure out if someone is using a member variable before it is initialized. By limiting the user to only calling private methods, it can be ensured that member variables are always initialized by the time they are used.
There are two ways to solve this problem:
  • Calling only private methods: As discussed, until all the member variables have been initialized, only call private methods. Why? The Hack checker is relatively lazy, and does not want to have to check more than one class. Since the method is private, its definition must be in the instantiated class, and so the Hack checker is able to read it and make sure no one is using an uninitialized member variable.
  • Moving the method call until initialization is done: Sometimes by moving around method calls, the issue can be avoided altogether.
Here is an example of the first fix:
<?hh
class PettingZoo {
  private 
FluffyBunny $fluffy;

  public function 
__construct(FluffyBunny $bunny) {
    
$this->doOtherInit();
    
$this->fluffy = new FluffyBunny();
  }

  private function 
doOtherInit(): void { }
}

Union Types 

Hack does not support union types. The following piece of code will run in HHVM. However it will not pass the Hack type checker, even though many would believe that it should:
<?hh
interface UTI {
  public function 
utBar();
}

class 
UTA {
  public function 
utFoo(): void {
    echo 
"UTA:ut_foo()\n";
  }
}

class 
UTB extends UTA implements UTI {
  public function 
utBar(): void {
    echo 
"UTB:utBar()\n";
  }
}

function 
ut_xyz(UTA $a) {
  if (
$a instanceof UTI) {
    
$a->utBar();
  }
  
$a->utFoo();
}

function 
main_ut() {
  
$b = new UTB();
  
ut_xyz($b);
}
main_ut();
The above example will output:
UTB:utBar()
UTA:ut_foo()
The type checker will balk, however:
File "union_types.php", line 24, characters 7-11:
The method utFoo is undefined in an object of type UTI
At first glance, the function ut_xyz() looks very reasonable. It takes an UTA, and since UTB is a child of UTA, utFoo()can be called successfully on either a UTA or UTB. The instanceof check allows a passed in UTB to call utBar().
So what is the issue? It is not necessarily an issue, but rather how Hack deduces the types of the variables in scope. After the instanceof check, Hack starts resolving the real type of $a. First, Hack determines that $a can be a UTI. Then at the call to utFoo(), Hack determines that $a can be a UTA. UTA and UTI are not compatible types (i.e., UTAdoes not implement UTI). Thus, Hack decides that $a cannot be both UTI and UTA.
The scenario describes above represents union types. Union types allow a variable to possibly have several, distinct representations. For performance reasons, Hack does not support this paradigm. To alleviate the type error presented above, local variables can be used.
<?hh
interface UTI {
  public function 
utBar();
}

class 
UTA {
  public function 
utFoo(): void {
    echo 
"UTA:ut_foo()\n";
  }
}

class 
UTB extends UTA implements UTI {
  public function 
utBar(): void {
    echo 
"UTB:utBar()\n";
  }
}

function 
ut_xyz(UTA $a) {
  
$a_local $a;
  if (
$a_local instanceof UTI) {
    
$a_local->utBar();
  }
  
$a->utFoo();
}

function 
main_ut() {
  
$b = new UTB();
  
ut_xyz($b);
}
main_ut();
The local variable $a_local is assigned to $a. Now since these two variables are distinct, each can represent a different type. Hack will not throw a type error in this case.

Heredocs 

Hack supports the type checking of the PHP string building technique called heredocs (e.g., $x = <<<EOF ..... EOF;). Hack does not balk when encountering a heredoc within PHP code, and throws an error when using heredocs incorrectly. Here is an example of Hack catching a heredoc error:
<?hh
function heredoc(): void {
  
$x = <<<MYHD
The above example will output:
File "heredoc.php", line 4, characters 8-10:
Unterminated heredoc
Of course, to fix this error, one needs to terminate the heredoc:
<?hh
function heredoc(): void {
  
$x = <<<MYHDHello, I am in hereMYHD;
}

echo 
heredoc();
There is a known current limitation with Hack's support of heredocs. The following code will not parse:
<?hh
function heredoc(): void {
  
$foo 3;
  
$x = <<<MYHD
{
$foo}
MYHD;
}

echo 
heredoc();
The above example will output:
File "heredoc.php", line 6, characters 7-11:
Syntax error

Nowdocs 

Hack supports nowdocs. In summary, nowdocs are similar to heredocs but without the parsing. Here is an example:
<?hh
function foo(): void {
  
$name 'MyName';

  echo <<<'EOT'
My name is "$name". I am printing some text.
Now, I am printing some {$foo->bar[1]}.
This should not print a capital 'A': \x41
EOT;
}
foo();
The above example will output:
My name is "$name". I am printing some text.
Now, I am printing some {$foo->bar[1]}.
This should not print a capital 'A': \x41
All text is explicitly printed out and not replaced with variable values. Unicode characters are not replaced with their literal bindings.

Casting 

Hack currently follows the PHP rules for casting, with a few minor exceptions. Hack does not allow the use of certain primitives where there is an equivalent primitive type (e.g. float instead of double or real). There is discussion about providing the capability to cast to non-primitive types, but that is not implemented as of yet.

Class Name Resolution 

Class name resolution provides a way of accesing the name of a class, interface or trait.
Note:
This is compatible to the PHP5 RFC around class name resolution. » PHP: rfc:class_name_scalars
Here is an example:
<?hh
$obj 
newv(FooClass::class, $constructor_args); BaseFacebookTestCase::assertInstanceOf(FooClass::class, $obj'I hope that constructor worked');
As you might imagine, Foo::class is significantly easier to statically typecheck than 'Foo'; we know that Foo::classis a class name, but interpreting the literal string 'Foo' requires divining that it is in fact a class name “pointer” based on how that string is used several hops away. As such, there’s now a better way than using string literals like 'ClassName'or 'InterfaceName' in your code.

No comments:

Post a Comment