able of Contents ¶
- Why Nullable?
- Why Not Always Use Nullable?
- Case Study
- Examples
- Various Null Handling Scenarios
Hack introduces a safer way to deal with nulls through a concept known as the "Nullable" type (sometimes called "maybe" or "option"). Nullable allows any type to have null assigned and checked on it. Nullable is very useful for primitive types that don't generally allow null as one of their values, such as bool and int, but it can be also be useful for user-defined classes. Below is an example. Note the ? operand used to represent nullable.
check_not_null() takes a nullable int as a parameter. This int can now be checked for null within the code, and appropriate action can be taken.
Nullables are useful when a possible invalid state for a primitive (value) type is required. A bool could be a common scenario where true or false has possibly not been determined yet. Or when null may be returned from a database that maps a column to a primitive type.
Why Nullable? ¶
It is true that in PHP null can currently be assigned to any variable. For example:
So what exactly is the Hack nullable feature trying to solve? In a statically-typed language such as C#, null cannot be assigned to a primitive like an int. One of Hack's primary purposes is to type check annotated code, bringing a statically-typed feel to the dynamic language that is PHP. Taking the above example and "hackifying" it, the Hack type checker will complain, as it should since null is not in the range of values for int.
Nullables are useful when a possible invalid state for a primitive (value) type is required. For example, a bool could be a common scenario where true or false has possibly not been determined yet. Here is an example:
A real-world scenario when a primitive type may not have a defined state is in a database. For example, if there is a database column that is defined as an int, there could be times when values in that column may not have been set yet. Nullable allows code to handle those cases.
Why Not Always Use Nullable? ¶
It may seem like using nullable should just be the default for primitives, just in case there is some sort of null state. However, many times it is reasonable to guarantee that a variable will be initialized, and, if somehow the variable isnull, a serious problem has occurred. For example:
<?hh
class NullableNotAppropriate {
public function noNullableNeeded(): ?int {
$vec = Vector{3};
return $vec->count();
}
}
function main_nt() {
$nna = new NullableNotAppropriate();
$y = $nna->noNullableNeeded();
var_dump($y);
}
main_nt();
There is certainly nothing technically wrong with the above code. ?int can be used as the return type annotation and Hack (nor HHVM) will not complain. However, the ?int is too relaxed. While this example is contrived, there is no way that $vec->count() will ever be null. And if it ever is, then something is wrong with the Vector implementation or the runtime.
Case Study ¶
Do not use nullable solely for convenience or to just get rid of those "annoying" Hack errors. Be judicious, particularly in core code. Here is a real world case study:
I've seen several core functions that have been typed to accept and return nullable types that handle null solely for convenience (array_invert was the most recent). This effectively taints all values going through these functions as nullable even though the caller may have had a stronger guarantee.
For example, if I take care to have a non-null array and pass it to array_invert (whose signature isfunction array_invert(?array $a): ?array), Hack now has to deal with nullability of the return value. In practice, this leads to nullthrows/invariant calls in the callers of array_invert, spreading the $&*% everywhere as we've come to affectionally call it. In this particular case it also places extra burden on callers that are working with (statically) non-null values and leveraging Hack's powerful analysis.
Sometimes it is appropriate to allow null. If a healthy majority of the callers expect to pass in and receive back nullable values, that is a good case to make for allowing null. That's where the judgement call comes in. Otherwise, pruning out nullability helps us write more cohesive functions and reduces the state space complexity of our programs.
Examples ¶
The following show some examples of using nullables, including showing what happens when null is passed to or returned from a method where null is not an expected value.
Returning a Nullable ¶
Here is a quick example that type checks correctly in Hack and produces valid output with HHVM:
Nullable and XHP ¶
Nullable works with XHP elements as well.
Nullable Method Parameter ¶
Here is the example above enhanced with a new function that takes a nullable as a parameter:
User-defined Types ¶
It is worth noting that nullable can be used on user-defined types. Take this example:
Real world code ¶
There was an example at a company where nullable came into play (there are probably many examples, but this is a specific one that can demonstrate the use of nullable quite well). The original code looked like this:
Caller
<?hh
function updateA($key,
$value,
$reviewer_unixname = null,
$comments = null,
$value_check = null,
$set_reviewer_required = false,
$diff = null,
$diff_id = null,
$author_unixname = null) {
// Assume, after a bunch of code above this statement, that $comments is still nullcreate_updateA($key, $comments);
Callee
<?hh
function create_updateA(string $key, string $comments) {
$comments = ($comments === '')
? 'Updated without comments.'
: 'Updated with the following comments: ' . $comments;
$vc = VC();
$ua = get_update_array($key, $comments);
regsiter(array('A', 'B'), $vc, $ua, 'update');
}
Hack complained about this code with the following message (line numbers may not be accurate):
The above example will output:
File "RealWorldNullable.php", line 13, characters 43-51:
Invalid argument
File "RealWorldNullable.php", line 23, characters 57-62:
This is a string
File "RealWorldNullable.php", line 7, characters 38-41:
It is incompatible with a nullable type
And, in fact, HHVM would complain about this code as well, since passing null to a function with a string type annotated parameter will cause an exception.
The above example will output:
HipHop Fatal error: Argument 2 passed to create_updateA() must be an instance of string, null given in RealWorldNullable.php on line 37
The $comments parameter in create_updateA() has been type annotated with a string. However, the callerfunction updateA() had set the string to be passed to create_updateA() as null. There were two ways to solve this problem. The good way and the bad way. The "good" way to solve this problem (and the one that was actually implemented and accepted at this company) is to use nullable.
<?hh
function create_updateA(string $key, ?string $comments) {
$comments = ($comments === '')
? 'Updated without comments.'
: 'Updated with the following comments: ' . $comments;
$vc = VC();
$ua = get_update_array($key, $comments);
regsiter(array('A', 'B'), $vc, $ua, 'update');
}
The above solution solves both the Hack and HHVM complaints that was seen without nullable. The "bad" way would solve the HHVM problem, but not the Hack problem. Basically, provide $comments in create_updateA() a default value of null.
<?hh
function create_updateA(string $key, string $comments = null) {
$comments = ($comments === '')
? 'Updated without comments.'
: 'Updated with the following comments: ' . $comments;
$vc = VC();
$ua = get_update_array($key, $comments);
regsiter(array('A', 'B'), $vc, $ua, 'update');
}
Again, as shown in the example above on nullable method parameters, using a default value of null in this case is not type correct and should not be done under normal circumstances.
Various Null Handling Scenarios ¶
This section will discuss various use-cases and scenarios that one may come across using nullables.
Null Member Variables ¶
These can be common Hack typing errors, and actually quite frustrating.
Take this example:
Nullable and Non-Nullable Interaction ¶
Sometimes a nullable type needs to be passed to a method that takes the non-nullable variant of that same type. For example, assume the following piece of code and the output from the Hack type checker:
There are legitimate times when there is an expectation that a variable will not be null, and, if that variable is null, a runtime exception should be thrown. For these cases, use a library function such as nullthrows(). The nullthrows()definition is something like:
nullthrows() checks if the value associated with a type T is null. If the value is null, an exception is thrown. If not, the value is returned unmodified. An optional exception message can be supplied to nullthrows() as well. The code above can be rewritten to use nullthrows(), making the Hack type checker happy:
Uninitialized Member Variables ¶
Take this piece of code and associated Hack error message:
Uninitialized Member Variables and No Constructor ¶
Take this scenario:
By a reading of the code, the class member variables are indeed being initialized. However, Hack doesn't support this initialization paradigm in the type checker. As it stands, there is no way for the Hack type checker to infer that the class members are being initialized through a forwarded call to setup() from call_user_func_array() in the__construct(). What are the options to avoid these Hack errors (noting that something like // UNSAFE will not work in this situation)? Here are the possible options:
Redesign/Refactor the code.
Make all member variables nullable.
This is a Hack problem that needs fixing. The type of pattern is not recognized by the Hack type checker.
Without understanding fully the prevalence of this type of design pattern and the side effects supporting it in Hack might have, options (1) and (2) should be examined before jumping in to assume option (3) must be the answer. Is there any way to redesign or refactor the code to avoid calling late-bound methods from the parent constructor? With this current structure, there are two possible problems. One is that even if Hack supported the paradigm, there is another Hack error looming regarding overriding setup() with a different number of parameters than the parent. This error will occur when the parent code is converted to <?hh. Secondly, using call_user_func_array to initialize variables may not necessarily be supported in Hack moving forward. Even if this design had to be maintained for legacy reasons, finding another way to call setup() might be preferred. Is there any way to use nullable on the class members (i.e. ?)? This may not be ideal when it is known that a member variable may never be null, but the consideration should at least be made whether it is a nice stopgap solution until a better design can be found.