The worst mistake of computer science

Uglier than a Windows backslash, odder than ===, more common than PHP, more unfortunate than CORS, more disappointing than Java generics, more inconsistent than XMLHttpRequest, more confusing than a C preprocessor, flakier than MongoDB, and more regrettable than UTF-16, the worst mistake in computer science was introduced in 1965.

I call it my billion-dollar mistake…At that time, I was designing the first comprehensive type system for references in an object-oriented language. My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn’t resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.
– Tony Hoare, inventor of ALGOL W.

In commemoration of the 50th anniversary of Sir Hoare’s null, this article explains what null is, why it is so terrible, and how to avoid it.

What is wrong with NULL?

The short answer: NULL is a value that is not a value. And that’s a problem.

It has festered in the most popular languages of all time and is now known by many names: NULL, nil, null, None, Nothing, Nil, nullptr. Each language has its own nuances.

Some of the problems caused by NULL apply only to a particular language, while others are universal; a few are simply different facets of a single issue.

NULL…

subverts types
is sloppy
is a special case
makes poor APIs
exacerbates poor language decisions
is difficult to debug
is non-composable

1. NULL subverts types

Statically typed languages check the uses of types in the program without actually executing, providing certain guarantees about program behavior.

For example, in Java, if I write x.toUppercase(), the compiler will inspect the type of x. If x is known to be a String, the type check succeeds; if x is known to be a Socket, the type check fails.

Static type checking is a powerful aid in writing large, complex software. But for Java, these wonderful compile-time checks suffer from a fatal flaw: any reference can be null, and calling a method on null produces a NullPointerException. Thus,

toUppercase() can be safely called on any String…unless the String is null.
read() can be called on any InputStream…unless the InputStream is null.
toString() can be called on any Object…unless the Object is null.

Java is not the only culprit; many other type systems have the same flaw, including of course, AGOL W.

In these languges, NULL is above type checks. It slips through them silently, waiting for runtime, to finally burst free in a shower of errors. NULL is the nothing that is simultaneously everything.

2. NULL is sloppy

There are many times when it doesn’t make sense to have a null. Unfortunately, if the language permits anything to be null, well, anything can be null.

Java programmers risk carpal tunnel from writing

if (str == null || str.equals("")) {}

It’s such a common idiom that C# adds String.IsNullOrEmpty

if (string.IsNullOrEmpty(str)) {}

Abhorrent.

Every time you write code that conflates null strings and empty strings, the Guava team weeps.
– Google Guava

Well said. But when your type system (e.g. Java, or C#) allows NULL everywhere, you cannot reliably exclude the possibility of NULL, and it’s nearly inevitable it will wind up conflated somewhere.

The ubiquitous possibility of null posed such a problem that Java 8 added the @NonNull annotation to try to retroactively fix this flaw in its type system.

3. NULL is a special-case

Given that NULL functions as a value that is not a value, NULL naturally becomes the subject of various forms of special treatment.

Pointers

For example, consider this C++:

char c = 'A';char *myChar = &c;std::cout << *myChar << std::endl;

myChar is a char *, meaning that it is a pointer—i.e. the memory address—to a char. The compiler verifies this. Therefore, the following is invalid:

char *myChar = 123; // compile errorstd::cout << *myChar << std::endl;

Since 123 is not guaranteed to be the address of a char, compilation fails. However, if we change the number to 0 (which is NULL in C++), the compiler passes it:

char *myChar = 0;std::cout << *myChar << std::endl; // runtime error

As with 123, NULL is not actually the address of a char. Yet this time the compiler permits it, because 0 (NULL) is a special case.

Strings

Yet another special case happens with C’s null-terminated strings. This is a bit different than the other examples, as there are no pointers or references. But the idea of a value that is not a value is still present, in the form of a char that is not a char.

A C-string is a sequence of bytes, whose end is marked by the NUL (0) byte.

 76 117 99 105 100 32 83 111 102 116 119 97 114 101 0 L u c i d S o f t w a r e NUL

Thus, each character of a C-string can be any of the possible 256 bytes, except 0 (the NUL character). Not only does this make string length a linear-time operation; even worse, it means that C-strings cannot be used for ASCII or extended ASCII. Instead, they can only be used for the unusual ASCIIZ.

This exception for a singular NUL character has caused innumerable errors: API weirdness, security vulnerabilities, and buffer overflows.

NULL is the worst CS mistake; more specifically, NUL-terminated strings are the most expensive one-byte mistakes.

4. NULL makes poor APIs

For the next example, we will journey to the land of dynamically-typed languages, where NULL will again prove to be a terrible mistake.

Key-value store

Suppose we create a Ruby class that acts as a key-value store. This may be a cache, an interface for a key-value database, etc. We’ll make the general-purpose API simple:

class Store ## # associate key with value # def set(key, value) ... end ## # get value associated with key, or return nil if there is no such key # def get(key) ... endend

We can imagine an analog in many languages (Python, JavaScript, Java, C#, etc.).

Now suppose our program has a slow or resource-intensive way of finding out someone’s phone number—perhaps by contacting a web service.

To improve performance, we’ll use a local Store as a cache, mapping a person’s name to his phone number.

store = Store.new()store.set('Bob', '801-555-5555')store.get('Bob') # returns '801-555-5555', which is Bob’s numberstore.get('Alice') # returns nil, since it does not have Alice

However, some people won’t have phone numbers (i.e. their phone number is nil). We’ll still cache that information, so we don’t have to repopulate it later.

store = Store.new()store.set('Ted', nil) # Ted has no phone numberstore.get('Ted') # returns nil, since Ted does not have a phone number

But now the meaning of our result is ambiguous! It could mean:

the person does not exist in the cache (Alice)
the person exists in the cache and does not have a phone number (Tom)

One circ*mstance requires an expensive recomputation, the other an instantaneous answer. But our code is insufficiently sophisticated to distinguish between these two.

In real code, situations like this come up frequently, in complex and subtle ways. Thus, simple, generic APIs can suddenly become special-cased, confusing sources of sloppy nullish behavior.

Patching the Store class with a contains() method might help. But this introduces redundant lookups, causing reduced performance, and race conditions.

Double trouble

JavaScript has this same issue, but with every single object.
If a property of an object doesn’t exist, JS returns a value to indicate the absence. The designers of JavaScript could have chosen this value to be null.

But insteadthey worried about cases where the property exists and is set to the value null.In a stroke of ungenius, JavaScript added undefined to distinguish a null property from a non-existent one.

5. NULL exacerbates poor language decisions

Java silently converts between reference and primitive types. Add in null, and things get even weirder.

For example, this does not compile:

int x = null; // compile error

This does compile:

Integer i = null;int x = i; // runtime error

though it throws a NullPointerException when run.

It’s bad enough that member methods can be called on null; it’s even worse when you never even see the method being called.

6. NULL is difficult to debug

C++ is a great example of how troublesome NULL can be. Calling member functions on a NULL pointer won’t necessarily crash the program. It’s much worse: it might crash the program.

#include <iostream>struct Foo { int x; void bar() { std::cout << "La la la" << std::endl; } void baz() { std::cout << x << std::endl; }};int main() { Foo *foo = NULL; foo->bar(); // okay foo->baz(); // crash}

When I compile this with gcc, the first call succeeds; the second call fails.

Why? foo->bar() is known at compile-time, so the compiler avoids a runtime vtable lookup, and transforms it to a static call like Foo_bar(foo), with this as the first argument. Since bar doesn’t dereference that NULL pointer, it succeeds. But baz does, which causes a segmentation fault.

But suppose instead we had made bar virtual. This means that its implementation may be overridden by a subclass.

 ... virtual void bar() { ...

As a virtual function, foo->bar() does a vtable lookup for the runtime type of foo, in case bar() has been overridden. Since foo is NULL, the program now crashes at foo->bar() instead, all because we made a function virtual.

int main() { Foo *foo = NULL; foo->bar(); // crash foo->baz();}

NULL has made debugging this code extraordinarily difficult and unintuitive for the programmer of main.

Granted, dereferencing NULL is undefined by the C++ standard, so technically we shouldn’t be surprised by whatever happened. Still, this is a non-pathological, common, very simple, real-world example of one of the many ways NULL can be capricious in practice.

7. NULL is non-composable

Programming languages are built around composability: the ability to apply one abstraction to another abstraction. This is perhaps the single most important feature of any language, library, framework, paradigm, API, or design pattern: the ability to be used orthogonally with other features.

In fact, composibility is really the fundamental issue behind many of these problems. For example, the Store API returning nil for non-existant values was not composable with storing nil for non-existant phone numbers.

C# addresses some problems of NULL with Nullable<T>. You can include the optionality (nullability) in the type.

int a = 1; // integerint? b = 2; // optional integer that existsint? c = null; // optional integer that does not exist

But it suffers from a critical flaw that Nullable<T> cannot apply to any T. It can only apply to non-nullable T. For example, it doesn’t make the Store problem any better.

string is nullable to begin with; you cannot make a non-nullable string
Even if string were non-nullable, thus making string? possible, you still wouldn’t be able to disambiguate the situation. There isn’t a string??

The solution

NULL has become so pervasive that many just assume that it’s necessary. We’ve had it for so long in so many low- and high-level languages, it seems essential, like integer arithmetic or I/O.

Not so! You can have an entire programming language without NULL. The problem with NULL is that it is a non-value value, a sentinel, a special case that was lumped in with everything else.

Instead, we need an entity that contains information about (1) whether it contains a value and (2) the contained value, if it exists. And it should be able to “contain” any type. This is the idea of Haskell’s Maybe, Java’s Optional, Swift’s Optional, etc.

For example, in Scala, Some[T] holds a value of type T. None holds no value. These are the two subtypes of Option[T], which may or may not hold a value.

The reader unfamiliar with Maybes/Options may think we have substituted one form of absence (NULL) for another form of absence (None). But there is a difference — subtle, but crucially important.

In a statically typed language, you cannot bypass the type system by substituting a None for any value. A None can only be used where we expected an Option. Optionality is explicitly represented in the type.

And in dynamically typed languages, you cannot confuse the usage of Maybes/Options and the contained values.

Let’s revisit the earlier Store, but this time using ruby-possibly. The Store class returns Some with the value if it exists, and a None if it does not. And for phone numbers, Some is for a phone number, and None is for no phone number. Thus there are two levels of existence/non-existence: the outer Maybe indicates presence in the Store; the inner Maybe indicates the presence of the phone number for that name. We have successfully composed the Maybes, something we could not do with nil.

cache = Store.new()cache.set('Bob', Some('801-555-5555'))cache.set('Tom', None())bob_phone = cache.get('Bob')bob_phone.is_some # true, Bob is in cachebob_phone.get.is_some # true, Bob has a phone numberbob_phone.get.get # '801-555-5555'alice_phone = cache.get('Alice')alice_phone.is_some # false, Alice is not in cachetom_phone = cache.get('Tom')tom_phone.is_some # true, Tom is in cachetom_phone.get.is_some #false, Tom does not have a phone number

The essential difference is that there is no more union–statically typed or dynamically assumed–between NULL and every other type, no morenonsensicalunion between a present value and an absence.

Manipulating Maybes/Options

Let’s continue with more examples of non-NULL code. Suppose in Java 8+, we have an integer that may or may not exist, and if it does exist, we print it.

Optional<Integer> option = ...if (option.isPresent()) { doubled = System.out.println(option.get());}

This is good. But most Maybe/Optional implementations, including Java’s, support an even better functional approach:

option.ifPresent(x -> System.out.println(x));// or option.ifPresent(System.out::println)

Not only is this functional way more succinct, but it is also a little safer. Remember that option.get() will produce an error if the value is not present. In the earlier example, the get() was guarded by an if. In this example, ifPresent() obviates our need for get() at all. It makes there obviously be no bug, rather than no obvious bugs.

Options can be thought about as a collection with a max size of 1. For example, we can double the value if it exists, or leave it empty otherwise.

option.map(x -> 2 * x)

We can optionally perform an operation that returns an optional value, and “flatten” the result.

option.flatMap(x -> methodReturningOptional(x))

We can provide a default value if none exists:

option.orElseGet(5)

In summary, the real value of Maybe/Option is

reducing unsafe assumptions about what values “exist” and which do not
making it easy to safely operate on optional data
explicitly declaring any unsafe existence assumptions (e.g. with an .get() method)

Down with NULL!

NULL is a terrible design flaw, one that continues to cause constant, immeasurable pain. Only a few languages have managed to avoid its terror.

If you do choose a language with NULL, at least possess the wisdom to avoid such awfulness in your own code and use the equivalent Maybe/Option.

NULL in common languages:

Language	NULL	Maybe
C	NULL
C++	NULL	boost::optional, from Boost.Optional
C#	null
Clojure	nil	java.lang.Optional
Common Lisp	nil	maybe, from cl-monad-macros
F#	null	Core.Option
Go	nil
Groovy	null	java.lang.Optional
Haskell		Maybe
Java	null	java.lang.Optional
JavaScript (ECMAScript)	null, undefined	Maybe, from npm maybe
Objective C	nil, Nil, NULL, NSNull	Maybe, from SVMaybe
OCaml		option
Perl	undef
PHP	NULL	Maybe, from monad-php
Python	None	Maybe, from PyMonad
Ruby	nil	Maybe, from ruby-possibly
Rust		Option
Scala	null	scala.Option
Standard ML		option
Swift		Optional
Visual Basic	Nothing

“Scores” are according to:

Does not have NULL.
Has NULL. Has an alternative in the language or standard libraries.
Has NULL. Has an alternative in a community library.
Has NULL.
Programmer’s worst nightmare. Multiple NULLs.

Edits

Ratings

Don’t take the “ratings” too seriously. The real point is to summarize the state of NULL in various languages and show alternatives to NULL, not to rank languages generally.

The info for a few languages has been corrected. Some languages have some sort of null pointer for compatibility reasons with runtimes, but they aren’t really usable in the language itself.

Example: Haskell’s Foreign.Ptr.nullPtr is used for FFI (Foreign Function Interface), for marshalling values to and from Haskell.
Example: Swift’s UnsafePointer must be used with unsafeUnwrap or !.
Counter-example: Scala, while idiomatically avoiding null, still treats null the same as Java, for increased interop. val x: String = null

When is NULL okay

It deserves mentions that a special value of the same size, like 0 or NULL can be useful when cutting CPU cycles, trading code quality for performance. This is handy for those low-level languages, like C, when it really matters, but it really should be left there.

The REAL problem

The more general issue of NULL is that of sentinel values: values that are treated the same as others, but which have entirely different semantics. Returning either an integer index or the integer -1 from indexOf is a good example. NUL-terminated strings is another. This post focuses mostly on NULL, given its ubiquity and real-world effects, but just as Sauron is a mere servent of Morgoth, so too is NULL a mere manifestation of the underlying problem of sentinels.

Interested in learning more about our software? Sign up for Lucidchart free here. Want to join our team? Check out our careers page.

The worst mistake of computer science - Lucidchart (2024)

FAQs

Why is null pointer billion dollar mistake? ›

Coined by Sir Tony Hoare, the very inventor of the null reference, the term is not merely a self-deprecating quip but a stark reflection of the cumulative costs associated with unforeseen bugs, crashes, and the maddening hours developers worldwide have spent debugging a null pointer exception.

Read On ›

What is the billion dollar mistake? ›

Speaking at a software conference in 2009, Tony Hoare hyperbolically apologized for "inventing" the null reference: I call it my billion-dollar mistake. It was the invention of the null reference in 1965.

Discover More Details ›

Why was null a bad idea? ›

We'll discuss why it's not a good practice, and suggest better alternatives to use instead. When a variable or object reference is set to null , it means it does not point to any valid object or value. This can cause runtime errors such as NullReferenceExceptions, and is a common source of bugs in software development.

What is the hardest thing about computer science? ›

What is the hardest part of computer science? For students with no programming experience, learning their first programming language can be the most challenging aspect of computer science.

See Details ›

How accurate is billion dollar code? ›

Lightly fictionalized as it is, this is a story that deals in harsh truths.

Find Out More ›

How many null pointers exist? ›

C. In C, two null pointers of any type are guaranteed to compare equal. The preprocessor macro NULL is defined as an implementation-defined null pointer constant in <stdlib. h> , which in C99 can be portably expressed as ((void *)0) , the integer value 0 converted to the type void* (see pointer to void type).

Tell Me More ›

Does Rust have null? ›

Rust doesn't have the null feature that many other languages have.

Show Me More ›

Has anyone had a billion dollars? ›

With the stroke of a pen, he gave up the business he had spent 15 years of his life building and found himself flush with cash—and effectively without a day job. After taxes, Iervolino was left with roughly $1 billion, earning him the nickname “Mister Billion” in the Italian press.

Explore More ›

Who made a billion dollars in one day? ›

George Soros is most famous for a single-day gain of $1 billion on Sept. 16, 1992, which he made by short selling the British pound.

What the heck is null? ›

A NULL value indicates that the data is missing or unknown, while a value of 0 indicates that the data is explicitly known and set to zero. If you are analyzing a stock table, the quantity of a product would be NULL when you don't have the total units for that product. If it's zero, it means it's sold out.

Show Me More ›

Is null real or fake? ›

Null (sometimes called The Original Null to distinguish him from Anti Null) is the newest in a line of hoaxes created by the Minecraft fanbase. He/She supposedly leaves signs with the word "null" on them. This is not in any way creepy or unnatural, it's due to a parsing failure by Java.

Read The Full Story ›

Can a C++ reference be null? ›

References cannot be null, whereas pointers can; every reference refers to some object, although it may or may not be valid.

See Details ›

What is the hottest topic in computer science? ›

Here is the list of hot topics for research in the field of computer science.

Big Data Analytics.
Artificial Intelligence (AI)
Machine Learning (ML)
Natural Language Processing (NLP)
Digital Image Processing.
Internet of Things (IoT)
Cloud Computing.
Computer Vision.

More items...

Get More Info Here ›

Is computer science hard for someone with no experience? ›

It is fundamentally based in mathematics, logic, and some physics. If you have little or no experience with these subjects, chances are you'd struggle with the concepts in most computer science programs. The most important is mathematics, including topics such as: Linear algebra.

Which is the hardest computer degree? ›

Top 7 Toughest Courses In Computer Science Engineering

Theory of Computation. Students will learn how to use computers while being aware of their limitations. ...
Artificial Intelligence. ...
Advanced Database Systems. ...
Compiler Design. ...
Algorithms. ...
Discrete Math. ...
Computer Architecture.

How do you fix a null pointer error? ›

To fix NullPointerException in Java programming, you should check the variable for empty or null value before it is used any further. In order to fix the java. lang. NullPointerException, the main() method in the above code is updated with a check using the StringUtils.

View Details ›

What is the problem with null? ›

To sum up: If you have any nulls in your database, then you're getting wrong answers to certain of your queries. What's more, you have no way of knowing, of course, just which queries you're getting wrong answers to and which not; all results become suspect.

What causes null pointer dereference crash? ›

A NULL pointer dereference, also known as a null dereference, occurs when a program attempts to access or manipulate memory using a pointer that has a value of NULL (a special value representing the absence of a valid memory address).

Learn More ›

Is a null pointer false? ›

A null pointer has a reserved value that is called a null pointer constant for indicating that the pointer does not point to any valid object or function.

Discover More Details ›

The worst mistake of computer science - Lucidchart (2024)

What is wrong with NULL?

1. NULL subverts types

2. NULL is sloppy

3. NULL is a special-case

Pointers

Strings

4. NULL makes poor APIs

Key-value store

Double trouble

5. NULL exacerbates poor language decisions

6. NULL is difficult to debug

7. NULL is non-composable

The solution

Manipulating Maybes/Options

Down with NULL!

Edits

Ratings

When is NULL okay

The REAL problem

FAQs

Why is null pointer billion dollar mistake? ›

Is null real or fake? ›