The Javascript type system, part #1: language types

In this four-part article series I will try to give an as streamlined as possible summary of my learnings regarding the conception of “types” or “classes” of objects and values in Javascript. This topic can get fairly comprehensive, as the quirks and substantial differences compared to other mainstream programming languages show up rather distinctively when looked at it more closely, so I will try to keep it clear rather than to cover every corner case.

When it comes to types in Javascript, some people simply don’t seem to care a lot. After all, it’s possible to create working software in this language following vague assumptions like “Everything is an object” or even “Javascript is a dynamic language, you don’t need to think about types”. While both statements are based on the experience that “it just works”, they are false, strictly speaking. The dynamic nature of Javascript does not imply the absence of types at all. Personally, when I write Javascript code, I try to always be very aware of the types I’m dealing with.

Objects and values in Javascript can be categorized according to not less than three different type taxonomies:

  • The actual type (like string, number, object …), provided by the Javascript language (i.e. the “language type”)
  • The inherited classification an object exposes by means of its prototype chain (i.e. the “class”)
  • The internal meta type provided by the ECMAScript specification (i.e. the [[Class]])

To keep the dimensions of blog entries small, I intend to discuss each of these systems separately, closing the series with a discussion about type coercion, and beginning today with the actual language types.

The taxonomy of Javascript’s language types

There are two fundamental categories of types in Javascript: primitives and objects. Primitives are value types, whereas objects are reference types. Therefore, when passed as a parameter into a function, in the first case, a copy of the value gets passed, in the latter case, a copy of the reference (to the very same object) gets passed. These are characteristics common with languages like C# and Java, now let’s move on to the differences:

While the group of primitives consists of five different types …

  • The Undefined type
  • The Null type
  • The Boolean type
  • The Number type
  • The String type

… the group of objects is represented by just one single type …

  • The Object type

So, we need to rephrase the beforementioned statement “Everything is an object” to:

“Everything that is not undefined, null, true, false, any number or any string, is an object.”

Let’s talk about objects first, as they’re really easy and straight-forward:

1
2
3
4
5
6
7
8
9
10
var ernie = {};                       // ernie is of type "Object"
ernie.name = "Ernie"; // name is a string property
console.log(ernie.name); // prints "Ernie"

var bert = { name: "Bert" }; // properties can be defined inline
ernie.friend = bert; // friend is an object property
console.log(ernie.friend.name); // prints "Bert"

ernie["home"] = "Sesame Street"; // alternative syntax for property access
console.log(ernie["home"]); // prints "Sesame Street"

As you can see, an object in Javascript is just an unordered collection of key-value pairs representing its properties. Each key is an arbitrary string, unique within the same object, of course; the corresponding value can be any primitive value or object. As functions are also objects in Javascript, the value of an object’s property can be a function, which then looks like what is called a “method” in classical object oriented languages. However, other than being held as the value in an object property, such functions do not have any connection to the respective object whatsoever. They can be called in context of any other object or primitive value just as well, as you can see in the following examples:

1
2
3
4
5
6
7
8
9
var ernie = { name: "Ernie" };
ernie.printName = function () {
console.log(this.name);
};
ernie.printName(); // prints "Ernie"

var bert = { name: "Bert" };
bert.printName = ernie.printName; // use ernie's function also on bert
bert.printName(); // prints "Bert"

In contrast to objects, primitives are low-level entities, they’re immutable and cannot have properties. Therefore the distinction between primitives and objects is absolutely substantial. Let’s do a few experiments to get a grip on primitives:

1
2
3
4
5
6
7
8
9
var num = 3;                     // num is of type "Number"
num.someProperty = "foo"; // seems to fail silently
console.log(num.someProperty); // prints "undefined", as num has no property called "someProperty"

var nil = null; // nil is of type "Null"
nil.someProperty = "foo"; // throws a TypeError

var boo = true; // boo is of type "Boolean"
console.log(boo.toString()); // prints "true"

Looking at these examples raises a few questions: Why does setting a property on a number value apparently fail without errors whereas doing the same on a null value throws an error? And why does calling toString on boo nevertheless return the correct result although I stated that primitive values cannot have properties?

The answer to both questions is: type coercion. Behind the scenes, Javascript tries really hard to translate values from one type to another in order to successfully execute the requested operations on them. In doing so it follows a set of rules which is beyond the scope of this article, but the one important thing to consider for now in the context of primitive types is:

Three of the five primitive types have a counterpart in the object world that acts like a wrapper.

Boolean values, numbers and strings can be enclosed in an object by calling the constructor functions Boolean, Number or String. This procedure is called boxing. Once you have a box containing a value, you can treat it like any other object: define properties, calling functions (like toString) on it and so on. The one thing you can’t do is changing the content of the box, as the constructor function is the only way to set the underlying value.

Considering these type coercions, what’s really going on behind the scenes of our last example can be imagined like this:

1
2
3
4
5
6
var num = 3;                                 // num is of type "Number"
(new Number(num)).someProperty = "foo"; // creates a box and defines "someProperty" on it
console.log(num.someProperty); // "someProperty" was defined on the box, not on num!

var boo = true; // boo is of type "Boolean"
console.log((new Boolean(boo)).toString()); // creates a box and calls "toString" on it

I intentionally statet it could be “imagined” like this, because it’s not the whole truth: In strict mode, when calling a function on a primitive value, boxing doesn’t get applied anymore, most likely for performance reasons. Nevertheless, the mental model stays the same, at least from the caller perspective. To verify this behavior, we can call a function in context of a primitive value (i.e. using the value as the this “parameter”) and evaluate its type inside the function (for details about the typeof operator see the next section):

1
2
3
4
5
6
7
8
9
10
11
12
var printMeNonStrict = function () {
console.log(typeof this);
}

var printMeStrict = function () {
"use strict";
console.log(typeof this);
}

var boo = true; // boo is a primitive of type "Boolean"
printMeNonStrict.call(boo); // prints "object" (in "printMeNonStrict", "this" is a box)
printMeStrict.call(boo); // prints "boolean" (in "printMeStrict", "this" is a primitive)

Boxed primitives in Javascript have a more academical significance than a practical. As a developer, one should never ever manually box a primitive value! The resulting objects are quite useless and also very dangerous:

1
2
3
4
5
6
7
8
9
var disguise = new String("Foo");
console.log(disguise === "Foo"); // prints "false", as object !== string

var eraseMyHarddrive = new Boolean(false);
if (eraseMyHarddrive) {
// this code will get executed because "eraseMyHarddrive"
// evaluates to "true" (all objects are truthy, even a boxed "false")!
console.log("Done!");
}

The typeof operator

To determine the type of a value in Javascript, we can use the typeof operator. As it is an unary operator, it gets applied without parentheses, however, adding those doesn’t change the meaning:

1
typeof "foo" === typeof("foo")   // true

The result of this operator is a string containing the lowercased type name:

1
2
3
4
5
6
7
8
9
10
typeof undefined        // "undefined"
typeof null // "object"
typeof true // "boolean"
typeof 42 // "number"
typeof "foo" // "string"
typeof {} // "object"
typeof [] // "object"
typeof new Date() // "object"
typeof Math // "object"
typeof function () {} // "function"

As you might have noticed from these examples, the typeof operator is seriously broken:

  • It returns “object” instead of “null” when applied to null
  • It returns “function” instead of “object” when applied to an object that is a function

The first bug is rather obvious: apparently null was so heavily meant to be used to indicate the absence of an object, that it itself “became” an object. The second one is more subtle, but functions in Javascript are objects just like arrays or regular expressions. Although they’re Javascript’s mightiest weapon, they do not form a new language type alongside other objects. The TC39 at ECMA which is responsible for the development of Javascript (technically: ECMAScript) has done nothing (and will do nothing) to fix these bugs, and rightly so, because it would probably break every major Javascript application ever written.

On the flipside, the “function” bug is a blessing in disguise though: There aren’t any other good ways in Javascript to determine if some object is a function (and therefore can be called) or not, so typeof comes in handy.

Concerning the everyday use of the typeof operator the following practices are recommended:

  • Use it to identify an object as a function
  • Use it to distinguish primitives, but never forget the extra check for null

This is the end for today, stay tuned for part #2 …