Understanding Objects and References in PHP Memory

This article was initially written during my preparation for the PHP certification exam. I wanted to gain a deeper understanding of how PHP handles variables and objects in memory. My research revealed that finding clear answers to my questions was surprisingly difficult. Therefore, I decided to compile my findings into this comprehensive guide, hoping to provide others with a centralized resource on this topic.

This article focuses on the intricacies of object and variable reference management in memory within PHP. This is a topic that often sparks debates and diverse viewpoints. One common question is: “In PHP, are objects passed by reference or by copy by default?” To address this, we’ll first clarify what references are not in PHP, then delve into their true nature, and finally, explore the workings of PHP’s garbage collector.

When PHP encounters a statement like $a = new Foo();, how does it create objects in memory? While memory is no longer as costly or limited as it once was, it remains crucial for good PHP developers to grasp the internal mechanisms governing variables and objects during application execution.

Objects and references in PHP memory and PHP garbage collection

Objects and References in PHP

There’s a lot of conflicting information out there – in PHP books and online – about how objects are handled. Some claim that objects in PHP are passed by reference by default, while others argue that they’re allocated by copy. To determine the accuracy of these statements, we first need to establish a clear understanding of what constitutes a reference in PHP (and what doesn’t).

What References in PHP Aren’t

Before we define what references are in PHP, it’s more crucial to understand what they’re not. Unlike C-style pointers, references in PHP don’t allow for arithmetic operations. This is because, unlike in C, references in PHP don’t directly represent memory addresses as numerical values indicating a memory location. So, if they’re not memory addresses, what are they?

What References in PHP Are

In essence, references in PHP act as “aliases.” They provide a way for two distinct variables to access and modify the same underlying value. Think of them as mechanisms that enable access to a single value using different variable names. This means that these variables effectively behave as if they were one and the same.

It’s important to remember that, in PHP, variable names and their corresponding content are distinct entities, connected through the “symbols table." Creating a reference simply adds an alias for that variable within this table.

Let’s illustrate this with an example:

1
$a = new Foo();

Upon executing this statement, PHP creates the variable $a in memory, along with an object of type Foo. An entry is added to the symbol table, signifying that the variable $a “references” (or points to, or is linked to – the terminology isn’t strict) the Foo object. However, it’s essential to note that this is not a true pointer relationship like you might find in C. A conceptual representation is shown below:

Object references in PHP memory

Quick question: What happens when we execute the following line of code?

1
$b = $a;

Contrary to what some might think, $b doesn’t become a direct reference to $a, nor is it an exact copy. What actually transpires is the creation of a new variable $b in memory. Subsequently, a new entry is made in the symbol table, indicating that $b also references the same Foo object as $a. Visually, this can be depicted as follows:

Object references in PHP memory

Now, if we execute this line:

1
$c = &$a;

A third variable $c is created in memory. However, instead of a new symbol table entry for $c, the table records $c as an alias for $a. This means $c behaves identically to $a, but it isn’t a pointer to $a. This differs from C, which would create a “pointer to a pointer.” We can visualize it like this:

Diagram of variables and their aliases

When we attempt to modify the value of any of these three variables (i.e., write a new value), PHP intervenes. It creates a new z_val structure in memory to separate the content of $b from the $a/$c pair. This separation ensures that each can be independently modified without affecting the others. Let’s add the following line to our previous code snippet:

1
$b = new Bar();

Now, our memory layout would resemble this:

Graphic representation of the situation described above

Let’s expand our understanding with a more elaborate example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
<?php

class myClass {
    public $var;
		
    function __construct() {
	$this->var = 1;
    }

    function inc() { return ++$this->var; }
}

$a = new myClass(); // $a "references" a Foo object
$b = $a; //b also references the same Foo object as a
//($a) == ($b) == <id> of Foo object, but a and b are different entries in symbols table

echo "$a = ";var_dump($a);
echo "$b = ";var_dump($b);

$c = &$a; //$c is an alias of $a
//($a, $c) == <id> of Foo object, c is an alias of a in the symbols table
echo "$c = ";var_dump($c);

$a = NULL;
//The entry in the symbols table which links "$a" with Foo object is removed
//Since that entry was removed, $c is not related to Foo anymore
//Anyway, Foo still exists in memory and it is still linked by $b
echo "$a = ";var_dump($a);
echo "$b = ";var_dump($b);
echo "$c = ";var_dump($c);
echo "$b->var: ".$b->inc();
echo "$b->var: ".$b->inc();

$b = NULL;
//The entry in the symbols table which links "$b" with the Foo object is removed
//There are no more entries in the symbols table linked to Foo,
//So, Foo is not referenced anymore and can be deleted by the garbage collector

echo "$b = ";var_dump($b);

Executing this script would produce the following output:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
$a = object(myClass)#1 (1) { ["var"]=> int(1) } 
$b = object(myClass)#1 (1) { ["var"]=> int(1) } 

$c = object(myClass)#1 (1) { ["var"]=> int(1) } 
$a = NULL 
$b = object(myClass)#1 (1) { ["var"]=> int(1) } 
$c = NULL 
$b->var: 2
$b->var: 3

$b = NULL

PHP Garbage Collection

Finally, let’s delve into how PHP’s garbage collection mechanism, introduced in version 5.3, handles memory cleanup. PHP employs a reference counting system for objects and variables. When an object’s reference count reaches zero, indicating no active references within the symbol table, the garbage collector marks it for removal. During its next cycle, the garbage collector frees the memory occupied by the marked object, making it available for reuse.

To put it simply, PHP keeps track of how many variables are “pointing” to an object. Once that number drops to zero because the object is no longer being used, PHP cleans it up. For a more comprehensive look at PHP’s garbage collection, refer to the this documentation.

Closing Thoughts

This article aimed to shed light on the internal mechanisms of how PHP manages objects and variables in memory, particularly its selection process for garbage collection.

Armed with this understanding of PHP’s internal memory management for variables and objects, I encourage you to experiment with code to solidify your knowledge. Play around with variables and references, and observe how modifying a variable’s value impacts others that reference it. As a final exercise, consider the following code snippet. What will be the values of $a and $b after its execution?

1
2
3
$a = '1';
$b = &$a;
$b = "2$b";

For those interested in diving deeper into PHP performance optimization, I recommend checking out this insightful post by my esteemed Toptaler colleague, Vilson Duka.

Licensed under CC BY-NC-SA 4.0