Does Java "pass-by-reference" or "pass-by-value"?

Introduction

The question pops up a lot both on the internet and when someone would like to check your knowledge of how Java treats variables:

Does Java "pass-by-reference" or "pass-by-value" when passing arguments to methods?

It seems like a simple question (it is), but many people get it wrong by saying:

Objects are passed by reference and primitive types are passed by value.

A correct statement would be:

Object references are passed by value, as are primitive types. Thus, Java passes by value, not by reference, in all cases.

This may sound unintuitive for some, as it's common for lectures to showcase the difference between an example like this:

public static void main(String[] args) {
    int x = 0;
    incrementNumber(x);
    System.out.println(x);
}

public static void incrementNumber(int x) {
    x += 1;
}

and an example like this:

public static void main(String[] args) {
    Number x = new Number(0);
    incrementNumber(x);
    System.out.println(x);
}

public static void incrementNumber(Number x) {
    x.value += 1;
}

public class Number {
    int value;
    // Constructor, getters and setters
}

The first example will print:

0

While the second example will print:

1

The reason for this difference is often understood to be because of "pass-by-value" (first example, the copied value of x is passed and any operation on the copy won't reflect on the original value) and "pass-by-reference" (second example, a reference is passed, and when altered, it reflects the original object).

In the proceeding sections, we'll explain why this is incorrect.

How Java Treats Variables

Let's have a refresher on how Java treats variables, as that's the key to understanding the misconception. The misconception is based on true facts, but a little warped.

Primitive Types

Java is a statically-typed language. It requires us to first declare a variable, then initialize it, and only then can we use it:

// Declaring a variable and initializing it with the value 5
int i = 5;

// Declaring a variable and initializing it with a value of false
boolean isAbsent = false;

You can split up the process of declaration and initialization:

// Declaration
int i;
boolean isAbsent;

// Initialization
i = 5;
isAbsent = false;

But if you try to use an uninitialized variable:

public static void printNumber() {
    int i;
    System.out.println(i);
    i = 5;
    System.out.println(i);
}

You're greeted with an error:

Main.java:10: error: variable i might not have been initialized
System.out.println(i);

There are no default values for local primitive types such as i. Though, if you define global variables like i in this example:

static int i;

public static void printNumber() {
    System.out.println(i);
    i = 5;
    System.out.println(i);
}

Running this, you'll then see the following output:

0
5

The variable i was output as 0, even though it wasn't yet assigned.

Each primitive type has a default value, if defined as a global variable, and these will typically be 0 for number-based types and false for booleans.

There are 8 primitive types in Java:

  • byte: Ranges from -128 to 127 inclusive, 8-bit signed integer
  • short: Ranges from -32,768 to 32,767 inclusive, 16-bit signed integer
  • int: Ranges from -2,147,483,648 to 2,147,483,647 inclusive, 32-bit signed integer
  • long: Ranges from -231 to 231-1, inclusive, 64-bit signed integer
  • float: Single precision, 32-bit IEEE 754 floating point integer with 6-7 significant digits
  • double: Double precision, 64-bit IEEE 754 floating point integer, with 15 significant digits
  • boolean: Binary values, true or false
  • char: Ranges from 0 to 65,536 inclusive, 16-bit unsigned integer representing a Unicode character

Passing Primitive Types

When we pass primitive types as method arguments, they're passed by value. Or rather, their value is copied and then passed to the method.

Let's go back to the first example and break it down:

public static void main(String[] args) {
    int x = 0;
    incrementNumber(x);
    System.out.println(x);
}

public static void incrementNumber(int x) {
    x += 1;
}

When we declare and initialize int x = 0;, we've told Java to keep a 4-byte space in the stack for the int to be stored in. The int doesn't have to fill up all 4 bytes (Integer.MAX_VALUE), but all 4 bytes will be available.

This spot in the memory is then referenced by the compiler when you want to use the integer x. The x variable name is what we use to access the memory location in the stack. The compiler has its own internal references to these locations.

Once we've passed x to the incrementNumber() method and the compiler reaches the method signature with the int x parameter - it creates a new memory location/space in the stack.

The variable name we use, x, has little meaning to the compiler. We can even go as far as to say that the int x we've declared in the main() method is x_1 and the int x we've declared in the method signature is x_2.

We've then increased the value of the integer x_2 in the method, and then print x_1. Naturally, the value stored in the memory location for x_1 is printed and we see the following:

0

Here's a visualization of the code:

java stack and primitive variables

In conclusion, the compiler makes a reference to the memory location of primitive variables.

A stack exists for every thread we're running and it's used for static memory allocation of simple variables, as well as references to the objects in the heap (More on the heap in later sections).

This is likely what you already knew, and what everyone who answered with the initial incorrect statement know. Where the biggest misconception lies is in the next data type.

Reference Types

The type used for passing data is the reference type.

When we declare and instantiate/initialize objects (similar to primitive types), a reference is created to them - again, very similar to primitive types:

// Declaration and Instantiation/initialization
Object obj = new Object();

Again, we can also split this process up:

// Declaration
Object obj;

// Instantiation/initialization
obj = new Object();

Note: There's a difference between instantiation and initialization. Instantiation refers to the creation of the object and assigning it a location in memory. Initialization refers to the population of this object's fields through the constructor, once it's created.

Once we're done with the declaration, the obj variable is a reference to the new object in memory. This object is stored in the heap - unlike primitive types which are stored in the stack.

Whenever an object is created, it's put in the heap. The Garbage Collector sweeps this heap for objects which have lost their references and removes them as we can't reach them anymore.

The default value for objects after declaration is null. There is no type which null is an instanceof and it doesn't belong to any type or set. If no value is assigned to a reference, such as obj, the reference will point to null.

Let's say we have a class such as an Employee:

public class Employee {
    String name;
    String surname;
}

And instantiate the class as:

Employee emp = new Employee();
emp.name = new String("David");
emp.surname = new String("Landup");

Here's what happens in the background:

java heap memory object creation

The emp reference points to an object in the heap space. This object contains references to two String objects which hold the values David and Landup.

Every time the new keyword is used, a new object is created.

Passing Object References

Let's see what happens when we pass an object as a method argument:

public static void main(String[] args) {
    Employee emp = new Employee();
    emp.salary = 1000;
    incrementSalary(emp);
    System.out.println(emp.salary);
}

public static void incrementSalary(Employee emp) {
    emp.salary += 100;
}

We've passed our emp reference to the method incrementSalary(). The method accesses the int salary field of the object and increments it by 100. In the end, we're greeted with:

1100

This surely means that the reference has been passed between the method call and the method itself, since the object we wanted to access has indeed been changed.

Wrong. The same as with primitive types, we can go ahead and say that there are two emp variables once the method has been called - emp_1 and emp_2, to the eyes of the compiler.

The difference between the primitive x we've used before and the emp reference we're using now is that both emp_1 and emp_2 point to the same object in memory.

Using any of these two references, the same object is accessed and the same information is changed.

java heap memory multiple references

That being said, this brings us to the initial question.

Does Java "pass-by-reference" or "pass-by-value"?

Java passes by value. Primitive types get passed by value, object references get passed by value.

Java doesn't pass objects. It passes object references - so if anyone asks how does Java pass objects, the answer is: "it doesn't".1

In the case of primitive types, once passed, they get allocated a new space in the stack and thus all further operations on that reference are linked to the new memory location.

In the case of object references, once passed, a new reference is made, but pointing to the same memory location.

1. According to Brian Goetz, the Java Language Architect working on the Valhalla and Amber projects. You can read more about this here.