Introduction
The question pops up a lot both on the internet and when someone would like to check your knowledge of how Java treats variables:
Does Java "pass-by-reference" or "pass-by-value" when passing arguments to methods?
It seems like a simple question (it is), but many people get it wrong by saying:
Objects are passed by reference and primitive types are passed by value.
A correct statement would be:
Object references are passed by value, as are primitive types. Thus, Java passes by value, not by reference, in all cases.
This may sound unintuitive for some, as it's common for lectures to showcase the difference between an example like this:
public static void main(String[] args) {
int x = 0;
incrementNumber(x);
System.out.println(x);
}
public static void incrementNumber(int x) {
x += 1;
}
and an example like this:
public static void main(String[] args) {
Number x = new Number(0);
incrementNumber(x);
System.out.println(x);
}
public static void incrementNumber(Number x) {
x.value += 1;
}
public class Number {
int value;
// Constructor, getters and setters
}
The first example will print:
0
While the second example will print:
1
The reason for this difference is often understood to be because of "pass-by-value" (first example, the copied value of x
is passed and any operation on the copy won't reflect on the original value) and "pass-by-reference" (second example, a reference is passed, and when altered, it reflects the original object).
In the proceeding sections, we'll explain why this is incorrect.
How Java Treats Variables
Let's have a refresher on how Java treats variables, as that's the key to understanding the misconception. The misconception is based on true facts, but a little warped.
Primitive Types
Java is a statically-typed language. It requires us to first declare a variable, then initialize it, and only then can we use it:
// Declaring a variable and initializing it with the value 5
int i = 5;
// Declaring a variable and initializing it with a value of false
boolean isAbsent = false;
You can split up the process of declaration and initialization:
// Declaration
int i;
boolean isAbsent;
// Initialization
i = 5;
isAbsent = false;
But if you try to use an uninitialized variable:
public static void printNumber() {
int i;
System.out.println(i);
i = 5;
System.out.println(i);
}
You're greeted with an error:
Main.java:10: error: variable i might not have been initialized
System.out.println(i);
There are no default values for local primitive types such as i
. Though, if you define global variables like i
in this example:
static int i;
public static void printNumber() {
System.out.println(i);
i = 5;
System.out.println(i);
}
Running this, you'll then see the following output:
0
5
The variable i
was output as 0
, even though it wasn't yet assigned.
Each primitive type has a default value, if defined as a global variable, and these will typically be 0
for number-based types and false
for booleans.
There are 8 primitive types in Java:
byte
: Ranges from-128
to127
inclusive, 8-bit signed integershort
: Ranges from-32,768
to32,767
inclusive, 16-bit signed integerint
: Ranges from-2,147,483,648
to2,147,483,647
inclusive, 32-bit signed integerlong
: Ranges from -231 to 231-1, inclusive, 64-bit signed integerfloat
: Single precision, 32-bit IEEE 754 floating point integer with 6-7 significant digitsdouble
: Double precision, 64-bit IEEE 754 floating point integer, with 15 significant digitsboolean
: Binary values,true
orfalse
char
: Ranges from0
to65,536
inclusive, 16-bit unsigned integer representing a Unicode character
Passing Primitive Types
When we pass primitive types as method arguments, they're passed by value. Or rather, their value is copied and then passed to the method.
Let's go back to the first example and break it down:
public static void main(String[] args) {
int x = 0;
incrementNumber(x);
System.out.println(x);
}
public static void incrementNumber(int x) {
x += 1;
}
When we declare and initialize int x = 0;
, we've told Java to keep a 4-byte space in the stack for the int
to be stored in. The int
doesn't have to fill up all 4 bytes (Integer.MAX_VALUE
), but all 4 bytes will be available.
This spot in the memory is then referenced by the compiler when you want to use the integer x
. The x
variable name is what we use to access the memory location in the stack. The compiler has its own internal references to these locations.
Once we've passed x
to the incrementNumber()
method and the compiler reaches the method signature with the int x
parameter - it creates a new memory location/space in the stack.
The variable name we use, x
, has little meaning to the compiler. We can even go as far as to say that the int x
we've declared in the main()
method is x_1
and the int x
we've declared in the method signature is x_2
.
We've then increased the value of the integer x_2
in the method, and then print x_1
. Naturally, the value stored in the memory location for x_1
is printed and we see the following:
0
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
Here's a visualization of the code:
In conclusion, the compiler makes a reference to the memory location of primitive variables.
A stack exists for every thread we're running and it's used for static memory allocation of simple variables, as well as references to the objects in the heap (More on the heap in later sections).
This is likely what you already knew, and what everyone who answered with the initial incorrect statement know. Where the biggest misconception lies is in the next data type.
Reference Types
The type used for passing data is the reference type.
When we declare and instantiate/initialize objects (similar to primitive types), a reference is created to them - again, very similar to primitive types:
// Declaration and Instantiation/initialization
Object obj = new Object();
Again, we can also split this process up:
// Declaration
Object obj;
// Instantiation/initialization
obj = new Object();
Note: There's a difference between instantiation and initialization. Instantiation refers to the creation of the object and assigning it a location in memory. Initialization refers to the population of this object's fields through the constructor, once it's created.
Once we're done with the declaration, the obj
variable is a reference to the new
object in memory. This object is stored in the heap - unlike primitive types which are stored in the stack.
Whenever an object is created, it's put in the heap. The Garbage Collector sweeps this heap for objects which have lost their references and removes them as we can't reach them anymore.
The default value for objects after declaration is null
. There is no type which null
is an instanceof
and it doesn't belong to any type or set. If no value is assigned to a reference, such as obj
, the reference will point to null
.
Let's say we have a class such as an Employee
:
public class Employee {
String name;
String surname;
}
And instantiate the class as:
Employee emp = new Employee();
emp.name = new String("David");
emp.surname = new String("Landup");
Here's what happens in the background:
The emp
reference points to an object in the heap space. This object contains references to two String
objects which hold the values David
and Landup
.
Every time the new
keyword is used, a new object is created.
Passing Object References
Let's see what happens when we pass an object as a method argument:
public static void main(String[] args) {
Employee emp = new Employee();
emp.salary = 1000;
incrementSalary(emp);
System.out.println(emp.salary);
}
public static void incrementSalary(Employee emp) {
emp.salary += 100;
}
We've passed our emp
reference to the method incrementSalary()
. The method accesses the int salary
field of the object and increments it by 100
. In the end, we're greeted with:
1100
This surely means that the reference has been passed between the method call and the method itself, since the object we wanted to access has indeed been changed.
Wrong. The same as with primitive types, we can go ahead and say that there are two emp
variables once the method has been called - emp_1
and emp_2
, to the eyes of the compiler.
The difference between the primitive x
we've used before and the emp
reference we're using now is that both emp_1
and emp_2
point to the same object in memory.
Using any of these two references, the same object is accessed and the same information is changed.
That being said, this brings us to the initial question.
Does Java "pass-by-reference" or "pass-by-value"?
Java passes by value. Primitive types get passed by value, object references get passed by value.
Java doesn't pass objects. It passes object references - so if anyone asks how does Java pass objects, the answer is: "it doesn't".1
In the case of primitive types, once passed, they get allocated a new space in the stack and thus all further operations on that reference are linked to the new memory location.
In the case of object references, once passed, a new reference is made, but pointing to the same memory location.
1. According to Brian Goetz, the Java Language Architect working on the Valhalla and Amber projects. You can read more about this here.