Notes on Equality Comparison in Java

TL;DR

🞿 This is my personal notes which I made while studying Java, so the content below should not be taken seriously as technical reference.

🞿 I will be spending 70% of the note building up the basic understanding of the Java memory model... to explain a difference that could be summarized in a single sentence:

== checks same value for primitives and same object for references; equals() checks whatever the class defines as equality.

The Context #

In Java, we have two methods for equality comparison of values: equals() and the operand ==.

equals() method exists since the JDK 1.0 ¹ as a public method of the class Object.
The == operator is common in most high-level programming languages and in Java, it is in the category of Equality & Relational Operator ²

So what's the difference between using == and equals() in Java? Turns out there's a lot.

The Mechanics #

To understand how primitive values and objects are being asserted for equality in Java, I think we must first roughly understand how they are being managed memory-wise.

For context, it is known that Java is a statically and strongly typed language. Static here means that every variable and expression have a type that is known at compile time, and strongly here means that the type of the value does not change. See Chapter 4 of the Java Spec for more detail on the type system.

Java Type System #

Java is object-oriented, but not everything is an object since there are primitive values. In other words, there are two kinds of types in Java: Primitive types and Reference types ³.

It is clear that a primitive value such as an integer 5 is simply represented as ...000101 in binary format (bits) in memory.

However, in Java, according to the JVM Spec section 2.3 only guarantees the semantics (logic/math) are 32-bit for int values. The specification grants JVM implementations such as HotSpot OpenJ9, etc. freedom in how they map the abstract machine to physical hardware.

Meanwhile, the reference type (class, array, interface, String, etc.) values are really just address lines pointing to somewhere in the heap space, hence the name reference.

Most simply, reference type values are values that the JVM can use to find the actual object during runtime.

Less simply, a reference is a value that identifies an object; how it is represented (pointer, handle (pointers to pointers), compressed pointer, etc.) is JVM dependent.

In the HotSpot implementation, a reference is a bit pattern representing a virtual memory address (a pointer, or a compressed offset decoded into a pointer) that indicates the starting byte of the JVM Object Header on the Heap.

Q&A

Q: Is Java's reference a pointer?

A: Short answer, no.

There are multiple JVM implementations, hence it is possible that in one implementation, GC can move objects (compacting collectors, generational collectors) to reduce fragmentation and improve locality. So we can not expect the reference to be the same across time.

This means Java references are GC-aware and should not be addresses/pointers that the programmers can directly control.

Q&A

Q: What's in the JVM header/metadata?

A: See JVM.

Pass-by-value #

Now that we've know something about how Java manages its objects, we will now learn about how Java is actually pass-by-value and not pass-by-reference.

The assumption: When passing an object to a method, we are passing the object itself, or rather the direct link to the original one.

The reality: Java copies the bits inside the original variable, since p holds an address in our example, it is copied to the new method parameter.

So in short, Java is pass-by-value, and the values passed are reference values for reference types.

Conclusion on Equality Comparison #

Now that we have cleared the concepts on reference in Java, we can finally produce a clearer answer on how == and .equals() differ:

Operator	Primitive type	Reference type
`==`	Compares value-by-value	Compares the reference value (which could be the heap address)
`.equals()`	Primitive types do not have method	Depending on the specific implementations, for example, the String class's implementation compares the underlying content. An example of a more complex `equals()` implementation can be seen in the HashMap class, which we will discuss in another note.

Notes on Equality Comparison in Java

The Context #

The Mechanics #

Java Type System #

Pass-by-value #

Conclusion on Equality Comparison #

References #