Monday, September 27, 2021

How to Clone a Collection in Java? Deep copy of ArrayList and HashSet Example

Programmers often mistook copy constructors provided by various collection classes, as a means to clone Collection like List, Set, ArrayList, HashSet, or any other implementation. What is worth remembering is that the copy constructor of Collection in Java only provides a shallow copy and not a deep copy, which means objects stored in both original Lists and cloned List will be the same and point to the same memory location in the Java heap. One thing, which adds to this misconception is a shallow copy of Collections with Immutable Objects. Since Immutable objects can't be changed, It's Ok even if two collections are pointing to the same object. This is exactly the case of String contained in the pool, update on one will not affect the other.

The problem arises when we use the Copy constructor of ArrayList to create a clone of List of Employees, where Employee is not Immutable. In this case, if the original collection modifies an employee, that change will also reflect into the cloned collection.

Similarly, if an employee is modified in the cloned collection, it will also appear as modified in the original collection. This is not desirable, in almost all cases, the clone should be independent of the original object. A solution to avoid this problem is deep cloning of collection, which means recursively cloning objects until you reached primitive or Immutable.

In this article, we will take a look at one approach of deep copying Collection classes like ArrayList or HashSet in Java. By the way, If you know the difference between shallow copy and deep copy, it would be very easy to understand how deep cloning of collection works.

And, If you are new to the Java world then I also recommend you go through The Complete Java MasterClass on Udemy to learn Java in a better and more structured way. This is one of the best and up-to-date courses to learn Java online.





Deep Cloning of Collection in Java

In the following example, we have a Collection of Employee, a mutable object, with a name and designation field. They are stored inside HashSet. We create another copy of this collection using the addAll() method of java.util.Collection interface. After that, we modified the designation of each Employee object stored in the original Collection. 

Ideally, this change should not affect original Collection, because the clone and original object should be independent of each other, but it does. The solution to fix this problem is deep cloning of elements stored in the Collection class.

import java.util.Collection;
import java.util.HashSet;
import java.util.Iterator;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/** 
  * Java program to demonstrate copy constructor of Collection provides shallow
  * copy and techniques to deep clone Collection by iterating over them.
  * @author http://javarevisited.blogspot.com
  */
public class CollectionCloningTest {
    private static final Logger logger = LoggerFactory.getLogger(CollectionCloningclass);
  
   
    public static void main(String args[]) {
       
        // deep cloning Collection in Java
        Collection<Employee> org = new HashSet<>();
        org.add(new Employee("Joe", "Manager"));
        org.add(new Employee("Tim", "Developer"));
        org.add(new Employee("Frank", "Developer"));
       
        // creating copy of Collection using copy constructor
        Collection<Employee> copy = new HashSet<>(org);
       
        logger.debug("Original Collection {}", org);
        logger.debug("Copy of Collection  {}", copy );
       
        Iterator<Employee> itr = org.iterator();
        while(itr.hasNext()){
            itr.next().setDesignation("staff");
        }
       
        logger.debug("Original Collection after modification  {}", org);
        logger.debug("Copy of Collection without modification {}", copy );
       
        // deep Cloning List in Java
      
    }
  
}

class Employee {
    private String name;
    private String designation;

    public Employee(String name, String designation) {
        this.name = name;
        this.designation = designation;
    }

    public String getDesignation() {
        return designation;
    }

    public void setDesignation(String designation) {
        this.designation = designation;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    @Override
    public String toString() {
        return String.format("%s: %s", name, designation );
    }  
   
}

Output :
- Original Collection [Joe: Manager, Frank: Developer, Tim: Developer]
- Copy of Collection  [Joe: Manager, Frank: Developer, Tim: Developer]
- Original Collection after modification  [Joe: staff, Frank: staff, Tim: staff]
- Copy of Collection without modification [Joe: staff, Frank: staff, Tim: staff]

You can see clearly that modifying the Employee object in the original Collection (changed the designation to "staff") is also reflecting in the cloned collection because the clone is a shallow copy and points to the same Employee object in the heap. 

In order to fix this, we need to deep clone the Employee object by iterating over Collection, and before that, we need to override the clone method for the Employee object.

1) Let Employee implements Cloneable interface
2) Add following clone() method into Employee class

@Override
    protected Employee clone() {
        Employee clone = null;
        try{
            clone = (Employee) super.clone();
           
        }catch(CloneNotSupportedException e){
            throw new RuntimeException(e); // won't happen
        }
       
        return clone;
       
    }


3) Instead of using Copy constructor use following code, to deep copy Collection in Java

Collection<Employee> copy = new HashSet<Employee>(org.size());
       
Iterator<Employee> iterator = org.iterator();
while(iterator.hasNext()){
    copy.add(iterator.next().clone());
}

4) Running same code for modifying collection, will not result in different output:

- Original Collection after modification  [Joe: staff, Tim: staff, Frank: staff]
- Copy of Collection without modification [Frank: Developer, Joe: Manager, Tim: Developer]

You can see that both clone and Collection are independent of each other and they are pointing to different objects.

Shallow copy vs Deep Copy of Collection Java


That's all on How to clone Collection in Java. Now we know that copy constructor or various collection classes like addAll() method of List or Set, only creates a shallow copy of Collection and both original and cloned Collection points to the same objects. To avoid this, we can deep copy collection, iterating over them and cloning each element. Though this requires that any object stored in Collection, must support deep cloning operation.


9 comments :

Santi said...

That's all on How to clone Collection in Java. Now we know that copy constructor or various collection classes e.g. addAll() method of List or Set, only creates shallow copy of Collection. I think last statement should be like that.

javin paul said...

You are absolutely correct Santosh. Thanks for pointing this out.

Unknown said...

Is deep copy more fast than create new object and set value?

Unknown said...

Shallow cloning copies the top level of a tree, but anything pointed to from that top level (e.g., object properties) remains shared by both copies. Deep cloning copies all levels of the tree, leaving no links between the source and the copy.

For instance, say you have a Person object (a) with a spouse property, which is also a Person object:

+-------------+
| Person: a |
+-------------+
| name: "Joe" |
| spouse |-------------->+---------------+
+-------------+ | Person |
+---------------+
| name: "Mary" |
+---------------+
If you do a shallow clone of a to b, both a and b point to the same Person from their spouse properties:

+-------------+
| Person: a |
+-------------+
| name: "Joe" |
| spouse |------+
+-------------+ |
|
+-------------+ +------->+---------------+
| Person: b | +------->| Person |
+-------------+ | +---------------+
| name: "Joe" | | | name: "Mary" |
| spouse |------ +---------------+
+-------------+
If you do a deep clone, you not only clone a to b, but you clone a.spouse to b.spouse so that they each end up having their own copy.

+-------------+
| Person: a |
+-------------+
| name: "Joe" |
| spouse |-------------->+---------------+
+-------------+ | Person |
+---------------+
+-------------+ | name: "Mary" |
| Person: b | +---------------+
+-------------+
| name: "Joe" |
| spouse |-------------->+---------------+
+-------------+ | Person |
+---------------+
| name: "Mary" |
+---------------+

Anonymous said...

Copy Constructor of List/Set provides a different List/Set. And hence the deep cloning can be achieved using the copy constructor. [Provided => The List/Set onto which the cloning is performed does not contain any mutable element into them] e.g.

Collection org = new HashSet<>();
org.add("Joe");
org.add("Tim");
org.add("Frank");

Collection collection = new HashSet<>(org);
collection.add("shekhar");
System.out.println(org);
System.out.println(collection);

Output:
[Joe, Tim, Frank]
[Joe, shekhar, Tim, Frank]

Please correct me if I am wrong.

Thanks.

javin paul said...

@Anonymous, both lists are actually pointing to same String objects e.g. "Joe", "Tim" and "Frank", it means they are not exactly deep cloning, but given they are Immutable and cannot be changed, one list will not be affected by other when someone modify their elements.

Unknown said...

Great Post. can you Please tell me
1.if Employee Class has Address Class object inside it. then what should we do to deep clone?
2.we always need Address class clone able? , What if it is not?

Rahul g said...

anybody knows why my shallow copy is not working ?
List original = new ArrayList<>();

Employee employee1 = new Employee("Rahul", "Engineer");
Employee employee2 = new Employee("Anil", "Engineer");
Employee employee3 = new Employee("Raju", "Engineer");

List copy = new ArrayList<>(original);
original.add(employee1);
original.add(employee2);
original.add(employee3);
System.out.println("original :" + original);

System.out.println("copy :" + copy);

Iterator iterator = original.iterator();
while (iterator.hasNext()) {
iterator.next().setDesignation("Staff");
}

System.out.println("original :" + original);

System.out.println("copy :" + copy);



===============O/P======================
original :[Employee [name=Rahul, designation=Engineer], Employee [name=Anil, designation=Engineer], Employee [name=Raju, designation=Engineer]]
copy :[]
original :[Employee [name=Rahul, designation=Staff], Employee [name=Anil, designation=Staff], Employee [name=Raju, designation=Staff]]
copy :[]

========================================



Unknown said...

@Rahul - you have created the copy of employee before adding elements in original employee object.

Post a Comment