Counting Duplicates with Java Stream API

Count Duplicates

We can use the distinct method from java.util.stream.Stream to count duplicates:

var test = List.of(1, 3, 3, 4);
var distinctSize = test.stream().distinct().count();

To check if the list has duplicates, we just have to compare the size:

System.out.println(test.size() == distinctSize); // false

If the list contains objects instead of primitives like in the first example, then our objects have to implement the equals method:

@AllArgsConstructor
@EqualsAndHashCode
public class Devlabs {
    long id;
}
var test = List.of(
    new Devlabs(1), 
    new Devlabs(3), 
    new Devlabs(3), 
    new Devlabs(4));
var distinctSize = test.stream().distinct().count();
System.out.println(test.size() == distinctSize); // false

In this example we are using Lombok's @EqualsAndHashCode to generate the equals method which compares the objects using all relevant fields. With equals method implemented, the two objects new Devlabs(3) are counted as duplicates.

If we want to remove the duplicates, we just have to use the distinct in combination with the collect method to accumulate the input elements into a new List:

var noDuplicates = test.stream()
    .distinct()
    .collect(Collectors.toList());

Calculate the Frequency

If we want to count the frequency of the items in the list, we can use the java.util.stream.Stream.collect method in combination with Collectors.groupingBy:

var test = List.of(1, 3, 3, 4);
Map<Integer, Long> freq = test
        .stream()
        .collect(Collectors.groupingBy(
            Function.identity(),
            Collectors.counting()
        ));
// {1=1, 3=2, 4=1}

Another possibility is to use the Java.util.Collections.frequency method:

import static java.util.Collections.frequency;

// Calculate the frequency for a single item in the list
System.out.println(frequency(test, 3)); // 2

// Calculate the frequency for all objects in the list
Map<Integer, Integer> freqTest = test.stream()
    .collect(
        Collectors.toMap(
            Function.identity(),
            v -> frequency(test, v),
            (v1, v2) -> v1)
        );

In the last example, we use the frequency method to calculate the frequency for all items in the list. Since the toMap function is called on each element of the list, we have to use (v1, v2) -> v1 as a merge function, since the result is the same.