Quantcast
Channel: Baeldung
Viewing all articles
Browse latest Browse all 3550

Finding the Index of the First Duplicate Element in an Array

$
0
0

1. Overview

Finding the index of the first duplicate element in an array is a common coding problem that can be approached in various ways. While the brute-force method is straightforward, it can be inefficient, especially for large datasets.

In this tutorial, we’ll explore different approaches to solve this problem, ranging from a basic brute-force solution to more optimized techniques using data structures like HashSet and array indexing.

2. Problem Statement

Given an array of integers, we want to find the index of the first duplicate element in an array.

Inputarr = [2, 1, 3, 5, 3, 2]
Output4
Explanation: the first element that repeats is 3, and its second occurrence is at index 4. So the correct answer is 4.

Inputarr = [1, 2, 3, 4, 5]
Output: -1
Explanation: Since no elements repeat, the correct answer is -1.

3. Brute Force Approach

In this approach, we check each element of the array and compare it with all subsequent elements to find the first duplicate.

Let’s first look at the implementation and then understand it step by step:

int firstDuplicateBruteForce(int[] arr) {
    int minIndex = arr.length;
    for (int i = 0; i < arr.length - 1; i++) {
        for (int j = i + 1; j < arr.length; j++) {
            if (arr[i] == arr[j]) {
                minIndex = Math.min(minIndex, j);
                break;
            }
        }
    }
    return minIndex == arr.length ? -1 : minIndex;
}

Let’s review this code:

  • We loop over every element i, and for each element, we look for its duplicate at a later index j.
  • When we find a duplicate, we compare j (the index of the second occurrence) with minIndex, which tracks the earliest second occurrence of any duplicate.
  • The break ensures we only track the first-second occurrence of a duplicate for each element.
  • In the end, if minIndex hasn’t been updated, it means there were no duplicates, so we return -1.

Now, let’s discuss the time and space complexity. We use two nested loops, each iterating over the array, leading to a quadratic time complexity. So, the time complexity is O(n^2). Since the extra space we used is independent of the size of the input, the space complexity is O(1).

4. Using HashSet

We can use a HashSet to store the elements we’ve seen so far. As we iterate through the array, we check if an element has already been seen. If so, we return its index as the first duplicate.

Let’s first look at the implementation and then understand it step by step:

int firstDuplicateHashSet(int[] arr) {
    HashSet<Integer> firstDuplicateSet = new HashSet<>();
    for (int i = 0; i < arr.length; i++) {
        if (firstDuplicateSet.contains(arr[i])) {
            return i;
        }
        firstDuplicateSet.add(arr[i]);
    }
    return -1;
}

Let’s review this code:

  • Create an empty HashSet.
  • Loop through the array.
  • For each element, check if it exists in the HashSet.
    • If it does, return the index (first duplicate).
    • If it doesn’t, add the element to the set.
  • If no duplicate is found, return -1.

We’ll now consider the example we mentioned at the start and have a dry run:

First Duplicate Hashset

If we look at time and space complexity, we iterate through the array once, and checking/inserting elements in a HashSet is O(1) on average, so the time complexity is O(n). In the worst case, we may need to store all elements in the HashSet, so the space complexity is O(n).

5. Using Array Indexing

If the elements are positive and within a specific range, i.e., between 1 and n for an array of size n, we can avoid using extra space by modifying the array itself. This approach works under the assumption that all elements are positive and within the range [1, n].

Let’s first look at the implementation and then understand it step by step:

int firstDuplicateArrayIndexing(int[] arr) {
    for (int i = 0; i < arr.length; i++) {
        int val = Math.abs(arr[i]) - 1;
        if (arr[val] < 0) {
            return i;
        }
        arr[val] = -arr[val];
    }
    return -1;
}

Let’s review this code:

  • Iterate through the array.
  • For each element, treat the value as an index and mark the corresponding element negative.
  • If we encounter a negative value at the calculated index, it means the element has already been encountered, and we return the current index as the first duplicate.
  • If no duplicates are found, return -1.

We’ll now consider the example we mentioned at the start and have a dry run:

First Duplicate Array Indexing

Now, let’s discuss the time and space complexity. Since we iterate through the array once, the time complexity will be O(n), and as we used no extra space, the space complexity will be O(1).

6. Conclusion

In this article, we saw that identifying the first duplicate in an array can be done using different strategies, each with its time and space complexity trade-offs.

The brute-force approach, while simple, isn’t ideal for larger datasets due to its O(n^2) time complexity. Optimized approaches provide significant performance improvements, such as using a HashSet or modifying the array.

Choosing the right solution depends on the problem constraints, such as whether extra space is allowed or if the array can be modified. Understanding these techniques equips us with versatile tools for tackling similar problems efficiently.

As always, the code presented in this article is available over on GitHub.

       

Viewing all articles
Browse latest Browse all 3550

Trending Articles