Explain how to find duplicates in a list

Finding duplicates in a list can be approached in several ways, depending on the requirements and constraints (e.g., time complexity, memory usage). Here are various methods:


1. Using a Dictionary (Efficient Approach)

This approach uses a dictionary (or a collections.Counter) to count occurrences of each element.

def find_duplicates(lst):
    element_counts = {}
    duplicates = []

    for item in lst:
        if item in element_counts:
            element_counts[item] += 1
        else:
            element_counts[item] = 1

    for item, count in element_counts.items():
        if count > 1:
            duplicates.append(item)

    return duplicates

# Example usage
lst = [1, 2, 3, 2, 4, 5, 1, 6, 1]
print(find_duplicates(lst))  # Output: [1, 2]

Time Complexity: O(n)O(n)
Space Complexity: O(n)O(n) (for the dictionary)


2. Using collections.Counter

A cleaner way to count occurrences is to use collections.Counter.

from collections import Counter

def find_duplicates(lst):
    counts = Counter(lst)
    return [item for item, count in counts.items() if count > 1]

# Example usage
lst = [1, 2, 3, 2, 4, 5, 1, 6, 1]
print(find_duplicates(lst))  # Output: [1, 2]

Time Complexity: O(n)O(n)
Space Complexity: O(n)O(n)


3. Using a Set (Efficient Memory-wise)

This approach uses a set to track seen elements and another set to store duplicates.

def find_duplicates(lst):
    seen = set()
    duplicates = set()

    for item in lst:
        if item in seen:
            duplicates.add(item)
        else:
            seen.add(item)

    return list(duplicates)

# Example usage
lst = [1, 2, 3, 2, 4, 5, 1, 6, 1]
print(find_duplicates(lst))  # Output: [1, 2]

Time Complexity: O(n)O(n)
Space Complexity: O(n)O(n) (for the sets)


4. Using Sorting (In-Place)

This approach first sorts the list, then checks adjacent elements for duplicates.

def find_duplicates(lst):
    lst.sort()  # Sort the list (in-place)
    duplicates = []

    for i in range(1, len(lst)):
        if lst[i] == lst[i - 1] and lst[i] not in duplicates:
            duplicates.append(lst[i])

    return duplicates

# Example usage
lst = [1, 2, 3, 2, 4, 5, 1, 6, 1]
print(find_duplicates(lst))  # Output: [1, 2]

Time Complexity: O(nlog⁡n)O(n \log n) (due to sorting)
Space Complexity: O(1)O(1) (in-place sorting)


5. Brute Force (Inefficient)

This naive approach compares each element to every other element in the list.

def find_duplicates(lst):
    duplicates = []

    for i in range(len(lst)):
        for j in range(i + 1, len(lst)):
            if lst[i] == lst[j] and lst[i] not in duplicates:
                duplicates.append(lst[i])

    return duplicates

# Example usage
lst = [1, 2, 3, 2, 4, 5, 1, 6, 1]
print(find_duplicates(lst))  # Output: [1, 2]

Time Complexity: O(n2)O(n^2)
Space Complexity: O(n)O(n) (for the duplicates list)


Summary of Approaches

MethodTime ComplexitySpace ComplexityBest For
Dictionary/CounterO(n)O(n)O(n)O(n)Large lists where speed matters
collections.CounterO(n)O(n)O(n)O(n)Concise, Pythonic implementations
Set-BasedO(n)O(n)O(n)O(n)Avoiding duplicates during search
SortingO(nlog⁡n)O(n \log n)O(1)O(1)Memory constraints or sorted output needed
Brute ForceO(n2)O(n^2)O(n)O(n)Small lists or educational purposes

For most practical purposes, dictionary-based or set-based approaches are preferred due to their efficiency.

My Thought

Your email address will not be published. Required fields are marked *

Our Tool : hike percentage calculator