Quantcast
Channel: Baeldung
Viewing all articles
Browse latest Browse all 3550

A Guide to ConcurrentMap

$
0
0

1. Overview

Maps are naturally one of the most widely style of Java collection.

And, importantly, HashMap is not a thread-safe implementation, while Hashtable does provide thread-safety by synchronizing operations.

Even though Hashtable is thread safe, it is not very efficient. Another fully synchronized Map, Collections.synchronizedMap, does not exhibit great efficiency either. If we want thread-safety with high throughput under high concurrency, these implementations aren’t the way to go.

To solve the problem, the Java Collections Framework introduced ConcurrentMap in Java 1.5.

The following discussions are based on Java 1.8.

2. ConcurrentMap

ConcurrentMap is an extension of the Map interface. It aims to provides a structure and guidance to solving the problem of reconciling throughput with thread-safety.

By overriding several interface default methods, ConcurrentMap gives guidelines for valid implementations to provide thread-safety and memory-consistent atomic operations.

Several default implementations are overridden, disabling the null key/value support:

  • getOrDefault
  • forEach
  • replaceAll
  • computeIfAbsent
  • computeIfPresent
  • compute
  • merge

The following APIs are also overridden to support atomicity, without a default interface implementation:

  • putIfAbsent
  • remove
  • replace(key, oldValue, newValue)
  • replace(key, value)

The rest of actions are directly inherited with basically consistent with Map.

3. ConcurrentHashMap

ConcurrentHashMap is the out-of-box ConcurrentMap implementation. For better performance, it consists of a set of tables (segments), each of which can be independently locked, thus read and update operations mostly do not block.

The number of segments required is relative to the number of threads accessing the table so that the update in progress per segment would be no more than one most of time.

That’s why constructors, compared to HashMap, provides the extra concurrencyLevel argument to control the number of estimated threads to use:

public ConcurrentHashMap(
  int initialCapacity, float loadFactor, int concurrencyLevel)

The other two arguments: initialCapacity and loadFactor, work quite the same as HashMap.

3.1. Thread-Safety

ConcurrentMap guarantees memory consistency on key/value operations in a multi-threading environment.

Actions in a thread prior to placing an object into a ConcurrentMap as a key or value happen-before actions subsequent to the access or removal of that object in another thread.

To confirm, let’s have a look at a memory inconsistent case:

@Test
public void givenHashMap_whenSumParallel_thenError() throws Exception {
    Map<String, Integer> map = new HashMap<>();
    List<Integer> sumList = parallelSum100(map, 100);

    assertNotEquals(1, sumList
      .stream()
      .distinct()
      .count());
    long wrongResultCount = sumList
      .stream()
      .filter(num -> num != 100)
      .count();
    
    assertTrue(wrongResultCount > 0);
}

private List<Integer> parallelSum100(Map<String, Integer> map, 
  int executionTimes) throws InterruptedException {
    List<Integer> sumList = new ArrayList<>(1000);
    for (int i = 0; i < executionTimes; i++) {
        map.put("test", 0);
        ExecutorService executorService = 
          Executors.newFixedThreadPool(4);
        for (int j = 0; j < 10; j++) {
            executorService.execute(() -> {
                for (int k = 0; k < 10; k++)
                    map.computeIfPresent(
                      "test", 
                      (key, value) -> value + 1
                    );
            });
        }
        executorService.shutdown();
        executorService.awaitTermination(5, TimeUnit.SECONDS);
        sumList.add(map.get("test"));
    }
    return sumList;
}

For each map.computeIfPresent action in parallel, HashMap does not provide a consistent view of what should be the present integer value, leading to inconsistent and undesirable results.

As for ConcurrentHashMap, we can get a consistent and correct result:

@Test
public void givenConcurrentMap_whenSumParallel_thenCorrect() 
  throws Exception {
    Map<String, Integer> map = new ConcurrentHashMap<>();
    List<Integer> sumList = parallelSum100(map, 1000);

    assertEquals(1, sumList
      .stream()
      .distinct()
      .count());
    long wrongResultCount = sumList
      .stream()
      .filter(num -> num != 100)
      .count();
    
    assertEquals(0, wrongResultCount);
}

3.2. Null Key/Value

Most APIs provided by ConcurrentMap does not allow null key or value, for example:

@Test(expected = NullPointerException.class)
public void givenConcurrentHashMap_whenPutWithNullKey_thenThrowsNPE() {
    concurrentMap.put(null, new Object());
}

@Test(expected = NullPointerException.class)
public void givenConcurrentHashMap_whenPutNullValue_thenThrowsNPE() {
    concurrentMap.put("test", null);
}

However, for compute* and merge actions, the computed value can be null, which indicates the key-value mapping is removed if present or remains absent if previously absent.

@Test
public void givenKeyPresent_whenComputeRemappingNull_thenMappingRemoved() {
    Object oldValue = new Object();
    concurrentMap.put("test", oldValue);
    concurrentMap.compute("test", (s, o) -> null);

    assertNull(concurrentMap.get("test"));
}

3.3. Performance

Under the hood, ConcurrentHashMap is somewhat similar to HashMap, with data access and update based on a hash table (though more complex).

And of course the ConcurrentHashMap should yield much better performance in most concurrent cases for data retrieval and update.

Let’s write a quick micro-benchmark for get and put performance and compare that to Hashtable and Collections.synchronizedMap, running both operations for 500,000 times in 4 threads.

@Test
public void givenMaps_whenGetPut500KTimes_thenConcurrentMapFaster() 
  throws Exception {
    Map<String, Object> hashtable = new Hashtable<>();
    Map<String, Object> synchronizedHashMap = 
      Collections.synchronizedMap(new HashMap<>());
    Map<String, Object> concurrentHashMap = new ConcurrentHashMap<>();

    long hashtableAvgRuntime = timeElapseForGetPut(hashtable);
    long syncHashMapAvgRuntime = 
      timeElapseForGetPut(synchronizedHashMap);
    long concurrentHashMapAvgRuntime = 
      timeElapseForGetPut(concurrentHashMap);

    assertTrue(hashtableAvgRuntime > concurrentHashMapAvgRuntime);
    assertTrue(syncHashMapAvgRuntime > concurrentHashMapAvgRuntime);
}

private long timeElapseForGetPut(Map<String, Object> map) 
  throws InterruptedException {
    ExecutorService executorService = 
      Executors.newFixedThreadPool(4);
    long startTime = System.nanoTime();
    for (int i = 0; i < 4; i++) {
        executorService.execute(() -> {
            for (int j = 0; j < 500_000; j++) {
                int value = ThreadLocalRandom
                  .current()
                  .nextInt(10000);
                String key = String.valueOf(value);
                map.put(key, value);
                map.get(key);
            }
        });
    }
    executorService.shutdown();
    executorService.awaitTermination(1, TimeUnit.MINUTES);
    return (System.nanoTime() - startTime) / 500_000;
}

Keep in mind micro-benchmarks are only looking at a single scenario and aren’t always a good reflection of real world performance.

That being said, on an OS X system with an an average dev system, we’re seeing an average sample result for 100 consecutive runs (in nanoseconds):

Hashtable: 1142.45
SynchronizedHashMap: 1273.89
ConcurrentHashMap: 230.2

In a multi-threading environment, where multiple threads are expected to access a common Map, the ConcurrentHashMap is clearly preferable.

However, when the Map is only accessible to a single thread, HashMap can be a better choice for its simplicity and solid performance.

3.4. Pitfalls

Retrieval operations generally do not block in ConcurrentHashMap and could overlap with update operations. So for better performance, they only reflect the results of the most recently completed update operations, as stated in the official Javadoc.

There are several other facts to bear in mind:

  • results of aggregate status methods including size, isEmpty, and containsValue are typically useful only when a map is not undergoing concurrent updates in other threads:
@Test
public void givenConcurrentMap_whenUpdatingAndGetSize_thenError() 
  throws InterruptedException {
    Runnable collectMapSizes = () -> {
        for (int i = 0; i < MAX_SIZE; i++) {
            mapSizes.add(concurrentMap.size());
        }
    };
    Runnable updateMapData = () -> {
        for (int i = 0; i < MAX_SIZE; i++) {
            concurrentMap.put(String.valueOf(i), i);
        }
    };
    executorService.execute(updateMapData);
    executorService.execute(collectMapSizes);
    executorService.shutdown();
    executorService.awaitTermination(1, TimeUnit.MINUTES);

    assertNotEquals(MAX_SIZE, mapSizes.get(MAX_SIZE - 1).intValue());
    assertEquals(MAX_SIZE, concurrentMap.size());
}

If concurrent updates are under strict control, aggregate status would still be reliable.

Although these aggregate status methods do not guarantee the real-time accuracy, they may be adequate for monitoring or estimation purposes.

  • hashCode matters: note that using many keys with exactly the same hashCode() is a sure way to slow down a performance of any hash table.

To ameliorate impact when keys are Comparable, ConcurrentHashMap may use comparison order among keys to help break ties. Still, we should avoid using the same hashCode() as much as we can.

  • iterators are only designed to use in a single thread as they provide weak consistency rather than fast-fail traversal, and they will never throw ConcurrentModificationException.
  • the default initial table capacity is 16, and it’s adjusted by the specified concurrency level:
public ConcurrentHashMap(
  int initialCapacity, float loadFactor, int concurrencyLevel) {
 
    //...
    if (initialCapacity < concurrencyLevel) {
        initialCapacity = concurrencyLevel;
    }
    //...
}
  • keys in ConcurrentHashMap are not in sorted order, so for cases when ordering is required, ConcurrentSkipListMap is a suitable choice.

4. ConcurrentNavigableMap

For cases when ordering of keys is required, we can use ConcurrentSkipListMap, a concurrent version of TreeMap.

As a supplement for ConcurrentMap, ConcurrentNavigableMap supports total ordering of its keys (in ascending order by default) and is concurrently navigable. Methods that return views of the map are overridden for concurrency compatibility:

  • subMap
  • headMap
  • tailMap
  • subMap
  • headMap
  • tailMap
  • descendingMap

keySet() views’ iterators and spliterators are enhanced with weak-memory-consistency:

  • navigableKeySet
  • keySet
  • descendingKeySet

5. ConcurrentSkipListMap

Previously, we have covered NavigableMap interface and its implementation TreeMap. ConcurrentSkipListMap can be seen a scalable concurrent version of TreeMap.

In practice, there’s no concurrent implementation of the red-black tree in Java. A concurrent variant of SkipLists is implemented in ConcurrentSkipListMap, providing an expected average log(n) time cost for the containsKey, get, put and remove operations and their variants.

In addition to TreeMap‘s features, key insertion, removal, update and access operations are guaranteed with thread-safety. Here’s a comparison to TreeMap when navigating concurrently:

@Test
public void givenSkipListMap_whenNavConcurrently_thenCountCorrect() 
  throws InterruptedException {
    NavigableMap<Integer, Integer> skipListMap
      = new ConcurrentSkipListMap<>();
    int count = countMapElementByPollingFirstEntry(skipListMap, 10000, 4);
 
    assertEquals(10000 * 4, count);
}

@Test
public void givenTreeMap_whenNavConcurrently_thenCountError() 
  throws InterruptedException {
    NavigableMap<Integer, Integer> treeMap = new TreeMap<>();
    int count = countMapElementByPollingFirstEntry(treeMap, 10000, 4);
 
    assertNotEquals(10000 * 4, count);
}

private int countMapElementByPollingFirstEntry(
  NavigableMap<Integer, Integer> navigableMap, 
  int elementCount, 
  int concurrencyLevel) throws InterruptedException {
 
    for (int i = 0; i < elementCount * concurrencyLevel; i++) {
        navigableMap.put(i, i);
    }
    
    AtomicInteger counter = new AtomicInteger(0);
    ExecutorService executorService
      = Executors.newFixedThreadPool(concurrencyLevel);
    for (int j = 0; j < concurrencyLevel; j++) {
        executorService.execute(() -> {
            for (int i = 0; i < elementCount; i++) {
                if (navigableMap.pollFirstEntry() != null) {
                    counter.incrementAndGet();
                }
            }
        });
    }
    executorService.shutdown();
    executorService.awaitTermination(1, TimeUnit.MINUTES);
    return counter.get();
}

A full explanation of the performance concerns behind the scenes is beyond the scope of this article. The details can be found in ConcurrentSkipListMap’s Javadoc, which is located under java/util/concurrent in the src.zip file.

6. Conclusion

In this article, we mainly introduced the ConcurrentMap interface and the features of ConcurrentHashMap and covered on ConcurrentNavigableMap being key-ordering required.

The full source code for all the examples used in this article can be found in the GitHub project.


Viewing all articles
Browse latest Browse all 3550

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>