Topic of this chapter requires us to familiar with the Collection interface and its sub interfaces -- what's the usage, what's the methods. We also need to compare a few implementations of the interfaces.
For experienced java programers, it is better to study this chapter with a later chapter:
Use java.util.concurrent collections and classes including CyclicBarrier and CopyOnWriteArrayList
Collection and Map interface |
There are five main interfaces in the Java Collections Framework:
- Collection: a group of objects contained in a single object.
- Set: a collection that does not allow duplicate entries.
- List: an ordered collection of elements the allows duplicate entries, which can be accessed by an int index.
- Queue: a collection that orders its elements in a specific order for processing.
- Map: a group of key/value pairs that maps keys to values. No duplicate keys are allowed.
The following video gives an overview of java Collection framework.
OCPJP tested interface methods are:
Collections Methods:
- boolean add(E element) -- insert an element and returns whether it was successful.
- boolean remove(Object o) -- removes a matching value and returns whether it was successful.
- boolean isEmpty()
- boolean size()
- void clear() -- remove all elements of the Collection.
- boolean contains(Object o)
List Methods:
- void add(E element) -- add element to end
- void add(int index, E element) -- add element at index and moves the rest toward the end
- E get(int index)
- int indexOf(Object o) -- returns first matching index or -1 if not found
- int lastIndexOf(Object o)
- void remove(int index) -- remove element at index and moves the rest toward the front
- E set(int index, E e) -- replace element at index and returns original
Set Methods:
The Set interface don't have extra methods in the OCPJP scope.
NavigableSet Methods:
- E lower(E e) -- return greatest element that is < e, or null if no such element.
- E floor(E e) -- return greatest element that is <= e, or null if no such element.
- E ceiling(E e) -- return smallest element that is >= e, or null if no such element.
- E higher(E e) -- return smallest element that is > e, or null if no such element.
Queue Methods:
Throws exception | Returns null | |
Insert | boolean add(E e) |
boolean offer(E e) |
Remove | E remove() |
E poll() |
Examine | E element() |
E peek() |
Deque Methods:
- void push(E e) -- add an element at the head of the deque.
- void pop() -- removes and returns the first element of the deque.
Map Methods:
- V get(Object key) -- return the value mapped by key or null if none is mapped.
- V put(K key, V value) -- add or replace key/value pair. Returns previous value or null.
- V remove(Object key) -- remove and returns value mapped to key or null if absent.
- Set<K> keySet() -- return set of all keys.
- Collection<V> values() -- return collection of all values.
- boolean containsKey(Object key)
- boolean containsValue(Object value)
- void clear() -- remove all keys and values from the map.
- boolean isEmpty()
- int size()
The OCPJP Exam topics explicitly mentioned the following implementations, we need to know them as well as being able to compare them with other imlementations.
ArrayList
ArrayList is a topic of OCAJP, however on OCPJP, the emphasis are:- It is Resizable-Array implementation of List. When elements are added, the capacity automatically grows.
- Accessing is fast and add/remove is slow.
TreeSet
TreeSet is a sorted tree implementation of NavigableSet interface. It is always ordered. The basic operations have O(log n) time cost.If you want to know more about Tree set, watch the following video for an introduction about TreeSet's methods and usage.
TreeMap
TreeMap stores the keys in a sorted tree data structure. It is always ordered. The basic operations have O(log n) time cost.ArrayDeque
Resizable-Array implementation of Deque interface. Null elements are prohibited. This class is likely to be faster than Stack when used as a stack, and faster than LinkedList when used as a queue.Comparison of Implementations
+ means fast O(1)
* means normal O(log n)
- means slow O(n)
modify and retrieve performance measured at worst scenario (for example, if add/remove middle nodes is slow, but add/remove head/tail node is fast, it is marked as modify - )
mouse over for tips in the table below
Class | Interfaces | Sorted? | use hashCode? | allow null? | performance | memory usage |
---|---|---|---|---|---|---|
ArrayList | List | No | No | Yes |
modify -, retrieve
|
normalEach ArrayList instance has a capacity. The capacity is the size of the array used to store the elements in the list. It is always at least as large as the list size. As elements are added to an ArrayList, its capacity grows automatically.
|
LinkedList | List, Queue | No | No | Yes |
modify +, retrieve -
All of the operations perform as could be expected for a doubly-linked list.
|
normal |
HashSet | Set | No | Yes | Yes |
modify +, retrieve +
Backed by a HashMap instance. The add, remove, contains and size operations run in constant time. Iterating over this set requires time proportional to the sum of the number of elements plus the number of buckets.
|
low |
TreeSet | Set | Yes | No | No |
modify *, retrieve *
The add, remove and contains operations take log(n) time cost.
|
high |
ArrayDeque | Queue | No | No | No |
modify +, retrive +
Most ArrayDeque operations run in amortized constant time. Exceptions include remove, removeFirstOccurrence, removeLastOccurrence, contains, iterator.remove(), and the bulk operations, all of which run in linear time.
|
normal |
HashMap | Map | No | Yes | Yes |
modify +, retrieve +
The get and put operations run in constant time. Iterating over this set requires time proportional to the sum of the number of elements plus the number of buckets. Thus, it's very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important.
|
low
The initial capacity and load factor effect HashMap performance. The initial capacity is the initial number of buckets. The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. When the (number of entries) > (load factor) X (number of buckets), the hash table is rehashed and the number of buckets are roughly doubled. The default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost. Creating it with a sufficiently large capacity to avoid costly rehashing.
|
TreeMap | Map | Yes | No | No |
modify *, retrieve *
This implementation provides guaranteed log(n) time cost for the containsKey, get, put and remove operations.
|
high |
OCPJP Pro: Tons of glossary to remember, it is really boring!
SwimFish: I know, I know, I feel the same...
OCPJP Pro: better way to learn Collection framework?
SwimFish: maybe learn java Collection framework in a game?