Example of implement limited size priorityqueue using Guava's MinMaxPriorityQueue
For example, Let’s say if I want to keep track of the top 10 most frequent elements. The following would be the implementation if I use Java’s native PriorityQueue.
//using Java's priorityQueue
public List<Integer> getTopKFrequent(int[] nums, int k) {
List<Integer> result = new ArrayList<>();
List<Integer> distinctList = new ArrayList<>(); //distinct numbers
HashMap<Integer, Integer> map = new HashMap<Integer, Integer>(); //number, frequency
for(int num: nums){
if(map.containsKey(num)){
map.put(num, map.get(num)+1);
}else{
map.put(num, 1);
distinctList.add(num);
}
}
//PriorityQuueue with custom comparator, with the head of queue being the element with highest frequency.
PriorityQueue<Integer> maxHeap = new PriorityQueue<>((a,b) -> (map.get(b)-map.get(a)));//offer first k from distinct number list
for(int i =0; i<distinctList.size(); i++){
maxHeap.offer(distinctList.get(i));
}
for(int i=0; i<k; i++){
result.add(maxHeap.poll());
}
return result;
}
|
the priorityqueue will hold N elements ( n being the number of distinct elements in the list). it can get very memory intensive when N gets big (let’s say 200k items).
so, We currently have two options if we need to save on memory.
Method 1: use minHeap. we add the elements to the queue until the size of queue reaches k, then we can compare each of the rest of the elements with the head of queue. ( see code_snippet_2 below)
Method 2: use Guava’s minMaxQueue , which can be configured with a maximum size. each time the size of queue exceeds the configured maximum size, the queue automatically removes its greatest element according to its comparator. ( see code_snippet_3 below)
code_snippet_2
//using Java's priorityQueue -minHeap
public List<Integer> getTopKFrequent(int[] nums, int k) {
List<Integer> result = new ArrayList<>();
List<Integer> distinctList = new ArrayList<>(); //distinct numbers
HashMap<Integer, Integer> map = new HashMap<Integer, Integer>(); //number, frequency
for(int num: nums){
if(map.containsKey(num)){
map.put(num, map.get(num)+1);
}else{
map.put(num, 1);
distinctList.add(num);
}
}
//PriorityQuueue with custom comparator, with the head of queue being the element with lowest frequency.
PriorityQueue<Integer> minHeap = new PriorityQueue<>((a, b) -> (map.get(a)-map.get(b)));//offer first k from distinct number list
int index = 0;
for(int i =0; i<k; i++){
minHeap.offer(distinctList.get(i));
index++;
}
while(index < distinctList.size()){
if(map.get(minHeap.peek())< map.get(distinctList.get(index))){
minHeap.poll();
minHeap.offer(distinctList.get(index));
}
index ++;
}
//add to the result in reverse order
for(int i=0; i<k; i++){
result.add(0, minHeap.poll());
}
return result;
}
|
code_snippet_3
//using Guava's minMaxPriorityQueue
public List<Integer> getTopKFrequent(int[] nums, int k) {
List<Integer> result = new ArrayList<>();
List<Integer> temp = new ArrayList<>(); //distinct numbers
HashMap<Integer, Integer> map = new HashMap<>(); //number, frequency
for(int num: nums){
if(map.containsKey(num)){
map.put(num, map.get(num)+1);
}else{
map.put(num, 1);
temp.add(num);
}
}
//Guava's MinMaxPriorityQueue
MinMaxPriorityQueue<Integer> maxHeap = MinMaxPriorityQueue.orderedBy(new Comparator<Integer>(){
public int compare(Integer a, Integer b){
return Integer.compare(map.get(b), map.get(a));
}
}).maximumSize(k).create();
//offer all elements, MinMaxPriorityQueue will envict based on comparators
for(int i =0; i<temp.size(); i++){
maxHeap.add(temp.get(i));
}
//add to the result
for(int i=0; i<k; i++){
result.add(maxHeap.poll());
}
return result;
}
|
At my test, when the number of distinct element is 7, the running time using minHeap is 38ms, while the running time using Guava’s MinMaxPriorityQueue is 8 ms.