Algorithms - Heaps / Priority Queues - Efficient top-k or priority processing
Intro
At the moment I am looking for a job, so in my research into coding test platforms, the problems that cannot be solved with just a little thinking and some loops are 80% covered by just 8 categories of problems:
- Optimization with overlapping subproblems (coin change, longest increasing subsequence, knapsack variants, matrix chain) - solved with Dynamic Programming (memoization/tabulation).
- Pathfinding, connectivity, or level-order processing in grids/graphs (number of islands, shortest path in maze, rotten oranges) - solved with Graph Traversal: BFS and DFS.
- Hierarchical data processing (tree diameter, in order traversal, BST validation, lowest common ancestor) - solved with Tree Traversals and Properties (pre/in/post-order, level-order, recursion/DFS on trees).
- Exhaustive search with pruning (permutations, subsets, N-Queens, sudoku solver, word search) - solved with Backtracking.
- Local optimal choices that work globally (jump game, task scheduler, fractional knapsack) - solved by Greedy Algorithms.
- Efficient "top-k" or priority processing (merge k sorted lists, kth largest element, task scheduler with cooldown) - solved with Heaps / Priority Queues.
- Efficient union of sets / cycle detection in graphs (redundant connection, number of provinces, Kruskal's MST) - solved with Union-Find (Disjoint Set Union) with path compression & union-by-rank.
- Searching in sorted/monotonic spaces (search in rotated sorted array, minimum in rotated array, binary search on answer for capacity problems) - solved by Binary Search (advanced variants).
Bonus:
- Efficient prefix and pattern matching - solved by Trie and KMP String Search
- Efficient range queries and updates - solved by Segment Tree and Fenwick Tree
- Shortest paths with negative weights and all-pairs - solved by Bellman-Ford and Floyd-Warshall
Today I cover Heaps and Priority Queues.
When to use Heaps / Priority Queues
We already saw a heap in action in the Greedy post (Task Scheduler), but today we'll zoom in on the data structure itself and why it's the perfect tool whenever you need the "best" or "worst" item right now without sorting the whole collection.
Recognize the problems that can be solved by heaps / priority queues by the following signs:
- You need repeated access to the current minimum or maximum element while the collection keeps changing (insertions, deletions, updates).
- You only care about the top-k (best/worst) elements, not the full sorted order.
- You have to process tasks, events or nodes in priority order (highest frequency, earliest deadline, smallest value, etc.).
- The data is dynamic - elements arrive over time and you need O(log n) updates.
- Brute force would require sorting the entire set repeatedly (O(n log n) per operation) but you can do much better.
- The problem mentions "k largest/smallest", "merge k sorted", "median finder", "task scheduler with cooldown", "Dijkstra shortest path", etc.
If the collection is static and you need everything sorted, just .Sort() or .OrderBy. Heaps shine when you only ever need the extreme values and you need them fast - O(log n) per insert or extract.
Classic patterns appear in LeetCode/HackerRank: Kth Largest Element, Merge K Sorted Lists, Task Scheduler, Find Median from Data Stream, Top K Frequent Elements, etc.
One-sentence rule: If you need the "best" (or "worst") item right now, repeatedly, while the set changes, and you don't need the full sorted list, use a heap (priority queue).
Example - Kth largest element
The problem is called Kth Largest Element.
Problem statement: Given an integer array nums and an integer k, return the k-th largest element in the array.
Naïve solution: sort the array, complexity O(n log n).
Better solution: keep a min-heap of size exactly k. At the end the root of the heap is the kth largest.
int FindKthLargest(int[] nums, int k)
{
// Min-heap: smallest of the k largest candidates sits at the top
var pq = new PriorityQueue<int, int>();
foreach (var num in nums)
{
if (pq.Count < k)
{
pq.Enqueue(num, num); // priority = value (min-heap)
}
else if (num > pq.Peek())
{
pq.Dequeue();
pq.Enqueue(num, num);
}
}
return pq.Peek();
}
var nums = new[] { 3, 2, 1, 5, 6, 4 };
Console.WriteLine(FindKthLargest(nums, 2)); // 5 It's basically the classic find min/max loop, but with an interval of k values instead of a single value.
We only ever keep the k largest numbers seen so far. Because it's a min-heap, the smallest among them is always at the root. Any number smaller than that root can be discarded forever - it can never be part of the top-k. At the end we have exactly the k largest elements and the root is the kth.
Works perfectly for streaming data or when k is much smaller than n.
Complexity:
- Time: O(n log k) - each of the n elements costs at most one heap operation
- Space: O(k)
Theory
A (binary) heap is a complete binary tree that satisfies the heap property:
- Max-heap: every parent ≥ children
- Min-heap: every parent ≤ children
Because it's complete, it can be stored in an array with beautiful cache locality (parent at i, children at 2i+1 and 2i+2).
C# gives us PriorityQueue<TElement, TPriority> (available since .NET 6). By default it is a min-heap on the priority value.
QuickSort works like this:
- Pick a pivot.
- Partition the array so everything smaller is on the left, everything larger on the right.
- Recurse on both sides.
QuickSelect does exactly the same partition, but then throws away the side that cannot contain the k-th element and recurses on only one side. That single decision is what drops the average time from O(n log n) to O(n).
Heaps / Priority Queues template
Even though the concept is simple, there is a reusable pattern:
- Decide min-heap or max-heap (C# PriorityQueue is min-heap by default on the priority value).
- Create the queue: new PriorityQueue<TElement, TPriority>() (or with custom comparer).
- Seed the heap with initial elements (often first k or first node of each list).
- While you need the next "best" item:
- Peek() or Dequeue() the top
- If more data exists for that source, Enqueue the next one
- Stop when you have extracted what you need.
Common patterns:
- Same value for element and priority ->
PriorityQueue<int, int> - Max-heap -> use negative priority or a custom
IComparer<TPriority> - Complex objects -> store a tuple or a small record as the element and the sorting key as priority.
In C# it looks like this:
var comparer = Comparer<TPriority>.Create(/* custom logic */);
var pq = new PriorityQueue<TElement, TPriority>(comparer);
// Enqueue / Dequeue / Peek are all O(log n)
// Count and Peek are O(1)
Common Pitfalls & Trade-Offs:
- Choosing min-heap vs max-heap - double-check with a small example.
- Custom comparers on complex types - make sure they are consistent (same as IComparable if you mix approaches).
- Using a heap when you actually need the full sorted list later - sometimes two heaps (or a sorted set) are cheaper.
- Memory pressure when k is huge - O(k) space is fine for k ≤ n/2, otherwise a QuickSelect can be better (see the theory chapter)
- Ties - if two items have the same priority, the heap doesn't guarantee FIFO order unless you add an extra index to the priority.
Example - Merge k sorted lists
A more impressive use case is Merge K Sorted Lists.
Problem statement: You are given an array of k linked lists, each sorted in ascending order. Merge them into one sorted linked list and return it.
Example: lists = [ [1,4,5], [1,3,4], [2,6] ]
Output: [ 1, 1, 2, 3, 4, 4, 5, 6 ]
Naive way (merge two at a time repeatedly) is painfully slow.
Heap way:
ListNode MergeKLists(ListNode[] lists)
{
if (lists == null || lists.Length == 0) return null;
// Min-heap: priority = node.Value
var pq = new PriorityQueue<ListNode, int>();
// Seed with the head of every non-empty list
for (int i = 0; i < lists.Length; i++)
{
if (lists[i] != null)
{
pq.Enqueue(lists[i], lists[i].Value);
}
}
ListNode dummy = new ListNode(0);
ListNode tail = dummy;
while (pq.Count > 0)
{
ListNode node = pq.Dequeue();
tail.Next = node;
tail = tail.Next;
if (node.Next != null)
{
pq.Enqueue(node.Next, node.Next.Value);
}
}
return dummy.Next;
}
ListNode[] lists = [
new [] {1, 4, 5},
new [] {1, 3, 4},
new [] {2, 6},
];
Console.WriteLine(MergeKLists(lists));
The ListNode class is this:
using System.Text;
public class ListNode(int value)
{
public int Value { get; set; } = value;
public ListNode Next { get; set; }
/// <summary>
/// Define a linked list from an int array
/// </summary>
/// <param name="values"></param>
public static implicit operator ListNode(int[] values)
{
if (values == null || values.Length == 0)
return null;
var head = new ListNode(0);
var current = head;
foreach(var value in values)
{
current.Next = new ListNode(value);
current = current.Next;
}
return head.Next;
}
public override string ToString()
{
var list = new List<int>();
var node = this;
while (node is not null)
{
list.Add(node.Value);
node = node.Next;
}
return "[ "+string.Join(", ", list) + "]";
}
} This is optimal:
- Every node is enqueued and dequeued exactly once - O(N log k) where N is total nodes
- At any moment the heap holds at most k nodes (one per list)
- Space: O(k)
- Beats the naive O(N * k) or repeated pairwise merging dramatically
More theory
Dijkstra's algorithm computes the shortest paths from a single source vertex to all other vertices in a graph with non-negative edge weights. It works exactly like a "greedy best-first" expansion: keep a min-heap (priority queue) of nodes ordered by their current known distance from the source. Repeatedly extract the node with the smallest distance, then relax (update) all its neighbors. Because we always process the closest unsettled node next, the first time a node is dequeued its distance is guaranteed to be optimal. The heap is what makes the repeated "extract the current best candidate" operation blazingly fast - turning the algorithm into the go-to solution for routing, GPS, network delay, and hundreds of other real-world problems.
Don't hate me. We need this for the next example.
Example - Dijkstra shortest path
Problem name is Dijkstra's Shortest Path.
Problem statement: Given a weighted graph with non-negative edges represented as an adjacency list (List<List<(int node, int weight)>>) and a source node, return the shortest distance from the source to every other node.
Example graph (4 nodes):
// 0 ──4──► 1
// │ ▲
// │ │
// 1 2
// │ │
// ▼ ▼
// 2 ──5──► 3
// (plus 2→1 weight 2 and 1→3 weight 2)
var graph = new List<List<(int node, int weight)>>
{
new() { (1, 4), (2, 1) }, // from 0
new() { (3, 2) }, // from 1
new() { (1, 2), (3, 5) }, // from 2
new() {} // from 3
};
int source = 0; Expected output: [0, 3, 1, 5]
(0 -> 2 -> 1 costs 1+2=3, better than direct edge 4; 0 -> 2 -> 1 -> 3 costs 5)
int[] Dijkstra(List<List<(int node, int weight)>> graph, int source)
{
var n = graph.Count;
var dist = Enumerable.Repeat(int.MaxValue, n).ToArray();
dist[source] = 0;
// Min-heap: (current_distance, node)
var pq = new PriorityQueue<(int distance, int node), int>();
pq.Enqueue((0, source), 0);
while (pq.Count > 0)
{
var (distance, node) = pq.Dequeue();
if (distance > dist[node]) continue; // outdated entry - ignore
foreach (var (destNode, destDistance) in graph[node])
{
var newDistance = dist[node] + destDistance;
if (newDistance < dist[destNode])
{
dist[destNode] = newDistance;
pq.Enqueue((newDistance, destNode), newDistance);
}
}
}
return dist;
}
var graph = new List<List<(int node, int weight)>>
{
new() { (1, 4), (2, 1) }, // from 0
new() { (3, 2) }, // from 1
new() { (1, 2), (3, 5) }, // from 2
new() {} // from 3
};
int source = 0;
var distances = Dijkstra(graph, source);
Console.WriteLine(string.Join(", ", distances)); // 0, 3, 1, 5 The priority queue always hands us the next "best" (closest) unsettled node. The check if (d > dist[u]) continue is the classic lazy-deletion trick that handles duplicate entries efficiently. With non-negative weights, the first time we dequeue a node its distance can never be improved later, exactly the same "process the current global best" pattern we saw in Merge K Lists and Kth Largest.
Complexity:
- Time: O((V + E) log V)
- Space: O(V)
Conclusion
Heaps / Priority Queues turn repeated "find the current best" operations from O(n log n) into O(log k) or O(log n) and power everything from task schedulers to Dijkstra's shortest path to median finders. Once you get it, you stop reaching for .Sort() when you only need the extremes.
Comments
Be the first to post a comment