python – Whats the time complexity of functions in heapq library

python – Whats the time complexity of functions in heapq library

heapq is a binary heap, with O(log n) push and O(log n) pop. See the heapq source code.

The algorithm you show takes O(n log n) to push all the items onto the heap, and then O((n-k) log n) to find the kth largest element. So the complexity would be O(n log n). It also requires O(n) extra space.

You can do this in O(n log k), using O(k) extra space by modifying the algorithm slightly. Im not a Python programmer, so youll have to translate the pseudocode:

# create a new min-heap
# push the first k nums onto the heap
for the rest of the nums:
    if num > heap.peek()

# at this point, the k largest items are on the heap.
# The kth largest is the root:

return heap.pop()

The key here is that the heap contains just the largest items seen so far. If an item is smaller than the kth largest seen so far, its never put onto the heap. The worst case is O(n log k).

Actually, heapq has a heapreplace method, so you could replace this:

    if num > heap.peek()


    if num > heap.peek()

Also, an alternative to pushing the first k items is to create a list of the first k items and call heapify. A more optimized (but still O(n log k)) algorithm is:

# create array of first `k` items
heap = heapify(array)
for remaining nums
    if (num > heap.peek())
return heap.pop()

You could also call heapify on the entire array, then pop the first n-k items, and then take the top:

for i = 0 to n-k
return heapq.heappop(nums)

Thats simpler. Not sure if its faster than my previous suggestion, but it modifies the original array. The complexity is O(n) to build the heap, then O((n-k) log n) for the pops. So its be O((n-k) log n). Worst case O(n log n).

heapify() actually takes linear time because the approach is different than calling heapq.push() N times.

heapq.push()/heapq.pop() takes log n time because it adjust all the nodes at a given hight/level.

when you pass an array in heapify() it makes sure that the left and right children of the node are already maintaining the heap property whether it is a min heap or max heap.

you can see this video:

Hope this would help.

python – Whats the time complexity of functions in heapq library

Summarize from @Shivam purbia s post:

  1. Using heaps.heapify() can reduce both time and space complexity because heaps.heapify() is an in-place heapify and costs linear time to run it.
  2. both heapq.heappush() and heapq.heappop() cost O(logN) time complexity

Final code will be like this …

import heapq

def findKthLargest(self, nums, k):
    heaps.heapify(nums)            # in-place heapify -> cost O(N) time
    for _ in range(len(nums)-k):   # run (N-k) times
        heapq.heappop(heap)        # cost O(logN) time
    return heapq.heappop(heap)     
  • Total time complexity is O((N – k)logN)
  • Total space complexity is O(1)

Leave a Reply

Your email address will not be published. Required fields are marked *