Okay, this is a quick video on building a binary heap, in worst-case linear time, with an elegant proof. Before watching the video, you should understand the introduction to heaps video, where we saw the insertion and heapify methods. Of course, we can build a heap by iterated insertion, that looks great at first, because the heap is small, and insertion is quick. But as the heap gets bigger, elements are inserted deeper down. In the worst case, we insert elements in increasing order, and it will take order n log n time. The last half of the elements inserted are all at about log n depth, and each might need to bubble all the way up to the root. To be fair, iterated insertion gets a bit of a bum rap: for a set of values in random order, most don’t really bubble up too much. Most values end up near the bottom of the tree, and each insertion will, on average, take just a couple of swaps. But, that analysis is definitely not basic. Maybe I will make an advanced video sometime on that. On the other hand, there is a worst-case linear time build-heap operation. It uses that same observation, that most nodes have a small height. It works bottom-up. For our heap definition, we know that every node in a heap will root a valid sub-heap, now no matter what values you give me, any single node with no children looks like a valid heap, taking whatever is given to you, the values that fall into the leaf positions look good on their own. Half of the nodes are leaf nodes, so that ain’t nothin’. Going from the last node, towards the first, we eventually get to a node that is the parent of some other node. This one here in this case. Now considering it as the root of a subheap, it still has the shape of a heap, but its value isn’t bigger than the value in each of its children. It only has one child here. But, that is exactly the place where our max-heapify from the intro video can be used to fix the heap. So use it. For the next n/4 nodes, we will do the same thing, we will use max-heapify to fix all of these small heaps of height 1. Continuing, we fix the n/8 heaps of height 2, the n/16 heaps of height 3, and so on. The one case that was easiest for the iterated insertion, is now the node that looks worst, the root. But for the n/2 nodes that were worst for the iterated insertion, they are now trivial, because they are already heaps to begin with. For our worst-case analysis, we are now dealing with the sum of node heights in the final heap, rather than the sum of node depths, and fewer nodes have a large height. Note, these two methods might not give the same heap, and in fact, given a random starting permutation, they won’t end up even having the same probability distribution over different heaps. Now for the analysis: for each heapify operation, the worst-case runtime is proportional to the starting height of that node. Here, we start off very well, because half of the nodes start at height 0 and can be skipped altogether. A quarter of the nodes move down one level, an eighth move down two levels, and we can generalize the number of nodes at any given height, and sum over all node heights. With a bit of manipulation, this gives us order n worst-case runtime. By magic. Or calculus. Okay, well if want to make this more formal, but also simplify the math, we should first prove something about the structure of the heap. In a heap of n values, there are floor of (n/2^i) nodes of height i or greater. I’m actually going to leave that as an exercise for the viewer. It can be proved with induction. Now, when being heapified, only the values that start at height at least 1 can be moved down even one level. In the worst case, all of those nodes do move down one level. Nodes starting at height 2 [or higher] can each move down a second level. But we have already moved them down one level, so we only have to count one extra to move down that second level. Nodes starting at height 3 [or higher] can move down a third level. This gives us a summation that is easier to bound than the original summation, and a simple proof of the linear runtime for building a heap. Okay, the next standard video to watch is going to be the one on heapsort. At some point, I also expect to get up a few advanced videos, better looking at iterated insertion, as well as a more optimized heapify method. as well as a more optimized heapify method. But for now I’ve got to get my earwax removed. See you next time.


34 Comments

Leo · May 5, 2015 at 6:53 pm

Hey, first year of CS & math first degree here.
What method do you use for removal of ear wax?

SOS · August 15, 2015 at 7:03 pm

Hi David, thanks a lot. These are great set of videos. Could you number the videos, so that a learning trail is easier? For example binary heap lectures, it was difficult to find the next lecture. Numbering could help a lot.

Sandro Winkler · September 24, 2015 at 4:18 pm

Love your explaining style! +1

Shikhar Gupta · November 10, 2015 at 4:32 am

The animation is great. Helped me to quickly grasp the working of the algorithm.

soham lawar · November 30, 2015 at 4:28 am

Awesome videos , proper explanation , thanks a lot !

arvanaghi · December 11, 2015 at 8:28 pm

You are the best at explaining algorithms I have seen. Well done!

Lawrence Dizon · February 7, 2016 at 11:19 pm

OMG. Clear and precise animations! Thanks

PRANJALI PAI · February 19, 2016 at 1:03 am

Please do Red black trees as well!!!
Understood the concept s soo well!!

Alexander Kuptsov · April 20, 2016 at 6:27 pm

Dear Sir,

What you do is absolutely top notch! Thanks a million!

May I ask you what software you use to do such beautifully designed videos?

liljan777 · April 29, 2016 at 4:34 pm

Hello. Can you say the source of this algorithm ?Where i can read about this algorithm?

Ali Caglayan · May 28, 2016 at 4:02 pm

I love this channel. Keep up the good work!

numanumaro · June 14, 2016 at 3:24 am

In the previous video, didn't the iterated insertion take the least number of operations when the number was small? Because in that case, wouldn't the root take the most operations in both insertion and linear heap building?

Shubham Pathak · July 13, 2016 at 9:49 pm

Great work, Thank you ! Looking forward for more videos, Please keep making such awesome videos !

Amarnath Anand · August 3, 2016 at 5:35 am

I love this channel.. Precise explanations.. Thanks for your good work!!

vipul sharma · August 10, 2016 at 4:02 pm

aren't you the best 🙂

corporalwaffles · August 11, 2016 at 7:51 pm

In this video it looks like you are doing siftUp starting from the very last element of the array. From this link, (http://stackoverflow.com/questions/13025163/why-siftdown-is-better-than-siftup-in-heapify), they say that siftDown is faster. Can you please clarify this situation?

Nikhil Shaw · August 20, 2016 at 5:20 pm

Really an awesome channel! Just don;t stop making more interesting videos 😀

Nikhil Shaw · August 20, 2016 at 5:20 pm

Really an awesome channel! Just don;t stop making more interesting videos 😀

Deltoidz · September 11, 2016 at 6:48 am

These videos are making me learn algorithms at worst case linear time, instead of n^2 from textbooks!

SIR_Vampire · November 5, 2016 at 5:18 pm

Great video, thank you! Sometimes it can be hard to find its subsequent videos, would you consider putting down the links of subsequent videos in description?

videofountain · December 5, 2016 at 5:38 pm

I get this sense you are semi shouting into the microphone.

Filip Kopecký · January 6, 2017 at 2:34 am

I went from having no idea to understanding how it works and why it works in about 5 minutes, this is a great video.

Adrian Lundberg · January 9, 2017 at 7:37 pm

Thank you very much! However, I would like to point out a typo at 4:50. It should say 1/2^h, otherwise it would sum up to n * infinity, not n *unity.

Wes Winn · January 30, 2017 at 7:31 pm

Thank you for making these! I'm a CS graduate and after spending enough years using tools that other people have made, it's sometimes nice to go grab a reminder without a 60 minute lecture (turns out application development doesn't involve building heaps from scratch too often).

Anyways, if you decided to run on Patreon or somesuch, I'll happily subscribe 🙂

Paul Hinker · February 27, 2017 at 3:00 pm

Excellent Job! Thank you for these. Also noticed the 2^n vs 1/2^n typo as mentioned by Adrian Lunberg and saw the comment. Not familiar with YouTube comments, can you pin comments so they stay near or at the top?

WiseSageBum · March 6, 2017 at 12:54 am

Nerdy Hugh Jackman… is that you?

Yu-Cheng Lin · April 16, 2017 at 3:38 pm

Your tutorials are really among the best I've ever seen. The beautiful animated graphics make it feels easy enough to be comprehended by self-learners, yet the content is really deep and abstract. Many thanks!

Daniil Belyakov · May 10, 2017 at 7:52 am

Here is what it might look like:
a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

for i in range(len(a) // 2, -1, -1):
parent = i

while True:
candidates = [parent, 2 * parent + 1, 2 * parent + 2]
candidates = [e for e in candidates if e < len(a)]
largest = max(candidates, key=lambda e: a[e])

if largest == parent:
break
else:
a[parent], a[largest], parent = a[largest], a[parent], largest

David Lim · March 6, 2018 at 12:56 am

Hello from Georgia Tech
Good luck with the test 2 on Wesnesday CS1332 Kidos

Prarabdh Joshi · March 17, 2018 at 5:13 am

San jose state University!! YAY

Aditya Roy · February 24, 2019 at 5:59 am

Please keep making
It was one worth of a watch. Though the mathematics was good for understanding, I just tried out this algorithm and I believe I can use it in my code. Thanks again

X Qiu · May 8, 2019 at 6:17 am

hope you will be back uploading videos.

Richard Opoku · May 20, 2019 at 1:33 am

wolverine teaching data structures.

Mona Yin · December 10, 2019 at 9:49 am

Briefly and very clear

Leave a Reply

Your email address will not be published. Required fields are marked *