A painting showing hands painting other hands

Everybody Hates Recursion

Everybody hates re­cur­sion. Well, maybe I'm ex­ag­ger­at­ing but the tech­nique is of­ten dis­missed as use­less and com­pli­cated. You may be think­ing: I'll just keep my loops please.” And there's noth­ing wrong with that. In many sit­u­a­tions, loops or higher-or­der func­tions are the nat­ural so­lu­tion. Not to men­tion, some lan­guages have poor sup­port for re­cur­sion—but I'll dive into that part later.

But Why?

So, why should you even bother with the es­o­teric tech­nique? I'll list a few rea­sons. For one, some prob­lems can eas­ily be ex­pressed re­cur­sively—the Tower of Hanoi puz­zle quickly comes to mind. Second, some pro­gram­ming lan­guages em­pha­size re­cur­sion as a key lan­guage fea­ture. This is true for many func­tional pro­gram­ming lan­guages. For ex­am­ple, it is one of the things that tripped me up when learn­ing Racket, a mod­ern im­ple­men­ta­tion of Scheme. The lan­guage has no loop­ing con­structs, so you're of­ten forced into us­ing re­cur­sion. Finally, I think it's good to learn new things. Why stay stag­nant? The world is worth ex­plor­ing.

Explain It Please

To solve a prob­lem us­ing re­cur­sion, it must take on a pat­tern, which I'll de­scribe here. For one, the tech­nique as­sumes that the prob­lem can be bro­ken down into smaller ver­sions of the same prob­lem. At some point, the break­down process can no longer con­tinue be­cause an ax­iomatic form of the prob­lem is reached. We'll call this fully sim­pli­fied form a base case. These base cases—there can be more than one—are hard­coded by the pro­gram­mer. Every re­cur­sive func­tion needs at least one base case. Otherwise, it's as if you wrote an in­fi­nite loop. Of course, you also need a self call—i.e., you call the func­tion within the same func­tion's body. Usually, the self call will grap­ple with a re­duced ver­sion of the prob­lem—this should ob­vi­ously tie into the work on a smaller prob­lem idea. Once the re­cur­sive func­tion is called with a state cor­re­spond­ing to a base case, the func­tion re­turns some spe­cial value, and the chain of func­tion calls is fi­nitely capped.

Factorials

I am go­ing to in­ter­rupt our train of thought to in­ject an ex­am­ple. So, let’s stare at this code rep­re­sent­ing the fac­to­r­ial func­tion. For con­tin­gency rea­sons, in case your high­school math teacher failed you, I’ll ex­plain this su­per quick. Say we want to com­pute one fac­to­r­ial. Well, that’s just one–and hint, it’s one of the base cases. But what if you want to com­pute 2 fac­to­r­ial. Then, you sim­ply take 2 * 1. Following the pat­tern, three fac­to­r­ial is 3 * 2 * 1. Yeah, there's not much to this. The for­mal de­f­i­n­i­tion is that the fac­to­r­ial of a pos­i­tive in­te­ger n, de­noted by n!, is the prod­uct of all pos­i­tive in­te­gers less than or equal to n. As an aside, we ran­domly as­sign that zero fac­to­r­ial is equal to one–this is the other base case. Ok, some code stuff.

function factorial($n) {
if ($n < 0) {
throw new Exception("\$n parameter cannot be negative");
}
else if($n <= 1) {
// base cases
return 1;
}
else if($n > 1) {
// self call; note how the problem is reduced
return $n * factorial($n - 1);
// also, the multiplication is the last operation
// this is not tail recursive
}
}

assert(factorial(0) === 1);
assert(factorial(1) === 1);
assert(factorial(2) === 2);
assert(factorial(11) === 39916800);

Hopefully, the base case and the self call are ob­vi­ous. If you're con­fused about how the code works, I rec­om­mend get­ting out a sheet of pa­per and trac­ing out what hap­pens for dif­fer­ent in­puts. Also, please note that the $n pa­ra­me­ter must be greater than zero. Otherwise, every­thing will break be­cause the fac­to­r­ial se­quence is only de­fined for pos­i­tive val­ues and zero.

The Call Stack

It's im­por­tant to note that each time a func­tion is called, in­for­ma­tion about it must be pushed onto the call stack. For the unini­ti­ated, I'll briefly ex­plain this process. To start, there's noth­ing spe­cial about this stack–we are still push­ing and pop­ping. In this case, we are push­ing frames onto the stack that con­tain in­for­ma­tion about the func­tion such as its pa­ra­me­ters, its lo­cal vari­ables, and a re­turn ad­dress.[1] The call stack's im­ple­men­ta­tion is lan­guage and ma­chine spe­cific so let's not get lost in the de­tails. Basically, frames are pushed and then popped from the call stack as func­tions be­gin and end ex­e­cu­tion re­spec­tively. Cool, but why do we care? Well, these frames take up mem­ory—and the com­puter only has a fi­nite amount of mem­ory. Therefore, there is go­ing to be some hard cap on the num­ber func­tion calls that can be nested. If the com­puter runs out space for a new frame, then a stack over­flow er­ror will be thrown by most lan­guages. Usually, re­cur­sive func­tions are prone to these types of er­rors—not be­cause the pro­gram­mer made a mis­take—but due to the nat­ural lim­i­ta­tions of nest­ing large amounts of func­tion calls.

Tail Call Optimization

While many peo­ple sim­ply hate re­cur­sion due to per­sonal pref­er­ences, there are times where us­ing the tech­nique is a bad idea. As I just men­tioned, in­for­ma­tion about func­tion calls has to be pushed onto the call stack, which takes up mem­ory. Many lan­guages have ar­bi­trary lim­its on the depth of the call stack. For ex­am­ple, Python sets its limit at about 1000 be­fore trig­ger­ing a stack over­flow. Thus, mem­ory us­age is a huge is­sue when us­ing re­cur­sion. In con­trast, it­er­a­tive tech­niques, such as for loops, don't have this prob­lem be­cause they only re­quire a con­stant amount of mem­ory re­gard­less of the num­ber of rep­e­ti­tions. But don't worry about this sad news. Some lan­guages im­ple­ment an idea called tail call op­ti­miza­tion that can save us.

Assuming the lan­guage has tail call op­ti­miza­tion, a tail re­cur­sive func­tion will only take up con­stant space on the call stack. Although sim­ple, the con­cept seems to con­fuse many peo­ple. The self call has to be the last op­er­a­tion of the func­tion. Importantly, this does not mean the last line. For an ex­am­ple, re­view the fac­to­r­ial func­tion from ear­lier. The last op­er­a­tion is the mul­ti­pli­ca­tion, not the func­tion call.

To con­vert our fac­to­r­ial func­tion to be tail re­cur­sive, a few changes have to be made. Importantly–and this is worth re­peat­ing–the self call has to be the last op­er­a­tion. To pull this off, it turns out an ex­tra pa­ra­me­ter has to added to track the ac­cu­mu­lated changes to the an­swer. Let's just look at the code.

function factorial_tail($n, $acc=1) {
if($n < 0) {
throw new Exception("\$n parameter cannot be negative");
}
else if($n <= 1) {
// base cases
return $acc;
}
else if($n > 1) {
// self call
return factorial_tail($n - 1, $n * $acc);
// function call is last operation; tail recursion!
}
}

assert(factorial_tail(0) === 1);
assert(factorial_tail(1) === 1);
assert(factorial_tail(2) === 2);
assert(factorial_tail(11) === 39916800);

Proving the Constant Space Hypothesis

So what makes tail re­cur­sive func­tions note­wor­thy? After all, like all func­tions, they must be al­lo­cated on the call stack.[2] Yet, some­thing spe­cial hap­pens when nest­ing these types of func­tions: the fi­nal nested func­tion call’s re­turn value will be the fi­nal an­swer. To prove I'm not mak­ing stuff up, study the fol­low­ing di­a­gram of a call stack for the tail re­cur­sive fac­to­r­ial func­tion.

From the di­a­gram, it's ob­vi­ous that the fi­nal frame will have a re­turn value of six, the fi­nal an­swer. If we both­ered to main­tain all the frames and bub­ble the value back down, then the six would get fun­neled straight through. Thus, there is no point in keep­ing the early frames. We can safely dis­card them.

Now, let's imag­ine a bad sit­u­a­tion. If the func­tion isn't tail re­cur­sive, then you'll have to store all the frames and back­sub­sti­tute to get the an­swer. We can't pre­ma­turely pop off the early frames be­cause we need in­for­ma­tion from later calls to com­plete the last op­er­a­tion. Try get­ting out a piece of pa­per and trac­ing the non-tail re­cur­sive fac­to­r­ial func­tion to see why this is the case.

Now, dis­ap­point­ingly per­haps, a lot of pro­gram­ming lan­guages do not im­ple­ment tail call op­ti­miza­tion–and this in­cludes most im­per­a­tive lan­guages. In most cases, you can find out if a lan­guage has tail call op­ti­miza­tion with a Google search. Alternatively, I guess you could ex­e­cute a deeply nested re­cur­sive func­tion and see if you get a stack over­flow er­ror.

Conclusion

Thanks for read­ing my ram­blings about re­cur­sion. On one level, it’s not so hard. Your func­tion only needs three prop­er­ties to be re­cur­sive and use­ful: it must have at least one base case, it must call it­self, and the in­put must be re­duced for each sub­se­quent self call. They are a great way to solve prob­lems that built from smaller iden­ti­cal sub­prob­lems. Yet, I wouldn't force the tech­nique when it's not needed. Without tail call op­ti­miza­tion, re­cur­sion is way less pow­er­ful. In fact, I would rec­om­mend ex­per­i­ment­ing with re­cu­sion us­ing a func­tional pro­gram­ming lan­guage. These lan­guages are built around func­tion calls and are of­ten de­pen­dent on re­cur­sion for do­ing it­er­a­tion. Anyway, I hope this guide in­spires you to play around with this con­cept.


  1. To note, the re­turn ad­dress is like a book­mark that des­ig­nates to the com­puter where to jump back to af­ter the func­tion call ends. ↩︎

  2. Ok, so in real life, there are mul­ti­ple dif­fer­ent ways to im­ple­ment tail call op­ti­miza­tion. The pro­gram­ming lan­guage isn't nec­es­sar­ily pre­ma­turely drop­ping stack frames. Yet, the ideas I show here give an in­tu­ition about why we can even at­tempt the op­ti­miza­tion. ↩︎