PDA

View Full Version : tail recursion 101



Maarten Witberg
07-20-2006, 01:20 PM
thread compiled from posts originally here (http://filemakertoday.com/filemakertoday.com_non_ssl/com/showthread.php?t=10173).



...the following custom function:

NumRange ( From ; To )
----
If(
From and To > From;
From & ", " & NumRange ( From + 1; To);
From
)

...You will be able to use it to generate ranges of up to 10,000 integers...

Hi Ray,
Why 10,000? I thought FM allows up to 50,000 for a single recursion?

kjoe

CobaltSky
07-20-2006, 04:47 PM
Hi Ray,
Why 10.000? I thought FM allows up to 50,000 for a single recursion?

kjoe
Hi kjoe,
The stack overflow for recursion occurs after 10,000 iterations, so most common forms of recursion max out at this point.

The limit of dependent operations for evaluation of an expression is 50,000 so if the syntax uses tail-recursion (ie constructed so that it is not required to return values from the stack) then the higher limit applies.

Maarten Witberg
07-20-2006, 10:49 PM
Thank you Ray.

To make the terminology fully clear, tail recursion would mean subtracting from the max. amount of recursions, for instance a loop through a value list that starts at the bottom? (like in post #6 here (http://filemakertoday.com/filemakertoday.com_non_ssl/com/showthread.php?t=10118)).
I'm just starting using recursive cfs. In trying to get them to work or understanding other peoples I pretend they're scripts but that helps only a bit... :p

kjoe

CobaltSky
07-21-2006, 07:49 AM
Thank you Ray.

To make the terminology fully clear, tail recursion would mean subtracting from the max. amount of recursions, for instance a loop through a value list that starts at the bottom? (like in post #6 here (http://filemakertoday.com/filemakertoday.com_non_ssl/com/showthread.php?t=10118)).
I'm just starting using recursive cfs. In trying to get them to work or understanding other peoples I pretend they're scripts but that helps only a bit... :p

kjoe
Hi kjoe,
Tail recursion has nothing to do with whether you 'loop' backwards from the end or forward from the beginning. What matters is how values accumulate and are passed down through successive function calls (and in particular, whether the function must feed values via a temporary memory 'stack' in the course of its operation). In the simplest terms, recursive functions that are constructed so that they don't depend on the stack are tail recursive.

The function at post 6 of the thread you've referenced is not tail recursive and is therefore subject to the 10,000 stack limit. The reason is that the core of the expression:

FindLinesContainingString2 ( LeftValues(SampleText;lines-1) ; string ) & include

...leaves the result from the current iteration ("include") dangling while the CF passes through a further evaluation of itself. That further evaluation in turn leaves its own "include" dangling as yet another call is evaluated. It is the succession of dangling values (which must be held in memory until the final pass when the exit condition is met) that form the contents of the stack.

To create a tail recursive code architecture, it's necessary to rework the syntax so that nothing is left in memory as evaluation progresses. That way each pass of the CF fully resolves and delivers its result entirely to the next iteration without the requirement for anything to be left 'dangling'. The most common (though not the only) way to do this is to pass the accumulating result/s between iterations of the function (ie successive calls of the CF) via one of the function's parameters. This can (for instance) be one of the existing parameters, or (clumsy work-around) an additional 'dummy' parameter defined solely for this purpose.

As an example, (using existing parameters) one might rework the function you mentioned around a slightly different logic, as:

FindLinesContainingString_Tail ( SampleText; String)
-------------
Let([
Lines = ValueCount ( SampleText );
ValNo = GetAsNumber(RightValues(string; 1));
ValCurr = GetValue( SampleText; ValNo );
ChkStrg = GetValue(String; 1);
Exclude = PatternCount( ValCurr; ChkStrg ) = 0];

Case(
ValueCount(String) = 1; FindLinesContainingString_Tail ( SampleText; String & ¶ & Lines );
ValNo > 0 ;
FindLinesContainingString_Tail (
LeftValues(SampleText; ValNo - Exclude) &
RightValues(SampleText; Lines - ValNo);
ChkStrg & ¶ & (ValNo - 1)
);
SampleText
)

)

This complicates matters slightly, but by doing so arrives at a syntax that enables the calls prior to the exit condition to pass their results cleanly and completely to the next iteration, with nothing held in memory - ie the result accumulates within (rather than outside) the parameters of the calls the CF makes to itself.

Thus the recursion proceeds in a linear (or spiral) fashion, with no requirement to pass back via the stack retrieving results from the successive (preceding) iterations after completing the sequence of function calls. Tail recursion. smiley_cool

Maarten Witberg
07-21-2006, 12:24 PM
Hi Ray,

Thank you very much. I'm not going to pretend I've got that cf down yet, but the theory is lucid. You're a teacher. smiley-smile

kjoe

CobaltSky
07-22-2006, 04:28 AM
...I'm not going to pretend I've got that cf down yet...
Okay, well seeing as these posts have been moved and we've got our own special sand-box for this conversation now, let me see if I can drill down a bit on this TRCF (tail recursive custom function) thing.

The CF I posted (see post#4 above) is not the most simple example of tail recursion, for two reasons:

A. I was responding to your query concerning whether or not your CF (at post #6 (http://filemakertoday.com/filemakertoday.com_non_ssl/com/showpost.php?p=39098&postcount=6) on the How to pull one line out of a mass of text? thread) was a TRCF - so it made some sense to show (for comparison) a CF that would deliver equivalent functionality using tail recursion, and

B. I have an 'allergic' reaction to the practice of adding dummy parameters to CFs as a convenience for the developer - even though doing so can slightly simplify the syntax of the CF itself (but at the expense of a more complex and less transparent/logical syntax for the end-user).

So, for the sake of comparison, here is a reworked version of the original CF that started this discussion, set up to support tail recursion:

NumRange_Tail ( From ; To )
----
Let(
pN = GetAsNumber(RightWords(From; 1));
If(pN < To; NumRange_Tail(From & ", " & (pN + 1); To); From)
)

I'm still avoiding adding a dummy parameter (after all, I don't want to break out in hives...), but perhaps you'll find it easier to focus on the stripped down logic in this example. As you can see, in the original, the payload was delivered (and recursion invoked) by the expression:

From & ", " & NumRange ( From + 1; To)

...which leaves the From & ", " bit dangling while a further "call to self" is resolved at the next iteration - and all those From & ", " bits accumulate on the stack until the exit expression From and To > From is satisfied.

In the tail recursive version, however, the From & ", " part of the logic is tucked inside the call which invokes recursion, thus:

NumRange(From & ", " & (pN + 1); To)

...leaving nothing dangling to require the use of the stack. Doing this necessitates a bit of extra work to parse out the closing number of the range-in-progress at each call, so hence the need for the GetAsNumber(RightWords(From; 1)) part of the TRCF. The "From" parameter is now serving a dual purpose as the vehicle to track progress and as a repository for the accumulating string which will become the result (so it is, if you will, providing a quasi-stack for us as the evaluation proceeds).

Because in this sense we have 'made our own stack' and don't need to employ FileMaker's memory stack in the TRCF syntax, we're no longer constrained by the 10,000 stack overflow limit imposed by the engineers at FMI. Instead we get the benefit of the higher 50,000 limit of dependent operations.

But we may have to wait a bit to enjoy it - 'cos 50,000 passes is going to take a while to compute! smiley-surprised smiley-wink

Maarten Witberg
07-22-2006, 10:10 AM
Hi Ray

at the risk of making a fool of myself, let's see if i'm a good student. :p
here's the cf you reworked, this time with my comments added to indicate what I think the various bits do:

FindLinesContainingString_Tail ( SampleText; String)
-------------
Let
(
[
Lines = ValueCount ( SampleText );
ValNo = GetAsNumber(RightValues(string; 1));
/* Valno picks the line number to be tested */
ValCurr = GetValue( SampleText; ValNo );
/* ValCurr is the line number picked by ValNo,
either a number or the original user entered string */
ChkStrg = GetValue(String; 1);
/*returns the original string the user has entered */
Exclude = PatternCount( ValCurr; ChkStrg ) = 0
/*a boolean that is true if 'string' has more than
1 value (the user entered search string) */
];

Case
(
ValueCount(String) = 1;
FindLinesContainingString_Tail ( SampleText; String & ¶ & Lines );
/* this should be the result on the first pass.
"lines" is included into the recursion*/
ValNo > 0 ;
FindLinesContainingString_Tail ( LeftValues(SampleText; ValNo - Exclude)
& RightValues(SampleText; Lines - ValNo); ChkStrg & ¶ & (ValNo - 1))
/*the result on subsequent passes. leftvalues does the current test,
rightvalues is the stack that is piggybacking.
I cannot quite see through the juggling with valno, exclude and lines. */
;
SampleText
/* default -> final pass when ValNo has become 0 */
) /*end case */
) /*end let */

kjoe

ps if you reply i hope it starts with "no not quite" rather than "no not at all"........

CobaltSky
07-22-2006, 11:43 AM
...ps if you reply i hope it starts with "no not quite" rather than "no not at all"........
Happy to oblige with a "not quite", kjoe. ;)

Actually there are only two or three points that could use some clarification. So - here are some comments on your comments:

Exclude = PatternCount( ValCurr; ChkStrg ) = 0
/*a boolean that is true if 'string' has more than 1 value (the user entered search string) */
Yes, it's boolean, so you have that bit right, but it is not checking the number of values within 'String'. Rather, it is checking whether the user-entered string value is present in the line currently being tested.

ValueCount(String) = 1;
FindLinesContainingString_Tail ( SampleText; String & ¶ & Lines );
/* this should be the result on the first pass. "lines" is included into the recursion*/
Yes, except that lines is not so much becoming 'part of the recursion' as being piggy-backed onto the String parameter so that it can be used as a counter value (and decremented on subsequent calls).

..and further down:

ValNo > 0 ;
FindLinesContainingString_Tail ( LeftValues(SampleText; ValNo - Exclude)
& RightValues(SampleText; Lines - ValNo); ChkStrg & ¶ & (ValNo - 1))
/*the result on subsequent passes. leftvalues does the current test, rightvalues is the stack that is piggybacking. I cannot quite see through the juggling with valno, exclude and lines. */
You're partly right here. This is indeed the result on the second and subsequent calls to the CF (excepting the last). LeftValues( ) splits the received list either after or before the line currently being tested, depending on the value of "Exclude". Ie if the PatternCount( ) test passes, the boolean will be false so Exclude will be equal to zero and the LeftValues( ) argument will grab the line currrently being tested - otherwise Exclude will evaluate to 1 and the line currently being tested will be omitted. RightValues( ) then back-fills with the lines (if any) already checked (so yes, it is performing the function of the stack in your conventional formulation - passing the checked values intact via the next function call).

Since in this case (as it happens - it's purely a matter of choice) we are working backwards from the last value, we then decrement ValNo by 1 so that the next pass will check the line prior to the one checked on the current pass.

This process continues until *all* the lines have been checked - at which point the second Case( ) argument - ValNo > 0 - fails and the default result kicks in - returning only those lines (if any) which contained the string being checked.

So... the default value on the last call of recursion (SampleText) is returning a final and complete value (representing the cumulative result of all preceding calls) which it pulls entirely from one of the parameters in play at the tail of the process - nothing need be retrieved at this point from the memory stack. Which is what qualifies this construction as tail recursion - and gets it around the 10,000 stack overflow limit. So Cool!

Maarten Witberg
07-22-2006, 01:12 PM
hi Ray

Cf's like this require rather more insight than my usual 'press the button and see what happens' mode of learning. And with hindsight it is rather silly that I did not see that Exclude does the actual test. But this has been a good learning experience! thank you once again for patience and clear explanation.

kjoe

forbade
01-28-2008, 10:32 AM
So basically the secret is passing a modified exit condition to itself, in the form of a special input to one of the function parameters.

The Recursive function keeps track of itself, without having to access the stack...

noice :D