Four Kinds of Scoping in R
In the first and second parts of this blog series, I defined lexical and dynamic scope, and demonstrated interesting cases of scoping in R.
In this third and final part of my blog series, I’d like to discuss a paper by the creators of R, where they motivate the need for lexical scoping in a statistical programming language.
This is a “bonus” blog post, because I’m going to dive into some of the hairier R features to show how four different kinds of scoping can be simulated in R.
Lexical scope and statistical computation
In Lexical Scope and Statistical Computation,^{1} Robert Gentleman and Ross Ihaka, the creators of R, discuss why they designed R with lexical scoping. The paper is written for a statistics audience, and they provide motivating examples for having lexical scoping in R.
For the purpose of their discussion, they define four (slightly different) kinds of scoping rules:
 trivial: no free variables allowed
 static: a free variable takes its value from a set of global variables
 lexical: a free variable takes the value of the binding that was in effect when the function was defined
 dynamic: a free variable takes the value of the most recent assignment to that variable
Note that under this set of definitions, static scoping is a separate scoping rule and not another name for lexical scoping.
It is possible to simulate each of strategies in R. For fun, we can even construct “factories” that take a function, and then modify it to use the desired scoping rule! (Jan Ječmen originally provided these examples to me, and I adapted them for this blog post after some feedback from Artem Pelenitsyn.)
Template
Our examples will follow the template given below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 
factory < function(fun) { <???> } x < 0 h < function() { x < 1 factory(function(a) x+a) } g < h() f < function() { x < 2 g(0) } f() # error, 0, 1, or 2 
We want to define a factory
that takes a function literal and returns a closure that implements the desired scoping rule.
Our example consists of three definitions of x
. On line 5, we assign 0
to x
at the top level. On line 7, we assign 1
to x
inside function h
, where we also create the closure. On line 12, we assign 2
to x
inside the function f
and right before we call g
, which is the closure.
Finally, we call f
and observe the result:
 Under trivial scoping, no free variables are allowed, so
f()
should result in an error.  Under static scoping, free variables may only refer to global variables, so
f()
should return0
.  Under lexical scoping, free variables refer to the variables in scope when the function was defined, so
f()
should return1
.  Under dynamic scoping, free variables take the value from the most recent assignment, so
f()
should return2
.
We will implement the body of factory
in only 3–5 lines of code. The rest of the code snippet, from lines 7 to the end, will remain the same, other than the call to factory
on line 10.
Trivial scoping
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 
makeTrivial < function(fun) { res < eval(substitute(fun)) environment(res) < baseenv() res } x < 0 h < function() { x < 1 makeTrivial(function(a) x+a) } g < h() f < function() { x < 2 g(0) } f() # Error in f(0) : object 'x' not found 
substitute
returns the unevaluated parse tree for fun
. In other words, it obtains the literal argument that was passed for fun
. This works because of callbyneed semantics in R: function arguments are packaged up into promises. As a result, the syntax tree of arguments is available for metaprogramming. A recent paper by Goel and Vitek^{2} discusses laziness in R in more detail.
In this example, on line 8, we call factory
with function(a) x+a
as the argument for the formal parameter fun
. Then, we evaluate that parse tree with eval
.
At this point, res
is the closure with expression function(a) x+a
and a reference to the environment of makeTrivial
. On line 3, we change that reference to baseenv()
, which is the environment containing library definitions. Since this environment is above the (user) toplevel environment, global variables are not available.
Therefore, variable lookup in the function literal will only search the base environment, so f()
results in an error.
Static scoping
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 
makeStatic < function(fun) { res < eval(substitute(fun)) environment(res) < globalenv() res } x < 0 h < function() { x < 1 makeStatic(function(a) x+a) } g < h() f < function() { x < 2 g(0) } f() # 0 
For this example, on line 3, we update the environment of res
to refer to globalenv()
, which is the toplevel environment where globals are defined.
Therefore, variable lookup searches the toplevel environment, so f()
returns 0
.
Lexical scoping
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 
makeLexical < function(fun) { res < eval(substitute(fun)) environment(res) < parent.frame() res } x < 0 h < function() { x < 1 makeLexical(function(a) x+a) } g < h() f < function() { x < 2 g(0) } f() # 1 
Although lexical scoping is the default for R, our factory template requires some metaprogramming to work properly. We need to set the environment of res
to parent.frame()
, which is the environment of the function (h
) that called the current function (makeLexical
). This allows us to simulate lexical scoping, as if the function literal was evaluated inside h
, rather than makeLexical
.
Therefore, variable lookup searches the environment of h
, so f()
returns 1
.
Dynamic scoping
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 
makeDynamic < function(fun) { function(...) { res < eval(substitute(fun)) environment(res) < parent.frame() res(...) } } x < 0 h < function() { x < 1 makeDynamic(function(a) x+a) } g < h() f < function() { x < 2 g(0) } f() # 2 
For this example, we need another level of indirection. makeDynamic
returns an anonymous function literal. The anonymous function takes ...
, which represents an arbitrary list of arguments, and then on line 5 we call res
with those exact arguments. Note that we set the environment of res
to be the environment of the caller of the anonymous function. Because of the multiple levels of indirection, the caller is f
, on line 17.
On line 12, makeDynamic
returns a closure for the anonymous function. h
returns that closure when it is called, and assigns it to g
. When g
is called on line 17, the function literal function(a) x+a
is finally evaluated, and its environment is set to the environment of f
, the caller of g
.
Therefore, variable lookup searches the environment of f
, so f()
returns 2
.
Conclusion
Hopefully this blog post has shown another way of looking at scoping definitions. As discussed in the previous post, it’s very easy to get confused because different definitions are used by different people. Here, Gentleman and Ihaka very clearly state what definitions they are using.
And finally, while I am far from an expert on metaprogramming in R, I hope this post has given a taste of what is possible.
I would like to thank Jan Ječmen for coming up with and showing me the original versions of these code examples, and Artem Pelenitsyn for his feedback to improve and not discard these examples from an earlier blog draft.
References

R. Gentleman and R. Ihaka. "Lexical Scope and Statistical Computing, Journal of Computational and Graphical Statistics, vol. 9, no. 3, 2000. [DOI][Available online] ↩

A. Goel and J. Vitek. “On the Design, Implementation and Use of Laziness in R,” in Proceedings of the ACM in Programming Languages (PACMPL), vol. 3, no. OOPSLA, 2019. To appear. [Available online] ↩