Variable scope

One of the important thing to know when learning a programming language is the variable scope – in other words, what variable can be seen where. It can indeed greatly vary from language to language.

Even though we always talk of variables, Python does not really have variables in the traditional sense (see Variables and integers). Rather, it has names in namespaces. In the current namespace a certain number of names are available. You can use dir() to see what is in the current namespace.

>>> dir()
['__builtins__', '__doc__', '__loader__', '__name__',
 '__package__', '__spec__']

>>> number1 = 42
>>> number2 = 43

>>> dir()
['__builtins__', '__doc__', '__loader__', '__name__',
 '__package__', '__spec__', 'number1', 'number2']

>>> vars()
{'__doc__': None, '__package__': None, '__loader__':
<class '_frozen_importlib.BuiltinImporter>, 'number1':
42, 'number2': 43, '__spec__': None, '__builtins__':
<module 'builtins' (built-in)>, '__name__': '__main__'}
>>>

After we initialize the two global variables number1 and number2, we find them in the global namespace. Note also that dir(__builtins__) shows all the names which are available wherever you are in the code.

A namespace can be modified through several ways:

  • variable assignment / initialization
  • new function
  • new class
  • import
  • del
  • globals()[]
  • locals()[]

Note that the instruction del does NOT delete an object. Rather, it removes its reference from the current namespace. It’s up to the Python runtime to delete the object (immediately or later on) if need be. A future post will look at the CPython garbage collector in more detail.

Global and local namespaces

Just like there is a global namespace, there is also a local namespace (accessible through locals()). Consider the following code:

>>> number = 42
>>> def func1():
...     return number + 1
...
>>> def func2():
...     number += 1
...     return number
...
>>> func1()
43
>>> func2()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in func2
UnboundLocalError: local variable 'number' referenced before assignment

How come func1() can access global variable number but not func2() ? Disassembling the code provides some answers:

>>> import dis
>>> dis.dis(func1)
  2           0 LOAD_GLOBAL              0 (number)
              3 LOAD_CONST               1 (1)
              6 BINARY_ADD
              7 RETURN_VALUE

>>> dis.dis(func2)
  2           0 LOAD_FAST                0 (number)
              3 LOAD_CONST               1 (1)
              6 INPLACE_ADD
              7 STORE_FAST               0 (number)

  3          10 LOAD_FAST                0 (number)
             13 RETURN_VALUE

func1 is considering “number” as a global variable (it’s using LOAD_GLOBAL) whereas func2 considers it as local (it’s using LOAD_FAST), and thus tries to increment a local variable which hasn’t been initialized yet. This is because func1() only tries to read the variable, whereas func2() tries to update it. This is a safety feature to prevent functions from unknowingly updating global variables.

As a general rule, any variable assigned inside the function is handled as a local variable (the “global” statement can override that), and all the other variables are handled as global variables. If we add “global number” then the function works without error. Note that the location of that line inside the function doesn’t matter – you can insert a global statement after the variable is assigned and it will work the same.

>>> def func3():
...     global number
...     number += 1
...     return number
...
>>> func3()
43
>>> func3()
44

This can however have some unintended consequences when trying to modify the local namespace through locals()[]:

>>> globals()['var1'] = 42
>>> var1
42
>>> def func():
...     locals()['var2'] = 42
...     print(dir())
...     print(var2)
...
>>> func()
['var2']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in func
NameError: global name 'var2' is not defined

A call to dir() inside func() shows that var2 has been added to the local namespace. However, trying to print that variable fails. This is because the compiler doesn’t know that var2 has been initialized (how would it know? It was done dynamically) so considers it a global variable.

How imports modifies the namespace

The import command allows to access libraries by adding some of its names to the namespace:

>>> import string
>>> dir()
['__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__',
'number1', 'number2', 'string']
>>> dir(string)
['ChainMap', 'Formatter', 'Template', '_TemplateMetaclass', '__builtins__', '__cached__',
 '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '_re',
 '_string', 'ascii_letters', 'ascii_lowercase', 'ascii_uppercase', 'capwords', 'digits',
 'hexdigits', 'octdigits', 'printable', 'punctuation', 'whitespace']

The official Style Guide for Python known as PEP 8 discourages from using wildcard imports. Not only it could overcrowd the namespace, but could lead to name clash:

>>> whitespace = ' '
>>> whitespace
' '
>>> from string import *
>>> whitespace
' \t\n\r\x0b\x0c'

Closures

A closure is a variable environment specific to a function that contains non-local variables. Closures are supported by a lot of languages these days, including Python. Let’s see an example:

>>> def incrementer(number1):
...     return lambda number2: number1 + number2
...
>>> inc4 = incrementer(4)
>>> inc4(8)
12
>>> inc6 = incrementer(6)
>>> inc6(8)
14

The incrementer() function returns a function which has its own variable context. In the code above, inc4 sees number1 as set to 4 whereas inc6 sees it as set to 6, even when the functions are called later. We can use the Python bytecode disassembler to verify a closure is being defined:

>>> import dis
>>> dis.dis(incrementer)
  2           0 LOAD_CLOSURE             0 (number1)
              3 BUILD_TUPLE              1
              6 LOAD_CONST               1 (&lt;code object &lt;lambda&gt; at 0x0000000002A0B390, file &quot;&lt;stdin&gt;&quot;, line 2&gt;)
              9 LOAD_CONST               2 ('incrementer.&lt;locals&gt;.&lt;lambda&gt;')
             12 MAKE_CLOSURE             0
             15 RETURN_VALUE

You can also check the closure at runtime. The inc4 variable has a __closure__ attribute:

>>> dir(inc4)
['__annotations__', '__call__', '__class__',
 '__closure__',
 '__code__', '__defaults__',
 ...
 '__str__', '__subclasshook__']

>>> inc4.__closure__
(<cell at 0x00000000021C7708: int object at 0x000000001E39C770>,)

>>> inc4.__closure__[0].cell_contents
4

There is however an interesting twist with closures. Consider the following code:

>>> def incrementer(number1):
...     def inc(number2):
...             return number1 + number2
...     number1 += 1
...     return inc
...
>>> inc4 = incrementer(4)
>>> inc4(8)
13

The value kept for number1 is not the value at the time inc() was defined (even though it was the time the closure was created) but the value of number1 at the end of the incrementer() function. One can assume that the closure is just referencing variable number1 rather than making a copy. If Python does not let us modify what is in a closure, number1 has a life even after inc() is defined, and until incrementer() completes.

Addendum: there is no block scope

When you assign a local variable for the first time, it is implicitly declared for the whole function. This is different from languages such as C++ where it is possible to declare a variable only inside a “{ }” block:

int number = 42;
if (true)
{
    int number = 43;
}
cout << number << endl;  // 42

When “number” gets declared again on line 3, the declaration (and its associated value) is only valid inside the surrounding “{ }” block (lines 3 to 5). This is why “number” on line 6 is equal to 42 and not 43. The notion of block scope exists in languages such as C++, in the latest JavaScript specification (using the “let” keyword), in Java or C# (although in the latter two cases, the compiler seems to refuse to declare the same variable several times in the same function)

In Python however, all declarations are implicit, and there is no concept of block scope.

2 thoughts on “Variable scope

  1. FYI is there is an error in the closure example:

    >>> inc6(8)

    should return 14 not 16.

    Thank you for the informative post!

    Like

Leave a comment