Ownership Explained with Python

16 Sep 2018
Paul Kernfeld dot com

What does this Python program print out?

squares = (x * x for x in range(10))  # 1
print(min(squares))                   # 2
print(max(squares))                   # 3

I would like it to print:

0
81

It actually prints:

0
Traceback (most recent call last):
  File "ownership-problem.py", line 3, in <module>
    print(max(squares))
ValueError: max() arg is an empty sequence

Argh! Line 2 worked correctly, but line 3 failed. Also, this error message is kind of confusing, because (x * x for x in range(10)) is clearly not an empty sequence. What happened is that, since squares is an iterator, calling min exhausted the iterator, leaving no elements in it. When we tried to call max, no elements were left.

It’s not immediately obvious that calling min(squares) modifies squares. If squares were a list or even a range, we would be able to call min and max on it with no problem. It would be nice if the language prevented us from trying to use something twice that can only be used once. Almost all modern languages, both statically and dynamically typed, will fail at runtime in these situations.

Once more, with ownership

Here is an equivalent program written in Rust:

fn main() {                              // 1
   let squares = (0..10).map(|x| x * x); // 2
   println!("{:?}", squares.min());      // 3
   println!("{:?}", squares.max());      // 4
}                                        // 5

When we try to compile this program, we get the following error:

error[E0382]: use of moved value: `squares`
 --> ownership.rs:4:21
  |
3 |    println!("{:?}", squares.min());
  |                     ------- value moved here
4 |    println!("{:?}", squares.max());
  |                     ^^^^^^^ value used here after move
  |
  = note: move occurs because `squares` has type `std::iter::Map<std::ops::Range<i32>, [closure@ownership.rs:2:31: 2:40]>`, which does not implement the `Copy` trait

First, an important distinction that I did not know before writing this blog post: a value is a chunk of data in the computer’s memory, whereas a variable, like squares, is a way to refer to that value from the code. In most modern programming languages, many variables can refer to a single value. In Rust, though, each value is owned by exactly one variable, meaning that no other variable can access that value. This is the core of ownership.

Rust functions can take ownership of values that are passed into them; this is called moving. Any values that are moved into a function can no longer be used in the outer function.

In this error, the compiler is saying that the value for squares was moved on line 3 but we tried to use it on line 4. This is because the min function moves the iterator. After calling min, the iterator can no longer be used in the outer function. This is a nice way to prevent us from accidentally trying to use an iterator that may have been exhausted.

Python solutions

How would we fix this program? Well, the original way of writing the program is just invalid, because we’re trying to make two passes over an iterator. The simplest solution would be to use a list rather than an iterator, i.e. to store the whole data set in memory. If you’re interested in processing large amounts of data (I am!), storing the whole data set in memory is something to avoid whenever you can.

I can think of two solutions that fix the original program without storing the whole data set in memory. First, we could construct a new iterator each time:

def squares():
    return (x * x for x in range(10))

print(min(squares()))
print(max(squares()))

Or, we could compute both the min and the max in a single pass:

def min_and_max(numbers):
    current_min = current_max = next(numbers)
    for number in numbers:
        current_min = min(current_min, number)
        current_max = max(current_max, number)
    return (current_min, current_max)

squares = (x * x for x in range(10))
minimum, maximum = min_and_max(squares)
print(minimum)
print(maximum)

This program is more efficient, but the code is longer and less elegant. Streaming is hard!

Zooming out a bit

Because an iterator contains mutable state, you should only access it from one place. With ownership, this isn’t just a good idea; it’s the law.

Can ownership be useful outside of iterators? Yes! In Rust, ownership strongly affects all mutable data. I really like this, because it’s easier to reason about the state of a mutable value without worrying that the value is being sneakily mutated somewhere else in the code.

Hopefully this post convinces you that ownership is a tool that can help programmers to avoid mistakes with mutable data. If you want to fill in some of the gaps that I blithely papered over, the Rust Book has a good explanation of ownership.

Thanks to the following Sams for comments: Helman, Zimmerman

Update 2018-09-19

Thank you to Reddit user codec-abc, who pointed out that you can implement this incorrect program in Rust! Here is, in my opinion, the most elegant way to do it:

fn main() {
   let mut squares = (0..10).map(|x| x * x);
   println!("{:?}", squares.by_ref().min());
   println!("{:?}", squares.max());
}

When run, this program prints out:

Some(0)
None

In this case, there are a few things that help to clarify how this program works:

  1. The programmer must use let mut squares instead of let squares. This indicates that squares is being changed somewhere.
  2. The programmer needs to call by_ref or use some other technique to turn a mut Iterator in a &mut Iterator. This indicates that something is going on. Explaining mutable references is beyond the scope of this post, but the by_ref documentation explains that it is intended exactly for cases where you want to consume part of an iterator without taking ownership of it.
  3. Since the min and max methods return Option, the programmer is forced to think about what should happen if the iterator is empty.