In Rust Iterator pattern with iter(), into_iter() and iter_mut() methods I explained why attempting to use a variable holding a Vec
after iterating through it using the for … in
syntax leads to a compilation error.
The post explains why the following code won't compile:
fn main() {
let some_ints = vec![1,2,3,4,5];
// iterating through a vec
for i in some_ints {
dbg!(i);
}
// attempting to use the vec will
// lead to compile error after the iteration
dbg!(some_ints);
}
I then showed 3 methods that can be called before iterating using the for … in
and how 2 of these methods allow the Vec
to still be used even after iteration.
These 3 methods are into_iter()
, iter()
, and iter_mut()
. That is:
#[test]
fn into_iter_demo() {
// the .into_iter() method creates an iterator, v1_iter
// which takes ownership of the values being iterated.
let mut v1_iter = v1.into_iter();
assert_eq!(v1_iter.next(), Some(1));
assert_eq!(v1_iter.next(), Some(2));
assert_eq!(v1_iter.next(), Some(3));
assert_eq!(v1_iter.next(), None);
// If the line below is uncommented, the code won't compile anymore
// this is because, after the iteration, v1 can no longer be used
// since the iteration moved ownership
//dbg!(v1);
}
The two other methods that allow the Vec
to still be used after iteration via for … in
are:
#[test]
fn iter_demo() {
let v1 = vec![1, 2, 3];
// the .iter() method creates an iterator,
// v1_iter which borrows value immutably
let mut v1_iter = v1.iter();
// iter() returns an iterator of slices.
assert_eq!(v1_iter.next(), Some(&1));
assert_eq!(v1_iter.next(), Some(&2));
assert_eq!(v1_iter.next(), Some(&3));
assert_eq!(v1_iter.next(), None);
// because values were borrowed immutably,
// it is still possible to use
// the vec after iteration is done
dbg!(v1);
}
And
#[test]
fn iter_mut_demo() {
let mut v1 = vec![1, 2, 3];
// the .iter_mut() method creates an iterator,
// v1_iter which borrows value and can mutate it.
let mut v1_iter = v1.iter_mut();
// access the first item and multiple it by 2
let item1 = v1_iter.next().unwrap();
*item1 = *item1 * 2;
// access the second item and multiple it by 2
let item2 = v1_iter.next().unwrap();
*item2 = *item2 * 2;
// access the third item and multiple it by 2
let item3 = v1_iter.next().unwrap();
*item3 = *item3 * 2;
// end of the iteration
assert_eq!(v1_iter.next(), None);
// this will print out [2,4,6]
dbg!(v1);
}
In this post, we are going to dive a little bit deeper into understanding some of the machinery that makes the above work.
We start again by talking about the Iterator
trait.
Iterator pattern and Iterator trait.
An Iterator
represents the ability to retrieve elements from another data structure in sequence. In rust, it is any data structure that implements the Iterator
trait.
It is important to note that the Vec
data structure by itself is not an Iterator
and hence cannot be iterated.
To make this more obvious, let us forget about the for … in
syntax for a second, and try to perform an operation that should be possible on a data structure that supports being iterated.
An example of such an operation is the for_each
method.
In the code example below, we attempt to loop directly over a Vec
of numbers using for_each
and print each item. The code won't compile:
fn main() {
let some_ints = vec![1,2,3];
// calling for_each directly on a Vec won't compile
some_ints.for_each(|item| {
dbg!(item);
});
}
The compile error below gives us a clue as to why the code does not compile:
error[E0599]: `Vec<{integer}>` is not an iterator
--> src/main.rs:60:14
|
60 | some_ints.for_each(|item| {
| ^^^^^^^^ `Vec<{integer}>` is not an iterator; try calling `.into_iter()` or `.iter()`
|
= note: the following trait bounds were not satisfied:
`Vec<{integer}>: Iterator`
which is required by `&mut Vec<{integer}>: Iterator`
`[{integer}]: Iterator`
which is required by `&mut [{integer}]: Iterator`
For more information about this error, try `rustc --explain E0599`.
error: could not compile `playground` due to a previous error
The compile error contains the line:
Vec<{integer}>
is not an iterator; try calling.into_iter()
or.iter()
Proving the point that a data structure like Vec
by itself is not an iterator.
But instead of using the for_each
method, we can actually perform an iteration directly on the Vec
using the for … in
syntax.
For example, the following code compile and runs as expected:
fn main() {
let some_ints = vec![1,2,3];
for item in some_ints {
dbg!(item);
}
}
What gives?!
Did we not just prove that a Vec
is not an Iterator on itself?
We even showed this by trying to call a method that should work on an iterator and confirm it fails. But here we are still being able to iterate over something that should not be an Iterator using the for … in
syntax.
How is that possible?
To understand why this works, we need to look into another trait called IntoIterator
.
What is the IntoIterator trait?
The IntoIterator
is a trait that specifies how a data structure can be converted into an Iterator. The basic structure of the trait looks like this:
pub trait IntoIterator {
type Item;
type IntoIter: Iterator
where
<Self::IntoIter as Iterator>::Item == Self::Item;
fn into_iter(self) -> Self::IntoIter;
}
As can be seen, the main method specified by the trait is into_iter()
. The result of calling it with is an Iterator
. That is, Self::IntoIter
the return type, is of type Iterator
, given it is an associated type defined to be an iterator in the body of the trait. This is what the line type IntoIter: Iterator
above means.
So any data structure, that by itself is not an iterator, can define how it can be transformed into an iterator by implementing the IntoIterator
.
The Vec
data structure defined in the Rust standard library implements the IntoIterator
, which means it has the method into_iter()
which when called returns an Iterator
.
To see this in action, let's go back to the code that did not compile, where we directly called for_each
on a Vec
, but instead of calling for_each
directly, we first call into_iter()
, before calling for_each
.
fn main() {
let some_ints = vec![1,2,3];
// first calling into_iter() works
some_ints.into_iter().for_each(|item| {
dbg!(item);
});
}
This works, because the first call to into_iter()
returns an Iterator
, which allows iterating over the underlying Vec
.
So how does this help answer why it is possible to use for … in
directly on a Vec
without first turning it into an Iterator
by calling into_iter
?
Well, the answer is that when the for … in
syntax is used, the compiler automagically first calls into_iter()
, getting an Iterator
back and using that to do the iteration.
According to the documentation, the for … in
syntax actually desugars to something like this:
let values = vec![1, 2, 3, 4, 5];
{
let result = match IntoIterator::into_iter(values) {
mut iter => loop {
let next;
match iter.next() {
Some(val) => next = val,
None => break,
};
let x = next;
let () = { println!("{x}"); };
},
};
result
}
Where the into_iter
is first called on the Vec
value, and then the iteration is continuously done, calling next()
, until None
is reached, signifying the end of the iteration.
So the into_iter
method, which is part of the IntoInterator
traits explains how the for … in
syntax can be used for iterating over a Vec
. And this is because the compiler by default calls the into_iter
when for … in
syntax is used.
But what about the other two similar methods that we saw at the beginning of this post? That is iter()
and iter_mut()
.
It is also possible to use these two methods to turn a Vec
into an iterator
.
That is:
fn main() {
let some_ints = vec![1,2,3];
// first calling into_iter() works
some_ints.iter().for_each(|item| {
dbg!(item);
});
}
and
fn main() {
let mut some_ints = vec![1,2,3];
some_ints.iter_mut().for_each(|item| {
dbg!(item);
});
}
What is going on when iter_mut()
and iter()
are used? And how is this different from the into_iter()
that comes from the IntoIterator
trait?
3 different kinds of iteration
Iterators can come in different forms. Nothing is stopping a developer from implementing an Iterator that has other custom behavior that defines how it iterates.
In Rust's standard library, most collections have 3 different kinds of Iterators. We can have one which takes ownership of the value being iterated, one that borrows the value immutably, and another that borrows the value and can mutate it.
An iterator that takes ownership can be created by calling into_iter()
, one that borrows immutable can be created by calling iter()
and the one that borrows value with the ability to mutate can be created by calling iter_mut()
. This is the crux of the Rust Iterator pattern with iter(), into_iter() and iter_mut() methods post.
It turns out that the Rust compiler by default goes for the into_iter()
version when it de-sugars the for … in
syntax.
One important thing to point out here is the fact that it is possible to have a custom data structure, that is an iterator, i.e. has all the familiar iteration-related methods: map
, for_each
etc but which is not usable with the for … in
syntax. This will be the case if such data structure implements Iterator
but does not implement the IntoIterator
trait. Because without implementing the IntoIterator
trait, there will be no into_iter()
method for the for … in
syntax to call.
Another interesting point is what happens if we manually call any of into_iter()
, iter()
, or iter_mut
ourselves as part of usage in for … in
syntax. Basically what was shown in the Rust Iterator pattern with iter(), into_iter() and iter_mut() methods post.
How come these works:
fn main() {
let mut some_ints = vec![1,2,3];
// manually calling iter in a for … in
for i in some_ints.iter() {
dbg!(i);
}
// manually calling iter_mut in a for … in
for i in some_ints.iter_mut() {
dbg!(i);
}
// manually calling into_iter in a for … in
for i in some_ints.into_iter() {
dbg!(i);
}
}
We are manually converting the Vec
ourselves to an iterator by calling iter()
, iter_mut()
, and into_iter()
and yet it works.
Why does this work?
Was it not already stated that the for … in
syntax works with anything that implements IntoIterator
which allows it to call into_iter()
. And here we are, manually converting the Vec
into an iterator
ourselves, and yet the for … in
works. How come?
The answer is in a little trick in the standard library. Which is the fact that the standard library contains this implementation for IntoIterator
:
impl<I: Iterator> IntoIterator for I
This basically means any Iterator
implements IntoIterator
and the implementation is such that the Iterator
returns itself when into_iter()
is called. Which makes sense if you think about it. If something is already an Iterator, what else can be done when you attempt to turn it again into an Iterator
other than it returning itself?
And this is what happens. Even though the iter()
, into_iter()
and iter_mut
methods are called directly, for … in
, still work, because the iterator
created by calling this method automatically has an implementation of IntoIterator
which returns itself and which the for … in
syntax needs.
The above shows how implementing the IntoIterator
trait can be done in such a way as to provide interesting functionalities.
Another interesting utility that is achieved via providing different implementations for IntoIterator
is how it is possible to use for … in
with a collection and yet be able to still use the collection after iteration, without having to call the iter()
or iter_mut
.
This is a more succinct syntax to the solution provided in the Rust Iterator pattern with iter(), into_iter() and iter_mut() methods post
We look at this, in the next section.
The 3 Implementation of IntoIterator for Vec
There are three different implementations of IntoIterator
for the Vec
type. These 3 different implementations are for 3 different variants of the Vec
type depending on its memory access.
There are implementations of IntoIterator
for the bare Vec<T>
type, the immutable referenced &Vec<T>
type and mutable reference &'a mut Vec<T>
type.
The implementation of IntoIterator
for bare Vec<T>
returns an Iterator
that takes ownership of the values as they are iterated. The implementation of IntoIterator
for &Vec<T>
borrows the value being iterated immutable, while the implementation of IntoIterator
for &'a mut Vec<T>
makes it possible to mutate the value as part of the iteration.
This means one can iterate over a Vec
type and still be able to use it afterward if the iteration is done over &Vec<T>
or &'a mut Vec<T>
. For example:
fn main() {
let some_ints = vec![1,2,3];
for i in &some_ints { // same as calling some_ints.iter()
dbg!(i);
}
}
and
fn main() {
let mut some_ints = vec![1,2,3];
for i in &mut some_ints { // same as calling some_ints.iter_mut()
*i = *i * 2;
}
dbg!(some_ints);
}
The above syntax can be used as a more succinct way to iterate over a data structure like Vec
using the for … in
syntax without taking ownership of the Vec
.
Summary
- The
IntoIterator
is a trait that defines how anIterator
can be created for a data structure. It defines aninto_iter()
method that when it is called, should return anIterator
- The
for .. in
syntax requires there is an implementation ofIntoIterator
because the compiler automagically first callsinto_iter()
to retrieve anIterator
it uses for its iteration. - The Rust standard library also contains an implementation of
IntoIterator
for anIterator
. This implementation just returns theIterator
. This makes sense, because if something is already aniterator
, then returning it, satisfies the contract defined byIntoIterator
- The fact that there is an
IntoIterator
forIterator
that returns thatiterator
means that methods likeiter()
oriter_mut()
can still be used within thefor .. in
syntax. Thefor .. in
syntax, can call theinto_iter()
gets theIterator
itself and uses that for its iteration. - The standard library also contains 3 different implementations of
IntoIterator
forVec<T>
,&Vec<T>
, and&'a mut Vec<T>
. The implementation forVec<T>
takes ownership of the value being iterated, the implementation for&Vec<T>
borrows values being iterated immutably, and the implementation for&'a mut Vec<T>
borrows values mutably. - Given a variable
some_var
holding aVec
one can iterate over it usingfor … in
and still be able to use it after the iteration if the iteration is done over&some_var
and&mut some_ints
.
Very clear explanation.
ReplyDeleteThe is a small copy/paste error :
fn main() {
let some_ints = vec![1,2,3];
// calling for_each directly on a Vec won't compile
some_ints.iter()(|item| {
dbg!(item);
});
}
As you are trying to demonstrate non-compiling code this should be some_ints.for_each(), as confirmed by the rustc error message below.
Thanks for spotting that. I have updated with the correction.
ReplyDeleteThere are a few into_inter that should be into_iter.
ReplyDeleteThanks! Updated!
ReplyDelete