Improving our I/O Project
We can improve our implementation of the I/O project in Chapter 12 by using
iterators to make places in the code clearer and more concise. Let's take a
look at how iterators can improve our implementation of both the Config::new
function and the search function.
Removing a clone Using an Iterator
In Listing 12-13, we had code that took a slice of String values and created
an instance of the Config struct by checking for the right number of
arguments, indexing into the slice, and cloning the values so that the Config
struct could own those values. We've reproduced the code here in Listing 13-23:
Filename: src/main.rs
impl Config {
fn new(args: &[String]) -> Result<Config, &'static str> {
if args.len() < 3 {
return Err("not enough arguments");
}
let query = args[1].clone();
let filename = args[2].clone();
Ok(Config {
query, filename
})
}
}
Listing 13-23: Reproduction of the Config::new function
from Listing 12-13
At the time, we said not to worry about the inefficient clone calls here
because we would remove them in the future. Well, that time is now!
The reason we needed clone here in the first place is that we have a slice
with String elements in the parameter args, but the new function does not
own args. In order to be able to return ownership of a Config instance, we
need to clone the values that we put in the query and filename fields of
Config, so that the Config instance can own its values.
With our new knowledge about iterators, we can change the new function to
take ownership of an iterator as its argument instead of borrowing a slice.
We'll use the iterator functionality instead of the code we had that checks the
length of the slice and indexes into specific locations. This will clear up
what the Config::new function is doing since the iterator will take care of
accessing the values.
Once Config::new taking ownership of the iterator and not using indexing
operations that borrow, we can move the String values from the iterator into
Config rather than calling clone and making a new allocation.
Using the Iterator Returned by env::args Directly
In your I/O project's src/main.rs, let's change the start of the main
function from this code that we had in Listing 12-23:
fn main() {
let args: Vec<String> = env::args().collect();
let mut stderr = std::io::stderr();
let config = Config::new(&args).unwrap_or_else(|err| {
writeln!(
&mut stderr,
"Problem parsing arguments: {}",
err
).expect("Could not write to stderr");
process::exit(1);
});
// ...snip...
}
To the code in Listing 13-24:
Filename: src/main.rs
fn main() {
let mut stderr = std::io::stderr();
let config = Config::new(env::args()).unwrap_or_else(|err| {
writeln!(
&mut stderr,
"Problem parsing arguments: {}",
err
).expect("Could not write to stderr");
process::exit(1);
});
// ...snip...
}
Listing 13-24: Passing the return value of env::args to
Config::new
The env::args function returns an iterator! Rather than collecting the
iterator values into a vector and then passing a slice to Config::new, now
we're passing ownership of the iterator returned from env::args to
Config::new directly.
Next, we need to update the definition of Config::new. In your I/O project's
src/lib.rs, let's change the signature of Config::new to look like Listing
13-25:
Filename: src/lib.rs
impl Config {
fn new(args: std::env::Args) -> Result<Config, &'static str> {
// ...snip...
Listing 13-25: Updating the signature of Config::new to
expect an iterator
The standard library documentation for the env::args function shows that the
type of the iterator it returns is std::env::Args. We've updated the
signature of the Config::new function so that the parameter args has the
type std::env::Args instead of &[String].
Using Iterator Trait Methods Instead of Indexing
Next, we'll fix the body of Config::new. The standard library documentation
also mentions that std::env::Args implements the Iterator trait, so we know
we can call the next method on it! Listing 13-26 has the new code:
Filename: src/lib.rs
# #![allow(unused_variables)] #fn main() { # struct Config { # query: String, # filename: String, # } # impl Config { fn new(mut args: std::env::Args) -> Result<Config, &'static str> { args.next(); let query = match args.next() { Some(arg) => arg, None => return Err("Didn't get a query string"), }; let filename = match args.next() { Some(arg) => arg, None => return Err("Didn't get a file name"), }; Ok(Config { query, filename }) } } #}
Listing 13-26: Changing the body of Config::new to use
iterator methods
Remember that the first value in the return value of env::args is the name of
the program. We want to ignore that and get to the next value, so first we call
next and do nothing with the return value. Second, we call next on the
value we want to put in the query field of Config. If next returns a
Some, we use a match to extract the value. If it returns None, it means
not enough arguments were given and we return early with an Err value. We do
the same thing for the filename value.
Making Code Clearer with Iterator Adaptors
The other place in our I/O project we could take advantage of iterators is in
the search function, as implemented in Listing 12-19 and reproduced here in
Listing 13-27:
Filename: src/lib.rs
fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
let mut results = Vec::new();
for line in contents.lines() {
if line.contains(query) {
results.push(line);
}
}
results
}
Listing 13-27: The implementation of the search
function from Listing 12-19
We can write this code in a much shorter way by using iterator adaptor methods
instead. This also lets us avoid having a mutable intermediate results
vector. The functional programming style prefers to minimize the amount of
mutable state to make code clearer. Removing the mutable state might make it
easier for us to make a future enhancement to make searching happen in
parallel, since we wouldn't have to manage concurrent access to the results
vector. Listing 13-28 shows this change:
Filename: src/lib.rs
fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
contents.lines()
.filter(|line| line.contains(query))
.collect()
}
Listing 13-28: Using iterator adaptor methods in the
implementation of the search function
Recall that the purpose of the search function is to return all lines in
contents that contain the query. Similarly to the filter example in
Listing 13-18, we can use the filter adaptor to keep only the lines that
line.contains(query) returns true for. We then collect the matching lines up
into another vector with collect. Much simpler!
The next logical question is which style you should choose in your own code: the original implementation in Listing 13-27, or the version using iterators in Listing 13-28. Most Rust programmers prefer to use the iterator style. It's a bit tougher to get the hang of at first, but once you get a feel for the various iterator adaptors and what they do, iterators can be easier to understand. Instead of fiddling with the various bits of looping and building new vectors, the code focuses on the high-level objective of the loop. This abstracts away some of the commonplace code so that it's easier to see the concepts that are unique to this code, like the filtering condition each element in the iterator must pass.
But are the two implementations truly equivalent? The intuitive assumption might be that the more low-level loop will be faster. Let's talk about performance.