Rust Optimization: `Vec::into_iter().collect()`

arendjr@programming.dev · 4 months ago

Rust Optimization: `Vec::into_iter().collect()`

Schmeckinger@feddit.de · 4 months ago

I don’t know and don’t think so, but what you are doing is better done with retain anyways.

arendjr@programming.dev · 4 months ago

I mean, the actual operation is just an example, of course. Feel free to make it a .map() operation instead. The strings couldn’t be reused then, but the vector’s allocation still could… in theory.

Ich, einfach anders@lemmings.world · edit-2 4 months ago

map() can still be used with Vec::iter_mut(), filter_map() can be replaced with Vec::retain_mut().

arendjr@programming.dev · edit-2 4 months ago

Yeah, that’s helpful if I would be currently optimizing a hot loop now. But I was really just using it as an example. Also, retain_mut() doesn’t compose as well.

I’d much rather write:

let vec_a: Vec<String> = /* ... */;
let vec_b: Vec<String> = vec_a
    .into_iter()
    .filter(some_filter)
    .map(some_map_fn)
    .collect();

Over:

let mut vec_a: Vec<String> = /* ... */;
vec_a.retain_mut(|x| if some_filter(x) {
    *x = some_map_fn(*x); // Yikes, cannot move out of reference.
    true
} else {
    false
});

And it would be nice if that would be optimized the same. After all, the point of Rust’s iterators is to provide zero-cost abstractions. In my opinion, functions like retain_mut() represent a leakiness to that abstraction, because the alternative turns out to not be zero cost.

taladar@sh.itjust.works · 4 months ago

https://blog.polybdenum.com/2024/01/17/identifying-the-collect-vec-memory-leak-footgun.html might be relevant to your question.

along with the related https://github.com/rust-lang/rust/issues/120091

arendjr@programming.dev · 4 months ago

Thanks! That’s very much what I was looking for!

porgamrer@programming.dev · 4 months ago

Is it really fair to say retain doesn’t compose as well just because it requires reference-based update instead of move-based? I also think using move semantics for in-place updates makes it harder to optimise things like a single field being updated on a large struct.

It also seems harsh to say iterators aren’t a zero-cost abstraction if they miss an optimisation that falls outside what the API promises. It’s natural to expect collect to allocate, no?

But I’m only writing this because I wonder if I haven’t understood your point fully.

(Side note: I think you could implement the API you want on top of retain_mut by using std::mem::replace with a default value, but you’d be hoping that the compiler optimises away all the replace calls when it inlines and sees the code can’t panic. Idk if that would actually work.)

Rust Optimization: Vec::into_iter().collect()

Rust Optimization: Vec::into_iter().collect()

Rust Optimization: `Vec::into_iter().collect()`

Rust Optimization: `Vec::into_iter().collect()`