rust学习笔记

阅读量: searchstar 2020-08-15 10:41:03

Categories： Tags：

¶ 安装

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

或者：linux 非交互式安装rust

安装rust之后rustup doc，文档就会在浏览器里打开。点击里面的The Rust Programming Language，就可以看到入门书的网页版了。

升级：rustup update

安装Nightly toolchain:

rustup toolchain install nightly

查看已安装的toolchain:

rustup show

参考：

https://rust-lang.github.io/rustup/basics.html#keeping-rust-up-to-date

https://rust-lang.github.io/rustup/concepts/channels.html

https://stackoverflow.com/questions/66681150/how-to-tell-cargo-to-use-nightly

¶ 卸载

rustup self uninstall

¶ cargo

¶ 文档

cargo doc --open

可以生成并在浏览器打开项目的文档。

¶ 新建项目

cargo new <项目名>

¶ `Cargo.toml`

cargo build带优化

¶ `version`

指定crate的版本。如果把crate托管在github上的话，如果连续几个commit里的version都相同，那么实际上只取最早的那个commit作为这个version的crate。因此假如把一个version push到github之后，如果又进行了修改，那么需要更改version code才能让用户使用新的修改。

¶ `[dev-dependencies]`

定义只在test里用的依赖：https://stackoverflow.com/questions/29857002/how-to-define-test-only-dependencies

¶ Blocking waiting for file lock on package cache

rm -rf ~/.cargo/registry/index/*
rm ~/.cargo/.package-cache

https://stackoverflow.com/questions/47565203/cargo-build-hangs-with-blocking-waiting-for-file-lock-on-the-registry-index-a#answer-53066206

¶ publish

Publishing on crates.io

cargo login
cargo publish

¶ 标准库

https://stackoverflow.com/questions/45384928/is-there-any-way-to-look-up-in-hashset-by-only-the-value-the-type-is-hashed-on

¶ 字符串成员函数

trim
去掉前后空格。
parse
把字符串转成特定类型（通过要被赋值给的变量确定？）

¶ 排序

排序分为不稳定排序和稳定排序。稳定排序是指相等的元素会保持它们的相对位置不变，不稳定排序不保证这一点。

稳定排序用sort_by：https://doc.rust-lang.org/std/primitive.slice.html#method.sort_by

不稳定排序用sort_unstable_by：https://doc.rust-lang.org/std/primitive.slice.html#method.sort_unstable_by

它们的最坏时间复杂度都是

¶ Entry API

以BTreeMap的Entry API为例。基础用法见标准库文档：https://doc.rust-lang.org/stable/std/collections/struct.BTreeMap.html#method.entry

但是基础的and_modify和or_insert_with接口有个问题，就是它们虽然是互斥的，但是却不能把一个object的ownership同时传给这两个接口。要解决这个问题，假如这个object有一个empty的状态，可以先or_insert把它变成empty，再进行修改操作。来源：https://users.rust-lang.org/t/hashmap-entry-api-and-ownership/81368

另一种比较通用的方法是用match判断返回的Entry是Occupied还是Vacant，这样编译器就知道这两种情况是互斥的了。

match的另一个例子：modify and optionally remove

use std::collections::btree_map::{self, BTreeMap};

fn pop(m: &mut BTreeMap<u32, Vec<u32>>, key: u32) -> Option<u32> {
    match m.entry(key) {
        btree_map::Entry::Occupied(mut entry) => {
            let values = entry.get_mut();
            let ret = values.pop();
            if values.is_empty() {
                entry.remove();
            }
            ret
        }
        btree_map::Entry::Vacant(_) => {
            None
        }
    }
}
fn main() {
    let mut m: BTreeMap<u32, Vec<u32>> = BTreeMap::new();
    m.insert(1, vec![2, 3]);
    assert_eq!(pop(&mut m, 1), Some(3));
    assert_eq!(pop(&mut m, 1), Some(2));
    assert!(m.is_empty());
}

参考：

https://doc.rust-lang.org/stable/std/collections/btree_map/enum.Entry.html

https://doc.rust-lang.org/stable/std/collections/btree_map/struct.VacantEntry.html

¶ mpsc

需求：需要在一个线程里读取数据，发送给另一个线程处理。

我的方法：用mpsc的channel发送和接收。

坑：mpsc的channel从不阻塞发送方，它有无限的缓冲。结果读取远远比写入快，导致大量内存被消耗。

解决方案：用sync_channel：

pub fn sync_channel<T>(bound: usize) -> (SyncSender<T>, Receiver<T>)

这个bound参数应该指的是个数。

文档：https://doc.rust-lang.org/stable/std/sync/mpsc/index.html

¶ Crates

¶ dyn_struct

https://www.reddit.com/r/rust/comments/qbj84o/dyn_struct_create_types_whose_size_is_determined/

https://github.com/nolanderc/dyn_struct

¶ enum_iterator

可以获取enum的可能取值个数。

¶ num-derive

可以把enum转成基本类型。

¶ serde

¶ 指定field名字

#[derive(Deserialize)]
struct Info {
    #[serde(rename = "num-run-op")]
    num_run_op: usize,
}

这样读json的时候就会把json里的num-run-op映射到num_run_op。

文档：https://serde.rs/field-attrs.html

¶ clap

官方文档：https://docs.rs/clap/latest/clap/

derive的用法：https://docs.rs/clap/latest/clap/_derive/index.html

¶ `#[arg(...)]`

¶ `short`

自动取field name的首字母作为参数名。

也可以short = 'x'指定参数名。

¶ `long`

自动把field name的下划线替换为-作为参数名。也可以long = "xxx"指定参数名。

¶ `default_value_t`

default_value_t [= <expr>]

_t后缀应该是type的意思。

¶ Positional arguments

不指定short之类的，默认就是positional argument。

¶ Optional auguments

把参数定义成Option<类型>即可。

¶ API guidelines

¶ Generic reader/writer functions take R: Read and W: Write by value (C-RW-VALUE)

https://rust-lang.github.io/api-guidelines/interoperability.html#generic-readerwriter-functions-take-r-read-and-w-write-by-value-c-rw-value

What is the reason for C-RW-VALUE?

¶ 类型转换

use std::ffi::CStr;
let c_buf: *const c_char = unsafe { hello() };
let c_str: &CStr = unsafe { CStr::from_ptr(c_buf) };
let str_slice: &str = c_str.to_str().unwrap();
let str_buf: String = str_slice.to_owned();  // if necessary

¶ 语法

Rust for循环

_是通配符

在这里插入图片描述

这里指匹配所有的Err，不管里面是啥。

https://users.rust-lang.org/t/calling-function-in-struct-field-requires-extra-parenthesis/14214/2

¶ I/O

¶ 读取命令行参数

use std::io;
use std::env;
use std::error::Error;

fn main() -> Result<(), Box<dyn Error>> {
    let mut args = env::args();
    let arg0 = args.next().unwrap();
    // args.len(): Returns the exact remaining length of the iterator.
    if args.len() != 1 {
        eprintln!("{} dump-file", arg0);
        return Err(Box::new(io::Error::new(
            io::ErrorKind::Other,
            "Invalid arguments",
        )));
    }
    let file_path = args.next().unwrap();
    println!("{}", file_path);
    Ok(())
}

参考：Rust编程知识拾遗：Rust 编程，读取命令行参数

¶ trait

Rust的trait相当于定义了这个类型有哪些接口。定义了trait之后，可以对已知类型实现这个trait:

trait A {
    fn a() -> i32;
}
impl A for f32 {
    fn a() -> i32 {
        return 2333;
    }
}
fn main() {
    // 2333
    println!("{}", f32::a());
}

¶ Associated type

trait A {
    type T;
}

如果B: A，一般可以这样访问T: B::T。但是在template argument中比较特殊：<B as A>::T。例子：

trait A {
    type T;
}
struct C<B: A, C = <B as A>::T> { a: B, at: C }

¶ Universal call syntax

文档：https://doc.rust-lang.org/reference/expressions/call-expr.html#disambiguating-function-calls

主要用来call指定trait的某个method：

<T as TraitA>::method_name(xxx)

¶ 约束不同类型的associated type相等

这是一个未实现的特性：https://github.com/rust-lang/rust/issues/20041

但是可以绕过去：https://stackoverflow.com/questions/66359551/alternative-to-equality-constraints-for-associated-types

¶ FnOnce, FnMut, Fn

https://stackoverflow.com/questions/30177395/when-does-a-closure-implement-fn-fnmut-and-fnonce

但是如果要构造function array的话，好像只能用fn类型，也就是普通函数：https://stackoverflow.com/questions/31736656/how-to-implement-a-vector-array-of-functions-in-rust-when-the-functions-co

¶ Higher-Rank Trait Bounds (HRTBs)

官方文档：https://doc.rust-lang.org/nomicon/hrtb.html

基本语法：T: for<'a> TraitName<'a>

相当于对所有的lifetime，T都要满足这个trait bound。例子：

use std::ops::SubAssign;
fn func<T>(a: &mut T, b: &T)
where
    T: for<'a> SubAssign<&'a T>,
{
    *a -= b;
}
fn main() {
    let mut a = 2;
    let b = 1;
    func(&mut a, &b);
    println!("{}", a);
}

¶ 多线程

原子量：https://doc.rust-lang.org/std/sync/atomic/index.html
读写锁：https://doc.rust-lang.org/stable/std/sync/struct.RwLock.html
rust scoped thread pool
rust scoped thread

¶ channel

标准库里的mpsc对应的select!已经deprecated了。可以考虑使用crossbeam-channel: https://docs.rs/crossbeam-channel/latest/crossbeam_channel/

select: https://docs.rs/crossbeam-channel/latest/crossbeam_channel/macro.select.html

¶ 错误处理

¶ 让main函数兼容多种Error

use std::error::Error;
fn main() -> Result<(), Box<dyn Error>> {

¶ 将多种Error通过channel发送出去

Box<dyn Error>是没法通过channel发送出去的。可以枚举出有哪些种类的Error，然后手搓一个enum表示它，这样就可以发送出去了：

enum FlushError {
    Bincode(bincode::Error),
    Io(io::Error),
}
impl From<bincode::Error> for FlushError {
    fn from(e: bincode::Error) -> Self {
        Self::Bincode(e)
    }
}
impl From<io::Error> for FlushError {
    fn from(e: io::Error) -> Self {
        Self::Io(e)
    }
}

参考：

https://fettblog.eu/rust-enums-wrapping-errors/

https://stackoverflow.com/questions/71977024/rust-cannot-send-unwrapped-result-data-across-await-point

¶ 获得`Vec`里多个元素的mutable reference

比如要获得a[1]和a[3]的可变引用，可以用iterator:

fn main() {
    let mut a = vec![0, 1, 2, 3, 4, 5];
    let mut iter = a.iter_mut();
    let a1 = iter.nth(1).unwrap();
    let a3 = iter.nth(3 - 1 - 1).unwrap();
    *a1 = -1;
    *a3 = -1;
    println!("{:?}", a);
}

也可以用nightly特性get_many_mut:

#![feature(get_many_mut)]
fn main() {
    let mut a = vec![0, 1, 2, 3, 4, 5];
    let [a1, a3] = a.get_many_mut([1, 3]).unwrap();
    *a1 = -1;
    *a3 = -1;
    println!("{:?}", a);
}

¶ struct成员变量默认值

https://stackoverflow.com/questions/19650265/is-there-a-faster-shorter-way-to-initialize-variables-in-a-rust-struct

¶ 生成随机数

用rand crate。文档：https://docs.rs/rand/latest/rand/。

基础用法：https://docs.rs/rand/latest/rand/#quick-start

自带的随机数生成器：https://docs.rs/rand/latest/rand/rngs/index.html

如果需要指定随机种子的话，一般rand::rngs::StdRng即可满足需求，文档：https://docs.rs/rand/latest/rand/rngs/struct.StdRng.html

一些自带的分布：https://docs.rs/rand/latest/rand/distributions/index.html

比较常见的均匀分布：https://docs.rs/rand/latest/rand/distributions/struct.Uniform.html

¶ lower_bound / upper_bound

https://stackoverflow.com/questions/48575866/how-to-get-the-lower-bound-and-upper-bound-of-an-element-in-a-btreeset

¶ Module

https://doc.rust-lang.org/book/ch07-05-separating-modules-into-different-files.html

¶ 条件编译

官方文档：https://doc.rust-lang.org/reference/conditional-compilation.html

https://stackoverflow.com/questions/29857002/how-to-define-test-only-dependencies

仅在测试时derive: #[cfg_attr(test, derive(Deserialize))]。来源：https://www.reddit.com/r/rust/comments/nwywqx/conditionally_derive_for_integration_tests/

仅在测试时impl:

#[cfg(test)]
impl Default for Status {

¶ 其他

https://stackoverflow.com/questions/60253791/why-can-i-not-mutably-borrow-separate-fields-from-a-mutex-guard

https://stackoverflow.com/questions/28185854/how-do-i-test-crates-with-no-std

¶ Unstable features

¶ generic_const_exprs

需要的项目：

https://github.com/seekstar/counter-timer-cpp

配合array-macro可以用array存timer而不是Vec。

¶ RFC

Multiple Attributes in an Attribute Container (postponed)

支持不允许Drop的类型：[https://github.com/rust-lang/rfcs/pull/776] (postponed)

Improving Entry API to get the keys back when they are unused

¶ 已知问题

¶ Non-lexical lifetimes (NLL)

来源：https://blog.rust-lang.org/2022/08/05/nll-by-default.html

fn last_or_push<'a>(vec: &'a mut Vec<String>) -> &'a String {
    if let Some(s) = vec.last() { // borrows vec
        // returning s here forces vec to be borrowed
        // for the rest of the function, even though it
        // shouldn't have to be
        return s;
    }

    // Because vec is borrowed, this call to vec.push gives
    // an error!
    vec.push("".to_string()); // ERROR
    vec.last().unwrap()
}

error[E0502]: cannot borrow `*vec` as mutable because it is also borrowed as immutable
  --> a.rs:11:5
   |
1  | fn last_or_push<'a>(vec: &'a mut Vec<String>) -> &'a String {
   |                 -- lifetime `'a` defined here
2  |     if let Some(s) = vec.last() { // borrows vec
   |                      ---------- immutable borrow occurs here
...
6  |         return s;
   |                - returning this value requires that `*vec` is borrowed for `'a`
...
11 |     vec.push("".to_string()); // ERROR
   |     ^^^^^^^^^^^^^^^^^^^^^^^^ mutable borrow occurs here

这是因为s borrow了vec之后，s是conditional return的，但是编译器仍然将对vec的borrow拓展到所有条件分支了，就导致另一个没有borrow vec的分支也被认为borrow了vec，就编译报错了。

据说下一代borrow checker polonius可以解决这个问题。现在只能通过推迟对vec的borrow绕过这个问题：

fn last_or_push<'a>(vec: &'a mut Vec<String>) -> &'a String {
    if !vec.is_empty() {
        let s = vec.last().unwrap(); // borrows vec
        return s; // extends the borrow
    }

    // In this branch, the borrow has never happened, so even
    // though it is extended, it doesn't cover this call;
    // the code compiles.
    //
    // Note the subtle difference with the previous example:
    // in that code, the borrow *always* happened, but it was
    // only *conditionally* returned (but the compiler lost track
    // of the fact that it was a conditional return).
    //
    // In this example, the *borrow itself* is conditional.
    vec.push("".to_string());
    vec.last().unwrap()
}

fn main() { }

¶ 调试时不能执行复杂代码

https://stackoverflow.com/questions/68232945/execute-a-statement-while-debugging-in-rust

¶ Raw pointers are `!Sync` and `!Send`

https://doc.rust-lang.org/nomicon/send-and-sync.html

主要目的是防止含有裸指针的struct被自动标记为thread-safe。

所以如果需要在不同线程之间共享裸指针，而且可以保证裸指针引用的部分已经做了并发控制的话，可以写一个wrapper：

struct ThreadSafePtr<T>(*mut T);
unsafe impl<T> Send for ThreadSafePtr<T> {}
unsafe impl<T> Sync for ThreadSafePtr<T> {}

但我觉得应该让raw pointer本身是thread safe的，然后在编译器层面不让含有裸指针的struct被自动标记为thread safe。

¶ drop的时候拿的是mutable reference而不是ownership

https://stackoverflow.com/questions/30905826/why-does-drop-take-mut-self-instead-of-self

这是为了防止编译器在drop的最后又自动调用drop。

如果需要在drop的时候consume某个field，可以通过把这个field放在Option里实现。或者把这个field用unsafe的ManuallyDrop包起来，然后在drop的时候take：https://users.rust-lang.org/t/can-drop-handler-take-ownership-of-a-field/74301/7

我觉得最好的实现应该是让drop拿ownership，然后在编译器里特殊处理这个case，在drop的最后不再调用drop。但是rust核心开发者觉得这个特性需要对编译器做太多修改：https://github.com/rust-lang/rust/issues/4330

¶ 如果有自定义的`Drop::drop`，就不能单独拿某个field的ownership

¶ Copy一个struct的mutable reference field时会mutable borrow这个struct

例如：

struct S<'a> {
    m: &'a mut i32,
}
impl<'a> S<'a> {
    fn f1<'b>(&'b mut self) -> &'a mut i32 {
        let new_m: &'a mut i32 = self.m;
        new_m
    }
}
fn f2(m: &mut i32) -> &mut i32 {
    let mut s = S { m };
    s.f1()
}
fn main() {
    let mut m = 2;
    *f2(&mut m) = 3;
    println!("{}", m);
}

会报错：

error: lifetime may not live long enough
 --> test.rs:6:20
  |
4 | impl<'a> S<'a> {
  |      -- lifetime `'a` defined here
5 |     fn f1<'b>(&'b mut self) -> &'a mut i32 {
  |           -- lifetime `'b` defined here
6 |         let new_m: &'a mut i32 = self.m;
  |                    ^^^^^^^^^^^ type annotation requires that `'b` must outlive `'a`
  |
  = help: consider adding the following bound: `'b: 'a`

error: aborting due to 1 previous error

显然我们不能改成'b: 'a，因为s是个局部变量，它的生命周期'b比'a短。

出现这个报错的原因是let new_m = self.m并不是单纯的copy，而是将*self.m的写入权限转让给了new_m。而编译器需要保证在写入权限交还给self前，self不能再被读或者写。于是编译器就让new_m mutable reference了self，这样就可以利用borrow机制保证这一点。而new_m mutable reference self就需要保证self活得比new_m长。

我认为我们可以引入一个新概念：mutability transfer。在let new_m = self.m时，我们说the mutability of self is transferred to new_m。当一个object的状态处于mutable transferred时，不允许读写之。这样就避免了影响new_m的lifetime。

目前遇到这种情况，只能让f1 consume self：

struct S<'a> {
    m: &'a mut i32,
}
impl<'a> S<'a> {
    fn f1(self) -> &'a mut i32 {
        let new_m: &'a mut i32 = self.m;
        new_m
    }
}
fn f2(m: &mut i32) -> &mut i32 {
    let s = S { m };
    s.f1()
}
fn main() {
    let mut m = 2;
    *f2(&mut m) = 3;
    println!("{}", m);
}

值得注意的是，copy一个immutable reference field是真正的copy，不需要reference整个struct，所以不会有这个问题。例如下面这段代码就可以通过编译：

struct S<'a> {
    m: &'a i32,
}
impl<'a> S<'a> {
    fn f1<'b>(&'b self) -> &'a i32 {
        let new_m: &'a i32 = self.m;
        new_m
    }
}
fn f2(m: &i32) -> &i32 {
    let s = S { m };
    s.f1()
}
fn main() {
    let m = 2;
    println!("{}", f2(&m));
}

但是，如果把S::m改成mutable reference:

struct S<'a> {
    m: &'a mut i32,
}
impl<'a> S<'a> {
    fn f1<'b>(&'b self) -> &'a i32 {
        let new_m: &'a i32 = self.m;
        new_m
    }
}
fn f2(m: &mut i32) -> &i32 {
    let s = S { m };
    s.f1()
}
fn main() {
    let mut m = 2;
    println!("{}", f2(&mut m));
}

即使是copy成一个immutable reference，也需要转让写入权限，所以就需要reference整个self，从而导致跟上面一样的lifetime的问题：

error: lifetime may not live long enough
 --> test.rs:6:20
  |
4 | impl<'a> S<'a> {
  |      -- lifetime `'a` defined here
5 |     fn f1<'b>(&'b self) -> &'a i32 {
  |           -- lifetime `'b` defined here
6 |         let new_m: &'a i32 = self.m;
  |                    ^^^^^^^ type annotation requires that `'b` must outlive `'a`
  |
  = help: consider adding the following bound: `'b: 'a`

error: aborting due to 1 previous error

这时也只能通过让f1 consume self来解决问题：

fn f1<'b>(self) -> &'a i32 {

rust单行读取多个整数

linux kernel get_user include