rust学习笔记

阅读量: searchstar 2020-08-15 10:41:03
Categories: Tags:

安装

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

或者:linux 非交互式安装rust

安装rust之后rustup doc,文档就会在浏览器里打开。点击里面的The Rust Programming Language,就可以看到入门书的网页版了。

升级:rustup update

安装Nightly toolchain:

rustup toolchain install nightly

参考:

https://rust-lang.github.io/rustup/basics.html#keeping-rust-up-to-date

https://rust-lang.github.io/rustup/concepts/channels.html

https://stackoverflow.com/questions/66681150/how-to-tell-cargo-to-use-nightly

cargo

文档

cargo doc --open

可以生成并在浏览器打开项目的文档。

新建项目

cargo new <项目名>

Cargo.toml

version

指定crate的版本。如果把crate托管在github上的话,如果连续几个commit里的version都相同,那么实际上只取最早的那个commit作为这个version的crate。因此假如把一个version push到github之后,如果又进行了修改,那么需要更改version code才能让用户使用新的修改。

[dev-dependencies]

定义只在test里用的依赖:https://stackoverflow.com/questions/29857002/how-to-define-test-only-dependencies

Blocking waiting for file lock on package cache

rm -rf ~/.cargo/registry/index/*
rm ~/.cargo/.package-cache

https://stackoverflow.com/questions/47565203/cargo-build-hangs-with-blocking-waiting-for-file-lock-on-the-registry-index-a#answer-53066206

publish

Publishing on crates.io

cargo login
cargo publish

标准库

https://stackoverflow.com/questions/45384928/is-there-any-way-to-look-up-in-hashset-by-only-the-value-the-type-is-hashed-on

字符串成员函数

排序

排序分为不稳定排序和稳定排序。稳定排序是指相等的元素会保持它们的相对位置不变,不稳定排序不保证这一点。

稳定排序用sort_byhttps://doc.rust-lang.org/std/primitive.slice.html#method.sort_by

不稳定排序用sort_unstable_byhttps://doc.rust-lang.org/std/primitive.slice.html#method.sort_unstable_by

它们的最坏时间复杂度都是

Entry API

以BTreeMap的Entry API为例。基础用法见标准库文档:https://doc.rust-lang.org/stable/std/collections/struct.BTreeMap.html#method.entry

但是基础的and_modifyor_insert_with接口有个问题,就是它们虽然是互斥的,但是却不能把一个object的ownership同时传给这两个接口。要解决这个问题,假如这个object有一个empty的状态,可以先or_insert把它变成empty,再进行修改操作。来源:https://users.rust-lang.org/t/hashmap-entry-api-and-ownership/81368

另一种比较通用的方法是用match判断返回的Entry是Occupied还是Vacant,这样编译器就知道这两种情况是互斥的了。

match的另一个例子:modify and optionally remove

use std::collections::btree_map::{self, BTreeMap};

fn pop(m: &mut BTreeMap<u32, Vec<u32>>, key: u32) -> Option<u32> {
    match m.entry(key) {
        btree_map::Entry::Occupied(mut entry) => {
            let values = entry.get_mut();
            let ret = values.pop();
            if values.is_empty() {
                entry.remove();
            }
            ret
        }
        btree_map::Entry::Vacant(_) => {
            None
        }
    }
}
fn main() {
    let mut m: BTreeMap<u32, Vec<u32>> = BTreeMap::new();
    m.insert(1, vec![2, 3]);
    assert_eq!(pop(&mut m, 1), Some(3));
    assert_eq!(pop(&mut m, 1), Some(2));
    assert!(m.is_empty());
}

参考:

https://doc.rust-lang.org/stable/std/collections/btree_map/enum.Entry.html

https://doc.rust-lang.org/stable/std/collections/btree_map/struct.VacantEntry.html

mpsc

需求:需要在一个线程里读取数据,发送给另一个线程处理。

我的方法:用mpsc的channel发送和接收。

坑:mpsc的channel从不阻塞发送方,它有无限的缓冲。结果读取远远比写入快,导致大量内存被消耗。

解决方案:用sync_channel

pub fn sync_channel<T>(bound: usize) -> (SyncSender<T>, Receiver<T>)

这个bound参数应该指的是个数。

文档:https://doc.rust-lang.org/stable/std/sync/mpsc/index.html

Crates

dyn_struct

https://www.reddit.com/r/rust/comments/qbj84o/dyn_struct_create_types_whose_size_is_determined/

https://github.com/nolanderc/dyn_struct

enum_iterator

可以获取enum的可能取值个数。

num-derive

可以把enum转成基本类型。

serde

指定field名字

#[derive(Deserialize)]
struct Info {
    #[serde(rename = "num-run-op")]
    num_run_op: usize,
}

这样读json的时候就会把json里的num-run-op映射到num_run_op

文档:https://serde.rs/field-attrs.html

clap

官方文档:https://docs.rs/clap/latest/clap/

derive的用法:https://docs.rs/clap/latest/clap/_derive/index.html

#[arg(...)]

short

自动取field name的首字母作为参数名。

也可以short = 'x'指定参数名。

long

自动把field name的下划线替换为-作为参数名。也可以long = "xxx"指定参数名。

default_value_t
default_value_t [= <expr>]

_t后缀应该是type的意思。

Positional arguments

不指定short之类的,默认就是positional argument。

Optional auguments

把参数定义成Option<类型>即可。

API guidelines

Generic reader/writer functions take R: Read and W: Write by value (C-RW-VALUE)

https://rust-lang.github.io/api-guidelines/interoperability.html#generic-readerwriter-functions-take-r-read-and-w-write-by-value-c-rw-value

What is the reason for C-RW-VALUE?

类型转换

int <-> [u8]

Rust字节数组和整型互转

Vec<u8> -> String

https://stackoverflow.com/questions/19076719/how-do-i-convert-a-vector-of-bytes-u8-to-a-string

https://doc.rust-lang.org/stable/std/string/struct.String.html#method.from_utf8

Vec<T> -> [T; N]

try_into: https://stackoverflow.com/questions/29570607/is-there-a-good-way-to-convert-a-vect-to-an-array

char -> u8

https://users.rust-lang.org/t/how-to-convert-char-to-u8/50195

C语言字符串转String

原文:https://stackoverflow.com/questions/24145823/how-do-i-convert-a-c-string-into-a-rust-string-and-back-via-ffi

use std::ffi::CStr;
let c_buf: *const c_char = unsafe { hello() };
let c_str: &CStr = unsafe { CStr::from_ptr(c_buf) };
let str_slice: &str = c_str.to_str().unwrap();
let str_buf: String = str_slice.to_owned();  // if necessary

语法

Rust for循环

在这里插入图片描述

这里指匹配所有的Err,不管里面是啥。

https://users.rust-lang.org/t/calling-function-in-struct-field-requires-extra-parenthesis/14214/2

I/O

读取命令行参数

use std::io;
use std::env;
use std::error::Error;

fn main() -> Result<(), Box<dyn Error>> {
    let mut args = env::args();
    let arg0 = args.next().unwrap();
    // args.len(): Returns the exact remaining length of the iterator.
    if args.len() != 1 {
        eprintln!("{} dump-file", arg0);
        return Err(Box::new(io::Error::new(
            io::ErrorKind::Other,
            "Invalid arguments",
        )));
    }
    let file_path = args.next().unwrap();
    println!("{}", file_path);
    Ok(())
}

参考:Rust编程知识拾遗:Rust 编程,读取命令行参数

trait

Rust的trait相当于定义了这个类型有哪些接口。定义了trait之后,可以对已知类型实现这个trait:

trait A {
    fn a() -> i32;
}
impl A for f32 {
    fn a() -> i32 {
        return 2333;
    }
}
fn main() {
    // 2333
    println!("{}", f32::a());
}

相关:

https://stackoverflow.com/questions/44445730/how-to-call-a-method-when-a-trait-and-struct-use-the-same-method-name

https://users.rust-lang.org/t/box-with-a-trait-object-requires-static-lifetime/35261

Associated type

trait A {
    type T;
}

如果B: A,一般可以这样访问T: B::T。但是在template argument中比较特殊:<B as A>::T。例子:

trait A {
    type T;
}
struct C<B: A, C = <B as A>::T> { a: B, at: C }

Universal call syntax

文档:https://doc.rust-lang.org/reference/expressions/call-expr.html#disambiguating-function-calls

主要用来call指定trait的某个method:

<T as TraitA>::method_name(xxx)

约束不同类型的associated type相等

这是一个未实现的特性:https://github.com/rust-lang/rust/issues/20041

但是可以绕过去:https://stackoverflow.com/questions/66359551/alternative-to-equality-constraints-for-associated-types

FnOnce, FnMut, Fn

https://stackoverflow.com/questions/30177395/when-does-a-closure-implement-fn-fnmut-and-fnonce

但是如果要构造function array的话,好像只能用fn类型,也就是普通函数:https://stackoverflow.com/questions/31736656/how-to-implement-a-vector-array-of-functions-in-rust-when-the-functions-co

Higher-Rank Trait Bounds (HRTBs)

官方文档:https://doc.rust-lang.org/nomicon/hrtb.html

基本语法:T: for<'a> TraitName<'a>

相当于对所有的lifetime,T都要满足这个trait bound。例子:

use std::ops::SubAssign;
fn func<T>(a: &mut T, b: &T)
where
    T: for<'a> SubAssign<&'a T>,
{
    *a -= b;
}
fn main() {
    let mut a = 2;
    let b = 1;
    func(&mut a, &b);
    println!("{}", a);
}

多线程

channel

标准库里的mpsc对应的select!已经deprecated了。可以考虑使用crossbeam-channel: https://docs.rs/crossbeam-channel/latest/crossbeam_channel/

select: https://docs.rs/crossbeam-channel/latest/crossbeam_channel/macro.select.html

错误处理

让main函数兼容多种Error

use std::error::Error;
fn main() -> Result<(), Box<dyn Error>> {

将多种Error通过channel发送出去

Box<dyn Error>是没法通过channel发送出去的。可以枚举出有哪些种类的Error,然后手搓一个enum表示它,这样就可以发送出去了:

enum FlushError {
    Bincode(bincode::Error),
    Io(io::Error),
}
impl From<bincode::Error> for FlushError {
    fn from(e: bincode::Error) -> Self {
        Self::Bincode(e)
    }
}
impl From<io::Error> for FlushError {
    fn from(e: io::Error) -> Self {
        Self::Io(e)
    }
}

参考:

https://fettblog.eu/rust-enums-wrapping-errors/

https://stackoverflow.com/questions/71977024/rust-cannot-send-unwrapped-result-data-across-await-point

获得Vec里多个元素的mutable reference

比如要获得a[1]a[3]的可变引用,可以用iterator:

fn main() {
    let mut a = vec![0, 1, 2, 3, 4, 5];
    let mut iter = a.iter_mut();
    let a1 = iter.nth(1).unwrap();
    let a3 = iter.nth(3 - 1 - 1).unwrap();
    *a1 = -1;
    *a3 = -1;
    println!("{:?}", a);
}

也可以用nightly特性get_many_mut:

#![feature(get_many_mut)]
fn main() {
    let mut a = vec![0, 1, 2, 3, 4, 5];
    let [a1, a3] = a.get_many_mut([1, 3]).unwrap();
    *a1 = -1;
    *a3 = -1;
    println!("{:?}", a);
}

struct成员变量默认值

https://stackoverflow.com/questions/19650265/is-there-a-faster-shorter-way-to-initialize-variables-in-a-rust-struct

生成随机数

rand crate。文档:https://docs.rs/rand/latest/rand/

基础用法:https://docs.rs/rand/latest/rand/#quick-start

自带的随机数生成器:https://docs.rs/rand/latest/rand/rngs/index.html

如果需要指定随机种子的话,一般rand::rngs::StdRng即可满足需求,文档:https://docs.rs/rand/latest/rand/rngs/struct.StdRng.html

一些自带的分布:https://docs.rs/rand/latest/rand/distributions/index.html

比较常见的均匀分布:https://docs.rs/rand/latest/rand/distributions/struct.Uniform.html

lower_bound / upper_bound

https://stackoverflow.com/questions/48575866/how-to-get-the-lower-bound-and-upper-bound-of-an-element-in-a-btreeset

Module

https://doc.rust-lang.org/book/ch07-05-separating-modules-into-different-files.html

条件编译

官方文档:https://doc.rust-lang.org/reference/conditional-compilation.html

https://stackoverflow.com/questions/29857002/how-to-define-test-only-dependencies

仅在测试时derive: #[cfg_attr(test, derive(Deserialize))]。来源:https://www.reddit.com/r/rust/comments/nwywqx/conditionally_derive_for_integration_tests/

仅在测试时impl:

#[cfg(test)]
impl Default for Status {

其他

https://stackoverflow.com/questions/60253791/why-can-i-not-mutably-borrow-separate-fields-from-a-mutex-guard

https://stackoverflow.com/questions/28185854/how-do-i-test-crates-with-no-std

Unstable features

generic_const_exprs

需要的项目:

https://github.com/seekstar/counter-timer-cpp

配合array-macro可以用array存timer而不是Vec

RFC

Multiple Attributes in an Attribute Container (postponed)

支持不允许Drop的类型:[https://github.com/rust-lang/rfcs/pull/776] (postponed)

Improving Entry API to get the keys back when they are unused

已知问题

Non-lexical lifetimes (NLL)

来源:https://blog.rust-lang.org/2022/08/05/nll-by-default.html

fn last_or_push<'a>(vec: &'a mut Vec<String>) -> &'a String {
    if let Some(s) = vec.last() { // borrows vec
        // returning s here forces vec to be borrowed
        // for the rest of the function, even though it
        // shouldn't have to be
        return s;
    }

    // Because vec is borrowed, this call to vec.push gives
    // an error!
    vec.push("".to_string()); // ERROR
    vec.last().unwrap()
}
error[E0502]: cannot borrow `*vec` as mutable because it is also borrowed as immutable
  --> a.rs:11:5
   |
1  | fn last_or_push<'a>(vec: &'a mut Vec<String>) -> &'a String {
   |                 -- lifetime `'a` defined here
2  |     if let Some(s) = vec.last() { // borrows vec
   |                      ---------- immutable borrow occurs here
...
6  |         return s;
   |                - returning this value requires that `*vec` is borrowed for `'a`
...
11 |     vec.push("".to_string()); // ERROR
   |     ^^^^^^^^^^^^^^^^^^^^^^^^ mutable borrow occurs here

这是因为s borrow了vec之后,s是conditional return的,但是编译器仍然将对vec的borrow拓展到所有条件分支了,就导致另一个没有borrow vec的分支也被认为borrow了vec,就编译报错了。

据说下一代borrow checker polonius可以解决这个问题。现在只能通过推迟对vec的borrow绕过这个问题:

fn last_or_push<'a>(vec: &'a mut Vec<String>) -> &'a String {
    if !vec.is_empty() {
        let s = vec.last().unwrap(); // borrows vec
        return s; // extends the borrow
    }

    // In this branch, the borrow has never happened, so even
    // though it is extended, it doesn't cover this call;
    // the code compiles.
    //
    // Note the subtle difference with the previous example:
    // in that code, the borrow *always* happened, but it was
    // only *conditionally* returned (but the compiler lost track
    // of the fact that it was a conditional return).
    //
    // In this example, the *borrow itself* is conditional.
    vec.push("".to_string());
    vec.last().unwrap()
}

fn main() { }

调试时不能执行复杂代码

https://stackoverflow.com/questions/68232945/execute-a-statement-while-debugging-in-rust

Raw pointers are !Sync and !Send

https://doc.rust-lang.org/nomicon/send-and-sync.html

主要目的是防止含有裸指针的struct被自动标记为thread-safe。

所以如果需要在不同线程之间共享裸指针,而且可以保证裸指针引用的部分已经做了并发控制的话,可以写一个wrapper:

struct ThreadSafePtr<T>(*mut T);
unsafe impl<T> Send for ThreadSafePtr<T> {}
unsafe impl<T> Sync for ThreadSafePtr<T> {}

但我觉得应该让raw pointer本身是thread safe的,然后在编译器层面不让含有裸指针的struct被自动标记为thread safe。

相关讨论:https://internals.rust-lang.org/t/shouldnt-pointers-be-send-sync-or/8818

drop的时候拿的是mutable reference而不是ownership

https://stackoverflow.com/questions/30905826/why-does-drop-take-mut-self-instead-of-self

这是为了防止编译器在drop的最后又自动调用drop

如果需要在drop的时候consume某个field,可以通过把这个field放在Option里实现。或者把这个field用unsafeManuallyDrop包起来,然后在drop的时候takehttps://users.rust-lang.org/t/can-drop-handler-take-ownership-of-a-field/74301/7

我觉得最好的实现应该是让drop拿ownership,然后在编译器里特殊处理这个case,在drop的最后不再调用drop。但是rust核心开发者觉得这个特性需要对编译器做太多修改:https://github.com/rust-lang/rust/issues/4330

如果有自定义的Drop::drop,就不能单独拿某个field的ownership

Copy一个struct的mutable reference field时会mutable borrow这个struct

例如:

struct S<'a> {
    m: &'a mut i32,
}
impl<'a> S<'a> {
    fn f1<'b>(&'b mut self) -> &'a mut i32 {
        let new_m: &'a mut i32 = self.m;
        new_m
    }
}
fn f2(m: &mut i32) -> &mut i32 {
    let mut s = S { m };
    s.f1()
}
fn main() {
    let mut m = 2;
    *f2(&mut m) = 3;
    println!("{}", m);
}

会报错:

error: lifetime may not live long enough
 --> test.rs:6:20
  |
4 | impl<'a> S<'a> {
  |      -- lifetime `'a` defined here
5 |     fn f1<'b>(&'b mut self) -> &'a mut i32 {
  |           -- lifetime `'b` defined here
6 |         let new_m: &'a mut i32 = self.m;
  |                    ^^^^^^^^^^^ type annotation requires that `'b` must outlive `'a`
  |
  = help: consider adding the following bound: `'b: 'a`

error: aborting due to 1 previous error

显然我们不能改成'b: 'a,因为s是个局部变量,它的生命周期'b'a短。

出现这个报错的原因是let new_m = self.m并不是单纯的copy,而是将*self.m的写入权限转让给了new_m。而编译器需要保证在写入权限交还给self前,self不能再被读或者写。于是编译器就让new_m mutable reference了self,这样就可以利用borrow机制保证这一点。而new_m mutable reference self就需要保证self活得比new_m长。

我认为我们可以引入一个新概念:mutability transfer。在let new_m = self.m时,我们说the mutability of self is transferred to new_m。当一个object的状态处于mutable transferred时,不允许读写之。这样就避免了影响new_m的lifetime。

目前遇到这种情况,只能让f1 consume self

struct S<'a> {
    m: &'a mut i32,
}
impl<'a> S<'a> {
    fn f1(self) -> &'a mut i32 {
        let new_m: &'a mut i32 = self.m;
        new_m
    }
}
fn f2(m: &mut i32) -> &mut i32 {
    let s = S { m };
    s.f1()
}
fn main() {
    let mut m = 2;
    *f2(&mut m) = 3;
    println!("{}", m);
}

值得注意的是,copy一个immutable reference field是真正的copy,不需要reference整个struct,所以不会有这个问题。例如下面这段代码就可以通过编译:

struct S<'a> {
    m: &'a i32,
}
impl<'a> S<'a> {
    fn f1<'b>(&'b self) -> &'a i32 {
        let new_m: &'a i32 = self.m;
        new_m
    }
}
fn f2(m: &i32) -> &i32 {
    let s = S { m };
    s.f1()
}
fn main() {
    let m = 2;
    println!("{}", f2(&m));
}

但是,如果把S::m改成mutable reference:

struct S<'a> {
    m: &'a mut i32,
}
impl<'a> S<'a> {
    fn f1<'b>(&'b self) -> &'a i32 {
        let new_m: &'a i32 = self.m;
        new_m
    }
}
fn f2(m: &mut i32) -> &i32 {
    let s = S { m };
    s.f1()
}
fn main() {
    let mut m = 2;
    println!("{}", f2(&mut m));
}

即使是copy成一个immutable reference,也需要转让写入权限,所以就需要reference整个self,从而导致跟上面一样的lifetime的问题:

error: lifetime may not live long enough
 --> test.rs:6:20
  |
4 | impl<'a> S<'a> {
  |      -- lifetime `'a` defined here
5 |     fn f1<'b>(&'b self) -> &'a i32 {
  |           -- lifetime `'b` defined here
6 |         let new_m: &'a i32 = self.m;
  |                    ^^^^^^^ type annotation requires that `'b` must outlive `'a`
  |
  = help: consider adding the following bound: `'b: 'a`

error: aborting due to 1 previous error

这时也只能通过让f1 consume self来解决问题:

fn f1<'b>(self) -> &'a i32 {