¶ 安装
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
安装rust之后rustup doc
,文档就会在浏览器里打开。点击里面的The
Rust Programming Language,就可以看到入门书的网页版了。
升级:rustup update
安装Nightly toolchain:
rustup toolchain install nightly
参考:
https://rust-lang.github.io/rustup/basics.html#keeping-rust-up-to-date
https://rust-lang.github.io/rustup/concepts/channels.html
https://stackoverflow.com/questions/66681150/how-to-tell-cargo-to-use-nightly
¶ cargo
¶ 文档
cargo doc --open
可以生成并在浏览器打开项目的文档。
¶ 新建项目
cargo new <项目名>
¶
Cargo.toml
¶ version
指定crate的版本。如果把crate托管在github上的话,如果连续几个commit里的version都相同,那么实际上只取最早的那个commit作为这个version的crate。因此假如把一个version push到github之后,如果又进行了修改,那么需要更改version code才能让用户使用新的修改。
¶
[dev-dependencies]
定义只在test里用的依赖:https://stackoverflow.com/questions/29857002/how-to-define-test-only-dependencies
¶ Blocking waiting for file lock on package cache
rm -rf ~/.cargo/registry/index/*
rm ~/.cargo/.package-cache
¶ publish
cargo login
cargo publish
¶ 标准库
¶ 字符串成员函数
- trim
去掉前后空格。 - parse
把字符串转成特定类型(通过要被赋值给的变量确定?)
¶ 排序
排序分为不稳定排序和稳定排序。稳定排序是指相等的元素会保持它们的相对位置不变,不稳定排序不保证这一点。
稳定排序用sort_by
:https://doc.rust-lang.org/std/primitive.slice.html#method.sort_by
不稳定排序用sort_unstable_by
:https://doc.rust-lang.org/std/primitive.slice.html#method.sort_unstable_by
它们的最坏时间复杂度都是
¶ Entry API
以BTreeMap的Entry API为例。基础用法见标准库文档:https://doc.rust-lang.org/stable/std/collections/struct.BTreeMap.html#method.entry
但是基础的and_modify
和or_insert_with
接口有个问题,就是它们虽然是互斥的,但是却不能把一个object的ownership同时传给这两个接口。要解决这个问题,假如这个object有一个empty的状态,可以先or_insert
把它变成empty,再进行修改操作。来源:https://users.rust-lang.org/t/hashmap-entry-api-and-ownership/81368
另一种比较通用的方法是用match
判断返回的Entry是Occupied还是Vacant,这样编译器就知道这两种情况是互斥的了。
match
的另一个例子:modify and optionally remove
use std::collections::btree_map::{self, BTreeMap};
fn pop(m: &mut BTreeMap<u32, Vec<u32>>, key: u32) -> Option<u32> {
match m.entry(key) {
btree_map::Entry::Occupied(mut entry) => {
let values = entry.get_mut();
let ret = values.pop();
if values.is_empty() {
.remove();
entry}
ret}
btree_map::Entry::Vacant(_) => {
None
}
}
}
fn main() {
let mut m: BTreeMap<u32, Vec<u32>> = BTreeMap::new();
.insert(1, vec![2, 3]);
massert_eq!(pop(&mut m, 1), Some(3));
assert_eq!(pop(&mut m, 1), Some(2));
assert!(m.is_empty());
}
参考:
https://doc.rust-lang.org/stable/std/collections/btree_map/enum.Entry.html
https://doc.rust-lang.org/stable/std/collections/btree_map/struct.VacantEntry.html
¶ mpsc
需求:需要在一个线程里读取数据,发送给另一个线程处理。
我的方法:用mpsc的channel发送和接收。
坑:mpsc的channel从不阻塞发送方,它有无限的缓冲。结果读取远远比写入快,导致大量内存被消耗。
解决方案:用sync_channel
:
pub fn sync_channel<T>(bound: usize) -> (SyncSender<T>, Receiver<T>)
这个bound参数应该指的是个数。
文档:https://doc.rust-lang.org/stable/std/sync/mpsc/index.html
¶ Crates
¶ dyn_struct
https://www.reddit.com/r/rust/comments/qbj84o/dyn_struct_create_types_whose_size_is_determined/
https://github.com/nolanderc/dyn_struct
¶ enum_iterator
可以获取enum的可能取值个数。
¶ num-derive
可以把enum转成基本类型。
¶ serde
- rust serde deserialize borrowed member
- rust存取一个含有borrowed域的结构体
https://github.com/serde-rs/json#operating-on-untyped-json-values
¶ 指定field名字
#[derive(Deserialize)]
struct Info {
#[serde(rename = "num-run-op")]
: usize,
num_run_op}
这样读json的时候就会把json里的num-run-op
映射到num_run_op
。
文档:https://serde.rs/field-attrs.html
¶ clap
官方文档:https://docs.rs/clap/latest/clap/
derive的用法:https://docs.rs/clap/latest/clap/_derive/index.html
¶ #[arg(...)]
¶ short
自动取field name的首字母作为参数名。
也可以short = 'x'
指定参数名。
¶ long
自动把field
name的下划线替换为-
作为参数名。也可以long = "xxx"
指定参数名。
¶
default_value_t
default_value_t [= <expr>]
_t
后缀应该是type的意思。
¶ Positional arguments
不指定short
之类的,默认就是positional argument。
¶ Optional auguments
把参数定义成Option<类型>
即可。
¶ API guidelines
¶ Generic reader/writer functions take R: Read and W: Write by value (C-RW-VALUE)
What is the reason for C-RW-VALUE?
¶ 类型转换
¶ int <->
[u8]
Rust字节数组和整型互转
¶
Vec<u8>
-> String
https://stackoverflow.com/questions/19076719/how-do-i-convert-a-vector-of-bytes-u8-to-a-string
https://doc.rust-lang.org/stable/std/string/struct.String.html#method.from_utf8
¶
Vec<T>
-> [T; N]
用try_into
: https://stackoverflow.com/questions/29570607/is-there-a-good-way-to-convert-a-vect-to-an-array
¶ char -> u8
https://users.rust-lang.org/t/how-to-convert-char-to-u8/50195
¶ C语言字符串转String
use std::ffi::CStr;
let c_buf: *const c_char = unsafe { hello() };
let c_str: &CStr = unsafe { CStr::from_ptr(c_buf) };
let str_slice: &str = c_str.to_str().unwrap();
let str_buf: String = str_slice.to_owned(); // if necessary
¶ 语法
_
是通配符
这里指匹配所有的Err,不管里面是啥。
https://users.rust-lang.org/t/calling-function-in-struct-field-requires-extra-parenthesis/14214/2
¶ I/O
- Rust文件操作
- rust格式化打印到文件
- 算法竞赛中rust的一种比较健壮的读入
- rust识别EOF
- rust BufReader逐字符读取
- rust单行读取多个整数
- rust从一行中读取数组
- rust用BufReader加速stdin读取
- rust格式控制
- rust print固定宽度左边补零
¶ 读取命令行参数
use std::io;
use std::env;
use std::error::Error;
fn main() -> Result<(), Box<dyn Error>> {
let mut args = env::args();
let arg0 = args.next().unwrap();
// args.len(): Returns the exact remaining length of the iterator.
if args.len() != 1 {
eprintln!("{} dump-file", arg0);
return Err(Box::new(io::Error::new(
io::ErrorKind::Other,
"Invalid arguments",
;
)))}
let file_path = args.next().unwrap();
println!("{}", file_path);
Ok(())
}
¶ trait
Rust的trait相当于定义了这个类型有哪些接口。定义了trait之后,可以对已知类型实现这个trait:
trait A {
fn a() -> i32;
}
impl A for f32 {
fn a() -> i32 {
return 2333;
}
}
fn main() {
// 2333
println!("{}", f32::a());
}
相关:
https://users.rust-lang.org/t/box-with-a-trait-object-requires-static-lifetime/35261
¶ Associated type
trait A {
type T;
}
如果B: A
,一般可以这样访问T
:
B::T
。但是在template
argument中比较特殊:<B as A>::T
。例子:
trait A {
type T;
}
struct C<B: A, C = <B as A>::T> { a: B, at: C }
¶ Universal call syntax
文档:https://doc.rust-lang.org/reference/expressions/call-expr.html#disambiguating-function-calls
主要用来call指定trait的某个method:
<T as TraitA>::method_name(xxx)
¶ 约束不同类型的associated type相等
这是一个未实现的特性:https://github.com/rust-lang/rust/issues/20041
¶ FnOnce, FnMut, Fn
https://stackoverflow.com/questions/30177395/when-does-a-closure-implement-fn-fnmut-and-fnonce
但是如果要构造function
array的话,好像只能用fn
类型,也就是普通函数:https://stackoverflow.com/questions/31736656/how-to-implement-a-vector-array-of-functions-in-rust-when-the-functions-co
¶ Higher-Rank Trait Bounds (HRTBs)
官方文档:https://doc.rust-lang.org/nomicon/hrtb.html
基本语法:T: for<'a> TraitName<'a>
相当于对所有的lifetime,T
都要满足这个trait
bound。例子:
use std::ops::SubAssign;
fn func<T>(a: &mut T, b: &T)
where
: for<'a> SubAssign<&'a T>,
T{
*a -= b;
}
fn main() {
let mut a = 2;
let b = 1;
&mut a, &b);
func(println!("{}", a);
}
¶ 多线程
读写锁:https://doc.rust-lang.org/stable/std/sync/struct.RwLock.html
- rust scoped thread pool
- rust scoped thread
¶ channel
标准库里的mpsc对应的select!
已经deprecated了。可以考虑使用crossbeam-channel:
https://docs.rs/crossbeam-channel/latest/crossbeam_channel/
select: https://docs.rs/crossbeam-channel/latest/crossbeam_channel/macro.select.html
¶ 错误处理
¶ 让main函数兼容多种Error
use std::error::Error;
fn main() -> Result<(), Box<dyn Error>> {
¶ 将多种Error通过channel发送出去
Box<dyn Error>
是没法通过channel发送出去的。可以枚举出有哪些种类的Error,然后手搓一个enum表示它,这样就可以发送出去了:
enum FlushError {
bincode::Error),
Bincode(io::Error),
Io(}
impl From<bincode::Error> for FlushError {
fn from(e: bincode::Error) -> Self {
Self::Bincode(e)
}
}
impl From<io::Error> for FlushError {
fn from(e: io::Error) -> Self {
Self::Io(e)
}
}
参考:
https://fettblog.eu/rust-enums-wrapping-errors/
¶
获得Vec
里多个元素的mutable reference
比如要获得a[1]
和a[3]
的可变引用,可以用iterator:
fn main() {
let mut a = vec![0, 1, 2, 3, 4, 5];
let mut iter = a.iter_mut();
let a1 = iter.nth(1).unwrap();
let a3 = iter.nth(3 - 1 - 1).unwrap();
*a1 = -1;
*a3 = -1;
println!("{:?}", a);
}
也可以用nightly特性get_many_mut
:
#![feature(get_many_mut)]
fn main() {
let mut a = vec![0, 1, 2, 3, 4, 5];
let [a1, a3] = a.get_many_mut([1, 3]).unwrap();
*a1 = -1;
*a3 = -1;
println!("{:?}", a);
}
¶ struct成员变量默认值
¶ 生成随机数
用rand
crate。文档:https://docs.rs/rand/latest/rand/。
基础用法:https://docs.rs/rand/latest/rand/#quick-start
自带的随机数生成器:https://docs.rs/rand/latest/rand/rngs/index.html
如果需要指定随机种子的话,一般rand::rngs::StdRng
即可满足需求,文档:https://docs.rs/rand/latest/rand/rngs/struct.StdRng.html
一些自带的分布:https://docs.rs/rand/latest/rand/distributions/index.html
比较常见的均匀分布:https://docs.rs/rand/latest/rand/distributions/struct.Uniform.html
¶ lower_bound / upper_bound
¶ Module
https://doc.rust-lang.org/book/ch07-05-separating-modules-into-different-files.html
¶ 条件编译
官方文档:https://doc.rust-lang.org/reference/conditional-compilation.html
https://stackoverflow.com/questions/29857002/how-to-define-test-only-dependencies
仅在测试时derive:
#[cfg_attr(test, derive(Deserialize))]
。来源:https://www.reddit.com/r/rust/comments/nwywqx/conditionally_derive_for_integration_tests/
仅在测试时impl
:
#[cfg(test)]
impl Default for Status {
¶ 其他
https://stackoverflow.com/questions/28185854/how-do-i-test-crates-with-no-std
¶ Unstable features
¶ generic_const_exprs
需要的项目:
https://github.com/seekstar/counter-timer-cpp
配合array-macro
可以用array存timer而不是Vec
。
¶ RFC
Multiple Attributes in an Attribute Container (postponed)
支持不允许Drop的类型:[https://github.com/rust-lang/rfcs/pull/776] (postponed)
Improving Entry API to get the keys back when they are unused
¶ 已知问题
¶ Non-lexical lifetimes (NLL)
来源:https://blog.rust-lang.org/2022/08/05/nll-by-default.html
fn last_or_push<'a>(vec: &'a mut Vec<String>) -> &'a String {
if let Some(s) = vec.last() { // borrows vec
// returning s here forces vec to be borrowed
// for the rest of the function, even though it
// shouldn't have to be
return s;
}
// Because vec is borrowed, this call to vec.push gives
// an error!
.push("".to_string()); // ERROR
vec.last().unwrap()
vec}
error[E0502]: cannot borrow `*vec` as mutable because it is also borrowed as immutable
--> a.rs:11:5
|
1 | fn last_or_push<'a>(vec: &'a mut Vec<String>) -> &'a String {
| -- lifetime `'a` defined here
2 | if let Some(s) = vec.last() { // borrows vec
| ---------- immutable borrow occurs here
...
6 | return s;
| - returning this value requires that `*vec` is borrowed for `'a`
...
11 | vec.push("".to_string()); // ERROR
| ^^^^^^^^^^^^^^^^^^^^^^^^ mutable borrow occurs here
这是因为s
borrow了vec
之后,s
是conditional
return的,但是编译器仍然将对vec
的borrow拓展到所有条件分支了,就导致另一个没有borrow
vec
的分支也被认为borrow了vec
,就编译报错了。
据说下一代borrow checker
polonius可以解决这个问题。现在只能通过推迟对vec
的borrow绕过这个问题:
fn last_or_push<'a>(vec: &'a mut Vec<String>) -> &'a String {
if !vec.is_empty() {
let s = vec.last().unwrap(); // borrows vec
return s; // extends the borrow
}
// In this branch, the borrow has never happened, so even
// though it is extended, it doesn't cover this call;
// the code compiles.
//
// Note the subtle difference with the previous example:
// in that code, the borrow *always* happened, but it was
// only *conditionally* returned (but the compiler lost track
// of the fact that it was a conditional return).
//
// In this example, the *borrow itself* is conditional.
.push("".to_string());
vec.last().unwrap()
vec}
fn main() { }
¶ 调试时不能执行复杂代码
https://stackoverflow.com/questions/68232945/execute-a-statement-while-debugging-in-rust
¶ Raw pointers are
!Sync
and !Send
https://doc.rust-lang.org/nomicon/send-and-sync.html
主要目的是防止含有裸指针的struct被自动标记为thread-safe。
所以如果需要在不同线程之间共享裸指针,而且可以保证裸指针引用的部分已经做了并发控制的话,可以写一个wrapper:
struct ThreadSafePtr<T>(*mut T);
unsafe impl<T> Send for ThreadSafePtr<T> {}
unsafe impl<T> Sync for ThreadSafePtr<T> {}
但我觉得应该让raw pointer本身是thread safe的,然后在编译器层面不让含有裸指针的struct被自动标记为thread safe。
相关讨论:https://internals.rust-lang.org/t/shouldnt-pointers-be-send-sync-or/8818
¶ drop的时候拿的是mutable reference而不是ownership
https://stackoverflow.com/questions/30905826/why-does-drop-take-mut-self-instead-of-self
这是为了防止编译器在drop
的最后又自动调用drop
。
如果需要在drop的时候consume某个field,可以通过把这个field放在Option
里实现。或者把这个field用unsafe
的ManuallyDrop
包起来,然后在drop
的时候take
:https://users.rust-lang.org/t/can-drop-handler-take-ownership-of-a-field/74301/7
我觉得最好的实现应该是让drop
拿ownership,然后在编译器里特殊处理这个case,在drop
的最后不再调用drop
。但是rust核心开发者觉得这个特性需要对编译器做太多修改:https://github.com/rust-lang/rust/issues/4330
¶
如果有自定义的Drop::drop
,就不能单独拿某个field的ownership
¶ Copy一个struct的mutable reference field时会mutable borrow这个struct
例如:
struct S<'a> {
: &'a mut i32,
m}
impl<'a> S<'a> {
fn f1<'b>(&'b mut self) -> &'a mut i32 {
let new_m: &'a mut i32 = self.m;
new_m}
}
fn f2(m: &mut i32) -> &mut i32 {
let mut s = S { m };
.f1()
s}
fn main() {
let mut m = 2;
*f2(&mut m) = 3;
println!("{}", m);
}
会报错:
error: lifetime may not live long enough
--> test.rs:6:20
|
4 | impl<'a> S<'a> {
| -- lifetime `'a` defined here
5 | fn f1<'b>(&'b mut self) -> &'a mut i32 {
| -- lifetime `'b` defined here
6 | let new_m: &'a mut i32 = self.m;
| ^^^^^^^^^^^ type annotation requires that `'b` must outlive `'a`
|
= help: consider adding the following bound: `'b: 'a`
error: aborting due to 1 previous error
显然我们不能改成'b: 'a
,因为s
是个局部变量,它的生命周期'b
比'a
短。
出现这个报错的原因是let new_m = self.m
并不是单纯的copy,而是将*self.m
的写入权限转让给了new_m
。而编译器需要保证在写入权限交还给self
前,self
不能再被读或者写。于是编译器就让new_m
mutable
reference了self
,这样就可以利用borrow机制保证这一点。而new_m
mutable reference
self
就需要保证self
活得比new_m
长。
我认为我们可以引入一个新概念:mutability transfer
。在let new_m = self.m
时,我们说the mutability of self is transferred to new_m
。当一个object的状态处于mutable transferred
时,不允许读写之。这样就避免了影响new_m
的lifetime。
目前遇到这种情况,只能让f1
consume
self
:
struct S<'a> {
: &'a mut i32,
m}
impl<'a> S<'a> {
fn f1(self) -> &'a mut i32 {
let new_m: &'a mut i32 = self.m;
new_m}
}
fn f2(m: &mut i32) -> &mut i32 {
let s = S { m };
.f1()
s}
fn main() {
let mut m = 2;
*f2(&mut m) = 3;
println!("{}", m);
}
值得注意的是,copy一个immutable reference field是真正的copy,不需要reference整个struct,所以不会有这个问题。例如下面这段代码就可以通过编译:
struct S<'a> {
: &'a i32,
m}
impl<'a> S<'a> {
fn f1<'b>(&'b self) -> &'a i32 {
let new_m: &'a i32 = self.m;
new_m}
}
fn f2(m: &i32) -> &i32 {
let s = S { m };
.f1()
s}
fn main() {
let m = 2;
println!("{}", f2(&m));
}
但是,如果把S::m
改成mutable reference
:
struct S<'a> {
: &'a mut i32,
m}
impl<'a> S<'a> {
fn f1<'b>(&'b self) -> &'a i32 {
let new_m: &'a i32 = self.m;
new_m}
}
fn f2(m: &mut i32) -> &i32 {
let s = S { m };
.f1()
s}
fn main() {
let mut m = 2;
println!("{}", f2(&mut m));
}
即使是copy成一个immutable
reference,也需要转让写入权限,所以就需要reference整个self
,从而导致跟上面一样的lifetime的问题:
error: lifetime may not live long enough
--> test.rs:6:20
|
4 | impl<'a> S<'a> {
| -- lifetime `'a` defined here
5 | fn f1<'b>(&'b self) -> &'a i32 {
| -- lifetime `'b` defined here
6 | let new_m: &'a i32 = self.m;
| ^^^^^^^ type annotation requires that `'b` must outlive `'a`
|
= help: consider adding the following bound: `'b: 'a`
error: aborting due to 1 previous error
这时也只能通过让f1
consume
self
来解决问题:
fn f1<'b>(self) -> &'a i32 {