예제로 배우는 Rust
Rust는 안정성, 속도, 동시성에 초점을 둔 모던 프로그래밍 언어입니다. 가비지 컬렉션 없이 메모리 안전성을 보장함으로써 이 모든 목표를 달성합니다.
예제로 배우는 Rust(Rust by Example, RBE)는 직접 실행해볼 수 있는 예제들을 모아 러스트의 다양한 개념과 표준 라이브러리를 설명합니다. 더 많은 예제를 원하신다면 직접 러스트를 설치해서, 공식 문서를 살펴보세요. 궁금한 점이 있다면 이 사이트의 소스 코드를 직접 확인하실 수도 있습니다.
이제 시작해봅시다!
-
Hello World - 시작은 언제나의 Hello World 프로그램입니다.
-
기본 요소 - 부호 정수, 부호 없는 정수 등의 기본 요소를 배웁니다.
-
커스텀 타입 -
struct
,enum
-
변수 바인딩 - 가변 변수 바인딩, 스코프, 변수 가리기
-
타입 - 타입 정의 및 변환을 배웁니다.
-
흐름 제어 -
if
/else
,for
등 -
함수 - 메소드, 클로저, 고차 함수를 배웁니다.
-
모듈 - 모듈을 이용한 코드 조직화
-
Crate(크레이트) - 크레이트는 러스트의 컴파일 단위입니다. 라이브러리 제작을 배웁니다.
-
Cargo - 러스트 공식 패키지 관리 툴 기본 기능을 살펴봅니다.
-
Attributes - An attribute is metadata applied to some module, crate or item.
-
제네릭 - 다양한 타입의 인자로 작동하는 함수나 데이터 형식을 작성하는 방법을 배웁니다.
-
범위 규칙 - 범위는 소유권, borrowing, 라이프타임에서 중요한 역할을 합니다.
-
Trait(트레잇) - 트레잇은
Self
라는 알 수 없는 타입에 정의된 메소드의 집합입니다. -
에러 처리 - 러스트의 에러 처리 방식을 배웁니다.
-
표준 라이브러리 타입 -
std
라이브러리에서 제공하는 커스텀 타입의 일부를 알아봅니다. -
Std misc - 파일 처리, 스레드용 커스텀 타입도 있답니다.
-
테스트 - 러스트의 온갖 종류 테스트
-
개발 외 - 문서화, 벤치마킹
Hello World
다음은 러스트로 작성한 Hello World 프로그램 소스 코드입니다.
// 이건 주석입니다. 주석은 프로그램에 영향을 주지 않습니다. // 우측 상단의 'Run this code' 버튼을 클릭하면 여러분이 직접 이 코드를 실행해볼 수 있습니다. // 키보드 단축키는 'Ctrl + Enter'입니다. // 이 코드는 자유롭게 수정할 수 있습니다. 마음대로 가지고 놀아보세요! // 우측 상단의 'Undo Changes' 버튼을 클릭하면 원래 코드로 되돌릴 수 있습니다. // 메인 함수입니다. fn main() { // 컴파일된 바이너리가 호출될 때 이곳에 작성된 구문이 실행됩니다. // 콘솔에 텍스트를 출력합니다. println!("Hello World!"); }
println!
는 콘솔에 텍스트를 출력하는
매크로입니다.
프로그램 바이너리는 러스트 컴파일러(rustc
)로 생성할 수 있습니다.
$ rustc hello.rs
rustc
명령어는 실행 가능한 hello
바이너리를 생성합니다.
$ ./hello
Hello World!
실습
실행 버튼을 클릭하면 어떤 결과가 나오는지 살펴보셨나요?
println!
매크로 한 줄을 새로 추가해 다음 결과가 나오도록
만들어보세요!
Hello World!
I'm a Rustacean!
주석
모든 프로그램에는 주석이 필요합니다. 러스트는 여러 종류 주석을 지원합니다.
- 일반 주석은 소스 코드를 읽는 사람을 위한 내용입니다. 컴파일러는 일반 주석의 내용을 무시합니다.
// 한 줄을 주석 처리합니다.
/* 닫는 기호 전까지의 내용을 블록 주석 처리합니다 */
- 문서화 주석은 문서(Docmumentation)라는
HTML 문서로 변환됩니다.
/// 이 주석에 따라붙는 요소를 문서화합니다.
//! 이 주석을 포함하는 요소를 문서화합니다.
fn main() { // 한 줄 주석 예시입니다. // 각 줄은 슬래시 두 개로 시작합니다. // 컴파일러는 여기에 적힌 내용을 처리하지 않습니다. // println!("Hello, world!"); // 이 코드를 실행해보세요. Hello, world!가 출력되나요? 아무것도 출력되지 않는다면, 윗 줄의 슬래시 두 개를 지우고 다시 실행해보세요! /* * 이건 블록 주석입니다. * 일반적으로 주석을 작성할 땐 한 줄 주석을 권장하지만, * 블록 주석은 여러 줄의 코드를 임시로 무효화할때 굉장히 유용합니다. * /* 블록 주석은 /* 중첩 가능합니다. */ */ * 현재 main() 함수 내 모든 코드를 주석 처리하는 것도 키보드를 단 몇 번만 타이핑하면 됩니다. * /*/*/* 직접 해보세요! */*/*/ */ /* Note: 윗 문단의 `*`는 양식을 맞추기 위해 넣은 기호입니다. 실제로는 없어도 됩니다. */ // 표현식을 다룰 때 블록 주석을 응용할 수도 있습니다. // 다음 구문에서 주석 기호를 // 제거해보세요. let x = 5 + /* 90 + */ 5; println!("`x`는 10일까요, 100일까요? x = {}", x); }
See also:
출력 포맷팅
format!
: 작성한 포맷팅대로String
을 생성합니다.print!
:format!
과 동일하지만, 문자열을 콘솔(io::stdout)로 출력합니다.println!
:print!
와 동일하지만, 줄바꿈이 추가됩니다.eprint!
:format!
과 동일하지만, 문자열을 표준 에러(io::stderr)로 출력합니다.eprintln!
:eprint!
와 동일하지만, 줄바꿈이 추가됩니다.
문자열을 구문 분석하는 방법은 모두 동일합니다. 여담으로, 러스트는 컴파일타임에 포맷팅이 올바르게 작성됐는지 검사합니다.
fn main() { // `{}`는 문자열화된 인수로 자동 대체됩니다. println!("{}일", 31); // 접미사를 따로 붙이지 않으면 31는 i32 타입으로 지정됩니다. // 접미사를 붙이면 원하는 타입으로 변경할 수 있습니다. (31i64는 i64타입이 됩니다.) // 필요하다면 다양한 패턴을 사용할 수 있습니다. // 위치 지정 인수 사용 예시입니다. println!("{0} 님, 이분은 {1} 님입니다. {1} 님, 이분은 {0} 님입니다.", "홍길동", "김철수"); // 명명 인수 사용 예시입니다. println!("{subject} {object} {verb}", object="헌 쳇바퀴에", subject="다람쥐", verb="타고파"); // 특수 포맷팅은 `:`에 붙여서 명시합니다. println!("사람들 중 {}/{:b} 은 이진법을 알고 있으며, 나머지 절반은 모릅니다.", 1, 2); // 지정한 길이 내 문자열을 우측 정렬합니다. // 공백 5칸과 "1" 이 붙은 " 1" 을 출력합니다. println!("{number:>width$}", number=1, width=6); // 0으로 나머지 숫자를 채웁니다. "000001" 을 출력합니다. println!("{number:0>width$}", number=1, width=6); // 러스트는 사용된 인수의 개수가 틀리지 않았는지 검사합니다. println!("제 이름은 {0}, {1} {0}입니다.", "홍"); // 고쳐주세요! ^ 인수에 "길동"을 추가해주세요 // `i32` 값을 갖는 `Structure` 구조체를 생성합니다. #[allow(dead_code)] struct Structure(i32); // 안타깝게도, 구조체같은 사용자 정의 타입은 다루기 좀 까다롭습니다. // 다음 코드는 작동하지 않습니다. println!("이 구조체는 출력할 수 없어요... `{}`", Structure(3)); // 고쳐주세요! ^ 이 줄을 주석 처리해주세요 }
std::fmt
에는 문자열 출력을 제어하는 여러 트레잇
이 있습니다.
두 가지 중요한 트레잇의 기본 형태는 다음과 같습니다.
fmt::Debug
:{:?}
로 표시합니다. 문자열을 디버깅하기 좋은 형태로 포맷팅합니다.fmt::Display
:{}
로 표시합니다. 문자열을 사용자가 보기 좋게 포맷팅합니다.
기본 타입들은 표준 라이브러리에서 fmt::Display
트레잇을 구현해두었습니다.
따라서 기본 타입은 별도의 작업 없이 바로 {}
로 포맷팅할 수 있지만, 사용자 정의 타입은 추가 작업이 필요합니다
fmt::Display
트레잇을 구현하면 ToString
트레잇이 자동으로 구현되어
해당 타입을 String
으로 변환할 수 있습니다.
실습
- 앞선 코드의 문제점('고쳐주세요!'로 표시)을 수정하여 오류 없이 실행되도록 만들어보세요.
Pi는 약 3.142입니다
를 출력하는println!
매크로를 추가해보세요.let pi = 3.141592
코드로 원주율을 계산하고, 표시할 소수점 자리수를 조정하세요. (힌트: 표시할 소수점 자리수를 지정하는 방법은std::fmt
에서 찾아볼 수 있습니다)
See also:
std::fmt
, macros
, struct
,
and traits
Debug
출력 구현체가 존재하지 않는 타입에는 std::fmt
포맷팅 트레잇을
사용할 수 없습니다. 러스트가 자동으로 구현체를 제공하는 타입은
std
라이브러리 내 타입뿐입니다. 그 외에는 전부 어떤 형태로건
직접 구현해야 합니다.
이 문제는 fmt::Debug
트레잇으로 쉽게 해결할 수 있습니다.
모든 타입은 fmt::Debug
트레잇을 derive
하여 구현체를 자동으로 생성할 수 있습니다.
이는 fmt::Display
트레잇에는 해당되지 않습니다. fmt::Display
트레잇은 반드시 직접 구현해야합니다.
#![allow(unused)] fn main() { // 이 구조체는 `fmt::Display`로 출력할 수 없으며, // `fmt::Debug`로도 출력할 수 없습니다. struct UnPrintable(i32); // 다음 `derive` 속성은 이 구조체가 `fmt::Debug`로 출력될 수 있도록 // 구현체를 자동으로 생성합니다. #[derive(Debug)] struct DebugPrintable(i32); }
std
라이브러리 내 타입은 {:?}
로도 출력할 수 있습니다.
// `Structure` 구조체에 `fmt::Debug` 구현체를 derive 합니다. // `Structure` 구조체는 `i32` 값 하나를 포함합니다. #[derive(Debug)] struct Structure(i32); // `Structure`를 `Deep` 구조체에 집어넣고, // `Deep` 구조체도 출력 가능하게 만듭니다. #[derive(Debug)] struct Deep(Structure); fn main() { // `{:?}` 출력은 `{}`와 비슷합니다. println!("1년은 {:?}개월입니다.", 12); println!("{1:?} {0:?}는 {actor:?} 이름입니다.", "슬레이터(Slater)", "크리스찬(Christian)", actor="배우"); // `Structure`는 출력 가능합니다. println!("이제 {:?}(을)를 출력할 수 있습니다!", Structure(3)); // `derive`의 문제점은 출력 결과의 형태를 조정할 수 없다는 점입니다. // 단순히 `7` 만 출력하려면 어떻게 해야 할까요? println!("이제 {:?}(을)를 출력할 수 있습니다!", Deep(Structure(7))); }
fmt::Debug
를 이용해 출력할 수 없던 것을 출력 가능하게 만들어보았습니다.
단정한 형태의 출력을 원하면 {:#?}
를 사용해 예쁘게 출력할 수 있습니다.
#[derive(Debug)] struct Person<'a> { name: &'a str, age: u8 } fn main() { let name = "Peter"; let age = 27; let peter = Person { name, age }; // 예쁘게 출력하기 println!("{:#?}", peter); }
출력 형태를 원하는 대로 조정하려면 fmt::Display
를 직접 구현해야 합니다.
See also:
attributes
, derive
, std::fmt
, 구조체
Display
fmt::Debug
로는 간결하고 깔끔한 출력을 만들기 어렵습니다.
출력 형태를 원하는대로 바꾸기 위해선 fmt::Display
(출력 시 {}
로 표시)를 직접 구현해야 합니다.
구현 방법은 다음과 같습니다.
#![allow(unused)] fn main() { // `use` 키워드로`fmt` 모듈을 가져옵니다. use std::fmt; // `fmt::Display`를 구현할 구조체를 정의합니다. // `Structure`는 `i32` 값을 갖는 튜플 구조체입니다. struct Structure(i32); // `{}` 표시자는 `fmt::Display` 트레잇을 구현하는 // 타입으로만 사용할 수 있습니다. impl fmt::Display for Structure { // Display 트레잇의 정확한 시그니처에선 `fmt`를 사용합니다 fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { // 제공된 출력 스트림 `f`에 첫 번째 요소를 작성합니다. // 반환하는 `fmt::Result`는 연산의 성공 여부를 나타냅니다. // `write!` 사용 문법은 `println!`과 굉장히 유사하다는 점을 // 기억해두세요. write!(f, "{}", self.0) } } }
fmt::Display
는 fmt::Debug
보다 깔끔합니다. 하지만, 오히려 깔끔하기 때문에 std
라이브러리는 fmt::Display
기본 구현을 제공하지 않습니다.
타입의 출력 형태를 정하기가 애매하기 때문입니다.
예를 들어, 만약 std
라이브러리에서 Vec<T>
타입의 출력을 구현한다면, 어떤 형태로 출력되도록 구현해야 할까요? 다음 두 가지 중 하나일까요?
Vec<path>
:/:/etc:/home/username:/bin
(:
로 구분)Vec<number>
:1,2,3
(,
로 구분)
모든 타입에 어울리는 이상적인 형태는 존재하지 않습니다.
따라서 std
라이브러리는 함부로 Vec<T>
등의 기본 제네릭 컨테이너에 fmt::Display
를 구현하지 않습니다.
기본 제네릭 컨테이너에는 fmt::Debug
를 사용해야 합니다.
표준 라이브러리의 기본 제네릭 컨테이너가 아닌,
새로 만든 컨테이너 타입에는 문제 없이 fmt::Display
를 구현할 수 있습니다.
use std::fmt; // `fmt`를 가져옵니다. // 두 숫자를 보관하는 구조체입니다. `Display` 결과와 비교하기 위해 // `Debug`를 derive합니다. #[derive(Debug)] struct MinMax(i64, i64); // `MinMax`에 `Display`를 구현합니다. impl fmt::Display for MinMax { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { // `self.숫자`는 각 위치의 데이터를 가리킵니다. write!(f, "({}, {})", self.0, self.1) } } // 비교를 위해, 필드에 이름을 붙인 구조체를 정의합니다. #[derive(Debug)] struct Point2D { x: f64, y: f64, } // 마찬가지로 `Point2D`에도 `Display`를 구현합니다. impl fmt::Display for Point2D { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { // `x`, `y`만 표시하도록 합니다. write!(f, "x: {}, y: {}", self.x, self.y) } } fn main() { let minmax = MinMax(0, 14); println!("구조체 비교"); println!("Display: {}", minmax); println!("Debug: {:?}", minmax); let big_range = MinMax(-300, 300); let small_range = MinMax(-3, 3); println!("큰 범위는 {big}이며, 작은 범위는 {small}입니다.", small = small_range, big = big_range); let point = Point2D { x: 3.3, y: 7.2 }; println!("Point 비교"); println!("Display: {}", point); println!("Debug: {:?}", point); // 다음 줄은 작동하지 않습니다. `Debug`, `Display`를 모두 구현하더라도 // `{:b}`는 `fmt::Binary`가 구현되어야 작동하기 때문입니다. // println!("Point2D를 이진법으로 나타내면 어떤 모습일까요? {:b}?", point); }
fmt::Display
를 구현한다고 해서 만사가 해결되는건 아닙니다.
만약 바이너리 출력을 사용하려면, fmt::Binary
를 구현해야합니다.
바이너리 이외에도 각각 따로 구현해야하는 std::fmt
트레잇
은 여럿 있습니다.
(자세한 내용은 std::fmt
를 찾아보세요.)
실습
앞선 예제에서 어떤 출력이 나오는지 확인해보았다면, Point2D
구조체 정의 예시 삼아
Complex
구조체를 추가해보세요. 동일한 방식으로 출력하여 다음과 같은 결과가 나와야합니다.
Display: 3.3 + 7.2i
Debug: Complex { real: 3.3, imag: 7.2 }
See also:
derive
, std::fmt
, 매크로
, 구조체
,
트레잇
, use
Testcase: List
순차적인 요소들을 처리하는 구조체에 fmt::Display
를 구현하는 건 까다롭습니다.
각 write!
마다 fmt::Result
를 생성하는데, 이를 전부 알맞게
처리해야하기 때문입니다.
러스트에서 제공하는 ?
연산자는 이런 경우에 딱 알맞습니다.
write!
에서 ?
연산자를 사용한 모습은 다음과 같습니다.
// `write!`를 시도하여 에러가 있는지 확인합니다.
// 에러가 발생하면 해당 에러를 반환하고, 발생하지 않으면 계속 진행합니다.
write!(f, "{}", value)?;
Vec
타입에 fmt::Display
를 구현하는 것도 ?
연산자를 사용하면
간단합니다.
use std::fmt; // `fmt`를 가져옵니다. // `Vec` 타입을 보관하는 `List` 구조체를 정의합니다. struct List(Vec<i32>); impl fmt::Display for List { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { // 튜플 인덱싱 문법으로 값을 추출하고 // `vec`에 대한 참조자를 생성합니다. let vec = &self.0; write!(f, "[")?; // 반복자 `v`로 `vec`을 순회하며 반복 횟수를 // `count`로 열거합니다. for (count, v) in vec.iter().enumerate() { // 첫 번째 요소 이외에는 쉼표를 추가합니다. // 에러일 경우 반환하기 위해 `?` 연산자를 사용합니다. if count != 0 { write!(f, ", ")?; } write!(f, "{}", v)?; } // 대괄호를 닫고 fmt::Result 값을 반환합니다. write!(f, "]") } } fn main() { let v = List(vec![1, 2, 3]); println!("{}", v); }
실습
프로그램을 수정해서 각 요소의 인덱스 번호도 출력하도록 만들어보세요. 출력 예시는 다음과 같습니다.
[0: 1, 1: 2, 2: 3]
See also:
for
, ref
, Result
, struct
,
?
, vec!
포맷팅
앞서 봤듯, 포맷팅은 포맷 문자열(format string) 로 명시합니다.
format!("{}", foo)
->"3735928559"
format!("0x{:X}", foo)
->"0xDEADBEEF"
format!("0o{:o}", foo)
->"0o33653337357"
똑같은 foo
변수를 인수 형식(argument type) 에 따라 다르게 포맷팅할 수 있습니다.
각각 X
, o
, 명시되지 않은 인수 형식을 사용했습니다.
포맷팅 기능은 여러 트레잇으로 구현되어 있으며, 각각의 트레잇이 인수 형식에 하나씩 대응합니다.
가장 보편적인 포맷팅 트레잇은 Display
트레잇입니다.
Display
트레잇은 명시되지 않은 인수 형식({}
)을 처리합니다.
use std::fmt::{self, Formatter, Display}; struct City { name: &'static str, // 위도(Latitude) lat: f32, // 경도(Longitude) lon: f32, } impl Display for City { // `f`는 버퍼입니다. fmt 메소드는 포맷 스트링을 `f` 버퍼에 작성해야 합니다. fn fmt(&self, f: &mut Formatter) -> fmt::Result { let lat_c = if self.lat >= 0.0 { 'N' } else { 'S' }; let lon_c = if self.lon >= 0.0 { 'E' } else { 'W' }; // `write!`은 `format!`과 비슷하지만, 첫 번째 인수인 버퍼에 // 포맷 스트링을 작성한다는 차이점이 있습니다. write!(f, "{}: {:.3}°{} {:.3}°{}", self.name, self.lat.abs(), lat_c, self.lon.abs(), lon_c) } } #[derive(Debug)] struct Color { red: u8, green: u8, blue: u8, } fn main() { for city in [ City { name: "더블린", lat: 53.347778, lon: -6.259722 }, City { name: "오슬로", lat: 59.95, lon: 10.75 }, City { name: "밴쿠버", lat: 49.25, lon: -123.1 }, ].iter() { println!("{}", *city); } for color in [ Color { red: 128, green: 255, blue: 90 }, Color { red: 0, green: 3, blue: 254 }, Color { red: 0, green: 0, blue: 0 }, ].iter() { // fmt::Display를 구현하고 나면 이 부분을 {}로 변경하세요. println!("{:?}", *color); } }
더 자세히 알아보고 싶다면 전체 포맷팅 트레잇 목록이나
std::fmt
문서에서 포맷팅 인수 형식을 살펴볼 수 있습니다.
실습
앞선 코드의 Color
구조체에 fmt::Display
트레잇을 구현하여
출력 결과가 다음과 같이 나타나도록 만들어보세요.
RGB (128, 255, 90) 0x80FF5A
RGB (0, 3, 254) 0x0003FE
RGB (0, 0, 0) 0x000000
막힐 때를 위해 힌트를 드리겠습니다.
See also:
기본 요소
러스트는 다양한 기본 요소(primitives)를 제공합니다. 어떤 것들을 제공하는지 살펴보죠.
스칼라 타입
- 부호 있는 정수(signed integer):
i8
,i16
,i32
,i64
,i128
및isize
(포인트 크기) - 부호 없는 정수(unsigned integer):
u8
,u16
,u32
,u64
,u128
및usize
(포인터 크기) - 부동 소수점(floating point):
f32
,f64
char
-'a'
,'α'
,'∞'
(각각 4바이트) 등의 유니코드 스칼라 값bool
-true
혹은false
- 유닛 타입
()
- 빈 튜플(()
) 값만을 의미하는 타입
유닛 타입은 튜플이지만, 여러 값을 포함하지는 않기 때문에 복합 타입으로 간주하지 않습니다.
복합 타입
- 배열 (
[1, 2, 3]
) - 튜플 (
(1, true)
)
모든 변수에는 타입을 명시할 수 있습니다.
숫자는 접미사로도 타입을 명시할 수 있고, 아무것도 작성하지 않고 기본 타입을 이용할 수 있죠.
정수는 i32
타입, 소수는 f64
타입이 기본 타입입니다.
또한, 러스트는 코드 문맥에서 타입을 추론할 수도 있습니다.
fn main() { // 변수에는 타입을 명시할 수 있습니다. let logical: bool = true; let a_float: f64 = 1.0; // 일반적인 명시 let an_integer = 5i32; // 접미사 명시 // 기본 타입을 사용할 수도 있습니다. let default_float = 3.0; // `f64` let default_integer = 7; // `i32` // 타입은 코드 문맥에서 추론될 수도 있습니다. let mut inferred_type = 12; // 다른 줄로 인해서 i64 타입으로 추론됩니다 inferred_type = 4294967296i64; // 가변 변수 값은 변경 가능합니다. let mut mutable = 12; // 가변 `i32` mutable = 21; // 에러! 변수의 타입은 변경할 수 없습니다. mutable = true; // 변수는 가려질(shadowing) 수 있습니다. let mutable = true; }
See also:
the std
library, mut
, inference
, shadowing
리터럴, 연산자
정수 1
, 부동 소수점 1.2
, 문자 'a'
, 문자열 "abc"
, boolean true
,
유닛 타입 ()
은 리터럴로 표현할 수 있습니다.
정수는 0x
, 0o
, 0b
접두사를 사용해 각각 16진수, 8진수, 2진수로
표기할 수 있습니다.
숫자 리터럴에 언더스코어(_
)를 추가하여 가독성을 높일 수도 있습니다.
1_000
은 1000
과 같고, 0.000_001
은 0.000001
와 같습니다.
원하는 타입의 리터럴을 사용하려면 컴파일러에게 타입을 알려주어야 합니다.
이번에는 부호 없는 32비트 정수 리터럴을 u32
접미사로 표시하고,
부호 있는 32비트 정수 리터럴을 i32
접미사로 표시하겠습니다.
러스트에서 사용 가능한 연산자와 연산자 우선 순위는 다른 C언어계 언어(C-like languages)와 유사합니다.
fn main() { // 정수 덧셈 println!("1 + 2 = {}", 1u32 + 2); // 정수 뺄셈 println!("1 - 2 = {}", 1i32 - 2); // TODO ^ `1i32`를 `1u32`로 바꿔보면 타입이 중요한 이유를 알 수 있습니다. // boolean 논리 연산 println!("true AND false = {}", true && false); println!("true OR false = {}", true || false); println!("NOT true = {}", !true); // 비트 연산 println!("0011 AND 0101 = {:04b}", 0b0011u32 & 0b0101); println!("0011 OR 0101 = {:04b}", 0b0011u32 | 0b0101); println!("0011 XOR 0101 = {:04b}", 0b0011u32 ^ 0b0101); println!("1 << 5 = {}", 1u32 << 5); println!("0x80 >> 2 = 0x{:x}", 0x80u32 >> 2); // 언더스코어를 사용해 가독성을 높였습니다. println!("백만을 숫자로 쓰면 {}입니다", 1_000_000u32); }
튜플
튜플은 다양한 타입 값의 집합입니다.
튜플은 괄호를 사용해 생성하며, 각각의 튜플은 (T1, T2, ...)
(T1
, T2
은 구성 요소의 타입) 타입 시그니처 타입의 값입니다.
튜플은 여러 값을 포함할 수 있으므로 함수에서 튜플을 이용해 여러 값을 반환할 수도 있습니다.
// 튜플은 함수 인수 및 반환 값으로 사용할 수 있습니다. fn reverse(pair: (i32, bool)) -> (bool, i32) { // `let` 구문으로 튜플의 구성 요소를 변수에 바인딩합니다. let (integer, boolean) = pair; (boolean, integer) } // 실습용 구조체입니다. #[derive(Debug)] struct Matrix(f32, f32, f32, f32); fn main() { // 다양한 타입이 모인 튜플 let long_tuple = (1u8, 2u16, 3u32, 4u64, -1i8, -2i16, -3i32, -4i64, 0.1f32, 0.2f64, 'a', true); // 튜플 내 값은 튜플 인덱싱을 사용해 추출할 수 있습니다 println!("long tuple 첫 번째 값: {}", long_tuple.0); println!("long tuple 두 번째 값: {}", long_tuple.1); // 튜플이 튜플의 구성요소가 될 수도 있습니다. let tuple_of_tuples = ((1u8, 2u16, 2u32), (4u64, -1i8), -2i16); // 튜플은 출력 가능합니다. println!("튜플로 만든 튜플: {:?}", tuple_of_tuples); // 하지만 긴 튜플은 출력할 수 없습니다. // let too_long_tuple = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13); // println!("너무 긴 튜플: {:?}", too_long_tuple); // TODO ^ 위 두 줄을 주석 해제하여 어떤 컴파일러 에러가 나타나는지 확인해보세요. let pair = (1, true); println!("쌍은 {:?}입니다.", pair); println!("뒤집은 쌍은 {:?}입니다.", reverse(pair)); // 요소가 하나뿐인 튜플을 생성할 땐 괄호로 둘러싸인 리터럴과 구별하기 위해 // 반드시 쉼표를 작성해야 합니다. println!("{:?}는 요소가 하나인 튜플입니다.", (5u32,)); println!("{:?}는 그냥 숫자입니다.", (5u32)); // 튜플을 해체하여 바인딩을 생성할 수 있습니다. let tuple = (1, "hello", 4.5, true); let (a, b, c, d) = tuple; println!("{:?}, {:?}, {:?}, {:?}", a, b, c, d); let matrix = Matrix(1.1, 1.2, 2.1, 2.2); println!("{:?}", matrix); }
실습
-
복습: 앞선 예제의
Matrix
구조체에fmt::Display
트레잇을 구현해보세요. 출력 부분의 Debug 포맷팅({:?}
)을 Display 포맷팅({}
)으로 변경했을 때 다음과 같은 결과가 출력되어야 합니다.( 1.1 1.2 ) ( 2.1 2.2 )
필요하다면, 이전 Display 출력 예제를 참고하세요.
-
reverse
함수를 참고자료 삼아, matrix(행렬)를 인수로 받아 두 개의 요소를 교환하여 반환하는transpose
함수를 추가해보세요. 예시는 다음과 같습니다.println!("Matrix:\n{}", matrix); println!("Transpose:\n{}", transpose(matrix));
다음은 출력 결과입니다.
Matrix: ( 1.1 1.2 ) ( 2.1 2.2 ) Transpose: ( 1.1 2.1 ) ( 1.2 2.2 )
배열, 슬라이스
배열은 메모리에 연속적으로 저장된 동일한 T
타입 요소의 집합입니다.
배열은 대괄호 []
를 사용해 생성하며, 타입 시그니처 [T; length]
의
일부인 배열 길이는 컴파일 타임에 알 수 있습니다.
슬라이스는 배열과 유사하지만, 슬라이스의 길이는 컴파일 타임에 알 수 없습니다.
슬라이스는 두 개의 word로 구성됩니다. 첫 번째 word는 데이터의 포인터이고,
두 번째 word는 슬라이스의 길이입니다.
word 크기는 usize
크기와 동일하게 프로세서 아키텍처(x86-64의 경우 64비트)에 의해 결정됩니다.
슬라이스는 배열의 일부를 borrow 하는 데에 사용할 수 있습니다. (타입 시그니처는 &[T]
입니다.)
use std::mem; // 슬라이스를 borrow 하는 함수 fn analyze_slice(slice: &[i32]) { println!("슬라이스 첫 번째 요소: {}", slice[0]); println!("슬라이스는 요소가 {}개 있습니다", slice.len()); } fn main() { // 고정된 크기 배열 (타입 시그니처는 불필요합니다) let xs: [i32; 5] = [1, 2, 3, 4, 5]; // 모든 요소를 같은 값으로 초기화 let ys: [i32; 500] = [0; 500]; // 인덱스는 0부터 시작합니다. println!("배열 첫 번째 요소: {}", xs[0]); println!("배열 두 번째 요소: {}", xs[1]); // `len` 함수는 배열 요소의 개수를 반환합니다. println!("배열 요소의 개수: {}", xs.len()); // 배열은 스택에 할당됩니다. println!("배열은 {}바이트를 차지합니다", mem::size_of_val(&xs)); // 배열은 자동으로 슬라이스로 borrow 될 수 있습니다. println!("전체 배열을 슬라이스로 borrow 합니다"); analyze_slice(&xs); // 슬라이스는 [시작_인덱스..끝_인덱스] 형태로 // 배열의 일부를 가리킬 수 있습니다. // 시작_인덱스는 슬라이스에 포함할 첫 번째 요소의 위치이고, // 끝_인덱스는 슬라이스에 포함할 마지막 위치에 1을 더한 위치입니다. println!("배열의 일부를 슬라이스로 borrow 합니다"); analyze_slice(&ys[1 .. 4]); // 인덱싱이 범위를 벗어나면 컴파일 에러가 발생합니다. println!("{}", xs[5]); }
커스텀 타입
러스트에서는 주로 다음 두 키워드로 커스텀 데이터 타입을 작성합니다.
struct
: 구조체를 정의합니다.enum
: 열거형을 정의합니다.
상수는 const
, static
키워드로 생성합니다.
구조체
struct
키워드로 만들 수 있는 구조체는 세 종류가 있습니다.
- 튜플 구조체 - 간단하게, '이름이 있는 튜플'입니다.
- 전통 C언어 구조체
- 유닛 구조체 - 필드가 없는 구조체입니다. 제네릭에 사용됩니다.
#[derive(Debug)] struct Person { name: String, age: u8, } // 유닛 구조체 struct Unit; // 튜플 구조체 struct Pair(i32, f32); // 두 개의 필드를 가진 구조체 struct Point { x: f32, y: f32, } // 구조체는 또 다른 구조체의 필드가 될 수 있습니다. #[allow(dead_code)] struct Rectangle { // 사각형은 좌측 상단 꼭짓점과 우측 하단 꼭짓점이 공간의 어디에 // 위치해 있는지로 나타낼 수 있습니다. top_left: Point, bottom_right: Point, } fn main() { // 필드 초기화 축약법으로 구조체를 생성합니다. let name = String::from("Peter"); let age = 27; let peter = Person { name, age }; // debug 출력으로 구조체를 출력합니다. println!("{:?}", peter); // `Point` 구조체 인스턴스를 생성합니다. let point: Point = Point { x: 10.3, y: 0.4 }; // point 인스턴스의 필드에 접근합니다. println!("point 좌표: ({}, {})", point.x, point.y); // 구조체 갱신 문법을 사용해 기존 구조체 인스턴스의 필드로 새로운 구조체 // 인스턴스를 생성합니다. let bottom_right = Point { x: 5.2, ..point }; // `point`의 필드로 `bottom_right`를 생성했으므로, // `bottom_right.y`는 `point.y`와 같습니다. println!("두 번째 point 좌표: ({}, {})", bottom_right.x, bottom_right.y); // `let` 구문으로 point를 해체하여 바인딩합니다. let Point { x: top_edge, y: left_edge } = point; let _rectangle = Rectangle { // 구조체 인스턴스 생성문도 표현식입니다. top_left: Point { x: left_edge, y: top_edge }, bottom_right: bottom_right, }; // 유닛 구조체 생성문입니다. let _unit = Unit; // 튜플 구조체를 생성합니다. let pair = Pair(1, 0.1); // 튜플 구조체 필드에 접근합니다 println!("pair에는 {:?}, {:?}이 들어있습니다", pair.0, pair.1); // 튜플 구조체를 해체합니다 let Pair(integer, decimal) = pair; println!("pair에는 {:?}, {:?}이 들어있습니다", integer, decimal); }
실습
- 사각형의 면적을 계산하는
react_area
함수를 추가해보세요. (중첩 해체 구문을 사용해보세요.) Point
,f32
를 매개변수로 전달받는square
함수를 추가해보세요.Point
를 좌측 하단 꼭짓점으로 사용하고,f32
값을 너비, 높이로 사용하는Rectangle
을 반환해야합니다.
See also
열거형
enum
키워드는 여러 variant(변종) 중 하나의 값이 될 수 있는 타입을 생성합니다.
구조체의 형태로 유효한 것은 열거형의 variant로도 유효합니다.
// 웹 이벤트를 분류하는 열거형을 생성합니다. // 열거형의 이름뿐만 아니라 variant를 어떤 방식으로 지정하는지도 유의해주세요. // `PageLoad`는 `PageUnload`와 다르며, `KeyPress(char)`와 `Paste(String)` 또한 다릅니다. // 각각의 variant는 모두 다르고 독립적입니다. enum WebEvent { // 열거형은 종류만 지정할 수도 있으며(`unit-like`), PageLoad, PageUnload, // 튜플 구조체 같은 형태도 가능하고, KeyPress(char), Paste(String), // C언어식 구조체도 가능합니다. Click { x: i64, y: i64 }, } // `WebEvent` 열거형을 인자로 전달받고 아무것도 반환하지 않는 // 함수입니다. fn inspect(event: WebEvent) { match event { WebEvent::PageLoad => println!("페이지 로드됨"), WebEvent::PageUnload => println!("페이지 언로드됨"), // 열거형 값 안의 `c`를 해체합니다. WebEvent::KeyPress(c) => println!("'{}' 눌림", c), WebEvent::Paste(s) => println!("\"{}\" 붙여넣음", s), // `Click`을 `x`, `y`로 해체합니다. WebEvent::Click { x, y } => { println!("x={}, y={} 지점 클릭됨", x, y); }, } } fn main() { let pressed = WebEvent::KeyPress('x'); // 소유권을 갖는 `String`을 생성하기 위해 문자열 슬라이스로 `to_owned()`를 호출합니다. let pasted = WebEvent::Paste("텍스트".to_owned()); let click = WebEvent::Click { x: 20, y: 80 }; let load = WebEvent::PageLoad; let unload = WebEvent::PageUnload; inspect(pressed); inspect(pasted); inspect(click); inspect(load); inspect(unload); }
타입 별칭
타입 별칭을 사용해, 열거형의 별칭으로 각 variant를 참조할 수 있습니다. 열거형의 이름이 너무 길거나, 오히려 너무 평범해서 이름을 바꾸고 싶을 때 유용합니다.
enum VeryVerboseEnumOfThingsToDoWithNumbers { Add, Subtract, } // 타입 별칭을 생성합니다. type Operations = VeryVerboseEnumOfThingsToDoWithNumbers; fn main() { // 길고 불편한 이름 대신 별칭으로 각 variant를 참조합니다. let x = Operations::Add; }
여러분이 가장 자주 보게 될 타입 별칭은 impl
블록 내 Self
별칭입니다.
enum VeryVerboseEnumOfThingsToDoWithNumbers { Add, Subtract, } impl VeryVerboseEnumOfThingsToDoWithNumbers { fn run(&self, x: i32, y: i32) -> i32 { match self { Self::Add => x + y, Self::Subtract => x - y, } } }
See also:
match
, fn
, String
, "Type alias enum variants" RFC
use
use
구문으로 매번 스코프를 지정할 필요 없도록 만들 수 있습니다.
// 사용하지 않은 코드(unused code) 경고를 숨기기 위한 속성입니다. #![allow(dead_code)] enum Status { Rich, Poor, } enum Work { Civilian, Soldier, } fn main() { // `use`를 명시하여, 스코프를 지정하지 않아도 되도록 만듭니다. use crate::Status::{Poor, Rich}; // `Work` 내 모든 이름을 `use` 합니다. use crate::Work::*; // `Status::Poor`과 동일합니다. let status = Poor; // `Work::Civilian`과 동일합니다. let work = Civilian; match status { // 앞서 `use`를 명시했으므로 스코프 지정은 불필요합니다. Rich => println!("부자는 돈이 많습니다!"), Poor => println!("빈민은 돈이 없습니다..."), } match work { // 마찬가지로 스코프 지정은 불필요합니다. Civilian => println!("시민이 일합니다!"), Soldier => println!("군인이 전투합니다!"), } }
See also:
C-like
열거형은 C언어에서처럼 사용할 수도 있습니다.
// 사용하지 않은 코드(unused code) 경고를 숨기기 위한 속성입니다. #![allow(dead_code)] // 암묵적 식별자(0부터 시작)를 사용한 열거형 enum Number { Zero, One, Two, } // 명시적 식별자를 사용한 열거형 enum Color { Red = 0xff0000, Green = 0x00ff00, Blue = 0x0000ff, } fn main() { // 열거형은 정수로 형 변환할 수 있습니다. println!("Zero는 {}입니다", Number::Zero as i32); println!("One은 {}입니다", Number::One as i32); println!("장미 색은 #{:06x}입니다", Color::Red as i32); println!("제비꽃 색은 #{:06x}입니다", Color::Blue as i32); }
See also:
Testcase: 연결 리스트
연결 리스트 구현은 적절한 열거형 사용 예시입니다.
use crate::List::*; enum List { // Cons: 요소와 다음 노드의 포인터를 감싼 튜플 구조체입니다. Cons(u32, Box<List>), // Nil: 연결 리스트의 끝을 표시하는 노드입니다. Nil, } // 열거형에는 메소드를 추가할 수 있습니다. impl List { // 빈 리스트를 생성합니다. fn new() -> List { // `Nil`의 타입은 `List`입니다. Nil } // 리스트의 소유권을 가져오고, 해당 리스트의 앞에 새 요소를 추가한 리스트를 반환합니다. fn prepend(self, elem: u32) -> List { // `Cons`의 타입도 마찬가지로 `List`입니다. Cons(elem, Box::new(self)) } // 리스트 길이를 반환합니다. fn len(&self) -> u32 { // `self`가 어떤 variant이냐에 따라 다르게 동작하는 메소드이므로 // `self`를 매치합니다. // `self`는 `&List` 타입이니, `*self`는 `List` 타입입니다. // 매치할 때는 참조자 `&T` 타입보다 구체적 타입 `T`가 더 선호됩니다. // 러스트 2018 에디션 이후에는 다음 위치에 `self`와 `tail` (ref 없이) // 을 작성할 수도 있습니다. 러스트는 &s, ref tail을 추론합니다. // https://doc.rust-lang.org/edition-guide/rust-2018/ownership-and-lifetimes/default-match-bindings.html 참고 match *self { // `self`는 borrow 되었기 때문에 다음 노드의 소유권을 얻어올 수 없으니, // 참조자를 가져옵니다. Cons(_, ref tail) => 1 + tail.len(), // 기본 케이스: 빈 리스트의 길이는 0입니다. Nil => 0 } } // 리스트를 문자열(힙 할당된)로 표현하여 반환합니다. fn stringify(&self) -> String { match *self { Cons(head, ref tail) => { // `format!`은 `print!`와 유사하지만, // 콘솔에 출력하는 대신 힙 할당된 문자열을 반환합니다. format!("{}, {}", head, tail.stringify()) }, Nil => { format!("Nil") }, } } } fn main() { // 빈 연결 리스트를 생성합니다. let mut list = List::new(); // 몇몇 요소를 덧붙입니다. list = list.prepend(1); list = list.prepend(2); list = list.prepend(3); // 리스트의 최종 상태를 표시합니다. println!("연결 리스트 길이: {}", list.len()); println!("{}", list.stringify()); }
See also:
상수
러스트에는 두 종류 상수가 있습니다. 전역 범위를 포함한 모든 스코프에 선언 가능하며, 타입을 반드시 명시해야 합니다.
const
: 변경 불가능한 값 (보편적인 상수입니다).static
:mut
키워드를 이용하면 변경 가능한 변수입니다.'static
라이프타임을 갖습니다. static 라이프타임은 자동으로 추론되며, 명시할 필요 없습니다. 변경 가능한 static 변수에의 접근 및 수정은unsafe
연산입니다.
// 모든 스코프를 벗어난 전역 범위에 선언합니다. static LANGUAGE: &str = "Rust"; const THRESHOLD: i32 = 10; fn is_big(n: i32) -> bool { // 함수 내에서 상수에 접근합니다. n > THRESHOLD } fn main() { let n = 16; // 메인 스레드 내에서 상수에 접근합니다. println!("프로그래밍 언어 {}", LANGUAGE); println!("임계치는 {}입니다", THRESHOLD); println!("{}은 {} 값입니다", n, if is_big(n) { "큰" } else { "작은" }); // 에러! `const`는 수정할 수 없습니다. THRESHOLD = 5; // 고쳐주세요! ^ 이 줄을 주석 처리해주세요. }
See also:
The const
/static
RFC,
'static
lifetime
변수 바인딩
러스트는 정적 타이핑으로 타입 안전성을 보장합니다. 따라서 변수 바인딩 선언 시 타입 어노테이션을 작성할 수 있습니다. 하지만 대부분의 경우는 컴파일러가 문맥에 맞게 타입을 추론하기 때문에, 어노테이션 작성 부담은 크지 않습니다.
리터럴 등의 값을 변수에 바인딩할 때에는 let
바인딩을 사용합니다.
fn main() { let an_integer = 1u32; let a_boolean = true; let unit = (); // `an_integer`를 `copied_integer`로 복사합니다. let copied_integer = an_integer; println!("정수: {:?}", copied_integer); println!("boolean: {:?}", a_boolean); println!(")유닛 값을 소개합니다: {:?}", unit); // 컴파일러는 사용되지 않은 변수 바인딩에 `unused variable` 경고를 표시합니다. // 변수명 앞에 밑줄(`_`, 언더스코어)을 추가하면 해당 경고를 잠재울 수 있습니다. let _unused_variable = 3u32; let noisy_unused_variable = 2u32; // 고쳐주세요! ^ 앞에 밑줄을 추가해 경고가 나타나지 않도록 하세요 }
가변성
변수 바인딩은 기본적으로 불변(immutable)이지만,
mut
수식어를 사용하면 가변성(mutability)을 갖도록 바꿀 수 있습니다.
fn main() { let _immutable_binding = 1; let mut mutable_binding = 1; println!("변하기 전: {}", mutable_binding); // 문제없음 mutable_binding += 1; println!("변한 후: {}", mutable_binding); // 에러! _immutable_binding += 1; // 고쳐주세요! ^ 이 줄을 주석 처리해주세요 }
컴파일러가 가변성 에러 진단 메시지를 상세히 표시할 겁니다.
스코프, 변수 가리기
변수 바인딩은 스코프(scope, 범위)를 가지며, 블록 내에서만 존재할 수 있습니다.
블록은 {}
로 둘러싸인 구문의 모음을 의미합니다.
fn main() { // 메인 함수 내에서 존재하는 바인딩 let long_lived_binding = 1; // 메인 함수보다는 스코프가 작은 블록입니다. { // 이 블록 내에서만 존재하는 바인딩 let short_lived_binding = 2; println!("내부에서의 수명 짧은 바인딩: {}", short_lived_binding); } // 블록 끝 // 에러! `short_lived_binding`는 이 스코프에 존재하지 않습니다. println!("외부에서의 수명 짧은 바인딩: {}", short_lived_binding); // 고쳐주세요! ^ 이 줄을 주석 처리해주세요. println!("외부에서의 수명 긴 바인딩: {}", long_lived_binding); }
또한, 변수는 가려질 수 있습니다. (variable shadowing)
fn main() { let shadowed_binding = 1; { println!("가려지기 전: {}", shadowed_binding); // 외부의 바인딩을 *가리는* 바인딩 let shadowed_binding = "abc"; println!("내부 블록에서 가려진 후: {}", shadowed_binding); } println!("내부 블록 벗어남: {}", shadowed_binding); // 기존 바인딩을 *가리는* 바인딩 let shadowed_binding = 2; println!("외부 블록에서 가려진 후: {}", shadowed_binding); }
조기 선언
변수 바인딩을 먼저 선언하고, 나중에 초기화할 수도 있습니다. 다만, 이러한 형식은 초기화되지 않은 변수 사용을 유발할 수 있으므로 자주 사용되지는 않습니다.
fn main() { // 변수 바인딩 선언 let a_binding; { let x = 2; // 바인딩 초기화 a_binding = x * x; } println!("a_binding: {}", a_binding); let another_binding; // 에러! 초기화되지 않은 바인딩 사용 println!("another binding: {}", another_binding); // 고쳐주세요! ^ 이 줄을 주석 처리해주세요 another_binding = 1; println!("another_binding: {}", another_binding); }
초기화되지 않은 변수 사용은 정의되지 않은 행동(Undefined behavior)을 유발할 수 있으므로, 컴파일러가 금지합니다.
변수 동결
가변 변수를 동일한 이름의 불변 변수로 바인딩하면 데이터를 동결(freeze)할 수 있습니다. 동결된 데이터는 불변 바인딩이 스코프를 벗어나기 전까진 수정할 수 없습니다.
fn main() { let mut _mutable_integer = 7i32; { // 불변 `_mutable_integer`로 가리기 let _mutable_integer = _mutable_integer; // 에러! `_mutable_integer`는 이 스코프에서 동결되었습니다. _mutable_integer = 50; // 고쳐주세요! ^ 이 줄을 주석 처리해주세요. // `_mutable_integer`가 스코프를 벗어남 } // 문제없음! `_mutable_integer`는 이 스코프에서 동결되어있지 않습니다. _mutable_integer = 3; }
타입
러스트는 기본 타입 및 사용자 정의 타입을 변경하거나 정의하는 방법을 여럿 제공합니다. 이번에 다루는 내용은 각각 다음과 같습니다.
형변환
러스트는 기본 타입 간 암묵적 형변환 기능을 제공하지 않습니다.
형변환은 as
키워드를 사용해 명시적으로 수행해야 합니다.
정수 타입 간의 형변환 규칙은 일반적으로 C언어 규칙을 따릅니다. (C언어에서 동작하지만 정의되지 않은 동작(UB)인 경우는 제외됩니다.) 정수 타입 간 형변환은 모두 러스트에 잘 정의되어 있습니다.
// 오버플로우 형변환 경고를 모두 숨깁니다. #![allow(overflowing_literals)] fn main() { let decimal = 65.4321_f32; // 에러! 암묵적 형변환은 불가능합니다 let integer: u8 = decimal; // 고쳐주세요! ^ 이 줄을 주석 처리해주세요 // 명시적 형변환 let integer = decimal as u8; let character = integer as char; // 에러! 변환 규칙에는 제한이 존재합니다. // 부동 소수점 타입은 char 타입으로 직접 변환할 수 없습니다. let character = decimal as char; // 고쳐주세요! ^ 이 줄을 주석 처리해주세요 println!("형변환: {} -> {} -> {}", decimal, integer, character); // 어떤 값을 부호 없는 타입 T로 형변환 할 경우, // 값을 새로운 타입에 저장 할 수 있을때까지 // T::MAX + 1 을 더하거나 뺍니다. // 1000은 u16 타입으로 저장할 수 있습니다 println!("1000 as u16 = {}", 1000 as u16); // 1000 - 256 - 256 - 256 = 232 // 내부적으로, 처음 8개 최하위 비트(LSB)는 유지되고, // 최상위 비트(MSB) 방향의 나머지는 잘립니다. println!("1000 as u8 = {}", 1000 as u8); // -1 + 256 = 255 println!(" -1 as u8 = {}", (-1i8) as u8); // 양수의 경우, 나머지 연산 결과와 동일합니다 println!("1000 % 256 = {}", 1000 % 256); // 부호 있는 타입으로 형변환 할 경우, // (비트 단위) 결과는 부호 없는 타입으로 형변환한 것과 같습니다. // 만약 값의 최상위 비트(MSB)가 1일 경우, 해당 값은 음수입니다. // 물론, 바로 저장 가능한 경우는 제외고요. println!(" 128 as i16 = {}", 128 as i16); // 128 as u8 -> 128, 8비트 상에서 2의 보수: println!(" 128 as i8 = {}", 128 as i8); // 앞선 예시 반복 // 1000 as u8 -> 232 println!("1000 as u8 = {}", 1000 as u8); // 232에 대한 2의 보수는 -24입니다 println!(" 232 as i8 = {}", 232 as i8); // 러스트 1.45 버전부터, `as` 키워드는 부동 소수점을 정수로 형변환 할 경우 // *포화 연산(saturate cast)*를 수행합니다. // 부동 소수점 값이 상한을 초과하거나 하한보다 작을 경우, // 반환 값은 한계치와 동일합니다. // 300.0 = 255 println!(" 300.0 = {}", 300.0_f32 as u8); // -100.0 as u8 = 0 println!("-100.0 as u8 = {}", -100.0_f32 as u8); // nan as u8 = 0 println!(" nan as u8 = {}", f32::NAN as u8); // 포화 연산은 약간의 런타임 비용이 발생합니다. // unsafe 메소드로 포화 연산을 회피할 수 있지만, 오버플로우가 발생해 // **부적절한** 결과 값이 반환될 수 있으므로 사용에 주의해야 합니다. unsafe { // 300.0 = 44 println!(" 300.0 = {}", 300.0_f32.to_int_unchecked::<u8>()); // -100.0 as u8 = 156 println!("-100.0 as u8 = {}", (-100.0_f32).to_int_unchecked::<u8>()); // nan as u8 = 0 println!(" nan as u8 = {}", f32::NAN.to_int_unchecked::<u8>()); } }
리터럴
숫자 리터럴에는 접미사로 타입 어노테이션을 작성할 수 있습니다.
예를 들어, 42i32
는 42
리터럴을 i32
타입으로 지정한다는 의미입니다.
접미사가 없는 숫자 리터럴은 어떻게 사용되느냐에 따라서 타입이 달라집니다.
제약 조건이 없을 경우, 컴파일러는 정수에 i32
타입을,
부동 소수점에 f64
타입을 사용합니다.
fn main() { // 접미사가 있는 리터럴 (초기화 시점에 타입을 알 수 있음) let x = 1u8; let y = 2u32; let z = 3f32; // 접미사가 없는 리터럴 (어떻게 사용되느냐에 따라 타입이 달라짐) let i = 1; let f = 1.0; // `size_of_val`는 변수의 크기를 바이트 단위로 반환합니다. println!("`x`의 바이트 단위 크기: {}", std::mem::size_of_val(&x)); println!("`y`의 바이트 단위 크기: {}", std::mem::size_of_val(&y)); println!("`z`의 바이트 단위 크기: {}", std::mem::size_of_val(&z)); println!("`i`의 바이트 단위 크기: {}", std::mem::size_of_val(&i)); println!("`f`의 바이트 단위 크기: {}", std::mem::size_of_val(&f)); }
앞선 코드에서는 아직 설명하지 않은 개념이 몇 가지 사용되었습니다. 궁금하신 분을 위해 간단히 설명하겠습니다.
std::mem::size_of_val
는 함수이며, 전체 경로를 통해 호출했습니다. 코드는 모듈이라는 논리적 단위로 분할할 수 있습니다. 이 경우,size_of_val
함수는mem
모듈 내에 정의되어있고,mem
모듈은std
크레이트 내에 정의되어있습니다. 자세한 내용은 모듈, 크레이트를 참고해주세요.
타입 추론
타입 추론 엔진은 꽤 영리합니다. 초기화 시 표현식 값의 타입을 알아내는 것을 넘어, 변수가 이후에 어떻게 사용되는지까지 지켜보고 타입을 추론하죠. 다음은 타입 추론의 심화 예시입니다.
fn main() { // 타입 어노테이션이 있으니, 컴파일러는 `elem`이 u8 타입임을 알 수 있습니다 let elem = 5u8; // 빈 벡터(크기가 늘어나는 배열)를 생성합니다. let mut vec = Vec::new(); // 이 시점에서, 컴파일러는 `vec`의 정확한 타입을 알 수 없습니다. // 무언가에 대한 벡터(`Vec<_>`)라는 것만 알 수 있죠. // `elem`을 벡터에 삽입합니다. vec.push(elem); // 이제 컴파일러는 `vec`이 `u8` 타입의 벡터(`Vec<u8>`)임을 알 수 있습니다! // TODO ^ `vec.push(elem)` 줄을 주석 처리해보세요 println!("{:?}", vec); }
변수에 타입 어노테이션을 작성하지 않고도 컴파일 문제가 발생하지 않으면, 프로그래머에게는 행복한 일이죠!
타입 별칭
type
구문으로 기존 타입에 새로운 이름을 지어줄 수도 있습니다.
타입명은 반드시 대문자 낙타 표기법(UpperCamelCase
)을 따라야 하며, 따르지 않을 경우 컴파일러 경고가 발생합니다.
단, usize
, f32
등 기본 타입은 이 규칙에서 예외입니다.
// `NanoSecond`는 `u64`의 새로운 별칭입니다. type NanoSecond = u64; type Inch = u64; // 속성을 사용해 경고를 무시합니다 #[allow(non_camel_case_types)] type u64_t = u64; // TODO ^ 속성을 제거해보세요 fn main() { // `NanoSecond` = `Inch` = `u64_t` = `u64`. let nanoseconds: NanoSecond = 5 as u64_t; let inches: Inch = 2 as u64_t; // 타입 별칭이 추가적인 타입 안정성을 제공하지는 *않는다는* 점을 알아두세요. // 별칭일 뿐 새로운 타입이 *아닙니다* println!("{} nanoseconds + {} inches = {} unit?", nanoseconds, inches, nanoseconds + inches); }
타입 별칭은 주로 보일러플레이트를 줄이기 위해 사용됩니다.
예를 들어, IoResult<T>
타입은 Result<T, IoError>
라는 긴 타입의 별칭입니다.
See also:
형변환
기본 타입은 서로 간에 형변환(Casting)할 수 있습니다.
러스트에서 커스텀 타입(struct
, enum
등) 간
형변환(Conversion)은 트레잇을 사용합니다.
일반적인 형변환은 From
, Into
트레잇을 이용하며,
자주 사용되는 String
형변환은 별도의 방법이
특별히 제공됩니다.
From
, Into
From
, Into
트레잇은 본질적으로 서로 이어져있으며,
실제 구현 또한 마찬가지입니다. A 타입을 B 타입으로 변환할 수 있다면,
B 타입을 A 타입으로 변환할 수도 있어야 하는 건 당연한 일이죠.
From
From
트레잇으로는 어떠한 타입을 다른 타입으로부터 어떻게 생성하는지 정의할 수 있습니다.
이를 이용하면 여러 타입 간에 쉽게 변환할 수 있습니다.
기본 타입 및 공용 타입은 표준 라이브러리에 수많은 From
트레잇 구현이
이미 작성되어있습니다.
그 예로, str
은 String
으로 쉽게 변환할 수 있죠.
#![allow(unused)] fn main() { let my_str = "hello"; let my_string = String::from(my_str); }
우리가 직접 만든 타입도 변환 방법을 정의해주면 쉽게 변환 가능합니다.
use std::convert::From; #[derive(Debug)] struct Number { value: i32, } impl From<i32> for Number { fn from(item: i32) -> Self { Number { value: item } } } fn main() { let num = Number::from(30); println!("내가 만든 Number {:?}", num); }
Into
Into
트레잇은 단순히 From
트레잇의 반대입니다.
즉, From
트레잇이 구현된 여러분의 타입에 Into
를 사용하면
from
이 호출됩니다.
Into
트레잇을 사용하는 대부분의 경우,
컴파일러는 알맞은 타입을 알아낼 수 없으므로 타입 명시가 필수적입니다.
하지만 이건 얻는 이점에 비하면 사소한 문제입니다.
use std::convert::From; #[derive(Debug)] struct Number { value: i32, } impl From<i32> for Number { fn from(item: i32) -> Self { Number { value: item } } } fn main() { let int = 5; // 타입 선언을 지워보세요 let num: Number = int.into(); println!("Number는 {:?}입니다", num); }
TryFrom
, TryInto
From
, Into
와 마찬가지로,
TryFrom
, TryInto
은 형변환용 제네릭 트레잇입니다.
다만 From
/Into
와는 다르게, TryFrom
/TryInto
트레잇은
실패할 가능성이 있는 변환에 사용됩니다(따라서 Result
를 반환합니다).
use std::convert::TryFrom; use std::convert::TryInto; #[derive(Debug, PartialEq)] struct EvenNumber(i32); // 짝수 impl TryFrom<i32> for EvenNumber { type Error = (); fn try_from(value: i32) -> Result<Self, Self::Error> { if value % 2 == 0 { Ok(EvenNumber(value)) } else { Err(()) } } } fn main() { // TryFrom assert_eq!(EvenNumber::try_from(8), Ok(EvenNumber(8))); assert_eq!(EvenNumber::try_from(5), Err(())); // TryInto let result: Result<EvenNumber, ()> = 8i32.try_into(); assert_eq!(result, Ok(EvenNumber(8))); let result: Result<EvenNumber, ()> = 5i32.try_into(); assert_eq!(result, Err(())); }
문자열 변환
문자열로 변환하기
어떤 타입에 ToString
트레잇이 구현되어있다면 String
으로 변환할 수 있습니다.
다만, 타입에 ToString
을 직접 구현하는 대신, fmt::Display
트레잇을 구현해
ToString
을 자동으로 생성하고 print!
에서 다룬 타입 출력도
가능하게 만들 수 있습니다.
use std::fmt; struct Circle { radius: i32 } impl fmt::Display for Circle { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { write!(f, "반지름이 {}인 원", self.radius) } } fn main() { let circle = Circle { radius: 6 }; println!("{}", circle.to_string()); }
문자열 파싱하기
문자열을 다른 타입으로 변환하는 문자열 파싱은
일반적으로 parse
함수가 사용되며, 여기에 타입 추론을 붙이거나
'turbofish' 구문으로 타입을 명시합니다.
다음 예제는 각각의 방법으로 문자열을 숫자로 변환하는 예시를 나타냅니다.
어떤 타입에 FromStr
트레잇이 구현되어있다면, 문자열을 해당 타입으로 변환할 수 있습니다.
표준 라이브러리 내에는 다양한 타입에 대해서 FromStr
트레잇이 구현되어있습니다.
사용자 정의 타입에도 FromStr
트레잇을 구현하면
마찬가지 동작이 가능합니다.
fn main() { let parsed: i32 = "5".parse().unwrap(); let turbo_parsed = "10".parse::<i32>().unwrap(); let sum = parsed + turbo_parsed; println!("합계: {:?}", sum); }
표현식
러스트 프로그램은 (대부분) 여러 개의 구문으로 이루어져 있습니다.
fn main() { // 구문 // 구문 // 구문 }
러스트에서 구문은 여러 종류가 존재합니다.
가장 일반적인 두 가지는 변수 바인딩 선언, ;
이 붙은 표현식입니다.
fn main() { // 변수 바인딩 let x = 5; // 표현식; x; x + 1; 15; }
블록 또한 표현식이므로 대입 값으로 사용할 수 있습니다.
블록 내 마지막 표현식은 지역 변수 등의 표현식에 대입됩니다.
블록의 마지막 표현식에 세미콜론이 붙는 경우,
반환값은 ()
입니다.
fn main() { let x = 5u32; let y = { let x_squared = x * x; let x_cube = x_squared * x; // 이 표현식은 `y`에 대입됩니다 x_cube + x_squared + x }; let z = { // 이 표현식은 세미콜론으로 억제되어, `z`에 `()`가 대입됩니다 2 * x; }; println!("x = {:?}", x); println!("y = {:?}", y); println!("z = {:?}", z); }
흐름 제어
if
/else
, for
등 흐름 제어 구문은 모든 프로그래밍 언어의 필수 요소입니다.
러스트의 흐름 제어 구문을 알아봅시다.
if/else
if
-else
를 사용하는 분기 방법은 다른 언어와 유사합니다.
하지만, 대다수의 언어와 달리 boolean 조건을 괄호로 묶을 필요가 없으며,
각 조건 뒤에는 블록이 반드시 따라붙습니다.
if
-eles
조건문 또한 표현식이며, 모든 갈래는 동일한 타입을 반환해야 합니다.
fn main() { let n = 5; if n < 0 { print!("{}은(는) 음수입니다.", n); } else if n > 0 { print!("{}은(는) 양수입니다.", n); } else { print!("{}은 0입니다.", n); } let big_n = if n < 10 && n > -10 { println!(" 작은 숫자이므로, 10배로 늘립니다."); // 이 표현식은 `i32`를 반환합니다. 10 * n } else { println!(" 큰 숫자이므로, 반으로 나눕니다."); // 이 표현식 또한 마찬가지로 `i32`를 반환해야 합니다. n / 2 // TODO ^ 이 표현식에 세미콜론을 붙여보세요. }; // ^ 이 부분에 세미콜론을 빠트리면 안 됩니다! // 모든 `let` 바인딩에는 세미콜론이 필요합니다. println!("{} -> {}", n, big_n); }
loop
러스트에선 loop
키워드로 무한 반복을 명시할 수 있습니다.
break
구문으로 언제든지 반복을 끝낼 수 있습니다.
continue
구문은 현 회차의 나머지를 생략하고,
새로운 회차를 시작합니다.
fn main() { let mut count = 0u32; println!("무한 카운트 시작!"); // 무한 반복 loop { count += 1; if count == 3 { println!("셋"); // 이번 회차의 나머지 생략 continue; } println!("{}", count); if count == 5 { println!("이 정도면 충분하겠네요"); // 반복 종료 break; } } }
중첩, 라벨
중첩된 반복 내에서 break
, continue
로 외부 반복문을 제어하는 것도 가능합니다.
이 경우, 반복문에는 어떤 'label
이 어노테이션되어있어야 하며,
해당 라벨을 break
/continue
구문에 전달해야 합니다.
#![allow(unreachable_code)] fn main() { 'outer: loop { println!("바깥쪽 반복문 진입"); 'inner: loop { println!("안쪽 반복문 진입"); // 이 break 구문은 안쪽 반복문을 종료합니다 //break; // 이 break 구문은 바깥쪽 반복문을 종료합니다 break 'outer; } println!("이 부분은 실행될 일이 없습니다"); } println!("바깥쪽 반복문 종료됨"); }
반복문에서 반환하기
loop
는 어떤 작업에 성공할 때까지 재시도하는 용도로 사용되기도 합니다.
해당 작업이 어떠한 값을 반환하고, 이후의 코드에서 사용할 수 있도록
값을 전달해야 할 경우, break
구문 뒤에 값을 작성하면
loop
표현식이 해당 값을 반환합니다.
fn main() { let mut counter = 0; let result = loop { counter += 1; if counter == 10 { break counter * 2; } }; assert_eq!(result, 20); }
while
while
키워드는 조건이 참인 동안 반복합니다.
while
반복문으로 FizzBuzz 문제를 구현해보죠.
fn main() { // 카운터 변수 let mut n = 1; // `n`이 101 미만인 동안 반복 while n < 101 { if n % 15 == 0 { println!("fizzbuzz"); } else if n % 3 == 0 { println!("fizz"); } else if n % 5 == 0 { println!("buzz"); } else { println!("{}", n); } // 카운터 증가 n += 1; } }
for 반복문
범위 표기법으로 for 반복문 사용하기
for in
구문은 Iterator
를 통해 순회합니다.
반복자(iterator)를 만드는 가장 쉬운 방법은 범위 표기법을 사용하는 겁니다.
a..b
범위 표기법은 a
(포함)부터 b
(제외)까지,
한 단계씩 값이 생성됩니다.
FizzBuzz 문제를 while
대신 for
반복문으로 구현해보겠습니다.
fn main() { // `n`은 각 회차에 따라 1, 2, ..., 100이 됩니다 for n in 1..101 { if n % 15 == 0 { println!("fizzbuzz"); } else if n % 3 == 0 { println!("fizz"); } else if n % 5 == 0 { println!("buzz"); } else { println!("{}", n); } } }
범위의 양쪽 끝을 모두 포함하려면 a..=b
를 사용합니다.
위 예제를 a..=b
표기법을 사용해 작성하면 다음과 같습니다.
fn main() { // `n`은 각 회차에 따라 1, 2, ..., 100이 됩니다 for n in 1..=100 { if n % 15 == 0 { println!("fizzbuzz"); } else if n % 3 == 0 { println!("fizz"); } else if n % 5 == 0 { println!("buzz"); } else { println!("{}", n); } } }
반복자(Iterator)로 for 반복문 사용하기
for in
구문은 Iterator
와 사용할 수 있습니다.
Iterator 트레잇 부분에서 다뤘듯, for
반복문은
기본적으로 into_iter
함수를 컬렉션에 적용합니다.
물론, into_iter
만이 컬렉션을 반복자로 변환하는 유일한 방법은 아닙니다.
into_iter
, iter
, iter_mut
는
모두 컬렉션을 반복자로 변환하며,
데이터를 각각의 관점으로 제공합니다.
iter
- 컬렉션의 각 요소를 borrow 합니다. 컬렉션을 건드리지 않으므로, 반복문 이후에도 사용할 수 있습니다.
fn main() { let names = vec!["Bob", "Frank", "Ferris"]; for name in names.iter() { match name { &"Ferris" => println!("이 중에 러스트 사용자가 있다!"), // TODO ^ `&`를 지우고 "Ferris"로만 매칭해보세요 _ => println!("안녕, {}", name), } } println!("names: {:?}", names); }
into_iter
- 컬렉션을 소비하여 각 회차에 정확한 데이터를 가져옵니다. 데이터가 반복문 내로 '이동'되므로 컬렉션을 소비하고 나면 이후에는 사용할 수 없습니다.
fn main() { let names = vec!["Bob", "Frank", "Ferris"]; for name in names.into_iter() { match name { "Ferris" => println!("이 중에 러스트 사용자가 있다!"), _ => println!("안녕, {}", name), } } println!("names: {:?}", names); // 고쳐주세요! ^ 이 줄을 주석 처리해주세요 }
iter_mut
- 컬렉션의 각 요소를 변경 가능하게 borrow 합니다. 컬렉션을 수정할 수 있습니다.
fn main() { let mut names = vec!["Bob", "Frank", "Ferris"]; for name in names.iter_mut() { *name = match name { &mut "Ferris" => "이 중에 러스트 사용자가 있다!", _ => "안녕", } } println!("names: {:?}", names); }
앞선 코드 예제에서는 match
가지의 타입을 주목해주세요.
각 종류의 주요 차이점입니다.
당연하지만, 여러 종류가 존재한다는 것은 종류마다 할 수 있는 일이 다르다는 의미입니다.
See also:
match
러스트는 C 언어의 switch
처럼
사용할 수 있는 match
패턴 매칭을 제공합니다.
가장 먼저 매칭되는 갈래가 평가되며, 가능성이 있는 모든 값이 다뤄져야 합니다.
fn main() { let number = 13; // TODO ^ `number`에 다른 값을 설정해 보세요. println!("{}는", number); match number { // 하나의 값 매칭 1 => println!("1 입니다!"), // 여러 값 매칭 2 | 3 | 5 | 7 | 11 => println!("소수입니다"), // TODO ^ 소수 값 목록에 13을 추가해보세요 // 포함 범위(inclusive range) 매칭 13..=19 => println!("10대입니다"), // 나머지 경우 처리 _ => println!("특별한 숫자는 아닙니다"), // TODO ^ 위의 모든 경우 처리 갈래를 주석 처리해보세요 } let boolean = true; // match도 표현식입니다 let binary = match boolean { // 매치 갈래는 가능성 있는 모든 값을 처리해야 합니다 false => 0, true => 1, // TODO ^ 갈래 중 하나를 주석 처리해보세요 }; println!("{} -> {}", boolean, binary); }
해체
match
블록은 다양한 방식으로 항목을 해체할 수 있습니다.
튜플
match
에서, 튜플은 다음과 같이 해체할 수 있습니다.
fn main() { let triple = (0, -2, 3); // TODO ^ `triple`에 다른 값을 넣어 보세요 println!("{:?}", triple); // 매치에서 튜플을 해체할 수 있습니다. match triple { // 두 번째, 세 번째 요소를 해체합니다 (0, y, z) => println!("첫 번째는 `0`, `y`는 {:?}, `z`는 {:?}", y, z), (1, ..) => println!("첫 번째는 `1`, 나머지는 상관없음"), // `..`는 튜플의 나머지를 무시하는 데 사용합니다 _ => println!("뭐든지 상관없음"), // `_`는 값을 변수에 바인딩하지 않음을 의미합니다 } }
See also:
열거형
enum
해체도 비슷합니다.
// 이 `allow`는 하나의 variant만 사용할 경우 // 나타나는 경고를 억제합니다. #[allow(dead_code)] enum Color { // 이 3개는 이름만 존재합니다 Red, Blue, Green, // `u32` 튜플을 색상 모델 이름으로 묶습니다. RGB(u32, u32, u32), HSV(u32, u32, u32), HSL(u32, u32, u32), CMY(u32, u32, u32), CMYK(u32, u32, u32, u32), } fn main() { let color = Color::RGB(122, 17, 40); // TODO ^ `color`에 다른 variant를 설정해 보세요 println!("어떤 색상인가요?"); // 매치에서 열거형을 해체할 수 있습니다. match color { Color::Red => println!("빨간색입니다!"), Color::Blue => println!("파란색입니다!"), Color::Green => println!("초록색입니다!"), Color::RGB(r, g, b) => println!("빨간색: {}, 초록색: {}, 파란색: {}!", r, g, b), Color::HSV(h, s, v) => println!("색상(Hue): {}, 채도(Saturation): {}, 명도(Value): {}!", h, s, v), Color::HSL(h, s, l) => println!("색상(Hue): {}, 채도(Saturation): {}, 명도(Lightness): {}!", h, s, l), Color::CMY(c, m, y) => println!("Cyan: {}, Magenta: {}, Yellow: {}!", c, m, y), Color::CMYK(c, m, y, k) => println!("Cyan: {}, Magenta: {}, Yellow: {}, Key(Black): {}!", c, m, y, k), // 모든 variant가 검사되었으므로 더 이상의 갈래가 필요하지 않습니다 } }
See also:
#[allow(...)]
, 색상 모델, 열거형(enum
)
포인터/참조자
포인터의 경우, C
같은 언어와는 사용 개념이 다르므로
해체(destructuring)와 역참조(dereferencing)를
구별해야 합니다.
- 역참조는
*
를 사용합니다 - 해체는
&
,ref
,ref mut
를 사용합니다
fn main() { // `i32` 타입의 참조자를 할당합니다. // `&`는 참조자가 대입됨을 나타냅니다. let reference = &4; match reference { // `reference`를 `&val`에 매치되는 패턴으로 비교하면 // 다음과 같습니다: // `&i32` // `&val` // ^ 대응되는 `&i32`의 `&` 가 빠지면 `val`에 대입될 // `i32`만 남는 것을 확인할 수 있습니다. &val => println!("해체로 얻은 값: {:?}", val), } // 매칭하기 전에 역참조하면 `&`를 회피할 수 있습니다. match *reference { val => println!("역참조로 얻은 값: {:?}", val), } // 처음부터 참조가 아닌 경우엔 어떨까요? // `reference`는 우변이 참조자였으니 `&` 였지만, // 이번엔 우변이 참조자가 아닙니다. let _not_a_reference = 3; // 러스트는 이 용도로 `ref`를 제공합니다. // `ref`는 대입문을 수정해서, 요소에 대한 참조자를 // 생성하고 대입합니다. let ref _is_a_reference = 3; // 참조 없이 두 값을 정의하고 // `ref`, `ref mut`으로 참조자를 얻어옵니다. let value = 5; let mut mut_value = 6; // `ref` 키워드로 참조자를 생성합니다. match value { ref r => println!("값의 참조자: {:?}", r), } // `ref mut` 사용 방법도 비슷합니다. match mut_value { ref mut m => { // 참조자를 얻어옵니다. // 역참조했으므로 값을 더할 수 있습니다. *m += 10; println!("10을 더했습니다. `mut_value`: {:?}", m); }, } }
See also:
구조체
struct
해체도 비슷합니다.
fn main() { struct Foo { x: (u32, u32), y: u32, } // 구조체 내 값을 변경하면 어떻게 되는지 확인해보세요 let foo = Foo { x: (1, 2), y: 3 }; match foo { Foo { x: (1, b), y } => println!("x의 첫 번째 값 = 1, b = {}, y = {} ", b, y), // 구조체를 해체하고 변수의 이름을 다시 지을 수도 있습니다. // 순서는 중요하지 않습니다. Foo { y: 2, x: i } => println!("y is 2, i = {:?}", i), // 일부 변수를 무시할 수도 있습니다: Foo { y, .. } => println!("y = {}, x는 신경 쓰지 않습니다", y), // 다음은 오류가 발생합니다. 패턴에 `x` 필드가 언급되지 않았습니다. //Foo { y } => println!("y = {}", y), } }
See also:
매치 가드
match
갈래 필터링에 **가드(guard)**를 추가할 수 있습니다.
fn main() { let pair = (2, -2); // TODO ^ `pair`에 다른 값을 설정해 보세요 println!("{:?}를 설명합니다", pair); match pair { (x, y) if x == y => println!("둘이 서로 같습니다"), // 여기 ^ `if 조건문` 부분이 매치 가드입니다 (x, y) if x + y == 0 => println!("둘은 상반 관계입니다!"), (x, _) if x % 2 == 1 => println!("첫 번째 값은 홀수입니다"), _ => println!("서로 상관없는 값이네요..."), } }
단, 컴파일러는 표현식의 모든 가능성이
검사되었는지 임의로 평가하지 않습니다.
따라서 마지막에 _
패턴을 반드시 사용해야 합니다.
fn main() { let number: u8 = 4; match number { i if i == 0 => println!("0"), i if i > 0 => println!("0보다 큼"), _ => println!("그냥 통과"), // 이 부분은 도달할 수 없어야 합니다 } }
See also:
바인딩
변수에 간접적으로 접근하면 분기 이후에 다시 바인딩하지 않고는
해당 변수를 사용할 수 없습니다. match
에서 @
를 이용하면
값을 어떤 이름에 바인딩할 수 있습니다.
// `age` 함수는 `u32` 타입을 반환합니다. fn age() -> u32 { 15 } fn main() { println!("당신은 나이는?"); match age() { 0 => println!("아직 돌잔치도 못했습니다"), // 1 ..= 12에 바로 매칭시킬 수도 있지만, // 이 경우 나이 값이 어떤지 알 수 없습니다. // `n`에 1 ..= 12를 바인딩하면 나이를 알 수 있습니다. n @ 1 ..= 12 => println!("{:?}살 어린이입니다", n), n @ 13 ..= 19 => println!("{:?}살 청소년입니다", n), // 바인딩 된 것이 없습니다. 결과를 반환합니다. n => println!("{:?}세 성인입니다.", n), } }
바인딩으로 enum
variant를 해체할 수도 있습니다. Option
처럼 말이죠.
fn some_number() -> Option<u32> { Some(42) } fn main() { match some_number() { // `Some` variant를 얻어내고, 값이 42로 일치하면 // `n`에 바인딩합니다. Some(n @ 42) => println!("답: {}!", n), // 그 외 값에 매칭 Some(n) => println!("관심 없음... {}", n), // 어떤 것과도 매칭되지 않음 (`None` variant일 경우). _ => (), } }
See also:
if let
열거형을 매칭하는 상황 중, match
를 사용하기에는 불편한 경우가 있습니다.
#![allow(unused)] fn main() { // `Option<i32>` 타입 `optional`를 생성합니다 let optional = Some(7); match optional { Some(i) => { println!("대충 긴 문자열, `{:?}`", i); // ^ `i`를 해체하기까지, 들여쓰기를 // 두 번이나 해야 합니다. }, _ => {}, // ^ `match`는 모든 경우를 처리해야 하기 때문에, 이 부분은 필수적입니다. // 하지만 굳이 필요할까요? }; }
이 사례에는 if let
이 더 깔끔하며,
다양한 실패 옵션을 지정할 수도 있습니다.
fn main() { // 전부 `Option<i32>` 타입입니다 let number = Some(7); let letter: Option<i32> = None; let emoticon: Option<i32> = None; // `if let` 구문은 다음과 같이 읽습니다. // "만약 `number`가 `Some(i)`로 해체된다면 블록(`{}`)을 평가해라." if let Some(i) = number { println!("일치합니다! {:?}", i); } // 실패할 경우를 지정하려면 `else`를 사용합니다. if let Some(i) = letter { println!("일치합니다! {:?}", i); } else { // 해체 실패했습니다. 실패했을 경우로 변경합니다. println!("일치하지 않는 숫자입니다. 문자를 이용해 주세요!"); } // 변할 가능성이 있는 실패 조건 let i_like_letters = false; if let Some(i) = emoticon { println!("일치합니다! {:?}", i); // 해체 실패했습니다. `else if` 조건을 평가하여 // 이 분기를 선택해야 하는지 확인합니다. } else if i_like_letters { println!("일치하지 않는 숫자입니다. 문자를 이용해 주세요!"); } else { // 조건이 false로 평가되었습니다. 이 분기가 기본값입니다. println!("문자는 선호하지 않습니다. 이모티콘을 이용해 주세요!"); } }
같은 방식으로, if let
를 사용해 어떤 열거형 값이든 매칭할 수 있습니다.
// 열거형 예시 enum Foo { Bar, Baz, Qux(u32) } fn main() { // 변수 예시 생성 let a = Foo::Bar; let b = Foo::Baz; let c = Foo::Qux(100); // Foo::Bar에 매칭되는 변수 if let Foo::Bar = a { println!("a는 foobar입니다"); } // 변수 b는 Foo::Bar에 매칭되지 않습니다. // 따라서 아무것도 출력되지 않습니다. if let Foo::Bar = b { println!("b는 foobar입니다"); } // 변수 c는 Foo::Qux와 매칭됩니다. // Foo::Quax는 이전 예제의 Some()처럼 값을 가지고 있습니다. if let Foo::Qux(value) = c { println!("c는 {}입니다", value); } // `if let`에서도 바인딩을 사용할 수 있습니다. if let Foo::Qux(value @ 100) = c { println!("c는 100입니다"); } }
if let
의 또 다른 장점은 매개변수화되지 않은 열거형 variant와 매치할 수 있다는 것입니다. 열거형이 PartialEq
를 구현하거나 derive하지 않아도 말이죠. 이 경우, 열거형의 인스턴스를 동일시할 수 없으므로 if Foo::Bar == a
는 컴파일할 수 없지만, if let
은 가능합니다.
직접 해보시겠나요? if let
을 사용해 다음 예제를 고쳐보세요.
// 이 열거형은 의도적으로 PartialEq를 구현하거나 derive하지 않습니다. // 따라서 이후의 Foo::Bar == a 비교는 실패합니다. enum Foo {Bar} fn main() { let a = Foo::Bar; // 변수 a는 Foo::Bar와 매칭됩니다 if Foo::Bar == a { // ^-- 컴파일 에러가 나타납니다. `if let`을 사용하세요. println!("a는 foobar입니다"); } }
See also:
while let
if let
처럼, while let
을 사용하면 불편한 macth
배열을
더 괜찮게 바꿀 수 있습니다. i
를 증가시키는 예시를 생각해보죠.
#![allow(unused)] fn main() { // `Option<i32>` 타입 `optional`를 생성합니다 let mut optional = Some(0); // 반복해서 검사합니다. loop { match optional { // `optional`이 해체되었다면 블록을 평가합니다. Some(i) => { if i > 9 { println!("9보다 큽니다. 끝!"); optional = None; } else { println!("`i`는 `{:?}입니다`. 다시 시도합니다.", i); optional = Some(i + 1); } // ^ 들여쓰기를 세 번이나 해야 합니다! }, // 해체 실패 시 반복문을 이탈합니다. _ => { break; } // ^ 굳이 필요할까요? 더 좋은 방법을 찾아봅시다! } } }
while let
을 사용하면 훨씬 나아집니다.
fn main() { // `Option<i32>` 타입 `optional`를 생성합니다 let mut optional = Some(0); // 여긴 다음과 같이 읽습니다. "`optional`이 `Some(i)`로 // 해체 가능한 동안에는 블록(`{}`)을 평가하고, 아니라면 `break` 해라." while let Some(i) = optional { if i > 9 { println!("9보다 큽니다. 끝!"); optional = None; } else { println!("`i`는 `{:?}입니다`. 다시 시도합니다.", i); optional = Some(i + 1); } // ^ 오른쪽으로 덜 밀려났으며, 해체 실패한 경우를 // 명시적으로 처리할 필요가 없습니다. } // ^ `if let`에는 `else`/`else if`를 추가할 수 있지만, // `while let`은 불가능합니다. }
See also:
Functions
Functions are declared using the fn
keyword. Its arguments are type
annotated, just like variables, and, if the function returns a value, the
return type must be specified after an arrow ->
.
The final expression in the function will be used as return value.
Alternatively, the return
statement can be used to return a value earlier
from within the function, even from inside loops or if
statements.
Let's rewrite FizzBuzz using functions!
// Unlike C/C++, there's no restriction on the order of function definitions fn main() { // We can use this function here, and define it somewhere later fizzbuzz_to(100); } // Function that returns a boolean value fn is_divisible_by(lhs: u32, rhs: u32) -> bool { // Corner case, early return if rhs == 0 { return false; } // This is an expression, the `return` keyword is not necessary here lhs % rhs == 0 } // Functions that "don't" return a value, actually return the unit type `()` fn fizzbuzz(n: u32) -> () { if is_divisible_by(n, 15) { println!("fizzbuzz"); } else if is_divisible_by(n, 3) { println!("fizz"); } else if is_divisible_by(n, 5) { println!("buzz"); } else { println!("{}", n); } } // When a function returns `()`, the return type can be omitted from the // signature fn fizzbuzz_to(n: u32) { for n in 1..n + 1 { fizzbuzz(n); } }
Methods
Methods are functions attached to objects. These methods have access to the
data of the object and its other methods via the self
keyword. Methods are
defined under an impl
block.
struct Point { x: f64, y: f64, } // Implementation block, all `Point` methods go in here impl Point { // This is a static method // Static methods don't need to be called by an instance // These methods are generally used as constructors fn origin() -> Point { Point { x: 0.0, y: 0.0 } } // Another static method, taking two arguments: fn new(x: f64, y: f64) -> Point { Point { x: x, y: y } } } struct Rectangle { p1: Point, p2: Point, } impl Rectangle { // This is an instance method // `&self` is sugar for `self: &Self`, where `Self` is the type of the // caller object. In this case `Self` = `Rectangle` fn area(&self) -> f64 { // `self` gives access to the struct fields via the dot operator let Point { x: x1, y: y1 } = self.p1; let Point { x: x2, y: y2 } = self.p2; // `abs` is a `f64` method that returns the absolute value of the // caller ((x1 - x2) * (y1 - y2)).abs() } fn perimeter(&self) -> f64 { let Point { x: x1, y: y1 } = self.p1; let Point { x: x2, y: y2 } = self.p2; 2.0 * ((x1 - x2).abs() + (y1 - y2).abs()) } // This method requires the caller object to be mutable // `&mut self` desugars to `self: &mut Self` fn translate(&mut self, x: f64, y: f64) { self.p1.x += x; self.p2.x += x; self.p1.y += y; self.p2.y += y; } } // `Pair` owns resources: two heap allocated integers struct Pair(Box<i32>, Box<i32>); impl Pair { // This method "consumes" the resources of the caller object // `self` desugars to `self: Self` fn destroy(self) { // Destructure `self` let Pair(first, second) = self; println!("Destroying Pair({}, {})", first, second); // `first` and `second` go out of scope and get freed } } fn main() { let rectangle = Rectangle { // Static methods are called using double colons p1: Point::origin(), p2: Point::new(3.0, 4.0), }; // Instance methods are called using the dot operator // Note that the first argument `&self` is implicitly passed, i.e. // `rectangle.perimeter()` === `Rectangle::perimeter(&rectangle)` println!("Rectangle perimeter: {}", rectangle.perimeter()); println!("Rectangle area: {}", rectangle.area()); let mut square = Rectangle { p1: Point::origin(), p2: Point::new(1.0, 1.0), }; // Error! `rectangle` is immutable, but this method requires a mutable // object //rectangle.translate(1.0, 0.0); // TODO ^ Try uncommenting this line // Okay! Mutable objects can call mutable methods square.translate(1.0, 1.0); let pair = Pair(Box::new(1), Box::new(2)); pair.destroy(); // Error! Previous `destroy` call "consumed" `pair` //pair.destroy(); // TODO ^ Try uncommenting this line }
Closures
Closures are functions that can capture the enclosing environment. For example, a closure that captures the x variable:
|val| val + x
The syntax and capabilities of closures make them very convenient for on the fly usage. Calling a closure is exactly like calling a function. However, both input and return types can be inferred and input variable names must be specified.
Other characteristics of closures include:
- using
||
instead of()
around input variables. - optional body delimination (
{}
) for a single expression (mandatory otherwise). - the ability to capture the outer environment variables.
fn main() { // Increment via closures and functions. fn function(i: i32) -> i32 { i + 1 } // Closures are anonymous, here we are binding them to references // Annotation is identical to function annotation but is optional // as are the `{}` wrapping the body. These nameless functions // are assigned to appropriately named variables. let closure_annotated = |i: i32| -> i32 { i + 1 }; let closure_inferred = |i | i + 1 ; let i = 1; // Call the function and closures. println!("function: {}", function(i)); println!("closure_annotated: {}", closure_annotated(i)); println!("closure_inferred: {}", closure_inferred(i)); // A closure taking no arguments which returns an `i32`. // The return type is inferred. let one = || 1; println!("closure returning one: {}", one()); }
Capturing
Closures are inherently flexible and will do what the functionality requires to make the closure work without annotation. This allows capturing to flexibly adapt to the use case, sometimes moving and sometimes borrowing. Closures can capture variables:
- by reference:
&T
- by mutable reference:
&mut T
- by value:
T
They preferentially capture variables by reference and only go lower when required.
fn main() { use std::mem; let color = String::from("green"); // A closure to print `color` which immediately borrows (`&`) `color` and // stores the borrow and closure in the `print` variable. It will remain // borrowed until `print` is used the last time. // // `println!` only requires arguments by immutable reference so it doesn't // impose anything more restrictive. let print = || println!("`color`: {}", color); // Call the closure using the borrow. print(); // `color` can be borrowed immutably again, because the closure only holds // an immutable reference to `color`. let _reborrow = &color; print(); // A move or reborrow is allowed after the final use of `print` let _color_moved = color; let mut count = 0; // A closure to increment `count` could take either `&mut count` or `count` // but `&mut count` is less restrictive so it takes that. Immediately // borrows `count`. // // A `mut` is required on `inc` because a `&mut` is stored inside. Thus, // calling the closure mutates the closure which requires a `mut`. let mut inc = || { count += 1; println!("`count`: {}", count); }; // Call the closure using a mutable borrow. inc(); // The closure still mutably borrows `count` because it is called later. // An attempt to reborrow will lead to an error. // let _reborrow = &count; // ^ TODO: try uncommenting this line. inc(); // The closure no longer needs to borrow `&mut count`. Therefore, it is // possible to reborrow without an error let _count_reborrowed = &mut count; // A non-copy type. let movable = Box::new(3); // `mem::drop` requires `T` so this must take by value. A copy type // would copy into the closure leaving the original untouched. // A non-copy must move and so `movable` immediately moves into // the closure. let consume = || { println!("`movable`: {:?}", movable); mem::drop(movable); }; // `consume` consumes the variable so this can only be called once. consume(); // consume(); // ^ TODO: Try uncommenting this line. }
Using move
before vertical pipes forces closure
to take ownership of captured variables:
fn main() { // `Vec` has non-copy semantics. let haystack = vec![1, 2, 3]; let contains = move |needle| haystack.contains(needle); println!("{}", contains(&1)); println!("{}", contains(&4)); // println!("There're {} elements in vec", haystack.len()); // ^ Uncommenting above line will result in compile-time error // because borrow checker doesn't allow re-using variable after it // has been moved. // Removing `move` from closure's signature will cause closure // to borrow _haystack_ variable immutably, hence _haystack_ is still // available and uncommenting above line will not cause an error. }
See also:
Box
and std::mem::drop
As input parameters
While Rust chooses how to capture variables on the fly mostly without type
annotation, this ambiguity is not allowed when writing functions. When
taking a closure as an input parameter, the closure's complete type must be
annotated using one of a few traits
. In order of decreasing restriction,
they are:
Fn
: the closure captures by reference (&T
)FnMut
: the closure captures by mutable reference (&mut T
)FnOnce
: the closure captures by value (T
)
On a variable-by-variable basis, the compiler will capture variables in the least restrictive manner possible.
For instance, consider a parameter annotated as FnOnce
. This specifies
that the closure may capture by &T
, &mut T
, or T
, but the compiler
will ultimately choose based on how the captured variables are used in the
closure.
This is because if a move is possible, then any type of borrow should also
be possible. Note that the reverse is not true. If the parameter is
annotated as Fn
, then capturing variables by &mut T
or T
are not
allowed.
In the following example, try swapping the usage of Fn
, FnMut
, and
FnOnce
to see what happens:
// A function which takes a closure as an argument and calls it. // <F> denotes that F is a "Generic type parameter" fn apply<F>(f: F) where // The closure takes no input and returns nothing. F: FnOnce() { // ^ TODO: Try changing this to `Fn` or `FnMut`. f(); } // A function which takes a closure and returns an `i32`. fn apply_to_3<F>(f: F) -> i32 where // The closure takes an `i32` and returns an `i32`. F: Fn(i32) -> i32 { f(3) } fn main() { use std::mem; let greeting = "hello"; // A non-copy type. // `to_owned` creates owned data from borrowed one let mut farewell = "goodbye".to_owned(); // Capture 2 variables: `greeting` by reference and // `farewell` by value. let diary = || { // `greeting` is by reference: requires `Fn`. println!("I said {}.", greeting); // Mutation forces `farewell` to be captured by // mutable reference. Now requires `FnMut`. farewell.push_str("!!!"); println!("Then I screamed {}.", farewell); println!("Now I can sleep. zzzzz"); // Manually calling drop forces `farewell` to // be captured by value. Now requires `FnOnce`. mem::drop(farewell); }; // Call the function which applies the closure. apply(diary); // `double` satisfies `apply_to_3`'s trait bound let double = |x| 2 * x; println!("3 doubled: {}", apply_to_3(double)); }
See also:
std::mem::drop
, Fn
, FnMut
, Generics, where and FnOnce
Type anonymity
Closures succinctly capture variables from enclosing scopes. Does this have any consequences? It surely does. Observe how using a closure as a function parameter requires generics, which is necessary because of how they are defined:
#![allow(unused)] fn main() { // `F` must be generic. fn apply<F>(f: F) where F: FnOnce() { f(); } }
When a closure is defined, the compiler implicitly creates a new
anonymous structure to store the captured variables inside, meanwhile
implementing the functionality via one of the traits
: Fn
, FnMut
, or
FnOnce
for this unknown type. This type is assigned to the variable which
is stored until calling.
Since this new type is of unknown type, any usage in a function will require
generics. However, an unbounded type parameter <T>
would still be ambiguous
and not be allowed. Thus, bounding by one of the traits
: Fn
, FnMut
, or
FnOnce
(which it implements) is sufficient to specify its type.
// `F` must implement `Fn` for a closure which takes no // inputs and returns nothing - exactly what is required // for `print`. fn apply<F>(f: F) where F: Fn() { f(); } fn main() { let x = 7; // Capture `x` into an anonymous type and implement // `Fn` for it. Store it in `print`. let print = || println!("{}", x); apply(print); }
See also:
A thorough analysis, Fn
, FnMut
,
and FnOnce
Input functions
Since closures may be used as arguments, you might wonder if the same can be said about functions. And indeed they can! If you declare a function that takes a closure as parameter, then any function that satisfies the trait bound of that closure can be passed as a parameter.
// Define a function which takes a generic `F` argument // bounded by `Fn`, and calls it fn call_me<F: Fn()>(f: F) { f(); } // Define a wrapper function satisfying the `Fn` bound fn function() { println!("I'm a function!"); } fn main() { // Define a closure satisfying the `Fn` bound let closure = || println!("I'm a closure!"); call_me(closure); call_me(function); }
As an additional note, the Fn
, FnMut
, and FnOnce
traits
dictate how
a closure captures variables from the enclosing scope.
See also:
As output parameters
Closures as input parameters are possible, so returning closures as
output parameters should also be possible. However, anonymous
closure types are, by definition, unknown, so we have to use
impl Trait
to return them.
The valid traits for returning a closure are:
Fn
FnMut
FnOnce
Beyond this, the move
keyword must be used, which signals that all captures
occur by value. This is required because any captures by reference would be
dropped as soon as the function exited, leaving invalid references in the
closure.
fn create_fn() -> impl Fn() { let text = "Fn".to_owned(); move || println!("This is a: {}", text) } fn create_fnmut() -> impl FnMut() { let text = "FnMut".to_owned(); move || println!("This is a: {}", text) } fn create_fnonce() -> impl FnOnce() { let text = "FnOnce".to_owned(); move || println!("This is a: {}", text) } fn main() { let fn_plain = create_fn(); let mut fn_mut = create_fnmut(); let fn_once = create_fnonce(); fn_plain(); fn_mut(); fn_once(); }
See also:
Fn
, FnMut
, Generics and impl Trait.
Examples in std
This section contains a few examples of using closures from the std
library.
Iterator::any
Iterator::any
is a function which when passed an iterator, will return
true
if any element satisfies the predicate. Otherwise false
. Its
signature:
pub trait Iterator {
// The type being iterated over.
type Item;
// `any` takes `&mut self` meaning the caller may be borrowed
// and modified, but not consumed.
fn any<F>(&mut self, f: F) -> bool where
// `FnMut` meaning any captured variable may at most be
// modified, not consumed. `Self::Item` states it takes
// arguments to the closure by value.
F: FnMut(Self::Item) -> bool {}
}
fn main() { let vec1 = vec![1, 2, 3]; let vec2 = vec![4, 5, 6]; // `iter()` for vecs yields `&i32`. Destructure to `i32`. println!("2 in vec1: {}", vec1.iter() .any(|&x| x == 2)); // `into_iter()` for vecs yields `i32`. No destructuring required. println!("2 in vec2: {}", vec2.into_iter().any(| x| x == 2)); let array1 = [1, 2, 3]; let array2 = [4, 5, 6]; // `iter()` for arrays yields `&i32`. println!("2 in array1: {}", array1.iter() .any(|&x| x == 2)); // `into_iter()` for arrays unusually yields `&i32`. println!("2 in array2: {}", array2.into_iter().any(|&x| x == 2)); }
See also:
Searching through iterators
Iterator::find
is a function which iterates over an iterator and searches for the
first value which satisfies some condition. If none of the values satisfy the
condition, it returns None
. Its signature:
pub trait Iterator {
// The type being iterated over.
type Item;
// `find` takes `&mut self` meaning the caller may be borrowed
// and modified, but not consumed.
fn find<P>(&mut self, predicate: P) -> Option<Self::Item> where
// `FnMut` meaning any captured variable may at most be
// modified, not consumed. `&Self::Item` states it takes
// arguments to the closure by reference.
P: FnMut(&Self::Item) -> bool {}
}
fn main() { let vec1 = vec![1, 2, 3]; let vec2 = vec![4, 5, 6]; // `iter()` for vecs yields `&i32`. let mut iter = vec1.iter(); // `into_iter()` for vecs yields `i32`. let mut into_iter = vec2.into_iter(); // `iter()` for vecs yields `&i32`, and we want to reference one of its // items, so we have to destructure `&&i32` to `i32` println!("Find 2 in vec1: {:?}", iter .find(|&&x| x == 2)); // `into_iter()` for vecs yields `i32`, and we want to reference one of // its items, so we have to destructure `&i32` to `i32` println!("Find 2 in vec2: {:?}", into_iter.find(| &x| x == 2)); let array1 = [1, 2, 3]; let array2 = [4, 5, 6]; // `iter()` for arrays yields `&i32` println!("Find 2 in array1: {:?}", array1.iter() .find(|&&x| x == 2)); // `into_iter()` for arrays unusually yields `&i32` println!("Find 2 in array2: {:?}", array2.into_iter().find(|&&x| x == 2)); }
Iterator::find
gives you a reference to the item. But if you want the index of the
item, use Iterator::position
.
fn main() { let vec = vec![1, 9, 3, 3, 13, 2]; let index_of_first_even_number = vec.iter().position(|x| x % 2 == 0); assert_eq!(index_of_first_even_number, Some(5)); let index_of_first_negative_number = vec.iter().position(|x| x < &0); assert_eq!(index_of_first_negative_number, None); }
See also:
std::iter::Iterator::rposition
Higher Order Functions
Rust provides Higher Order Functions (HOF). These are functions that take one or more functions and/or produce a more useful function. HOFs and lazy iterators give Rust its functional flavor.
fn is_odd(n: u32) -> bool { n % 2 == 1 } fn main() { println!("Find the sum of all the squared odd numbers under 1000"); let upper = 1000; // Imperative approach // Declare accumulator variable let mut acc = 0; // Iterate: 0, 1, 2, ... to infinity for n in 0.. { // Square the number let n_squared = n * n; if n_squared >= upper { // Break loop if exceeded the upper limit break; } else if is_odd(n_squared) { // Accumulate value, if it's odd acc += n_squared; } } println!("imperative style: {}", acc); // Functional approach let sum_of_squared_odd_numbers: u32 = (0..).map(|n| n * n) // All natural numbers squared .take_while(|&n_squared| n_squared < upper) // Below upper limit .filter(|&n_squared| is_odd(n_squared)) // That are odd .fold(0, |acc, n_squared| acc + n_squared); // Sum them println!("functional style: {}", sum_of_squared_odd_numbers); }
Option and Iterator implement their fair share of HOFs.
Diverging functions
Diverging functions never return. They are marked using !
, which is an empty type.
#![allow(unused)] fn main() { fn foo() -> ! { panic!("This call never returns."); } }
As opposed to all the other types, this one cannot be instantiated, because the
set of all possible values this type can have is empty. Note that, it is
different from the ()
type, which has exactly one possible value.
For example, this function returns as usual, although there is no information in the return value.
fn some_fn() { () } fn main() { let a: () = some_fn(); println!("This function returns and you can see this line.") }
As opposed to this function, which will never return the control back to the caller.
#![feature(never_type)]
fn main() {
let x: ! = panic!("This call never returns.");
println!("You will never see this line!");
}
Although this might seem like an abstract concept, it is in fact very useful and
often handy. The main advantage of this type is that it can be cast to any other
one and therefore used at places where an exact type is required, for instance
in match
branches. This allows us to write code like this:
fn main() { fn sum_odd_numbers(up_to: u32) -> u32 { let mut acc = 0; for i in 0..up_to { // Notice that the return type of this match expression must be u32 // because of the type of the "addition" variable. let addition: u32 = match i%2 == 1 { // The "i" variable is of type u32, which is perfectly fine. true => i, // On the other hand, the "continue" expression does not return // u32, but it is still fine, because it never returns and therefore // does not violate the type requirements of the match expression. false => continue, }; acc += addition; } acc } println!("Sum of odd numbers up to 9 (excluding): {}", sum_odd_numbers(9)); }
It is also the return type of functions that loop forever (e.g. loop {}
) like
network servers or functions that terminate the process (e.g. exit()
).
Modules
Rust provides a powerful module system that can be used to hierarchically split code in logical units (modules), and manage visibility (public/private) between them.
A module is a collection of items: functions, structs, traits, impl
blocks,
and even other modules.
Visibility
By default, the items in a module have private visibility, but this can be
overridden with the pub
modifier. Only the public items of a module can be
accessed from outside the module scope.
// A module named `my_mod` mod my_mod { // Items in modules default to private visibility. fn private_function() { println!("called `my_mod::private_function()`"); } // Use the `pub` modifier to override default visibility. pub fn function() { println!("called `my_mod::function()`"); } // Items can access other items in the same module, // even when private. pub fn indirect_access() { print!("called `my_mod::indirect_access()`, that\n> "); private_function(); } // Modules can also be nested pub mod nested { pub fn function() { println!("called `my_mod::nested::function()`"); } #[allow(dead_code)] fn private_function() { println!("called `my_mod::nested::private_function()`"); } // Functions declared using `pub(in path)` syntax are only visible // within the given path. `path` must be a parent or ancestor module pub(in crate::my_mod) fn public_function_in_my_mod() { print!("called `my_mod::nested::public_function_in_my_mod()`, that\n> "); public_function_in_nested(); } // Functions declared using `pub(self)` syntax are only visible within // the current module, which is the same as leaving them private pub(self) fn public_function_in_nested() { println!("called `my_mod::nested::public_function_in_nested()`"); } // Functions declared using `pub(super)` syntax are only visible within // the parent module pub(super) fn public_function_in_super_mod() { println!("called `my_mod::nested::public_function_in_super_mod()`"); } } pub fn call_public_function_in_my_mod() { print!("called `my_mod::call_public_function_in_my_mod()`, that\n> "); nested::public_function_in_my_mod(); print!("> "); nested::public_function_in_super_mod(); } // pub(crate) makes functions visible only within the current crate pub(crate) fn public_function_in_crate() { println!("called `my_mod::public_function_in_crate()`"); } // Nested modules follow the same rules for visibility mod private_nested { #[allow(dead_code)] pub fn function() { println!("called `my_mod::private_nested::function()`"); } // Private parent items will still restrict the visibility of a child item, // even if it is declared as visible within a bigger scope. #[allow(dead_code)] pub(crate) fn restricted_function() { println!("called `my_mod::private_nested::restricted_function()`"); } } } fn function() { println!("called `function()`"); } fn main() { // Modules allow disambiguation between items that have the same name. function(); my_mod::function(); // Public items, including those inside nested modules, can be // accessed from outside the parent module. my_mod::indirect_access(); my_mod::nested::function(); my_mod::call_public_function_in_my_mod(); // pub(crate) items can be called from anywhere in the same crate my_mod::public_function_in_crate(); // pub(in path) items can only be called from within the module specified // Error! function `public_function_in_my_mod` is private //my_mod::nested::public_function_in_my_mod(); // TODO ^ Try uncommenting this line // Private items of a module cannot be directly accessed, even if // nested in a public module: // Error! `private_function` is private //my_mod::private_function(); // TODO ^ Try uncommenting this line // Error! `private_function` is private //my_mod::nested::private_function(); // TODO ^ Try uncommenting this line // Error! `private_nested` is a private module //my_mod::private_nested::function(); // TODO ^ Try uncommenting this line // Error! `private_nested` is a private module //my_mod::private_nested::restricted_function(); // TODO ^ Try uncommenting this line }
Struct visibility
Structs have an extra level of visibility with their fields. The visibility
defaults to private, and can be overridden with the pub
modifier. This
visibility only matters when a struct is accessed from outside the module
where it is defined, and has the goal of hiding information (encapsulation).
mod my { // A public struct with a public field of generic type `T` pub struct OpenBox<T> { pub contents: T, } // A public struct with a private field of generic type `T` #[allow(dead_code)] pub struct ClosedBox<T> { contents: T, } impl<T> ClosedBox<T> { // A public constructor method pub fn new(contents: T) -> ClosedBox<T> { ClosedBox { contents: contents, } } } } fn main() { // Public structs with public fields can be constructed as usual let open_box = my::OpenBox { contents: "public information" }; // and their fields can be normally accessed. println!("The open box contains: {}", open_box.contents); // Public structs with private fields cannot be constructed using field names. // Error! `ClosedBox` has private fields //let closed_box = my::ClosedBox { contents: "classified information" }; // TODO ^ Try uncommenting this line // However, structs with private fields can be created using // public constructors let _closed_box = my::ClosedBox::new("classified information"); // and the private fields of a public struct cannot be accessed. // Error! The `contents` field is private //println!("The closed box contains: {}", _closed_box.contents); // TODO ^ Try uncommenting this line }
See also:
The use
declaration
The use
declaration can be used to bind a full path to a new name, for easier
access. It is often used like this:
use crate::deeply::nested::{
my_first_function,
my_second_function,
AndATraitType
};
fn main() {
my_first_function();
}
You can use the as
keyword to bind imports to a different name:
// Bind the `deeply::nested::function` path to `other_function`. use deeply::nested::function as other_function; fn function() { println!("called `function()`"); } mod deeply { pub mod nested { pub fn function() { println!("called `deeply::nested::function()`"); } } } fn main() { // Easier access to `deeply::nested::function` other_function(); println!("Entering block"); { // This is equivalent to `use deeply::nested::function as function`. // This `function()` will shadow the outer one. use crate::deeply::nested::function; // `use` bindings have a local scope. In this case, the // shadowing of `function()` is only in this block. function(); println!("Leaving block"); } function(); }
super
and self
The super
and self
keywords can be used in the path to remove ambiguity
when accessing items and to prevent unnecessary hardcoding of paths.
fn function() { println!("called `function()`"); } mod cool { pub fn function() { println!("called `cool::function()`"); } } mod my { fn function() { println!("called `my::function()`"); } mod cool { pub fn function() { println!("called `my::cool::function()`"); } } pub fn indirect_call() { // Let's access all the functions named `function` from this scope! print!("called `my::indirect_call()`, that\n> "); // The `self` keyword refers to the current module scope - in this case `my`. // Calling `self::function()` and calling `function()` directly both give // the same result, because they refer to the same function. self::function(); function(); // We can also use `self` to access another module inside `my`: self::cool::function(); // The `super` keyword refers to the parent scope (outside the `my` module). super::function(); // This will bind to the `cool::function` in the *crate* scope. // In this case the crate scope is the outermost scope. { use crate::cool::function as root_function; root_function(); } } } fn main() { my::indirect_call(); }
File hierarchy
Modules can be mapped to a file/directory hierarchy. Let's break down the visibility example in files:
$ tree .
.
|-- my
| |-- inaccessible.rs
| |-- mod.rs
| `-- nested.rs
`-- split.rs
In split.rs
:
// This declaration will look for a file named `my.rs` or `my/mod.rs` and will
// insert its contents inside a module named `my` under this scope
mod my;
fn function() {
println!("called `function()`");
}
fn main() {
my::function();
function();
my::indirect_access();
my::nested::function();
}
In my/mod.rs
:
// Similarly `mod inaccessible` and `mod nested` will locate the `nested.rs`
// and `inaccessible.rs` files and insert them here under their respective
// modules
mod inaccessible;
pub mod nested;
pub fn function() {
println!("called `my::function()`");
}
fn private_function() {
println!("called `my::private_function()`");
}
pub fn indirect_access() {
print!("called `my::indirect_access()`, that\n> ");
private_function();
}
In my/nested.rs
:
pub fn function() {
println!("called `my::nested::function()`");
}
#[allow(dead_code)]
fn private_function() {
println!("called `my::nested::private_function()`");
}
In my/inaccessible.rs
:
#[allow(dead_code)]
pub fn public_function() {
println!("called `my::inaccessible::public_function()`");
}
Let's check that things still work as before:
$ rustc split.rs && ./split
called `my::function()`
called `function()`
called `my::indirect_access()`, that
> called `my::private_function()`
called `my::nested::function()`
Crates
A crate is a compilation unit in Rust. Whenever rustc some_file.rs
is called,
some_file.rs
is treated as the crate file. If some_file.rs
has mod
declarations in it, then the contents of the module files would be inserted in
places where mod
declarations in the crate file are found, before running
the compiler over it. In other words, modules do not get compiled
individually, only crates get compiled.
A crate can be compiled into a binary or into a library. By default, rustc
will produce a binary from a crate. This behavior can be overridden by passing
the --crate-type
flag to lib
.
Creating a Library
Let's create a library, and then see how to link it to another crate.
pub fn public_function() {
println!("called rary's `public_function()`");
}
fn private_function() {
println!("called rary's `private_function()`");
}
pub fn indirect_access() {
print!("called rary's `indirect_access()`, that\n> ");
private_function();
}
$ rustc --crate-type=lib rary.rs
$ ls lib*
library.rlib
Libraries get prefixed with "lib", and by default they get named after their
crate file, but this default name can be overridden by passing
the --crate-name
option to rustc
or by using the crate_name
attribute.
Using a Library
To link a crate to this new library you may use rustc
's --extern
flag. All
of its items will then be imported under a module named the same as the library.
This module generally behaves the same way as any other module.
// extern crate rary; // May be required for Rust 2015 edition or earlier
fn main() {
rary::public_function();
// Error! `private_function` is private
//rary::private_function();
rary::indirect_access();
}
# Where library.rlib is the path to the compiled library, assumed that it's
# in the same directory here:
$ rustc executable.rs --extern rary=library.rlib --edition=2018 && ./executable
called rary's `public_function()`
called rary's `indirect_access()`, that
> called rary's `private_function()`
Cargo
cargo
is the official Rust package management tool. It has lots of really
useful features to improve code quality and developer velocity! These include
- Dependency management and integration with crates.io (the official Rust package registry)
- Awareness of unit tests
- Awareness of benchmarks
This chapter will go through some quick basics, but you can find the comprehensive docs in The Cargo Book.
Dependencies
Most programs have dependencies on some libraries. If you have ever managed
dependencies by hand, you know how much of a pain this can be. Luckily, the Rust
ecosystem comes standard with cargo
! cargo
can manage dependencies for a
project.
To create a new Rust project,
# A binary
cargo new foo
# OR A library
cargo new --lib foo
For the rest of this chapter, let's assume we are making a binary, rather than a library, but all of the concepts are the same.
After the above commands, you should see a file hierarchy like this:
foo
├── Cargo.toml
└── src
└── main.rs
The main.rs
is the root source file for your new project -- nothing new there.
The Cargo.toml
is the config file for cargo
for this project (foo
). If you
look inside it, you should see something like this:
[package]
name = "foo"
version = "0.1.0"
authors = ["mark"]
[dependencies]
The name
field under [package]
determines the name of the project. This is
used by crates.io
if you publish the crate (more later). It is also the name
of the output binary when you compile.
The version
field is a crate version number using Semantic
Versioning.
The authors
field is a list of authors used when publishing the crate.
The [dependencies]
section lets you add dependencies for your project.
For example, suppose that we want our program to have a great CLI. You can find
lots of great packages on crates.io (the official Rust
package registry). One popular choice is clap.
As of this writing, the most recent published version of clap
is 2.27.1
. To
add a dependency to our program, we can simply add the following to our
Cargo.toml
under [dependencies]
: clap = "2.27.1"
. And that's it! You can start using
clap
in your program.
cargo
also supports other types of dependencies. Here is just
a small sampling:
[package]
name = "foo"
version = "0.1.0"
authors = ["mark"]
[dependencies]
clap = "2.27.1" # from crates.io
rand = { git = "https://github.com/rust-lang-nursery/rand" } # from online repo
bar = { path = "../bar" } # from a path in the local filesystem
cargo
is more than a dependency manager. All of the available
configuration options are listed in the format specification of
Cargo.toml
.
To build our project we can execute cargo build
anywhere in the project
directory (including subdirectories!). We can also do cargo run
to build and
run. Notice that these commands will resolve all dependencies, download crates
if needed, and build everything, including your crate. (Note that it only
rebuilds what it has not already built, similar to make
).
Voila! That's all there is to it!
Conventions
In the previous chapter, we saw the following directory hierarchy:
foo
├── Cargo.toml
└── src
└── main.rs
Suppose that we wanted to have two binaries in the same project, though. What then?
It turns out that cargo
supports this. The default binary name is main
, as
we saw before, but you can add additional binaries by placing them in a bin/
directory:
foo
├── Cargo.toml
└── src
├── main.rs
└── bin
└── my_other_bin.rs
To tell cargo
to compile or run this binary as opposed to the default or other
binaries, we just pass cargo
the --bin my_other_bin
flag, where my_other_bin
is the name of the binary we want to work with.
In addition to extra binaries, cargo
supports more features such as
benchmarks, tests, and examples.
In the next chapter, we will look more closely at tests.
Testing
As we know testing is integral to any piece of software! Rust has first-class support for unit and integration testing (see this chapter in TRPL).
From the testing chapters linked above, we see how to write unit tests and
integration tests. Organizationally, we can place unit tests in the modules they
test and integration tests in their own tests/
directory:
foo
├── Cargo.toml
├── src
│ └── main.rs
│ └── lib.rs
└── tests
├── my_test.rs
└── my_other_test.rs
Each file in tests
is a separate
integration test,
i.e. a test that is meant to test your library as if it were being called from a dependent
crate.
The Testing chapter elaborates on the three different testing styles: Unit, Doc, and Integration.
cargo
naturally provides an easy way to run all of your tests!
$ cargo test
You should see output like this:
$ cargo test
Compiling blah v0.1.0 (file:///nobackup/blah)
Finished dev [unoptimized + debuginfo] target(s) in 0.89 secs
Running target/debug/deps/blah-d3b32b97275ec472
running 3 tests
test test_bar ... ok
test test_baz ... ok
test test_foo_bar ... ok
test test_foo ... ok
test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
You can also run tests whose name matches a pattern:
$ cargo test test_foo
$ cargo test test_foo
Compiling blah v0.1.0 (file:///nobackup/blah)
Finished dev [unoptimized + debuginfo] target(s) in 0.35 secs
Running target/debug/deps/blah-d3b32b97275ec472
running 2 tests
test test_foo ... ok
test test_foo_bar ... ok
test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 2 filtered out
One word of caution: Cargo may run multiple tests concurrently, so make sure that they don't race with each other.
One example of this concurrency causing issues is if two tests output to a file, such as below:
#![allow(unused)] fn main() { #[cfg(test)] mod tests { // Import the necessary modules use std::fs::OpenOptions; use std::io::Write; // This test writes to a file #[test] fn test_file() { // Opens the file ferris.txt or creates one if it doesn't exist. let mut file = OpenOptions::new() .append(true) .create(true) .open("ferris.txt") .expect("Failed to open ferris.txt"); // Print "Ferris" 5 times. for _ in 0..5 { file.write_all("Ferris\n".as_bytes()) .expect("Could not write to ferris.txt"); } } // This test tries to write to the same file #[test] fn test_file_also() { // Opens the file ferris.txt or creates one if it doesn't exist. let mut file = OpenOptions::new() .append(true) .create(true) .open("ferris.txt") .expect("Failed to open ferris.txt"); // Print "Corro" 5 times. for _ in 0..5 { file.write_all("Corro\n".as_bytes()) .expect("Could not write to ferris.txt"); } } } }
Although the intent is to get the following:
$ cat ferris.txt
Ferris
Ferris
Ferris
Ferris
Ferris
Corro
Corro
Corro
Corro
Corro
What actually gets put into ferris.txt
is this:
$ cargo test test_foo
Corro
Ferris
Corro
Ferris
Corro
Ferris
Corro
Ferris
Corro
Ferris
Build Scripts
Sometimes a normal build from cargo
is not enough. Perhaps your crate needs
some pre-requisites before cargo
will successfully compile, things like code
generation, or some native code that needs to be compiled. To solve this problem
we have build scripts that Cargo can run.
To add a build script to your package it can either be specified in the
Cargo.toml
as follows:
[package]
...
build = "build.rs"
Otherwise Cargo will look for a build.rs
file in the project directory by
default.
How to use a build script
The build script is simply another Rust file that will be compiled and invoked prior to compiling anything else in the package. Hence it can be used to fulfill pre-requisites of your crate.
Cargo provides the script with inputs via environment variables specified here that can be used.
The script provides output via stdout. All lines printed are written to
target/debug/build/<pkg>/output
. Further, lines prefixed with cargo:
will be
interpreted by Cargo directly and hence can be used to define parameters for the
package's compilation.
For further specification and examples have a read of the Cargo specification.
Attributes
An attribute is metadata applied to some module, crate or item. This metadata can be used to/for:
- conditional compilation of code
- set crate name, version and type (binary or library)
- disable lints (warnings)
- enable compiler features (macros, glob imports, etc.)
- link to a foreign library
- mark functions as unit tests
- mark functions that will be part of a benchmark
When attributes apply to a whole crate, their syntax is #![crate_attribute]
,
and when they apply to a module or item, the syntax is #[item_attribute]
(notice the missing bang !
).
Attributes can take arguments with different syntaxes:
#[attribute = "value"]
#[attribute(key = "value")]
#[attribute(value)]
Attributes can have multiple values and can be separated over multiple lines, too:
#[attribute(value, value2)]
#[attribute(value, value2, value3,
value4, value5)]
dead_code
The compiler provides a dead_code
lint that will warn
about unused functions. An attribute can be used to disable the lint.
fn used_function() {} // `#[allow(dead_code)]` is an attribute that disables the `dead_code` lint #[allow(dead_code)] fn unused_function() {} fn noisy_unused_function() {} // FIXME ^ Add an attribute to suppress the warning fn main() { used_function(); }
Note that in real programs, you should eliminate dead code. In these examples we'll allow dead code in some places because of the interactive nature of the examples.
Crates
The crate_type
attribute can be used to tell the compiler whether a crate is
a binary or a library (and even which type of library), and the crate_name
attribute can be used to set the name of the crate.
However, it is important to note that both the crate_type
and crate_name
attributes have no effect whatsoever when using Cargo, the Rust package
manager. Since Cargo is used for the majority of Rust projects, this means
real-world uses of crate_type
and crate_name
are relatively limited.
// This crate is a library #![crate_type = "lib"] // The library is named "rary" #![crate_name = "rary"] pub fn public_function() { println!("called rary's `public_function()`"); } fn private_function() { println!("called rary's `private_function()`"); } pub fn indirect_access() { print!("called rary's `indirect_access()`, that\n> "); private_function(); }
When the crate_type
attribute is used, we no longer need to pass the
--crate-type
flag to rustc
.
$ rustc lib.rs
$ ls lib*
library.rlib
cfg
Configuration conditional checks are possible through two different operators:
- the
cfg
attribute:#[cfg(...)]
in attribute position - the
cfg!
macro:cfg!(...)
in boolean expressions
While the former enables conditional compilation, the latter conditionally
evaluates to true
or false
literals allowing for checks at run-time. Both
utilize identical argument syntax.
// This function only gets compiled if the target OS is linux #[cfg(target_os = "linux")] fn are_you_on_linux() { println!("You are running linux!"); } // And this function only gets compiled if the target OS is *not* linux #[cfg(not(target_os = "linux"))] fn are_you_on_linux() { println!("You are *not* running linux!"); } fn main() { are_you_on_linux(); println!("Are you sure?"); if cfg!(target_os = "linux") { println!("Yes. It's definitely linux!"); } else { println!("Yes. It's definitely *not* linux!"); } }
See also:
the reference, cfg!
, and macros.
Custom
Some conditionals like target_os
are implicitly provided by rustc
, but
custom conditionals must be passed to rustc
using the --cfg
flag.
#[cfg(some_condition)] fn conditional_function() { println!("condition met!"); } fn main() { conditional_function(); }
Try to run this to see what happens without the custom cfg
flag.
With the custom cfg
flag:
$ rustc --cfg some_condition custom.rs && ./custom
condition met!
Generics
Generics is the topic of generalizing types and functionalities to broader cases. This is extremely useful for reducing code duplication in many ways, but can call for rather involving syntax. Namely, being generic requires taking great care to specify over which types a generic type is actually considered valid. The simplest and most common use of generics is for type parameters.
A type parameter is specified as generic by the use of angle brackets and upper
camel case: <Aaa, Bbb, ...>
. "Generic type parameters" are
typically represented as <T>
. In Rust, "generic" also describes anything that
accepts one or more generic type parameters <T>
. Any type specified as a
generic type parameter is generic, and everything else is concrete (non-generic).
For example, defining a generic function named foo
that takes an argument
T
of any type:
fn foo<T>(arg: T) { ... }
Because T
has been specified as a generic type parameter using <T>
, it
is considered generic when used here as (arg: T)
. This is the case even if T
has previously been defined as a struct
.
This example shows some of the syntax in action:
// A concrete type `A`. struct A; // In defining the type `Single`, the first use of `A` is not preceded by `<A>`. // Therefore, `Single` is a concrete type, and `A` is defined as above. struct Single(A); // ^ Here is `Single`s first use of the type `A`. // Here, `<T>` precedes the first use of `T`, so `SingleGen` is a generic type. // Because the type parameter `T` is generic, it could be anything, including // the concrete type `A` defined at the top. struct SingleGen<T>(T); fn main() { // `Single` is concrete and explicitly takes `A`. let _s = Single(A); // Create a variable `_char` of type `SingleGen<char>` // and give it the value `SingleGen('a')`. // Here, `SingleGen` has a type parameter explicitly specified. let _char: SingleGen<char> = SingleGen('a'); // `SingleGen` can also have a type parameter implicitly specified: let _t = SingleGen(A); // Uses `A` defined at the top. let _i32 = SingleGen(6); // Uses `i32`. let _char = SingleGen('a'); // Uses `char`. }
See also:
Functions
The same set of rules can be applied to functions: a type T
becomes
generic when preceded by <T>
.
Using generic functions sometimes requires explicitly specifying type parameters. This may be the case if the function is called where the return type is generic, or if the compiler doesn't have enough information to infer the necessary type parameters.
A function call with explicitly specified type parameters looks like:
fun::<A, B, ...>()
.
struct A; // Concrete type `A`. struct S(A); // Concrete type `S`. struct SGen<T>(T); // Generic type `SGen`. // The following functions all take ownership of the variable passed into // them and immediately go out of scope, freeing the variable. // Define a function `reg_fn` that takes an argument `_s` of type `S`. // This has no `<T>` so this is not a generic function. fn reg_fn(_s: S) {} // Define a function `gen_spec_t` that takes an argument `_s` of type `SGen<T>`. // It has been explicitly given the type parameter `A`, but because `A` has not // been specified as a generic type parameter for `gen_spec_t`, it is not generic. fn gen_spec_t(_s: SGen<A>) {} // Define a function `gen_spec_i32` that takes an argument `_s` of type `SGen<i32>`. // It has been explicitly given the type parameter `i32`, which is a specific type. // Because `i32` is not a generic type, this function is also not generic. fn gen_spec_i32(_s: SGen<i32>) {} // Define a function `generic` that takes an argument `_s` of type `SGen<T>`. // Because `SGen<T>` is preceded by `<T>`, this function is generic over `T`. fn generic<T>(_s: SGen<T>) {} fn main() { // Using the non-generic functions reg_fn(S(A)); // Concrete type. gen_spec_t(SGen(A)); // Implicitly specified type parameter `A`. gen_spec_i32(SGen(6)); // Implicitly specified type parameter `i32`. // Explicitly specified type parameter `char` to `generic()`. generic::<char>(SGen('a')); // Implicitly specified type parameter `char` to `generic()`. generic(SGen('c')); }
See also:
Implementation
Similar to functions, implementations require care to remain generic.
#![allow(unused)] fn main() { struct S; // Concrete type `S` struct GenericVal<T>(T); // Generic type `GenericVal` // impl of GenericVal where we explicitly specify type parameters: impl GenericVal<f32> {} // Specify `f32` impl GenericVal<S> {} // Specify `S` as defined above // `<T>` Must precede the type to remain generic impl<T> GenericVal<T> {} }
struct Val { val: f64, } struct GenVal<T> { gen_val: T, } // impl of Val impl Val { fn value(&self) -> &f64 { &self.val } } // impl of GenVal for a generic type `T` impl<T> GenVal<T> { fn value(&self) -> &T { &self.gen_val } } fn main() { let x = Val { val: 3.0 }; let y = GenVal { gen_val: 3i32 }; println!("{}, {}", x.value(), y.value()); }
See also:
functions returning references, impl
, and struct
Traits
Of course trait
s can also be generic. Here we define one which reimplements
the Drop
trait
as a generic method to drop
itself and an input.
// Non-copyable types. struct Empty; struct Null; // A trait generic over `T`. trait DoubleDrop<T> { // Define a method on the caller type which takes an // additional single parameter `T` and does nothing with it. fn double_drop(self, _: T); } // Implement `DoubleDrop<T>` for any generic parameter `T` and // caller `U`. impl<T, U> DoubleDrop<T> for U { // This method takes ownership of both passed arguments, // deallocating both. fn double_drop(self, _: T) {} } fn main() { let empty = Empty; let null = Null; // Deallocate `empty` and `null`. empty.double_drop(null); //empty; //null; // ^ TODO: Try uncommenting these lines. }
See also:
Bounds
When working with generics, the type parameters often must use traits as bounds to
stipulate what functionality a type implements. For example, the following
example uses the trait Display
to print and so it requires T
to be bound
by Display
; that is, T
must implement Display
.
// Define a function `printer` that takes a generic type `T` which
// must implement trait `Display`.
fn printer<T: Display>(t: T) {
println!("{}", t);
}
Bounding restricts the generic to types that conform to the bounds. That is:
struct S<T: Display>(T);
// Error! `Vec<T>` does not implement `Display`. This
// specialization will fail.
let s = S(vec![1]);
Another effect of bounding is that generic instances are allowed to access the methods of traits specified in the bounds. For example:
// A trait which implements the print marker: `{:?}`. use std::fmt::Debug; trait HasArea { fn area(&self) -> f64; } impl HasArea for Rectangle { fn area(&self) -> f64 { self.length * self.height } } #[derive(Debug)] struct Rectangle { length: f64, height: f64 } #[allow(dead_code)] struct Triangle { length: f64, height: f64 } // The generic `T` must implement `Debug`. Regardless // of the type, this will work properly. fn print_debug<T: Debug>(t: &T) { println!("{:?}", t); } // `T` must implement `HasArea`. Any type which meets // the bound can access `HasArea`'s function `area`. fn area<T: HasArea>(t: &T) -> f64 { t.area() } fn main() { let rectangle = Rectangle { length: 3.0, height: 4.0 }; let _triangle = Triangle { length: 3.0, height: 4.0 }; print_debug(&rectangle); println!("Area: {}", area(&rectangle)); //print_debug(&_triangle); //println!("Area: {}", area(&_triangle)); // ^ TODO: Try uncommenting these. // | Error: Does not implement either `Debug` or `HasArea`. }
As an additional note, where
clauses can also be used to apply bounds in
some cases to be more expressive.
See also:
Testcase: empty bounds
A consequence of how bounds work is that even if a trait
doesn't
include any functionality, you can still use it as a bound. Eq
and
Copy
are examples of such trait
s from the std
library.
struct Cardinal; struct BlueJay; struct Turkey; trait Red {} trait Blue {} impl Red for Cardinal {} impl Blue for BlueJay {} // These functions are only valid for types which implement these // traits. The fact that the traits are empty is irrelevant. fn red<T: Red>(_: &T) -> &'static str { "red" } fn blue<T: Blue>(_: &T) -> &'static str { "blue" } fn main() { let cardinal = Cardinal; let blue_jay = BlueJay; let _turkey = Turkey; // `red()` won't work on a blue jay nor vice versa // because of the bounds. println!("A cardinal is {}", red(&cardinal)); println!("A blue jay is {}", blue(&blue_jay)); //println!("A turkey is {}", red(&_turkey)); // ^ TODO: Try uncommenting this line. }
See also:
std::cmp::Eq
, std::marker::Copy
, and trait
s
Multiple bounds
Multiple bounds for a single type can be applied with a +
. Like normal, different types are
separated with ,
.
use std::fmt::{Debug, Display}; fn compare_prints<T: Debug + Display>(t: &T) { println!("Debug: `{:?}`", t); println!("Display: `{}`", t); } fn compare_types<T: Debug, U: Debug>(t: &T, u: &U) { println!("t: `{:?}`", t); println!("u: `{:?}`", u); } fn main() { let string = "words"; let array = [1, 2, 3]; let vec = vec![1, 2, 3]; compare_prints(&string); //compare_prints(&array); // TODO ^ Try uncommenting this. compare_types(&array, &vec); }
See also:
Where clauses
A bound can also be expressed using a where
clause immediately
before the opening {
, rather than at the type's first mention.
Additionally, where
clauses can apply bounds to arbitrary types,
rather than just to type parameters.
Some cases that a where
clause is useful:
- When specifying generic types and bounds separately is clearer:
impl <A: TraitB + TraitC, D: TraitE + TraitF> MyTrait<A, D> for YourType {}
// Expressing bounds with a `where` clause
impl <A, D> MyTrait<A, D> for YourType where
A: TraitB + TraitC,
D: TraitE + TraitF {}
- When using a
where
clause is more expressive than using normal syntax. Theimpl
in this example cannot be directly expressed without awhere
clause:
use std::fmt::Debug; trait PrintInOption { fn print_in_option(self); } // Because we would otherwise have to express this as `T: Debug` or // use another method of indirect approach, this requires a `where` clause: impl<T> PrintInOption for T where Option<T>: Debug { // We want `Option<T>: Debug` as our bound because that is what's // being printed. Doing otherwise would be using the wrong bound. fn print_in_option(self) { println!("{:?}", Some(self)); } } fn main() { let vec = vec![1, 2, 3]; vec.print_in_option(); }
See also:
New Type Idiom
The newtype
idiom gives compile time guarantees that the right type of value is supplied
to a program.
For example, an age verification function that checks age in years, must be given
a value of type Years
.
struct Years(i64); struct Days(i64); impl Years { pub fn to_days(&self) -> Days { Days(self.0 * 365) } } impl Days { /// truncates partial years pub fn to_years(&self) -> Years { Years(self.0 / 365) } } fn old_enough(age: &Years) -> bool { age.0 >= 18 } fn main() { let age = Years(5); let age_days = age.to_days(); println!("Old enough {}", old_enough(&age)); println!("Old enough {}", old_enough(&age_days.to_years())); // println!("Old enough {}", old_enough(&age_days)); }
Uncomment the last print statement to observe that the type supplied must be Years
.
To obtain the newtype
's value as the base type, you may use the tuple or destructuring syntax like so:
struct Years(i64); fn main() { let years = Years(42); let years_as_primitive_1: i64 = years.0; // Tuple let Years(years_as_primitive_2) = years; // Destructuring }
See also:
Associated items
"Associated Items" refers to a set of rules pertaining to item
s
of various types. It is an extension to trait
generics, and allows
trait
s to internally define new items.
One such item is called an associated type, providing simpler usage
patterns when the trait
is generic over its container type.
See also:
The Problem
A trait
that is generic over its container type has type specification
requirements - users of the trait
must specify all of its generic types.
In the example below, the Contains
trait
allows the use of the generic
types A
and B
. The trait is then implemented for the Container
type,
specifying i32
for A
and B
so that it can be used with fn difference()
.
Because Contains
is generic, we are forced to explicitly state all of the
generic types for fn difference()
. In practice, we want a way to express that
A
and B
are determined by the input C
. As you will see in the next
section, associated types provide exactly that capability.
struct Container(i32, i32); // A trait which checks if 2 items are stored inside of container. // Also retrieves first or last value. trait Contains<A, B> { fn contains(&self, _: &A, _: &B) -> bool; // Explicitly requires `A` and `B`. fn first(&self) -> i32; // Doesn't explicitly require `A` or `B`. fn last(&self) -> i32; // Doesn't explicitly require `A` or `B`. } impl Contains<i32, i32> for Container { // True if the numbers stored are equal. fn contains(&self, number_1: &i32, number_2: &i32) -> bool { (&self.0 == number_1) && (&self.1 == number_2) } // Grab the first number. fn first(&self) -> i32 { self.0 } // Grab the last number. fn last(&self) -> i32 { self.1 } } // `C` contains `A` and `B`. In light of that, having to express `A` and // `B` again is a nuisance. fn difference<A, B, C>(container: &C) -> i32 where C: Contains<A, B> { container.last() - container.first() } fn main() { let number_1 = 3; let number_2 = 10; let container = Container(number_1, number_2); println!("Does container contain {} and {}: {}", &number_1, &number_2, container.contains(&number_1, &number_2)); println!("First number: {}", container.first()); println!("Last number: {}", container.last()); println!("The difference is: {}", difference(&container)); }
See also:
Associated types
The use of "Associated types" improves the overall readability of code
by moving inner types locally into a trait as output types. Syntax
for the trait
definition is as follows:
#![allow(unused)] fn main() { // `A` and `B` are defined in the trait via the `type` keyword. // (Note: `type` in this context is different from `type` when used for // aliases). trait Contains { type A; type B; // Updated syntax to refer to these new types generically. fn contains(&self, &Self::A, &Self::B) -> bool; } }
Note that functions that use the trait
Contains
are no longer required
to express A
or B
at all:
// Without using associated types
fn difference<A, B, C>(container: &C) -> i32 where
C: Contains<A, B> { ... }
// Using associated types
fn difference<C: Contains>(container: &C) -> i32 { ... }
Let's rewrite the example from the previous section using associated types:
struct Container(i32, i32); // A trait which checks if 2 items are stored inside of container. // Also retrieves first or last value. trait Contains { // Define generic types here which methods will be able to utilize. type A; type B; fn contains(&self, _: &Self::A, _: &Self::B) -> bool; fn first(&self) -> i32; fn last(&self) -> i32; } impl Contains for Container { // Specify what types `A` and `B` are. If the `input` type // is `Container(i32, i32)`, the `output` types are determined // as `i32` and `i32`. type A = i32; type B = i32; // `&Self::A` and `&Self::B` are also valid here. fn contains(&self, number_1: &i32, number_2: &i32) -> bool { (&self.0 == number_1) && (&self.1 == number_2) } // Grab the first number. fn first(&self) -> i32 { self.0 } // Grab the last number. fn last(&self) -> i32 { self.1 } } fn difference<C: Contains>(container: &C) -> i32 { container.last() - container.first() } fn main() { let number_1 = 3; let number_2 = 10; let container = Container(number_1, number_2); println!("Does container contain {} and {}: {}", &number_1, &number_2, container.contains(&number_1, &number_2)); println!("First number: {}", container.first()); println!("Last number: {}", container.last()); println!("The difference is: {}", difference(&container)); }
Phantom type parameters
A phantom type parameter is one that doesn't show up at runtime, but is checked statically (and only) at compile time.
Data types can use extra generic type parameters to act as markers or to perform type checking at compile time. These extra parameters hold no storage values, and have no runtime behavior.
In the following example, we combine std::marker::PhantomData with the phantom type parameter concept to create tuples containing different data types.
use std::marker::PhantomData; // A phantom tuple struct which is generic over `A` with hidden parameter `B`. #[derive(PartialEq)] // Allow equality test for this type. struct PhantomTuple<A, B>(A,PhantomData<B>); // A phantom type struct which is generic over `A` with hidden parameter `B`. #[derive(PartialEq)] // Allow equality test for this type. struct PhantomStruct<A, B> { first: A, phantom: PhantomData<B> } // Note: Storage is allocated for generic type `A`, but not for `B`. // Therefore, `B` cannot be used in computations. fn main() { // Here, `f32` and `f64` are the hidden parameters. // PhantomTuple type specified as `<char, f32>`. let _tuple1: PhantomTuple<char, f32> = PhantomTuple('Q', PhantomData); // PhantomTuple type specified as `<char, f64>`. let _tuple2: PhantomTuple<char, f64> = PhantomTuple('Q', PhantomData); // Type specified as `<char, f32>`. let _struct1: PhantomStruct<char, f32> = PhantomStruct { first: 'Q', phantom: PhantomData, }; // Type specified as `<char, f64>`. let _struct2: PhantomStruct<char, f64> = PhantomStruct { first: 'Q', phantom: PhantomData, }; // Compile-time Error! Type mismatch so these cannot be compared: //println!("_tuple1 == _tuple2 yields: {}", // _tuple1 == _tuple2); // Compile-time Error! Type mismatch so these cannot be compared: //println!("_struct1 == _struct2 yields: {}", // _struct1 == _struct2); }
See also:
Derive, struct, and TupleStructs
Testcase: unit clarification
A useful method of unit conversions can be examined by implementing Add
with a phantom type parameter. The Add
trait
is examined below:
// This construction would impose: `Self + RHS = Output`
// where RHS defaults to Self if not specified in the implementation.
pub trait Add<RHS = Self> {
type Output;
fn add(self, rhs: RHS) -> Self::Output;
}
// `Output` must be `T<U>` so that `T<U> + T<U> = T<U>`.
impl<U> Add for T<U> {
type Output = T<U>;
...
}
The whole implementation:
use std::ops::Add; use std::marker::PhantomData; /// Create void enumerations to define unit types. #[derive(Debug, Clone, Copy)] enum Inch {} #[derive(Debug, Clone, Copy)] enum Mm {} /// `Length` is a type with phantom type parameter `Unit`, /// and is not generic over the length type (that is `f64`). /// /// `f64` already implements the `Clone` and `Copy` traits. #[derive(Debug, Clone, Copy)] struct Length<Unit>(f64, PhantomData<Unit>); /// The `Add` trait defines the behavior of the `+` operator. impl<Unit> Add for Length<Unit> { type Output = Length<Unit>; // add() returns a new `Length` struct containing the sum. fn add(self, rhs: Length<Unit>) -> Length<Unit> { // `+` calls the `Add` implementation for `f64`. Length(self.0 + rhs.0, PhantomData) } } fn main() { // Specifies `one_foot` to have phantom type parameter `Inch`. let one_foot: Length<Inch> = Length(12.0, PhantomData); // `one_meter` has phantom type parameter `Mm`. let one_meter: Length<Mm> = Length(1000.0, PhantomData); // `+` calls the `add()` method we implemented for `Length<Unit>`. // // Since `Length` implements `Copy`, `add()` does not consume // `one_foot` and `one_meter` but copies them into `self` and `rhs`. let two_feet = one_foot + one_foot; let two_meters = one_meter + one_meter; // Addition works. println!("one foot + one_foot = {:?} in", two_feet.0); println!("one meter + one_meter = {:?} mm", two_meters.0); // Nonsensical operations fail as they should: // Compile-time Error: type mismatch. //let one_feter = one_foot + one_meter; }
See also:
Borrowing (&
), Bounds (X: Y
), enum, impl & self,
Overloading, ref, Traits (X for Y
), and TupleStructs.
Scoping rules
Scopes play an important part in ownership, borrowing, and lifetimes. That is, they indicate to the compiler when borrows are valid, when resources can be freed, and when variables are created or destroyed.
RAII
Variables in Rust do more than just hold data in the stack: they also own
resources, e.g. Box<T>
owns memory in the heap. Rust enforces RAII
(Resource Acquisition Is Initialization), so whenever an object goes out of
scope, its destructor is called and its owned resources are freed.
This behavior shields against resource leak bugs, so you'll never have to manually free memory or worry about memory leaks again! Here's a quick showcase:
// raii.rs fn create_box() { // Allocate an integer on the heap let _box1 = Box::new(3i32); // `_box1` is destroyed here, and memory gets freed } fn main() { // Allocate an integer on the heap let _box2 = Box::new(5i32); // A nested scope: { // Allocate an integer on the heap let _box3 = Box::new(4i32); // `_box3` is destroyed here, and memory gets freed } // Creating lots of boxes just for fun // There's no need to manually free memory! for _ in 0u32..1_000 { create_box(); } // `_box2` is destroyed here, and memory gets freed }
Of course, we can double check for memory errors using valgrind
:
$ rustc raii.rs && valgrind ./raii
==26873== Memcheck, a memory error detector
==26873== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==26873== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
==26873== Command: ./raii
==26873==
==26873==
==26873== HEAP SUMMARY:
==26873== in use at exit: 0 bytes in 0 blocks
==26873== total heap usage: 1,013 allocs, 1,013 frees, 8,696 bytes allocated
==26873==
==26873== All heap blocks were freed -- no leaks are possible
==26873==
==26873== For counts of detected and suppressed errors, rerun with: -v
==26873== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
No leaks here!
Destructor
The notion of a destructor in Rust is provided through the Drop
trait. The
destructor is called when the resource goes out of scope. This trait is not
required to be implemented for every type, only implement it for your type if
you require its own destructor logic.
Run the below example to see how the Drop
trait works. When the variable in
the main
function goes out of scope the custom destructor will be invoked.
struct ToDrop; impl Drop for ToDrop { fn drop(&mut self) { println!("ToDrop is being dropped"); } } fn main() { let x = ToDrop; println!("Made a ToDrop!"); }
See also:
Ownership and moves
Because variables are in charge of freeing their own resources, resources can only have one owner. This also prevents resources from being freed more than once. Note that not all variables own resources (e.g. references).
When doing assignments (let x = y
) or passing function arguments by value
(foo(x)
), the ownership of the resources is transferred. In Rust-speak,
this is known as a move.
After moving resources, the previous owner can no longer be used. This avoids creating dangling pointers.
// This function takes ownership of the heap allocated memory fn destroy_box(c: Box<i32>) { println!("Destroying a box that contains {}", c); // `c` is destroyed and the memory freed } fn main() { // _Stack_ allocated integer let x = 5u32; // *Copy* `x` into `y` - no resources are moved let y = x; // Both values can be independently used println!("x is {}, and y is {}", x, y); // `a` is a pointer to a _heap_ allocated integer let a = Box::new(5i32); println!("a contains: {}", a); // *Move* `a` into `b` let b = a; // The pointer address of `a` is copied (not the data) into `b`. // Both are now pointers to the same heap allocated data, but // `b` now owns it. // Error! `a` can no longer access the data, because it no longer owns the // heap memory //println!("a contains: {}", a); // TODO ^ Try uncommenting this line // This function takes ownership of the heap allocated memory from `b` destroy_box(b); // Since the heap memory has been freed at this point, this action would // result in dereferencing freed memory, but it's forbidden by the compiler // Error! Same reason as the previous Error //println!("b contains: {}", b); // TODO ^ Try uncommenting this line }
Mutability
Mutability of data can be changed when ownership is transferred.
fn main() { let immutable_box = Box::new(5u32); println!("immutable_box contains {}", immutable_box); // Mutability error //*immutable_box = 4; // *Move* the box, changing the ownership (and mutability) let mut mutable_box = immutable_box; println!("mutable_box contains {}", mutable_box); // Modify the contents of the box *mutable_box = 4; println!("mutable_box now contains {}", mutable_box); }
Partial moves
Within the destructuring of a single variable, both by-move
and
by-reference
pattern bindings can be used at the same time. Doing
this will result in a partial move of the variable, which means
that parts of the variable will be moved while other parts stay. In
such a case, the parent variable cannot be used afterwards as a
whole, however the parts that are only referenced (and not moved)
can still be used.
fn main() { #[derive(Debug)] struct Person { name: String, age: u8, } let person = Person { name: String::from("Alice"), age: 20, }; // `name` is moved out of person, but `age` is referenced let Person { name, ref age } = person; println!("The person's age is {}", age); println!("The person's name is {}", name); // Error! borrow of partially moved value: `person` partial move occurs //println!("The person struct is {:?}", person); // `person` cannot be used but `person.age` can be used as it is not moved println!("The person's age from person struct is {}", person.age); }
See also:
Borrowing
Most of the time, we'd like to access data without taking ownership over
it. To accomplish this, Rust uses a borrowing mechanism. Instead of
passing objects by value (T
), objects can be passed by reference (&T
).
The compiler statically guarantees (via its borrow checker) that references always point to valid objects. That is, while references to an object exist, the object cannot be destroyed.
// This function takes ownership of a box and destroys it fn eat_box_i32(boxed_i32: Box<i32>) { println!("Destroying box that contains {}", boxed_i32); } // This function borrows an i32 fn borrow_i32(borrowed_i32: &i32) { println!("This int is: {}", borrowed_i32); } fn main() { // Create a boxed i32, and a stacked i32 let boxed_i32 = Box::new(5_i32); let stacked_i32 = 6_i32; // Borrow the contents of the box. Ownership is not taken, // so the contents can be borrowed again. borrow_i32(&boxed_i32); borrow_i32(&stacked_i32); { // Take a reference to the data contained inside the box let _ref_to_i32: &i32 = &boxed_i32; // Error! // Can't destroy `boxed_i32` while the inner value is borrowed later in scope. eat_box_i32(boxed_i32); // FIXME ^ Comment out this line // Attempt to borrow `_ref_to_i32` after inner value is destroyed borrow_i32(_ref_to_i32); // `_ref_to_i32` goes out of scope and is no longer borrowed. } // `boxed_i32` can now give up ownership to `eat_box` and be destroyed eat_box_i32(boxed_i32); }
Mutability
Mutable data can be mutably borrowed using &mut T
. This is called
a mutable reference and gives read/write access to the borrower.
In contrast, &T
borrows the data via an immutable reference, and
the borrower can read the data but not modify it:
#[allow(dead_code)] #[derive(Clone, Copy)] struct Book { // `&'static str` is a reference to a string allocated in read only memory author: &'static str, title: &'static str, year: u32, } // This function takes a reference to a book fn borrow_book(book: &Book) { println!("I immutably borrowed {} - {} edition", book.title, book.year); } // This function takes a reference to a mutable book and changes `year` to 2014 fn new_edition(book: &mut Book) { book.year = 2014; println!("I mutably borrowed {} - {} edition", book.title, book.year); } fn main() { // Create an immutable Book named `immutabook` let immutabook = Book { // string literals have type `&'static str` author: "Douglas Hofstadter", title: "Gödel, Escher, Bach", year: 1979, }; // Create a mutable copy of `immutabook` and call it `mutabook` let mut mutabook = immutabook; // Immutably borrow an immutable object borrow_book(&immutabook); // Immutably borrow a mutable object borrow_book(&mutabook); // Borrow a mutable object as mutable new_edition(&mut mutabook); // Error! Cannot borrow an immutable object as mutable new_edition(&mut immutabook); // FIXME ^ Comment out this line }
See also:
Aliasing
Data can be immutably borrowed any number of times, but while immutably borrowed, the original data can't be mutably borrowed. On the other hand, only one mutable borrow is allowed at a time. The original data can be borrowed again only after the mutable reference has been used for the last time.
struct Point { x: i32, y: i32, z: i32 } fn main() { let mut point = Point { x: 0, y: 0, z: 0 }; let borrowed_point = &point; let another_borrow = &point; // Data can be accessed via the references and the original owner println!("Point has coordinates: ({}, {}, {})", borrowed_point.x, another_borrow.y, point.z); // Error! Can't borrow `point` as mutable because it's currently // borrowed as immutable. // let mutable_borrow = &mut point; // TODO ^ Try uncommenting this line // The borrowed values are used again here println!("Point has coordinates: ({}, {}, {})", borrowed_point.x, another_borrow.y, point.z); // The immutable references are no longer used for the rest of the code so // it is possible to reborrow with a mutable reference. let mutable_borrow = &mut point; // Change data via mutable reference mutable_borrow.x = 5; mutable_borrow.y = 2; mutable_borrow.z = 1; // Error! Can't borrow `point` as immutable because it's currently // borrowed as mutable. // let y = &point.y; // TODO ^ Try uncommenting this line // Error! Can't print because `println!` takes an immutable reference. // println!("Point Z coordinate is {}", point.z); // TODO ^ Try uncommenting this line // Ok! Mutable references can be passed as immutable to `println!` println!("Point has coordinates: ({}, {}, {})", mutable_borrow.x, mutable_borrow.y, mutable_borrow.z); // The mutable reference is no longer used for the rest of the code so it // is possible to reborrow let new_borrowed_point = &point; println!("Point now has coordinates: ({}, {}, {})", new_borrowed_point.x, new_borrowed_point.y, new_borrowed_point.z); }
The ref pattern
When doing pattern matching or destructuring via the let
binding, the ref
keyword can be used to take references to the fields of a struct/tuple. The
example below shows a few instances where this can be useful:
#[derive(Clone, Copy)] struct Point { x: i32, y: i32 } fn main() { let c = 'Q'; // A `ref` borrow on the left side of an assignment is equivalent to // an `&` borrow on the right side. let ref ref_c1 = c; let ref_c2 = &c; println!("ref_c1 equals ref_c2: {}", *ref_c1 == *ref_c2); let point = Point { x: 0, y: 0 }; // `ref` is also valid when destructuring a struct. let _copy_of_x = { // `ref_to_x` is a reference to the `x` field of `point`. let Point { x: ref ref_to_x, y: _ } = point; // Return a copy of the `x` field of `point`. *ref_to_x }; // A mutable copy of `point` let mut mutable_point = point; { // `ref` can be paired with `mut` to take mutable references. let Point { x: _, y: ref mut mut_ref_to_y } = mutable_point; // Mutate the `y` field of `mutable_point` via a mutable reference. *mut_ref_to_y = 1; } println!("point is ({}, {})", point.x, point.y); println!("mutable_point is ({}, {})", mutable_point.x, mutable_point.y); // A mutable tuple that includes a pointer let mut mutable_tuple = (Box::new(5u32), 3u32); { // Destructure `mutable_tuple` to change the value of `last`. let (_, ref mut last) = mutable_tuple; *last = 2u32; } println!("tuple is {:?}", mutable_tuple); }
Lifetimes
A lifetime is a construct the compiler (or more specifically, its borrow checker) uses to ensure all borrows are valid. Specifically, a variable's lifetime begins when it is created and ends when it is destroyed. While lifetimes and scopes are often referred to together, they are not the same.
Take, for example, the case where we borrow a variable via &
. The
borrow has a lifetime that is determined by where it is declared. As a result,
the borrow is valid as long as it ends before the lender is destroyed. However,
the scope of the borrow is determined by where the reference is used.
In the following example and in the rest of this section, we will see how lifetimes relate to scopes, as well as how the two differ.
// Lifetimes are annotated below with lines denoting the creation // and destruction of each variable. // `i` has the longest lifetime because its scope entirely encloses // both `borrow1` and `borrow2`. The duration of `borrow1` compared // to `borrow2` is irrelevant since they are disjoint. fn main() { let i = 3; // Lifetime for `i` starts. ────────────────┐ // │ { // │ let borrow1 = &i; // `borrow1` lifetime starts. ──┐│ // ││ println!("borrow1: {}", borrow1); // ││ } // `borrow1 ends. ──────────────────────────────────┘│ // │ // │ { // │ let borrow2 = &i; // `borrow2` lifetime starts. ──┐│ // ││ println!("borrow2: {}", borrow2); // ││ } // `borrow2` ends. ─────────────────────────────────┘│ // │ } // Lifetime ends. ─────────────────────────────────────┘
Note that no names or types are assigned to label lifetimes. This restricts how lifetimes will be able to be used as we will see.
Explicit annotation
The borrow checker uses explicit lifetime annotations to determine how long references should be valid. In cases where lifetimes are not elided1, Rust requires explicit annotations to determine what the lifetime of a reference should be. The syntax for explicitly annotating a lifetime uses an apostrophe character as follows:
foo<'a>
// `foo` has a lifetime parameter `'a`
Similar to closures, using lifetimes requires generics.
Additionally, this lifetime syntax indicates that the lifetime of foo
may not exceed that of 'a
. Explicit annotation of a type has the form
&'a T
where 'a
has already been introduced.
In cases with multiple lifetimes, the syntax is similar:
foo<'a, 'b>
// `foo` has lifetime parameters `'a` and `'b`
In this case, the lifetime of foo
cannot exceed that of either 'a
or 'b
.
See the following example for explicit lifetime annotation in use:
// `print_refs` takes two references to `i32` which have different // lifetimes `'a` and `'b`. These two lifetimes must both be at // least as long as the function `print_refs`. fn print_refs<'a, 'b>(x: &'a i32, y: &'b i32) { println!("x is {} and y is {}", x, y); } // A function which takes no arguments, but has a lifetime parameter `'a`. fn failed_borrow<'a>() { let _x = 12; // ERROR: `_x` does not live long enough let y: &'a i32 = &_x; // Attempting to use the lifetime `'a` as an explicit type annotation // inside the function will fail because the lifetime of `&_x` is shorter // than that of `y`. A short lifetime cannot be coerced into a longer one. } fn main() { // Create variables to be borrowed below. let (four, nine) = (4, 9); // Borrows (`&`) of both variables are passed into the function. print_refs(&four, &nine); // Any input which is borrowed must outlive the borrower. // In other words, the lifetime of `four` and `nine` must // be longer than that of `print_refs`. failed_borrow(); // `failed_borrow` contains no references to force `'a` to be // longer than the lifetime of the function, but `'a` is longer. // Because the lifetime is never constrained, it defaults to `'static`. }
elision implicitly annotates lifetimes and so is different.
See also:
Functions
Ignoring elision, function signatures with lifetimes have a few constraints:
- any reference must have an annotated lifetime.
- any reference being returned must have the same lifetime as an input or
be
static
.
Additionally, note that returning references without input is banned if it would result in returning references to invalid data. The following example shows off some valid forms of functions with lifetimes:
// One input reference with lifetime `'a` which must live // at least as long as the function. fn print_one<'a>(x: &'a i32) { println!("`print_one`: x is {}", x); } // Mutable references are possible with lifetimes as well. fn add_one<'a>(x: &'a mut i32) { *x += 1; } // Multiple elements with different lifetimes. In this case, it // would be fine for both to have the same lifetime `'a`, but // in more complex cases, different lifetimes may be required. fn print_multi<'a, 'b>(x: &'a i32, y: &'b i32) { println!("`print_multi`: x is {}, y is {}", x, y); } // Returning references that have been passed in is acceptable. // However, the correct lifetime must be returned. fn pass_x<'a, 'b>(x: &'a i32, _: &'b i32) -> &'a i32 { x } //fn invalid_output<'a>() -> &'a String { &String::from("foo") } // The above is invalid: `'a` must live longer than the function. // Here, `&String::from("foo")` would create a `String`, followed by a // reference. Then the data is dropped upon exiting the scope, leaving // a reference to invalid data to be returned. fn main() { let x = 7; let y = 9; print_one(&x); print_multi(&x, &y); let z = pass_x(&x, &y); print_one(z); let mut t = 3; add_one(&mut t); print_one(&t); }
See also:
Methods
Methods are annotated similarly to functions:
struct Owner(i32); impl Owner { // Annotate lifetimes as in a standalone function. fn add_one<'a>(&'a mut self) { self.0 += 1; } fn print<'a>(&'a self) { println!("`print`: {}", self.0); } } fn main() { let mut owner = Owner(18); owner.add_one(); owner.print(); }
See also:
Structs
Annotation of lifetimes in structures are also similar to functions:
// A type `Borrowed` which houses a reference to an // `i32`. The reference to `i32` must outlive `Borrowed`. #[derive(Debug)] struct Borrowed<'a>(&'a i32); // Similarly, both references here must outlive this structure. #[derive(Debug)] struct NamedBorrowed<'a> { x: &'a i32, y: &'a i32, } // An enum which is either an `i32` or a reference to one. #[derive(Debug)] enum Either<'a> { Num(i32), Ref(&'a i32), } fn main() { let x = 18; let y = 15; let single = Borrowed(&x); let double = NamedBorrowed { x: &x, y: &y }; let reference = Either::Ref(&x); let number = Either::Num(y); println!("x is borrowed in {:?}", single); println!("x and y are borrowed in {:?}", double); println!("x is borrowed in {:?}", reference); println!("y is *not* borrowed in {:?}", number); }
See also:
Traits
Annotation of lifetimes in trait methods basically are similar to functions.
Note that impl
may have annotation of lifetimes too.
// A struct with annotation of lifetimes. #[derive(Debug)] struct Borrowed<'a> { x: &'a i32, } // Annotate lifetimes to impl. impl<'a> Default for Borrowed<'a> { fn default() -> Self { Self { x: &10, } } } fn main() { let b: Borrowed = Default::default(); println!("b is {:?}", b); }
See also:
Bounds
Just like generic types can be bounded, lifetimes (themselves generic)
use bounds as well. The :
character has a slightly different meaning here,
but +
is the same. Note how the following read:
T: 'a
: All references inT
must outlive lifetime'a
.T: Trait + 'a
: TypeT
must implement traitTrait
and all references inT
must outlive'a
.
The example below shows the above syntax in action used after keyword where
:
use std::fmt::Debug; // Trait to bound with. #[derive(Debug)] struct Ref<'a, T: 'a>(&'a T); // `Ref` contains a reference to a generic type `T` that has // an unknown lifetime `'a`. `T` is bounded such that any // *references* in `T` must outlive `'a`. Additionally, the lifetime // of `Ref` may not exceed `'a`. // A generic function which prints using the `Debug` trait. fn print<T>(t: T) where T: Debug { println!("`print`: t is {:?}", t); } // Here a reference to `T` is taken where `T` implements // `Debug` and all *references* in `T` outlive `'a`. In // addition, `'a` must outlive the function. fn print_ref<'a, T>(t: &'a T) where T: Debug + 'a { println!("`print_ref`: t is {:?}", t); } fn main() { let x = 7; let ref_x = Ref(&x); print_ref(&ref_x); print(ref_x); }
See also:
generics, bounds in generics, and multiple bounds in generics
Coercion
A longer lifetime can be coerced into a shorter one so that it works inside a scope it normally wouldn't work in. This comes in the form of inferred coercion by the Rust compiler, and also in the form of declaring a lifetime difference:
// Here, Rust infers a lifetime that is as short as possible. // The two references are then coerced to that lifetime. fn multiply<'a>(first: &'a i32, second: &'a i32) -> i32 { first * second } // `<'a: 'b, 'b>` reads as lifetime `'a` is at least as long as `'b`. // Here, we take in an `&'a i32` and return a `&'b i32` as a result of coercion. fn choose_first<'a: 'b, 'b>(first: &'a i32, _: &'b i32) -> &'b i32 { first } fn main() { let first = 2; // Longer lifetime { let second = 3; // Shorter lifetime println!("The product is {}", multiply(&first, &second)); println!("{} is the first", choose_first(&first, &second)); }; }
Static
Rust has a few reserved lifetime names. One of those is 'static
. You
might encounter it in two situations:
// A reference with 'static lifetime: let s: &'static str = "hello world"; // 'static as part of a trait bound: fn generic<T>(x: T) where T: 'static {}
Both are related but subtly different and this is a common source for confusion when learning Rust. Here are some examples for each situation:
Reference lifetime
As a reference lifetime 'static
indicates that the data pointed to by
the reference lives for the entire lifetime of the running program.
It can still be coerced to a shorter lifetime.
There are two ways to make a variable with 'static
lifetime, and both
are stored in the read-only memory of the binary:
- Make a constant with the
static
declaration. - Make a
string
literal which has type:&'static str
.
See the following example for a display of each method:
// Make a constant with `'static` lifetime. static NUM: i32 = 18; // Returns a reference to `NUM` where its `'static` // lifetime is coerced to that of the input argument. fn coerce_static<'a>(_: &'a i32) -> &'a i32 { &NUM } fn main() { { // Make a `string` literal and print it: let static_string = "I'm in read-only memory"; println!("static_string: {}", static_string); // When `static_string` goes out of scope, the reference // can no longer be used, but the data remains in the binary. } { // Make an integer to use for `coerce_static`: let lifetime_num = 9; // Coerce `NUM` to lifetime of `lifetime_num`: let coerced_static = coerce_static(&lifetime_num); println!("coerced_static: {}", coerced_static); } println!("NUM: {} stays accessible!", NUM); }
Trait bound
As a trait bound, it means the type does not contain any non-static references. Eg. the receiver can hold on to the type for as long as they want and it will never become invalid until they drop it.
It's important to understand this means that any owned data always passes
a 'static
lifetime bound, but a reference to that owned data generally
does not:
use std::fmt::Debug; fn print_it( input: impl Debug + 'static ) { println!( "'static value passed in is: {:?}", input ); } fn main() { // i is owned and contains no references, thus it's 'static: let i = 5; print_it(i); // oops, &i only has the lifetime defined by the scope of // main(), so it's not 'static: print_it(&i); }
The compiler will tell you:
error[E0597]: `i` does not live long enough
--> src/lib.rs:15:15
|
15 | print_it(&i);
| ---------^^--
| | |
| | borrowed value does not live long enough
| argument requires that `i` is borrowed for `'static`
16 | }
| - `i` dropped here while still borrowed
See also:
Elision
Some lifetime patterns are overwhelmingly common and so the borrow checker will allow you to omit them to save typing and to improve readability. This is known as elision. Elision exists in Rust solely because these patterns are common.
The following code shows a few examples of elision. For a more comprehensive description of elision, see lifetime elision in the book.
// `elided_input` and `annotated_input` essentially have identical signatures // because the lifetime of `elided_input` is inferred by the compiler: fn elided_input(x: &i32) { println!("`elided_input`: {}", x); } fn annotated_input<'a>(x: &'a i32) { println!("`annotated_input`: {}", x); } // Similarly, `elided_pass` and `annotated_pass` have identical signatures // because the lifetime is added implicitly to `elided_pass`: fn elided_pass(x: &i32) -> &i32 { x } fn annotated_pass<'a>(x: &'a i32) -> &'a i32 { x } fn main() { let x = 3; elided_input(&x); annotated_input(&x); println!("`elided_pass`: {}", elided_pass(&x)); println!("`annotated_pass`: {}", annotated_pass(&x)); }
See also:
Traits
A trait
is a collection of methods defined for an unknown type:
Self
. They can access other methods declared in the same trait.
Traits can be implemented for any data type. In the example below,
we define Animal
, a group of methods. The Animal
trait
is
then implemented for the Sheep
data type, allowing the use of
methods from Animal
with a Sheep
.
struct Sheep { naked: bool, name: &'static str } trait Animal { // Static method signature; `Self` refers to the implementor type. fn new(name: &'static str) -> Self; // Instance method signatures; these will return a string. fn name(&self) -> &'static str; fn noise(&self) -> &'static str; // Traits can provide default method definitions. fn talk(&self) { println!("{} says {}", self.name(), self.noise()); } } impl Sheep { fn is_naked(&self) -> bool { self.naked } fn shear(&mut self) { if self.is_naked() { // Implementor methods can use the implementor's trait methods. println!("{} is already naked...", self.name()); } else { println!("{} gets a haircut!", self.name); self.naked = true; } } } // Implement the `Animal` trait for `Sheep`. impl Animal for Sheep { // `Self` is the implementor type: `Sheep`. fn new(name: &'static str) -> Sheep { Sheep { name: name, naked: false } } fn name(&self) -> &'static str { self.name } fn noise(&self) -> &'static str { if self.is_naked() { "baaaaah?" } else { "baaaaah!" } } // Default trait methods can be overridden. fn talk(&self) { // For example, we can add some quiet contemplation. println!("{} pauses briefly... {}", self.name, self.noise()); } } fn main() { // Type annotation is necessary in this case. let mut dolly: Sheep = Animal::new("Dolly"); // TODO ^ Try removing the type annotations. dolly.talk(); dolly.shear(); dolly.talk(); }
Derive
The compiler is capable of providing basic implementations for some traits via
the #[derive]
attribute. These traits can still be
manually implemented if a more complex behavior is required.
The following is a list of derivable traits:
- Comparison traits:
Eq
,PartialEq
,Ord
,PartialOrd
. Clone
, to createT
from&T
via a copy.Copy
, to give a type 'copy semantics' instead of 'move semantics'.Hash
, to compute a hash from&T
.Default
, to create an empty instance of a data type.Debug
, to format a value using the{:?}
formatter.
// `Centimeters`, a tuple struct that can be compared #[derive(PartialEq, PartialOrd)] struct Centimeters(f64); // `Inches`, a tuple struct that can be printed #[derive(Debug)] struct Inches(i32); impl Inches { fn to_centimeters(&self) -> Centimeters { let &Inches(inches) = self; Centimeters(inches as f64 * 2.54) } } // `Seconds`, a tuple struct with no additional attributes struct Seconds(i32); fn main() { let _one_second = Seconds(1); // Error: `Seconds` can't be printed; it doesn't implement the `Debug` trait //println!("One second looks like: {:?}", _one_second); // TODO ^ Try uncommenting this line // Error: `Seconds` can't be compared; it doesn't implement the `PartialEq` trait //let _this_is_true = (_one_second == _one_second); // TODO ^ Try uncommenting this line let foot = Inches(12); println!("One foot equals {:?}", foot); let meter = Centimeters(100.0); let cmp = if foot.to_centimeters() < meter { "smaller" } else { "bigger" }; println!("One foot is {} than one meter.", cmp); }
See also:
Returning Traits with dyn
The Rust compiler needs to know how much space every function's return type requires. This means all your functions have to return a concrete type. Unlike other languages, if you have a trait like Animal
, you can't write a function that returns Animal
, because its different implementations will need different amounts of memory.
However, there's an easy workaround. Instead of returning a trait object directly, our functions return a Box
which contains some Animal
. A box
is just a reference to some memory in the heap. Because a reference has a statically-known size, and the compiler can guarantee it points to a heap-allocated Animal
, we can return a trait from our function!
Rust tries to be as explicit as possible whenever it allocates memory on the heap. So if your function returns a pointer-to-trait-on-heap in this way, you need to write the return type with the dyn
keyword, e.g. Box<dyn Animal>
.
struct Sheep {} struct Cow {} trait Animal { // Instance method signature fn noise(&self) -> &'static str; } // Implement the `Animal` trait for `Sheep`. impl Animal for Sheep { fn noise(&self) -> &'static str { "baaaaah!" } } // Implement the `Animal` trait for `Cow`. impl Animal for Cow { fn noise(&self) -> &'static str { "moooooo!" } } // Returns some struct that implements Animal, but we don't know which one at compile time. fn random_animal(random_number: f64) -> Box<dyn Animal> { if random_number < 0.5 { Box::new(Sheep {}) } else { Box::new(Cow {}) } } fn main() { let random_number = 0.234; let animal = random_animal(random_number); println!("You've randomly chosen an animal, and it says {}", animal.noise()); }
Operator Overloading
In Rust, many of the operators can be overloaded via traits. That is, some operators can
be used to accomplish different tasks based on their input arguments. This is possible
because operators are syntactic sugar for method calls. For example, the +
operator in
a + b
calls the add
method (as in a.add(b)
). This add
method is part of the Add
trait. Hence, the +
operator can be used by any implementor of the Add
trait.
A list of the traits, such as Add
, that overload operators can be found in core::ops
.
use std::ops; struct Foo; struct Bar; #[derive(Debug)] struct FooBar; #[derive(Debug)] struct BarFoo; // The `std::ops::Add` trait is used to specify the functionality of `+`. // Here, we make `Add<Bar>` - the trait for addition with a RHS of type `Bar`. // The following block implements the operation: Foo + Bar = FooBar impl ops::Add<Bar> for Foo { type Output = FooBar; fn add(self, _rhs: Bar) -> FooBar { println!("> Foo.add(Bar) was called"); FooBar } } // By reversing the types, we end up implementing non-commutative addition. // Here, we make `Add<Foo>` - the trait for addition with a RHS of type `Foo`. // This block implements the operation: Bar + Foo = BarFoo impl ops::Add<Foo> for Bar { type Output = BarFoo; fn add(self, _rhs: Foo) -> BarFoo { println!("> Bar.add(Foo) was called"); BarFoo } } fn main() { println!("Foo + Bar = {:?}", Foo + Bar); println!("Bar + Foo = {:?}", Bar + Foo); }
See Also
Drop
The Drop
trait only has one method: drop
, which is called automatically
when an object goes out of scope. The main use of the Drop
trait is to free the
resources that the implementor instance owns.
Box
, Vec
, String
, File
, and Process
are some examples of types that
implement the Drop
trait to free resources. The Drop
trait can also be
manually implemented for any custom data type.
The following example adds a print to console to the drop
function to announce
when it is called.
struct Droppable { name: &'static str, } // This trivial implementation of `drop` adds a print to console. impl Drop for Droppable { fn drop(&mut self) { println!("> Dropping {}", self.name); } } fn main() { let _a = Droppable { name: "a" }; // block A { let _b = Droppable { name: "b" }; // block B { let _c = Droppable { name: "c" }; let _d = Droppable { name: "d" }; println!("Exiting block B"); } println!("Just exited block B"); println!("Exiting block A"); } println!("Just exited block A"); // Variable can be manually dropped using the `drop` function drop(_a); // TODO ^ Try commenting this line println!("end of the main function"); // `_a` *won't* be `drop`ed again here, because it already has been // (manually) `drop`ed }
Iterators
The Iterator
trait is used to implement iterators over collections such as arrays.
The trait requires only a method to be defined for the next
element,
which may be manually defined in an impl
block or automatically
defined (as in arrays and ranges).
As a point of convenience for common situations, the for
construct
turns some collections into iterators using the .into_iter()
method.
struct Fibonacci { curr: u32, next: u32, } // Implement `Iterator` for `Fibonacci`. // The `Iterator` trait only requires a method to be defined for the `next` element. impl Iterator for Fibonacci { // We can refer to this type using Self::Item type Item = u32; // Here, we define the sequence using `.curr` and `.next`. // The return type is `Option<T>`: // * When the `Iterator` is finished, `None` is returned. // * Otherwise, the next value is wrapped in `Some` and returned. // We use Self::Item in the return type, so we can change // the type without having to update the function signatures. fn next(&mut self) -> Option<Self::Item> { let new_next = self.curr + self.next; self.curr = self.next; self.next = new_next; // Since there's no endpoint to a Fibonacci sequence, the `Iterator` // will never return `None`, and `Some` is always returned. Some(self.curr) } } // Returns a Fibonacci sequence generator fn fibonacci() -> Fibonacci { Fibonacci { curr: 0, next: 1 } } fn main() { // `0..3` is an `Iterator` that generates: 0, 1, and 2. let mut sequence = 0..3; println!("Four consecutive `next` calls on 0..3"); println!("> {:?}", sequence.next()); println!("> {:?}", sequence.next()); println!("> {:?}", sequence.next()); println!("> {:?}", sequence.next()); // `for` works through an `Iterator` until it returns `None`. // Each `Some` value is unwrapped and bound to a variable (here, `i`). println!("Iterate through 0..3 using `for`"); for i in 0..3 { println!("> {}", i); } // The `take(n)` method reduces an `Iterator` to its first `n` terms. println!("The first four terms of the Fibonacci sequence are: "); for i in fibonacci().take(4) { println!("> {}", i); } // The `skip(n)` method shortens an `Iterator` by dropping its first `n` terms. println!("The next four terms of the Fibonacci sequence are: "); for i in fibonacci().skip(4).take(4) { println!("> {}", i); } let array = [1u32, 3, 3, 7]; // The `iter` method produces an `Iterator` over an array/slice. println!("Iterate the following array {:?}", &array); for i in array.iter() { println!("> {}", i); } }
impl Trait
If your function returns a type that implements MyTrait
, you can write its
return type as -> impl MyTrait
. This can help simplify your type signatures quite a lot!
use std::iter; use std::vec::IntoIter; // This function combines two `Vec<i32>` and returns an iterator over it. // Look how complicated its return type is! fn combine_vecs_explicit_return_type( v: Vec<i32>, u: Vec<i32>, ) -> iter::Cycle<iter::Chain<IntoIter<i32>, IntoIter<i32>>> { v.into_iter().chain(u.into_iter()).cycle() } // This is the exact same function, but its return type uses `impl Trait`. // Look how much simpler it is! fn combine_vecs( v: Vec<i32>, u: Vec<i32>, ) -> impl Iterator<Item=i32> { v.into_iter().chain(u.into_iter()).cycle() } fn main() { let v1 = vec![1, 2, 3]; let v2 = vec![4, 5]; let mut v3 = combine_vecs(v1, v2); assert_eq!(Some(1), v3.next()); assert_eq!(Some(2), v3.next()); assert_eq!(Some(3), v3.next()); assert_eq!(Some(4), v3.next()); assert_eq!(Some(5), v3.next()); println!("all done"); }
More importantly, some Rust types can't be written out. For example, every
closure has its own unnamed concrete type. Before impl Trait
syntax, you had
to allocate on the heap in order to return a closure. But now you can do it all
statically, like this:
// Returns a function that adds `y` to its input fn make_adder_function(y: i32) -> impl Fn(i32) -> i32 { let closure = move |x: i32| { x + y }; closure } fn main() { let plus_one = make_adder_function(1); assert_eq!(plus_one(2), 3); }
You can also use impl Trait
to return an iterator that uses map
or filter
closures! This makes using map
and filter
easier. Because closure types don't
have names, you can't write out an explicit return type if your function returns
iterators with closures. But with impl Trait
you can do this easily:
fn double_positives<'a>(numbers: &'a Vec<i32>) -> impl Iterator<Item = i32> + 'a { numbers .iter() .filter(|x| x > &&0) .map(|x| x * 2) }
Clone
When dealing with resources, the default behavior is to transfer them during assignments or function calls. However, sometimes we need to make a copy of the resource as well.
The Clone
trait helps us do exactly this. Most commonly, we can
use the .clone()
method defined by the Clone
trait.
// A unit struct without resources #[derive(Debug, Clone, Copy)] struct Unit; // A tuple struct with resources that implements the `Clone` trait #[derive(Clone, Debug)] struct Pair(Box<i32>, Box<i32>); fn main() { // Instantiate `Unit` let unit = Unit; // Copy `Unit`, there are no resources to move let copied_unit = unit; // Both `Unit`s can be used independently println!("original: {:?}", unit); println!("copy: {:?}", copied_unit); // Instantiate `Pair` let pair = Pair(Box::new(1), Box::new(2)); println!("original: {:?}", pair); // Move `pair` into `moved_pair`, moves resources let moved_pair = pair; println!("moved: {:?}", moved_pair); // Error! `pair` has lost its resources //println!("original: {:?}", pair); // TODO ^ Try uncommenting this line // Clone `moved_pair` into `cloned_pair` (resources are included) let cloned_pair = moved_pair.clone(); // Drop the original pair using std::mem::drop drop(moved_pair); // Error! `moved_pair` has been dropped //println!("copy: {:?}", moved_pair); // TODO ^ Try uncommenting this line // The result from .clone() can still be used! println!("clone: {:?}", cloned_pair); }
Supertraits
Rust doesn't have "inheritance", but you can define a trait as being a superset of another trait. For example:
trait Person { fn name(&self) -> String; } // Person is a supertrait of Student. // Implementing Student requires you to also impl Person. trait Student: Person { fn university(&self) -> String; } trait Programmer { fn fav_language(&self) -> String; } // CompSciStudent (computer science student) is a subtrait of both Programmer // and Student. Implementing CompSciStudent requires you to impl both supertraits. trait CompSciStudent: Programmer + Student { fn git_username(&self) -> String; } fn comp_sci_student_greeting(student: &dyn CompSciStudent) -> String { format!( "My name is {} and I attend {}. My favorite language is {}. My Git username is {}", student.name(), student.university(), student.fav_language(), student.git_username() ) } fn main() {}
See also:
The Rust Programming Language chapter on supertraits
Disambiguating overlapping traits
A type can implement many different traits. What if two traits both require the same name? For example, many traits might have a method named get()
. They might even have different return types!
Good news: because each trait implementation gets its own impl
block, it's
clear which trait's get
method you're implementing.
What about when it comes time to call those methods? To disambiguate between them, we have to use Fully Qualified Syntax.
trait UsernameWidget { // Get the selected username out of this widget fn get(&self) -> String; } trait AgeWidget { // Get the selected age out of this widget fn get(&self) -> u8; } // A form with both a UsernameWidget and an AgeWidget struct Form { username: String, age: u8, } impl UsernameWidget for Form { fn get(&self) -> String { self.username.clone() } } impl AgeWidget for Form { fn get(&self) -> u8 { self.age } } fn main() { let form = Form{ username: "rustacean".to_owned(), age: 28, }; // If you uncomment this line, you'll get an error saying // "multiple `get` found". Because, after all, there are multiple methods // named `get`. // println!("{}", form.get()); let username = <Form as UsernameWidget>::get(&form); assert_eq!("rustacean".to_owned(), username); let age = <Form as AgeWidget>::get(&form); assert_eq!(28, age); }
See also:
The Rust Programming Language chapter on Fully Qualified syntax
macro_rules!
Rust provides a powerful macro system that allows metaprogramming. As you've
seen in previous chapters, macros look like functions, except that their name
ends with a bang !
, but instead of generating a function call, macros are
expanded into source code that gets compiled with the rest of the program.
However, unlike macros in C and other languages, Rust macros are expanded into
abstract syntax trees, rather than string preprocessing, so you don't get
unexpected precedence bugs.
Macros are created using the macro_rules!
macro.
// This is a simple macro named `say_hello`. macro_rules! say_hello { // `()` indicates that the macro takes no argument. () => { // The macro will expand into the contents of this block. println!("Hello!"); }; } fn main() { // This call will expand into `println!("Hello");` say_hello!() }
So why are macros useful?
-
Don't repeat yourself. There are many cases where you may need similar functionality in multiple places but with different types. Often, writing a macro is a useful way to avoid repeating code. (More on this later)
-
Domain-specific languages. Macros allow you to define special syntax for a specific purpose. (More on this later)
-
Variadic interfaces. Sometimes you want to define an interface that takes a variable number of arguments. An example is
println!
which could take any number of arguments, depending on the format string!. (More on this later)
Syntax
In following subsections, we will show how to define macros in Rust. There are three basic ideas:
Designators
The arguments of a macro are prefixed by a dollar sign $
and type annotated
with a designator:
macro_rules! create_function { // This macro takes an argument of designator `ident` and // creates a function named `$func_name`. // The `ident` designator is used for variable/function names. ($func_name:ident) => { fn $func_name() { // The `stringify!` macro converts an `ident` into a string. println!("You called {:?}()", stringify!($func_name)); } }; } // Create functions named `foo` and `bar` with the above macro. create_function!(foo); create_function!(bar); macro_rules! print_result { // This macro takes an expression of type `expr` and prints // it as a string along with its result. // The `expr` designator is used for expressions. ($expression:expr) => { // `stringify!` will convert the expression *as it is* into a string. println!("{:?} = {:?}", stringify!($expression), $expression); }; } fn main() { foo(); bar(); print_result!(1u32 + 1); // Recall that blocks are expressions too! print_result!({ let x = 1u32; x * x + 2 * x - 1 }); }
These are some of the available designators:
block
expr
is used for expressionsident
is used for variable/function namesitem
literal
is used for literal constantspat
(pattern)path
stmt
(statement)tt
(token tree)ty
(type)vis
(visibility qualifier)
For a complete list, see the Rust Reference.
Overload
Macros can be overloaded to accept different combinations of arguments.
In that regard, macro_rules!
can work similarly to a match block:
// `test!` will compare `$left` and `$right` // in different ways depending on how you invoke it: macro_rules! test { // Arguments don't need to be separated by a comma. // Any template can be used! ($left:expr; and $right:expr) => { println!("{:?} and {:?} is {:?}", stringify!($left), stringify!($right), $left && $right) }; // ^ each arm must end with a semicolon. ($left:expr; or $right:expr) => { println!("{:?} or {:?} is {:?}", stringify!($left), stringify!($right), $left || $right) }; } fn main() { test!(1i32 + 1 == 2i32; and 2i32 * 2 == 4i32); test!(true; or false); }
Repeat
Macros can use +
in the argument list to indicate that an argument may
repeat at least once, or *
, to indicate that the argument may repeat zero or
more times.
In the following example, surrounding the matcher with $(...),+
will
match one or more expression, separated by commas.
Also note that the semicolon is optional on the last case.
// `find_min!` will calculate the minimum of any number of arguments. macro_rules! find_min { // Base case: ($x:expr) => ($x); // `$x` followed by at least one `$y,` ($x:expr, $($y:expr),+) => ( // Call `find_min!` on the tail `$y` std::cmp::min($x, find_min!($($y),+)) ) } fn main() { println!("{}", find_min!(1u32)); println!("{}", find_min!(1u32 + 2, 2u32)); println!("{}", find_min!(5u32, 2u32 * 3, 4u32)); }
DRY (Don't Repeat Yourself)
Macros allow writing DRY code by factoring out the common parts of functions
and/or test suites. Here is an example that implements and tests the +=
, *=
and -=
operators on Vec<T>
:
use std::ops::{Add, Mul, Sub}; macro_rules! assert_equal_len { // The `tt` (token tree) designator is used for // operators and tokens. ($a:expr, $b:expr, $func:ident, $op:tt) => { assert!($a.len() == $b.len(), "{:?}: dimension mismatch: {:?} {:?} {:?}", stringify!($func), ($a.len(),), stringify!($op), ($b.len(),)); }; } macro_rules! op { ($func:ident, $bound:ident, $op:tt, $method:ident) => { fn $func<T: $bound<T, Output=T> + Copy>(xs: &mut Vec<T>, ys: &Vec<T>) { assert_equal_len!(xs, ys, $func, $op); for (x, y) in xs.iter_mut().zip(ys.iter()) { *x = $bound::$method(*x, *y); // *x = x.$method(*y); } } }; } // Implement `add_assign`, `mul_assign`, and `sub_assign` functions. op!(add_assign, Add, +=, add); op!(mul_assign, Mul, *=, mul); op!(sub_assign, Sub, -=, sub); mod test { use std::iter; macro_rules! test { ($func:ident, $x:expr, $y:expr, $z:expr) => { #[test] fn $func() { for size in 0usize..10 { let mut x: Vec<_> = iter::repeat($x).take(size).collect(); let y: Vec<_> = iter::repeat($y).take(size).collect(); let z: Vec<_> = iter::repeat($z).take(size).collect(); super::$func(&mut x, &y); assert_eq!(x, z); } } }; } // Test `add_assign`, `mul_assign`, and `sub_assign`. test!(add_assign, 1u32, 2u32, 3u32); test!(mul_assign, 2u32, 3u32, 6u32); test!(sub_assign, 3u32, 2u32, 1u32); }
$ rustc --test dry.rs && ./dry
running 3 tests
test test::mul_assign ... ok
test test::add_assign ... ok
test test::sub_assign ... ok
test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured
Domain Specific Languages (DSLs)
A DSL is a mini "language" embedded in a Rust macro. It is completely valid Rust because the macro system expands into normal Rust constructs, but it looks like a small language. This allows you to define concise or intuitive syntax for some special functionality (within bounds).
Suppose that I want to define a little calculator API. I would like to supply an expression and have the output printed to console.
macro_rules! calculate { (eval $e:expr) => {{ { let val: usize = $e; // Force types to be integers println!("{} = {}", stringify!{$e}, val); } }}; } fn main() { calculate! { eval 1 + 2 // hehehe `eval` is _not_ a Rust keyword! } calculate! { eval (1 + 2) * (3 / 4) } }
Output:
1 + 2 = 3
(1 + 2) * (3 / 4) = 0
This was a very simple example, but much more complex interfaces have been
developed, such as lazy_static
or
clap
.
Also, note the two pairs of braces in the macro. The outer ones are
part of the syntax of macro_rules!
, in addition to ()
or []
.
Variadic Interfaces
A variadic interface takes an arbitrary number of arguments. For example,
println!
can take an arbitrary number of arguments, as determined by the
format string.
We can extend our calculate!
macro from the previous section to be variadic:
macro_rules! calculate { // The pattern for a single `eval` (eval $e:expr) => {{ { let val: usize = $e; // Force types to be integers println!("{} = {}", stringify!{$e}, val); } }}; // Decompose multiple `eval`s recursively (eval $e:expr, $(eval $es:expr),+) => {{ calculate! { eval $e } calculate! { $(eval $es),+ } }}; } fn main() { calculate! { // Look ma! Variadic `calculate!`! eval 1 + 2, eval 3 + 4, eval (2 * 3) + 1 } }
Output:
1 + 2 = 3
3 + 4 = 7
(2 * 3) + 1 = 7
Error handling
Error handling is the process of handling the possibility of failure. For example, failing to read a file and then continuing to use that bad input would clearly be problematic. Noticing and explicitly managing those errors saves the rest of the program from various pitfalls.
There are various ways to deal with errors in Rust, which are described in the following subchapters. They all have more or less subtle differences and different use cases. As a rule of thumb:
An explicit panic
is mainly useful for tests and dealing with unrecoverable errors.
For prototyping it can be useful, for example when dealing with functions that
haven't been implemented yet, but in those cases the more descriptive unimplemented
is better. In tests panic
is a reasonable way to explicitly fail.
The Option
type is for when a value is optional or when the lack of a value is
not an error condition. For example the parent of a directory - /
and C:
don't
have one. When dealing with Option
s, unwrap
is fine for prototyping and cases
where it's absolutely certain that there is guaranteed to be a value. However expect
is more useful since it lets you specify an error message in case something goes
wrong anyway.
When there is a chance that things do go wrong and the caller has to deal with the
problem, use Result
. You can unwrap
and expect
them as well (please don't
do that unless it's a test or quick prototype).
For a more rigorous discussion of error handling, refer to the error handling section in the official book.
panic
The simplest error handling mechanism we will see is panic
. It prints an
error message, starts unwinding the stack, and usually exits the program.
Here, we explicitly call panic
on our error condition:
fn drink(beverage: &str) { // You shouldn't drink too much sugary beverages. if beverage == "lemonade" { panic!("AAAaaaaa!!!!"); } println!("Some refreshing {} is all I need.", beverage); } fn main() { drink("water"); drink("lemonade"); }
Option
& unwrap
In the last example, we showed that we can induce program failure at will.
We told our program to panic
if we drink a sugary lemonade.
But what if we expect some drink but don't receive one?
That case would be just as bad, so it needs to be handled!
We could test this against the null string (""
) as we do with a lemonade.
Since we're using Rust, let's instead have the compiler point out cases
where there's no drink.
An enum
called Option<T>
in the std
library is used when absence is a
possibility. It manifests itself as one of two "options":
Some(T)
: An element of typeT
was foundNone
: No element was found
These cases can either be explicitly handled via match
or implicitly with
unwrap
. Implicit handling will either return the inner element or panic
.
Note that it's possible to manually customize panic
with expect,
but unwrap
otherwise leaves us with a less meaningful output than explicit
handling. In the following example, explicit handling yields a more
controlled result while retaining the option to panic
if desired.
// The adult has seen it all, and can handle any drink well. // All drinks are handled explicitly using `match`. fn give_adult(drink: Option<&str>) { // Specify a course of action for each case. match drink { Some("lemonade") => println!("Yuck! Too sugary."), Some(inner) => println!("{}? How nice.", inner), None => println!("No drink? Oh well."), } } // Others will `panic` before drinking sugary drinks. // All drinks are handled implicitly using `unwrap`. fn drink(drink: Option<&str>) { // `unwrap` returns a `panic` when it receives a `None`. let inside = drink.unwrap(); if inside == "lemonade" { panic!("AAAaaaaa!!!!"); } println!("I love {}s!!!!!", inside); } fn main() { let water = Some("water"); let lemonade = Some("lemonade"); let void = None; give_adult(water); give_adult(lemonade); give_adult(void); let coffee = Some("coffee"); let nothing = None; drink(coffee); drink(nothing); }
Unpacking options with ?
You can unpack Option
s by using match
statements, but it's often easier to
use the ?
operator. If x
is an Option
, then evaluating x?
will return
the underlying value if x
is Some
, otherwise it will terminate whatever
function is being executed and return None
.
fn next_birthday(current_age: Option<u8>) -> Option<String> { // If `current_age` is `None`, this returns `None`. // If `current_age` is `Some`, the inner `u8` gets assigned to `next_age` let next_age: u8 = current_age?; Some(format!("Next year I will be {}", next_age)) }
You can chain many ?
s together to make your code much more readable.
struct Person { job: Option<Job>, } #[derive(Clone, Copy)] struct Job { phone_number: Option<PhoneNumber>, } #[derive(Clone, Copy)] struct PhoneNumber { area_code: Option<u8>, number: u32, } impl Person { // Gets the area code of the phone number of the person's job, if it exists. fn work_phone_area_code(&self) -> Option<u8> { // This would need many nested `match` statements without the `?` operator. // It would take a lot more code - try writing it yourself and see which // is easier. self.job?.phone_number?.area_code } } fn main() { let p = Person { job: Some(Job { phone_number: Some(PhoneNumber { area_code: Some(61), number: 439222222, }), }), }; assert_eq!(p.work_phone_area_code(), Some(61)); }
Combinators: map
match
is a valid method for handling Option
s. However, you may
eventually find heavy usage tedious, especially with operations only valid
with an input. In these cases, combinators can be used to
manage control flow in a modular fashion.
Option
has a built in method called map()
, a combinator for the simple
mapping of Some -> Some
and None -> None
. Multiple map()
calls can be
chained together for even more flexibility.
In the following example, process()
replaces all functions previous
to it while staying compact.
#![allow(dead_code)] #[derive(Debug)] enum Food { Apple, Carrot, Potato } #[derive(Debug)] struct Peeled(Food); #[derive(Debug)] struct Chopped(Food); #[derive(Debug)] struct Cooked(Food); // Peeling food. If there isn't any, then return `None`. // Otherwise, return the peeled food. fn peel(food: Option<Food>) -> Option<Peeled> { match food { Some(food) => Some(Peeled(food)), None => None, } } // Chopping food. If there isn't any, then return `None`. // Otherwise, return the chopped food. fn chop(peeled: Option<Peeled>) -> Option<Chopped> { match peeled { Some(Peeled(food)) => Some(Chopped(food)), None => None, } } // Cooking food. Here, we showcase `map()` instead of `match` for case handling. fn cook(chopped: Option<Chopped>) -> Option<Cooked> { chopped.map(|Chopped(food)| Cooked(food)) } // A function to peel, chop, and cook food all in sequence. // We chain multiple uses of `map()` to simplify the code. fn process(food: Option<Food>) -> Option<Cooked> { food.map(|f| Peeled(f)) .map(|Peeled(f)| Chopped(f)) .map(|Chopped(f)| Cooked(f)) } // Check whether there's food or not before trying to eat it! fn eat(food: Option<Cooked>) { match food { Some(food) => println!("Mmm. I love {:?}", food), None => println!("Oh no! It wasn't edible."), } } fn main() { let apple = Some(Food::Apple); let carrot = Some(Food::Carrot); let potato = None; let cooked_apple = cook(chop(peel(apple))); let cooked_carrot = cook(chop(peel(carrot))); // Let's try the simpler looking `process()` now. let cooked_potato = process(potato); eat(cooked_apple); eat(cooked_carrot); eat(cooked_potato); }
See also:
closures, Option
, Option::map()
Combinators: and_then
map()
was described as a chainable way to simplify match
statements.
However, using map()
on a function that returns an Option<T>
results
in the nested Option<Option<T>>
. Chaining multiple calls together can
then become confusing. That's where another combinator called and_then()
,
known in some languages as flatmap, comes in.
and_then()
calls its function input with the wrapped value and returns the result. If the Option
is None
, then it returns None
instead.
In the following example, cookable_v2()
results in an Option<Food>
.
Using map()
instead of and_then()
would have given an
Option<Option<Food>>
, which is an invalid type for eat()
.
#![allow(dead_code)] #[derive(Debug)] enum Food { CordonBleu, Steak, Sushi } #[derive(Debug)] enum Day { Monday, Tuesday, Wednesday } // We don't have the ingredients to make Sushi. fn have_ingredients(food: Food) -> Option<Food> { match food { Food::Sushi => None, _ => Some(food), } } // We have the recipe for everything except Cordon Bleu. fn have_recipe(food: Food) -> Option<Food> { match food { Food::CordonBleu => None, _ => Some(food), } } // To make a dish, we need both the recipe and the ingredients. // We can represent the logic with a chain of `match`es: fn cookable_v1(food: Food) -> Option<Food> { match have_recipe(food) { None => None, Some(food) => match have_ingredients(food) { None => None, Some(food) => Some(food), }, } } // This can conveniently be rewritten more compactly with `and_then()`: fn cookable_v2(food: Food) -> Option<Food> { have_recipe(food).and_then(have_ingredients) } fn eat(food: Food, day: Day) { match cookable_v2(food) { Some(food) => println!("Yay! On {:?} we get to eat {:?}.", day, food), None => println!("Oh no. We don't get to eat on {:?}?", day), } } fn main() { let (cordon_bleu, steak, sushi) = (Food::CordonBleu, Food::Steak, Food::Sushi); eat(cordon_bleu, Day::Monday); eat(steak, Day::Tuesday); eat(sushi, Day::Wednesday); }
See also:
closures, Option
, and Option::and_then()
Result
Result
is a richer version of the Option
type that
describes possible error instead of possible absence.
That is, Result<T, E>
could have one of two outcomes:
Ok(T)
: An elementT
was foundErr(E)
: An error was found with elementE
By convention, the expected outcome is Ok
while the unexpected outcome is Err
.
Like Option
, Result
has many methods associated with it. unwrap()
, for
example, either yields the element T
or panic
s. For case handling,
there are many combinators between Result
and Option
that overlap.
In working with Rust, you will likely encounter methods that return the
Result
type, such as the parse()
method. It might not always
be possible to parse a string into the other type, so parse()
returns a
Result
indicating possible failure.
Let's see what happens when we successfully and unsuccessfully parse()
a string:
fn multiply(first_number_str: &str, second_number_str: &str) -> i32 { // Let's try using `unwrap()` to get the number out. Will it bite us? let first_number = first_number_str.parse::<i32>().unwrap(); let second_number = second_number_str.parse::<i32>().unwrap(); first_number * second_number } fn main() { let twenty = multiply("10", "2"); println!("double is {}", twenty); let tt = multiply("t", "2"); println!("double is {}", tt); }
In the unsuccessful case, parse()
leaves us with an error for unwrap()
to panic
on. Additionally, the panic
exits our program and provides an
unpleasant error message.
To improve the quality of our error message, we should be more specific about the return type and consider explicitly handling the error.
Using Result
in main
The Result
type can also be the return type of the main
function if
specified explicitly. Typically the main
function will be of the form:
fn main() { println!("Hello World!"); }
However main
is also able to have a return type of Result
. If an error
occurs within the main
function it will return an error code and print a debug
representation of the error (using the Debug
trait). The following example
shows such a scenario and touches on aspects covered in the following section.
use std::num::ParseIntError; fn main() -> Result<(), ParseIntError> { let number_str = "10"; let number = match number_str.parse::<i32>() { Ok(number) => number, Err(e) => return Err(e), }; println!("{}", number); Ok(()) }
map
for Result
Panicking in the previous example's multiply
does not make for robust code.
Generally, we want to return the error to the caller so it can decide what is
the right way to respond to errors.
We first need to know what kind of error type we are dealing with. To determine
the Err
type, we look to parse()
, which is implemented with the
FromStr
trait for i32
. As a result, the Err
type is
specified as ParseIntError
.
In the example below, the straightforward match
statement leads to code
that is overall more cumbersome.
use std::num::ParseIntError; // With the return type rewritten, we use pattern matching without `unwrap()`. fn multiply(first_number_str: &str, second_number_str: &str) -> Result<i32, ParseIntError> { match first_number_str.parse::<i32>() { Ok(first_number) => { match second_number_str.parse::<i32>() { Ok(second_number) => { Ok(first_number * second_number) }, Err(e) => Err(e), } }, Err(e) => Err(e), } } fn print(result: Result<i32, ParseIntError>) { match result { Ok(n) => println!("n is {}", n), Err(e) => println!("Error: {}", e), } } fn main() { // This still presents a reasonable answer. let twenty = multiply("10", "2"); print(twenty); // The following now provides a much more helpful error message. let tt = multiply("t", "2"); print(tt); }
Luckily, Option
's map
, and_then
, and many other combinators are also
implemented for Result
. Result
contains a complete listing.
use std::num::ParseIntError; // As with `Option`, we can use combinators such as `map()`. // This function is otherwise identical to the one above and reads: // Modify n if the value is valid, otherwise pass on the error. fn multiply(first_number_str: &str, second_number_str: &str) -> Result<i32, ParseIntError> { first_number_str.parse::<i32>().and_then(|first_number| { second_number_str.parse::<i32>().map(|second_number| first_number * second_number) }) } fn print(result: Result<i32, ParseIntError>) { match result { Ok(n) => println!("n is {}", n), Err(e) => println!("Error: {}", e), } } fn main() { // This still presents a reasonable answer. let twenty = multiply("10", "2"); print(twenty); // The following now provides a much more helpful error message. let tt = multiply("t", "2"); print(tt); }
aliases for Result
How about when we want to reuse a specific Result
type many times?
Recall that Rust allows us to create aliases. Conveniently,
we can define one for the specific Result
in question.
At a module level, creating aliases can be particularly helpful. Errors
found in a specific module often have the same Err
type, so a single alias
can succinctly define all associated Results
. This is so useful that the
std
library even supplies one: io::Result
!
Here's a quick example to show off the syntax:
use std::num::ParseIntError; // Define a generic alias for a `Result` with the error type `ParseIntError`. type AliasedResult<T> = Result<T, ParseIntError>; // Use the above alias to refer to our specific `Result` type. fn multiply(first_number_str: &str, second_number_str: &str) -> AliasedResult<i32> { first_number_str.parse::<i32>().and_then(|first_number| { second_number_str.parse::<i32>().map(|second_number| first_number * second_number) }) } // Here, the alias again allows us to save some space. fn print(result: AliasedResult<i32>) { match result { Ok(n) => println!("n is {}", n), Err(e) => println!("Error: {}", e), } } fn main() { print(multiply("10", "2")); print(multiply("t", "2")); }
See also:
Early returns
In the previous example, we explicitly handled the errors using combinators.
Another way to deal with this case analysis is to use a combination of
match
statements and early returns.
That is, we can simply stop executing the function and return the error if one occurs. For some, this form of code can be easier to both read and write. Consider this version of the previous example, rewritten using early returns:
use std::num::ParseIntError; fn multiply(first_number_str: &str, second_number_str: &str) -> Result<i32, ParseIntError> { let first_number = match first_number_str.parse::<i32>() { Ok(first_number) => first_number, Err(e) => return Err(e), }; let second_number = match second_number_str.parse::<i32>() { Ok(second_number) => second_number, Err(e) => return Err(e), }; Ok(first_number * second_number) } fn print(result: Result<i32, ParseIntError>) { match result { Ok(n) => println!("n is {}", n), Err(e) => println!("Error: {}", e), } } fn main() { print(multiply("10", "2")); print(multiply("t", "2")); }
At this point, we've learned to explicitly handle errors using combinators and early returns. While we generally want to avoid panicking, explicitly handling all of our errors is cumbersome.
In the next section, we'll introduce ?
for the cases where we simply
need to unwrap
without possibly inducing panic
.
Introducing ?
Sometimes we just want the simplicity of unwrap
without the possibility of
a panic
. Until now, unwrap
has forced us to nest deeper and deeper when
what we really wanted was to get the variable out. This is exactly the purpose of ?
.
Upon finding an Err
, there are two valid actions to take:
panic!
which we already decided to try to avoid if possiblereturn
because anErr
means it cannot be handled
?
is almost1 exactly equivalent to an unwrap
which return
s
instead of panic
king on Err
s. Let's see how we can simplify the earlier
example that used combinators:
use std::num::ParseIntError; fn multiply(first_number_str: &str, second_number_str: &str) -> Result<i32, ParseIntError> { let first_number = first_number_str.parse::<i32>()?; let second_number = second_number_str.parse::<i32>()?; Ok(first_number * second_number) } fn print(result: Result<i32, ParseIntError>) { match result { Ok(n) => println!("n is {}", n), Err(e) => println!("Error: {}", e), } } fn main() { print(multiply("10", "2")); print(multiply("t", "2")); }
The try!
macro
Before there was ?
, the same functionality was achieved with the try!
macro.
The ?
operator is now recommended, but you may still find try!
when looking
at older code. The same multiply
function from the previous example
would look like this using try!
:
// To compile and run this example without errors, while using Cargo, change the value // of the `edition` field, in the `[package]` section of the `Cargo.toml` file, to "2015". use std::num::ParseIntError; fn multiply(first_number_str: &str, second_number_str: &str) -> Result<i32, ParseIntError> { let first_number = try!(first_number_str.parse::<i32>()); let second_number = try!(second_number_str.parse::<i32>()); Ok(first_number * second_number) } fn print(result: Result<i32, ParseIntError>) { match result { Ok(n) => println!("n is {}", n), Err(e) => println!("Error: {}", e), } } fn main() { print(multiply("10", "2")); print(multiply("t", "2")); }
See re-enter ? for more details.
Multiple error types
The previous examples have always been very convenient; Result
s interact
with other Result
s and Option
s interact with other Option
s.
Sometimes an Option
needs to interact with a Result
, or a
Result<T, Error1>
needs to interact with a Result<T, Error2>
. In those
cases, we want to manage our different error types in a way that makes them
composable and easy to interact with.
In the following code, two instances of unwrap
generate different error
types. Vec::first
returns an Option
, while parse::<i32>
returns a
Result<i32, ParseIntError>
:
fn double_first(vec: Vec<&str>) -> i32 { let first = vec.first().unwrap(); // Generate error 1 2 * first.parse::<i32>().unwrap() // Generate error 2 } fn main() { let numbers = vec!["42", "93", "18"]; let empty = vec![]; let strings = vec!["tofu", "93", "18"]; println!("The first doubled is {}", double_first(numbers)); println!("The first doubled is {}", double_first(empty)); // Error 1: the input vector is empty println!("The first doubled is {}", double_first(strings)); // Error 2: the element doesn't parse to a number }
Over the next sections, we'll see several strategies for handling these kind of problems.
Pulling Result
s out of Option
s
The most basic way of handling mixed error types is to just embed them in each other.
use std::num::ParseIntError; fn double_first(vec: Vec<&str>) -> Option<Result<i32, ParseIntError>> { vec.first().map(|first| { first.parse::<i32>().map(|n| 2 * n) }) } fn main() { let numbers = vec!["42", "93", "18"]; let empty = vec![]; let strings = vec!["tofu", "93", "18"]; println!("The first doubled is {:?}", double_first(numbers)); println!("The first doubled is {:?}", double_first(empty)); // Error 1: the input vector is empty println!("The first doubled is {:?}", double_first(strings)); // Error 2: the element doesn't parse to a number }
There are times when we'll want to stop processing on errors (like with
?
) but keep going when the Option
is None
. A
couple of combinators come in handy to swap the Result
and Option
.
use std::num::ParseIntError; fn double_first(vec: Vec<&str>) -> Result<Option<i32>, ParseIntError> { let opt = vec.first().map(|first| { first.parse::<i32>().map(|n| 2 * n) }); opt.map_or(Ok(None), |r| r.map(Some)) } fn main() { let numbers = vec!["42", "93", "18"]; let empty = vec![]; let strings = vec!["tofu", "93", "18"]; println!("The first doubled is {:?}", double_first(numbers)); println!("The first doubled is {:?}", double_first(empty)); println!("The first doubled is {:?}", double_first(strings)); }
Defining an error type
Sometimes it simplifies the code to mask all of the different errors with a single type of error. We'll show this with a custom error.
Rust allows us to define our own error types. In general, a "good" error type:
- Represents different errors with the same type
- Presents nice error messages to the user
- Is easy to compare with other types
- Good:
Err(EmptyVec)
- Bad:
Err("Please use a vector with at least one element".to_owned())
- Good:
- Can hold information about the error
- Good:
Err(BadChar(c, position))
- Bad:
Err("+ cannot be used here".to_owned())
- Good:
- Composes well with other errors
use std::fmt; type Result<T> = std::result::Result<T, DoubleError>; // Define our error types. These may be customized for our error handling cases. // Now we will be able to write our own errors, defer to an underlying error // implementation, or do something in between. #[derive(Debug, Clone)] struct DoubleError; // Generation of an error is completely separate from how it is displayed. // There's no need to be concerned about cluttering complex logic with the display style. // // Note that we don't store any extra info about the errors. This means we can't state // which string failed to parse without modifying our types to carry that information. impl fmt::Display for DoubleError { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { write!(f, "invalid first item to double") } } fn double_first(vec: Vec<&str>) -> Result<i32> { vec.first() // Change the error to our new type. .ok_or(DoubleError) .and_then(|s| { s.parse::<i32>() // Update to the new error type here also. .map_err(|_| DoubleError) .map(|i| 2 * i) }) } fn print(result: Result<i32>) { match result { Ok(n) => println!("The first doubled is {}", n), Err(e) => println!("Error: {}", e), } } fn main() { let numbers = vec!["42", "93", "18"]; let empty = vec![]; let strings = vec!["tofu", "93", "18"]; print(double_first(numbers)); print(double_first(empty)); print(double_first(strings)); }
Box
ing errors
A way to write simple code while preserving the original errors is to Box
them. The drawback is that the underlying error type is only known at runtime and not
statically determined.
The stdlib helps in boxing our errors by having Box
implement conversion from
any type that implements the Error
trait into the trait object Box<Error>
,
via From
.
use std::error; use std::fmt; // Change the alias to `Box<error::Error>`. type Result<T> = std::result::Result<T, Box<dyn error::Error>>; #[derive(Debug, Clone)] struct EmptyVec; impl fmt::Display for EmptyVec { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { write!(f, "invalid first item to double") } } impl error::Error for EmptyVec {} fn double_first(vec: Vec<&str>) -> Result<i32> { vec.first() .ok_or_else(|| EmptyVec.into()) // Converts to Box .and_then(|s| { s.parse::<i32>() .map_err(|e| e.into()) // Converts to Box .map(|i| 2 * i) }) } fn print(result: Result<i32>) { match result { Ok(n) => println!("The first doubled is {}", n), Err(e) => println!("Error: {}", e), } } fn main() { let numbers = vec!["42", "93", "18"]; let empty = vec![]; let strings = vec!["tofu", "93", "18"]; print(double_first(numbers)); print(double_first(empty)); print(double_first(strings)); }
See also:
Dynamic dispatch and Error
trait
Other uses of ?
Notice in the previous example that our immediate reaction to calling
parse
is to map
the error from a library error into a boxed
error:
.and_then(|s| s.parse::<i32>()
.map_err(|e| e.into())
Since this is a simple and common operation, it would be convenient if it
could be elided. Alas, because and_then
is not sufficiently flexible, it
cannot. However, we can instead use ?
.
?
was previously explained as either unwrap
or return Err(err)
.
This is only mostly true. It actually means unwrap
or
return Err(From::from(err))
. Since From::from
is a conversion utility
between different types, this means that if you ?
where the error is
convertible to the return type, it will convert automatically.
Here, we rewrite the previous example using ?
. As a result, the
map_err
will go away when From::from
is implemented for our error type:
use std::error; use std::fmt; // Change the alias to `Box<dyn error::Error>`. type Result<T> = std::result::Result<T, Box<dyn error::Error>>; #[derive(Debug)] struct EmptyVec; impl fmt::Display for EmptyVec { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { write!(f, "invalid first item to double") } } impl error::Error for EmptyVec {} // The same structure as before but rather than chain all `Results` // and `Options` along, we `?` to get the inner value out immediately. fn double_first(vec: Vec<&str>) -> Result<i32> { let first = vec.first().ok_or(EmptyVec)?; let parsed = first.parse::<i32>()?; Ok(2 * parsed) } fn print(result: Result<i32>) { match result { Ok(n) => println!("The first doubled is {}", n), Err(e) => println!("Error: {}", e), } } fn main() { let numbers = vec!["42", "93", "18"]; let empty = vec![]; let strings = vec!["tofu", "93", "18"]; print(double_first(numbers)); print(double_first(empty)); print(double_first(strings)); }
This is actually fairly clean now. Compared with the original panic
, it
is very similar to replacing the unwrap
calls with ?
except that the
return types are Result
. As a result, they must be destructured at the
top level.
See also:
From::from
and ?
Wrapping errors
An alternative to boxing errors is to wrap them in your own error type.
use std::error; use std::error::Error as _; use std::num::ParseIntError; use std::fmt; type Result<T> = std::result::Result<T, DoubleError>; #[derive(Debug)] enum DoubleError { EmptyVec, // We will defer to the parse error implementation for their error. // Supplying extra info requires adding more data to the type. Parse(ParseIntError), } impl fmt::Display for DoubleError { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { match *self { DoubleError::EmptyVec => write!(f, "please use a vector with at least one element"), // The wrapped error contains additional information and is available // via the source() method. DoubleError::Parse(..) => write!(f, "the provided string could not be parsed as int"), } } } impl error::Error for DoubleError { fn source(&self) -> Option<&(dyn error::Error + 'static)> { match *self { DoubleError::EmptyVec => None, // The cause is the underlying implementation error type. Is implicitly // cast to the trait object `&error::Error`. This works because the // underlying type already implements the `Error` trait. DoubleError::Parse(ref e) => Some(e), } } } // Implement the conversion from `ParseIntError` to `DoubleError`. // This will be automatically called by `?` if a `ParseIntError` // needs to be converted into a `DoubleError`. impl From<ParseIntError> for DoubleError { fn from(err: ParseIntError) -> DoubleError { DoubleError::Parse(err) } } fn double_first(vec: Vec<&str>) -> Result<i32> { let first = vec.first().ok_or(DoubleError::EmptyVec)?; // Here we implicitly use the `ParseIntError` implementation of `From` (which // we defined above) in order to create a `DoubleError`. let parsed = first.parse::<i32>()?; Ok(2 * parsed) } fn print(result: Result<i32>) { match result { Ok(n) => println!("The first doubled is {}", n), Err(e) => { println!("Error: {}", e); if let Some(source) = e.source() { println!(" Caused by: {}", source); } }, } } fn main() { let numbers = vec!["42", "93", "18"]; let empty = vec![]; let strings = vec!["tofu", "93", "18"]; print(double_first(numbers)); print(double_first(empty)); print(double_first(strings)); }
This adds a bit more boilerplate for handling errors and might not be needed in all applications. There are some libraries that can take care of the boilerplate for you.
See also:
From::from
and Enums
Iterating over Result
s
An Iter::map
operation might fail, for example:
fn main() { let strings = vec!["tofu", "93", "18"]; let numbers: Vec<_> = strings .into_iter() .map(|s| s.parse::<i32>()) .collect(); println!("Results: {:?}", numbers); }
Let's step through strategies for handling this.
Ignore the failed items with filter_map()
filter_map
calls a function and filters out the results that are None
.
fn main() { let strings = vec!["tofu", "93", "18"]; let numbers: Vec<_> = strings .into_iter() .filter_map(|s| s.parse::<i32>().ok()) .collect(); println!("Results: {:?}", numbers); }
Fail the entire operation with collect()
Result
implements FromIter
so that a vector of results (Vec<Result<T, E>>
)
can be turned into a result with a vector (Result<Vec<T>, E>
). Once an
Result::Err
is found, the iteration will terminate.
fn main() { let strings = vec!["tofu", "93", "18"]; let numbers: Result<Vec<_>, _> = strings .into_iter() .map(|s| s.parse::<i32>()) .collect(); println!("Results: {:?}", numbers); }
This same technique can be used with Option
.
Collect all valid values and failures with partition()
fn main() { let strings = vec!["tofu", "93", "18"]; let (numbers, errors): (Vec<_>, Vec<_>) = strings .into_iter() .map(|s| s.parse::<i32>()) .partition(Result::is_ok); println!("Numbers: {:?}", numbers); println!("Errors: {:?}", errors); }
When you look at the results, you'll note that everything is still wrapped in
Result
. A little more boilerplate is needed for this.
fn main() { let strings = vec!["tofu", "93", "18"]; let (numbers, errors): (Vec<_>, Vec<_>) = strings .into_iter() .map(|s| s.parse::<i32>()) .partition(Result::is_ok); let numbers: Vec<_> = numbers.into_iter().map(Result::unwrap).collect(); let errors: Vec<_> = errors.into_iter().map(Result::unwrap_err).collect(); println!("Numbers: {:?}", numbers); println!("Errors: {:?}", errors); }
Std library types
The std
library provides many custom types which expands drastically on
the primitives
. Some of these include:
- growable
String
s like:"hello world"
- growable vectors:
[1, 2, 3]
- optional types:
Option<i32>
- error handling types:
Result<i32, i32>
- heap allocated pointers:
Box<i32>
See also:
primitives and the std library
Box, stack and heap
All values in Rust are stack allocated by default. Values can be boxed
(allocated on the heap) by creating a Box<T>
. A box is a smart pointer to a
heap allocated value of type T
. When a box goes out of scope, its destructor
is called, the inner object is destroyed, and the memory on the heap is freed.
Boxed values can be dereferenced using the *
operator; this removes one layer
of indirection.
use std::mem; #[allow(dead_code)] #[derive(Debug, Clone, Copy)] struct Point { x: f64, y: f64, } // A Rectangle can be specified by where its top left and bottom right // corners are in space #[allow(dead_code)] struct Rectangle { top_left: Point, bottom_right: Point, } fn origin() -> Point { Point { x: 0.0, y: 0.0 } } fn boxed_origin() -> Box<Point> { // Allocate this point on the heap, and return a pointer to it Box::new(Point { x: 0.0, y: 0.0 }) } fn main() { // (all the type annotations are superfluous) // Stack allocated variables let point: Point = origin(); let rectangle: Rectangle = Rectangle { top_left: origin(), bottom_right: Point { x: 3.0, y: -4.0 } }; // Heap allocated rectangle let boxed_rectangle: Box<Rectangle> = Box::new(Rectangle { top_left: origin(), bottom_right: Point { x: 3.0, y: -4.0 }, }); // The output of functions can be boxed let boxed_point: Box<Point> = Box::new(origin()); // Double indirection let box_in_a_box: Box<Box<Point>> = Box::new(boxed_origin()); println!("Point occupies {} bytes on the stack", mem::size_of_val(&point)); println!("Rectangle occupies {} bytes on the stack", mem::size_of_val(&rectangle)); // box size == pointer size println!("Boxed point occupies {} bytes on the stack", mem::size_of_val(&boxed_point)); println!("Boxed rectangle occupies {} bytes on the stack", mem::size_of_val(&boxed_rectangle)); println!("Boxed box occupies {} bytes on the stack", mem::size_of_val(&box_in_a_box)); // Copy the data contained in `boxed_point` into `unboxed_point` let unboxed_point: Point = *boxed_point; println!("Unboxed point occupies {} bytes on the stack", mem::size_of_val(&unboxed_point)); }
Vectors
Vectors are re-sizable arrays. Like slices, their size is not known at compile time, but they can grow or shrink at any time. A vector is represented using 3 parameters:
- pointer to the data
- length
- capacity
The capacity indicates how much memory is reserved for the vector. The vector can grow as long as the length is smaller than the capacity. When this threshold needs to be surpassed, the vector is reallocated with a larger capacity.
fn main() { // Iterators can be collected into vectors let collected_iterator: Vec<i32> = (0..10).collect(); println!("Collected (0..10) into: {:?}", collected_iterator); // The `vec!` macro can be used to initialize a vector let mut xs = vec![1i32, 2, 3]; println!("Initial vector: {:?}", xs); // Insert new element at the end of the vector println!("Push 4 into the vector"); xs.push(4); println!("Vector: {:?}", xs); // Error! Immutable vectors can't grow collected_iterator.push(0); // FIXME ^ Comment out this line // The `len` method yields the number of elements currently stored in a vector println!("Vector length: {}", xs.len()); // Indexing is done using the square brackets (indexing starts at 0) println!("Second element: {}", xs[1]); // `pop` removes the last element from the vector and returns it println!("Pop last element: {:?}", xs.pop()); // Out of bounds indexing yields a panic println!("Fourth element: {}", xs[3]); // FIXME ^ Comment out this line // `Vector`s can be easily iterated over println!("Contents of xs:"); for x in xs.iter() { println!("> {}", x); } // A `Vector` can also be iterated over while the iteration // count is enumerated in a separate variable (`i`) for (i, x) in xs.iter().enumerate() { println!("In position {} we have value {}", i, x); } // Thanks to `iter_mut`, mutable `Vector`s can also be iterated // over in a way that allows modifying each value for x in xs.iter_mut() { *x *= 3; } println!("Updated vector: {:?}", xs); }
More Vec
methods can be found under the
std::vec module
Strings
There are two types of strings in Rust: String
and &str
.
A String
is stored as a vector of bytes (Vec<u8>
), but guaranteed to
always be a valid UTF-8 sequence. String
is heap allocated, growable and not
null terminated.
&str
is a slice (&[u8]
) that always points to a valid UTF-8 sequence, and
can be used to view into a String
, just like &[T]
is a view into Vec<T>
.
fn main() { // (all the type annotations are superfluous) // A reference to a string allocated in read only memory let pangram: &'static str = "the quick brown fox jumps over the lazy dog"; println!("Pangram: {}", pangram); // Iterate over words in reverse, no new string is allocated println!("Words in reverse"); for word in pangram.split_whitespace().rev() { println!("> {}", word); } // Copy chars into a vector, sort and remove duplicates let mut chars: Vec<char> = pangram.chars().collect(); chars.sort(); chars.dedup(); // Create an empty and growable `String` let mut string = String::new(); for c in chars { // Insert a char at the end of string string.push(c); // Insert a string at the end of string string.push_str(", "); } // The trimmed string is a slice to the original string, hence no new // allocation is performed let chars_to_trim: &[char] = &[' ', ',']; let trimmed_str: &str = string.trim_matches(chars_to_trim); println!("Used characters: {}", trimmed_str); // Heap allocate a string let alice = String::from("I like dogs"); // Allocate new memory and store the modified string there let bob: String = alice.replace("dog", "cat"); println!("Alice says: {}", alice); println!("Bob says: {}", bob); }
More str
/String
methods can be found under the
std::str and
std::string
modules
Literals and escapes
There are multiple ways to write string literals with special characters in them.
All result in a similar &str
so it's best to use the form that is the most
convenient to write. Similarly there are multiple ways to write byte string literals,
which all result in &[u8; N]
.
Generally special characters are escaped with a backslash character: \
.
This way you can add any character to your string, even unprintable ones
and ones that you don't know how to type. If you want a literal backslash,
escape it with another one: \\
String or character literal delimiters occuring within a literal must be escaped: "\""
, '\''
.
fn main() { // You can use escapes to write bytes by their hexadecimal values... let byte_escape = "I'm writing \x52\x75\x73\x74!"; println!("What are you doing\x3F (\\x3F means ?) {}", byte_escape); // ...or Unicode code points. let unicode_codepoint = "\u{211D}"; let character_name = "\"DOUBLE-STRUCK CAPITAL R\""; println!("Unicode character {} (U+211D) is called {}", unicode_codepoint, character_name ); let long_string = "String literals can span multiple lines. The linebreak and indentation here ->\ <- can be escaped too!"; println!("{}", long_string); }
Sometimes there are just too many characters that need to be escaped or it's just much more convenient to write a string out as-is. This is where raw string literals come into play.
fn main() { let raw_str = r"Escapes don't work here: \x3F \u{211D}"; println!("{}", raw_str); // If you need quotes in a raw string, add a pair of #s let quotes = r#"And then I said: "There is no escape!""#; println!("{}", quotes); // If you need "# in your string, just use more #s in the delimiter. // There is no limit for the number of #s you can use. let longer_delimiter = r###"A string with "# in it. And even "##!"###; println!("{}", longer_delimiter); }
Want a string that's not UTF-8? (Remember, str
and String
must be valid UTF-8).
Or maybe you want an array of bytes that's mostly text? Byte strings to the rescue!
use std::str; fn main() { // Note that this is not actually a `&str` let bytestring: &[u8; 21] = b"this is a byte string"; // Byte arrays don't have the `Display` trait, so printing them is a bit limited println!("A byte string: {:?}", bytestring); // Byte strings can have byte escapes... let escaped = b"\x52\x75\x73\x74 as bytes"; // ...but no unicode escapes // let escaped = b"\u{211D} is not allowed"; println!("Some escaped bytes: {:?}", escaped); // Raw byte strings work just like raw strings let raw_bytestring = br"\u{211D} is not escaped here"; println!("{:?}", raw_bytestring); // Converting a byte array to `str` can fail if let Ok(my_str) = str::from_utf8(raw_bytestring) { println!("And the same as text: '{}'", my_str); } let _quotes = br#"You can also use "fancier" formatting, \ like with normal raw strings"#; // Byte strings don't have to be UTF-8 let shift_jis = b"\x82\xe6\x82\xa8\x82\xb1\x82\xbb"; // "ようこそ" in SHIFT-JIS // But then they can't always be converted to `str` match str::from_utf8(shift_jis) { Ok(my_str) => println!("Conversion successful: '{}'", my_str), Err(e) => println!("Conversion failed: {:?}", e), }; }
For conversions between character encodings check out the encoding crate.
A more detailed listing of the ways to write string literals and escape characters is given in the 'Tokens' chapter of the Rust Reference.
Option
Sometimes it's desirable to catch the failure of some parts of a program
instead of calling panic!
; this can be accomplished using the Option
enum.
The Option<T>
enum has two variants:
None
, to indicate failure or lack of value, andSome(value)
, a tuple struct that wraps avalue
with typeT
.
// An integer division that doesn't `panic!` fn checked_division(dividend: i32, divisor: i32) -> Option<i32> { if divisor == 0 { // Failure is represented as the `None` variant None } else { // Result is wrapped in a `Some` variant Some(dividend / divisor) } } // This function handles a division that may not succeed fn try_division(dividend: i32, divisor: i32) { // `Option` values can be pattern matched, just like other enums match checked_division(dividend, divisor) { None => println!("{} / {} failed!", dividend, divisor), Some(quotient) => { println!("{} / {} = {}", dividend, divisor, quotient) }, } } fn main() { try_division(4, 2); try_division(1, 0); // Binding `None` to a variable needs to be type annotated let none: Option<i32> = None; let _equivalent_none = None::<i32>; let optional_float = Some(0f32); // Unwrapping a `Some` variant will extract the value wrapped. println!("{:?} unwraps to {:?}", optional_float, optional_float.unwrap()); // Unwrapping a `None` variant will `panic!` println!("{:?} unwraps to {:?}", none, none.unwrap()); }
Result
We've seen that the Option
enum can be used as a return value from functions
that may fail, where None
can be returned to indicate failure. However,
sometimes it is important to express why an operation failed. To do this we
have the Result
enum.
The Result<T, E>
enum has two variants:
Ok(value)
which indicates that the operation succeeded, and wraps thevalue
returned by the operation. (value
has typeT
)Err(why)
, which indicates that the operation failed, and wrapswhy
, which (hopefully) explains the cause of the failure. (why
has typeE
)
mod checked { // Mathematical "errors" we want to catch #[derive(Debug)] pub enum MathError { DivisionByZero, NonPositiveLogarithm, NegativeSquareRoot, } pub type MathResult = Result<f64, MathError>; pub fn div(x: f64, y: f64) -> MathResult { if y == 0.0 { // This operation would `fail`, instead let's return the reason of // the failure wrapped in `Err` Err(MathError::DivisionByZero) } else { // This operation is valid, return the result wrapped in `Ok` Ok(x / y) } } pub fn sqrt(x: f64) -> MathResult { if x < 0.0 { Err(MathError::NegativeSquareRoot) } else { Ok(x.sqrt()) } } pub fn ln(x: f64) -> MathResult { if x <= 0.0 { Err(MathError::NonPositiveLogarithm) } else { Ok(x.ln()) } } } // `op(x, y)` === `sqrt(ln(x / y))` fn op(x: f64, y: f64) -> f64 { // This is a three level match pyramid! match checked::div(x, y) { Err(why) => panic!("{:?}", why), Ok(ratio) => match checked::ln(ratio) { Err(why) => panic!("{:?}", why), Ok(ln) => match checked::sqrt(ln) { Err(why) => panic!("{:?}", why), Ok(sqrt) => sqrt, }, }, } } fn main() { // Will this fail? println!("{}", op(1.0, 10.0)); }
?
Chaining results using match can get pretty untidy; luckily, the ?
operator
can be used to make things pretty again. ?
is used at the end of an expression
returning a Result
, and is equivalent to a match expression, where the
Err(err)
branch expands to an early Err(From::from(err))
, and the Ok(ok)
branch expands to an ok
expression.
mod checked { #[derive(Debug)] enum MathError { DivisionByZero, NonPositiveLogarithm, NegativeSquareRoot, } type MathResult = Result<f64, MathError>; fn div(x: f64, y: f64) -> MathResult { if y == 0.0 { Err(MathError::DivisionByZero) } else { Ok(x / y) } } fn sqrt(x: f64) -> MathResult { if x < 0.0 { Err(MathError::NegativeSquareRoot) } else { Ok(x.sqrt()) } } fn ln(x: f64) -> MathResult { if x <= 0.0 { Err(MathError::NonPositiveLogarithm) } else { Ok(x.ln()) } } // Intermediate function fn op_(x: f64, y: f64) -> MathResult { // if `div` "fails", then `DivisionByZero` will be `return`ed let ratio = div(x, y)?; // if `ln` "fails", then `NonPositiveLogarithm` will be `return`ed let ln = ln(ratio)?; sqrt(ln) } pub fn op(x: f64, y: f64) { match op_(x, y) { Err(why) => panic!("{}", match why { MathError::NonPositiveLogarithm => "logarithm of non-positive number", MathError::DivisionByZero => "division by zero", MathError::NegativeSquareRoot => "square root of negative number", }), Ok(value) => println!("{}", value), } } } fn main() { checked::op(1.0, 10.0); }
Be sure to check the documentation,
as there are many methods to map/compose Result
.
panic!
The panic!
macro can be used to generate a panic and start unwinding
its stack. While unwinding, the runtime will take care of freeing all the
resources owned by the thread by calling the destructor of all its objects.
Since we are dealing with programs with only one thread, panic!
will cause the
program to report the panic message and exit.
// Re-implementation of integer division (/) fn division(dividend: i32, divisor: i32) -> i32 { if divisor == 0 { // Division by zero triggers a panic panic!("division by zero"); } else { dividend / divisor } } // The `main` task fn main() { // Heap allocated integer let _x = Box::new(0i32); // This operation will trigger a task failure division(3, 0); println!("This point won't be reached!"); // `_x` should get destroyed at this point }
Let's check that panic!
doesn't leak memory.
$ rustc panic.rs && valgrind ./panic
==4401== Memcheck, a memory error detector
==4401== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==4401== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info
==4401== Command: ./panic
==4401==
thread '<main>' panicked at 'division by zero', panic.rs:5
==4401==
==4401== HEAP SUMMARY:
==4401== in use at exit: 0 bytes in 0 blocks
==4401== total heap usage: 18 allocs, 18 frees, 1,648 bytes allocated
==4401==
==4401== All heap blocks were freed -- no leaks are possible
==4401==
==4401== For counts of detected and suppressed errors, rerun with: -v
==4401== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
HashMap
Where vectors store values by an integer index, HashMap
s store values by key.
HashMap
keys can be booleans, integers, strings,
or any other type that implements the Eq
and Hash
traits.
More on this in the next section.
Like vectors, HashMap
s are growable, but HashMaps can also shrink themselves
when they have excess space.
You can create a HashMap with a certain starting capacity using
HashMap::with_capacity(uint)
, or use HashMap::new()
to get a HashMap
with a default initial capacity (recommended).
use std::collections::HashMap; fn call(number: &str) -> &str { match number { "798-1364" => "We're sorry, the call cannot be completed as dialed. Please hang up and try again.", "645-7689" => "Hello, this is Mr. Awesome's Pizza. My name is Fred. What can I get for you today?", _ => "Hi! Who is this again?" } } fn main() { let mut contacts = HashMap::new(); contacts.insert("Daniel", "798-1364"); contacts.insert("Ashley", "645-7689"); contacts.insert("Katie", "435-8291"); contacts.insert("Robert", "956-1745"); // Takes a reference and returns Option<&V> match contacts.get(&"Daniel") { Some(&number) => println!("Calling Daniel: {}", call(number)), _ => println!("Don't have Daniel's number."), } // `HashMap::insert()` returns `None` // if the inserted value is new, `Some(value)` otherwise contacts.insert("Daniel", "164-6743"); match contacts.get(&"Ashley") { Some(&number) => println!("Calling Ashley: {}", call(number)), _ => println!("Don't have Ashley's number."), } contacts.remove(&"Ashley"); // `HashMap::iter()` returns an iterator that yields // (&'a key, &'a value) pairs in arbitrary order. for (contact, &number) in contacts.iter() { println!("Calling {}: {}", contact, call(number)); } }
For more information on how hashing and hash maps (sometimes called hash tables) work, have a look at Hash Table Wikipedia
Alternate/custom key types
Any type that implements the Eq
and Hash
traits can be a key in HashMap
.
This includes:
bool
(though not very useful since there is only two possible keys)int
,uint
, and all variations thereofString
and&str
(protip: you can have aHashMap
keyed byString
and call.get()
with an&str
)
Note that f32
and f64
do not implement Hash
,
likely because floating-point precision errors
would make using them as hashmap keys horribly error-prone.
All collection classes implement Eq
and Hash
if their contained type also respectively implements Eq
and Hash
.
For example, Vec<T>
will implement Hash
if T
implements Hash
.
You can easily implement Eq
and Hash
for a custom type with just one line:
#[derive(PartialEq, Eq, Hash)]
The compiler will do the rest. If you want more control over the details,
you can implement Eq
and/or Hash
yourself.
This guide will not cover the specifics of implementing Hash
.
To play around with using a struct
in HashMap
,
let's try making a very simple user logon system:
use std::collections::HashMap; // Eq requires that you derive PartialEq on the type. #[derive(PartialEq, Eq, Hash)] struct Account<'a>{ username: &'a str, password: &'a str, } struct AccountInfo<'a>{ name: &'a str, email: &'a str, } type Accounts<'a> = HashMap<Account<'a>, AccountInfo<'a>>; fn try_logon<'a>(accounts: &Accounts<'a>, username: &'a str, password: &'a str){ println!("Username: {}", username); println!("Password: {}", password); println!("Attempting logon..."); let logon = Account { username, password, }; match accounts.get(&logon) { Some(account_info) => { println!("Successful logon!"); println!("Name: {}", account_info.name); println!("Email: {}", account_info.email); }, _ => println!("Login failed!"), } } fn main(){ let mut accounts: Accounts = HashMap::new(); let account = Account { username: "j.everyman", password: "password123", }; let account_info = AccountInfo { name: "John Everyman", email: "j.everyman@email.com", }; accounts.insert(account, account_info); try_logon(&accounts, "j.everyman", "psasword123"); try_logon(&accounts, "j.everyman", "password123"); }
HashSet
Consider a HashSet
as a HashMap
where we just care about the keys (
HashSet<T>
is, in actuality, just a wrapper around HashMap<T, ()>
).
"What's the point of that?" you ask. "I could just store the keys in a Vec
."
A HashSet
's unique feature is that
it is guaranteed to not have duplicate elements.
That's the contract that any set collection fulfills.
HashSet
is just one implementation. (see also: BTreeSet
)
If you insert a value that is already present in the HashSet
,
(i.e. the new value is equal to the existing and they both have the same hash),
then the new value will replace the old.
This is great for when you never want more than one of something, or when you want to know if you've already got something.
But sets can do more than that.
Sets have 4 primary operations (all of the following calls return an iterator):
-
union
: get all the unique elements in both sets. -
difference
: get all the elements that are in the first set but not the second. -
intersection
: get all the elements that are only in both sets. -
symmetric_difference
: get all the elements that are in one set or the other, but not both.
Try all of these in the following example:
use std::collections::HashSet; fn main() { let mut a: HashSet<i32> = vec![1i32, 2, 3].into_iter().collect(); let mut b: HashSet<i32> = vec![2i32, 3, 4].into_iter().collect(); assert!(a.insert(4)); assert!(a.contains(&4)); // `HashSet::insert()` returns false if // there was a value already present. assert!(b.insert(4), "Value 4 is already in set B!"); // FIXME ^ Comment out this line b.insert(5); // If a collection's element type implements `Debug`, // then the collection implements `Debug`. // It usually prints its elements in the format `[elem1, elem2, ...]` println!("A: {:?}", a); println!("B: {:?}", b); // Print [1, 2, 3, 4, 5] in arbitrary order println!("Union: {:?}", a.union(&b).collect::<Vec<&i32>>()); // This should print [1] println!("Difference: {:?}", a.difference(&b).collect::<Vec<&i32>>()); // Print [2, 3, 4] in arbitrary order. println!("Intersection: {:?}", a.intersection(&b).collect::<Vec<&i32>>()); // Print [1, 5] println!("Symmetric Difference: {:?}", a.symmetric_difference(&b).collect::<Vec<&i32>>()); }
(Examples are adapted from the documentation.)
Rc
When multiple ownership is needed, Rc
(Reference Counting) can be used. Rc
keeps track of the number of the references which means the number of owners of the value wrapped inside an Rc
.
Reference count of an Rc
increases by 1 whenever an Rc
is cloned, and decreases by 1 whenever one cloned Rc
is dropped out of the scope. When an Rc
's reference count becomes zero, which means there are no owners remained, both the Rc
and the value are all dropped.
Cloning an Rc
never performs a deep copy. Cloning creates just another pointer to the wrapped value, and increments the count.
use std::rc::Rc; fn main() { let rc_examples = "Rc examples".to_string(); { println!("--- rc_a is created ---"); let rc_a: Rc<String> = Rc::new(rc_examples); println!("Reference Count of rc_a: {}", Rc::strong_count(&rc_a)); { println!("--- rc_a is cloned to rc_b ---"); let rc_b: Rc<String> = Rc::clone(&rc_a); println!("Reference Count of rc_b: {}", Rc::strong_count(&rc_b)); println!("Reference Count of rc_a: {}", Rc::strong_count(&rc_a)); // Two `Rc`s are equal if their inner values are equal println!("rc_a and rc_b are equal: {}", rc_a.eq(&rc_b)); // We can use methods of a value directly println!("Length of the value inside rc_a: {}", rc_a.len()); println!("Value of rc_b: {}", rc_b); println!("--- rc_b is dropped out of scope ---"); } println!("Reference Count of rc_a: {}", Rc::strong_count(&rc_a)); println!("--- rc_a is dropped out of scope ---"); } // Error! `rc_examples` already moved into `rc_a` // And when `rc_a` is dropped, `rc_examples` is dropped together // println!("rc_examples: {}", rc_examples); // TODO ^ Try uncommenting this line }
See also:
std::rc and std::sync::arc.
Arc
When shared ownership between threads is needed, Arc
(Atomic Reference Counted) can be used. This struct, via the Clone
implementation can create a reference pointer for the location of a value in the memory heap while increasing the reference counter. As it shares ownership between threads, when the last reference pointer to a value is out of scope, the variable is dropped.
fn main() { use std::sync::Arc; use std::thread; // This variable declaration is where its value is specified. let apple = Arc::new("the same apple"); for _ in 0..10 { // Here there is no value specification as it is a pointer to a reference // in the memory heap. let apple = Arc::clone(&apple); thread::spawn(move || { // As Arc was used, threads can be spawned using the value allocated // in the Arc variable pointer's location. println!("{:?}", apple); }); } }
Std misc
Many other types are provided by the std library to support things such as:
- Threads
- Channels
- File I/O
These expand beyond what the primitives provide.
See also:
primitives and the std library
Threads
Rust provides a mechanism for spawning native OS threads via the spawn
function, the argument of this function is a moving closure.
use std::thread; const NTHREADS: u32 = 10; // This is the `main` thread fn main() { // Make a vector to hold the children which are spawned. let mut children = vec![]; for i in 0..NTHREADS { // Spin up another thread children.push(thread::spawn(move || { println!("this is thread number {}", i); })); } for child in children { // Wait for the thread to finish. Returns a result. let _ = child.join(); } }
These threads will be scheduled by the OS.
Testcase: map-reduce
Rust makes it very easy to parallelise data processing, without many of the headaches traditionally associated with such an attempt.
The standard library provides great threading primitives out of the box. These, combined with Rust's concept of Ownership and aliasing rules, automatically prevent data races.
The aliasing rules (one writable reference XOR many readable references) automatically prevent
you from manipulating state that is visible to other threads. (Where synchronisation is needed,
there are synchronisation
primitives like Mutex
es or Channel
s.)
In this example, we will calculate the sum of all digits in a block of numbers. We will do this by parcelling out chunks of the block into different threads. Each thread will sum its tiny block of digits, and subsequently we will sum the intermediate sums produced by each thread.
Note that, although we're passing references across thread boundaries, Rust understands that we're
only passing read-only references, and that thus no unsafety or data races can occur. Because
we're move
-ing the data segments into the thread, Rust will also ensure the data is kept alive
until the threads exit, so no dangling pointers occur.
use std::thread; // This is the `main` thread fn main() { // This is our data to process. // We will calculate the sum of all digits via a threaded map-reduce algorithm. // Each whitespace separated chunk will be handled in a different thread. // // TODO: see what happens to the output if you insert spaces! let data = "86967897737416471853297327050364959 11861322575564723963297542624962850 70856234701860851907960690014725639 38397966707106094172783238747669219 52380795257888236525459303330302837 58495327135744041048897885734297812 69920216438980873548808413720956532 16278424637452589860345374828574668"; // Make a vector to hold the child-threads which we will spawn. let mut children = vec![]; /************************************************************************* * "Map" phase * * Divide our data into segments, and apply initial processing ************************************************************************/ // split our data into segments for individual calculation // each chunk will be a reference (&str) into the actual data let chunked_data = data.split_whitespace(); // Iterate over the data segments. // .enumerate() adds the current loop index to whatever is iterated // the resulting tuple "(index, element)" is then immediately // "destructured" into two variables, "i" and "data_segment" with a // "destructuring assignment" for (i, data_segment) in chunked_data.enumerate() { println!("data segment {} is \"{}\"", i, data_segment); // Process each data segment in a separate thread // // spawn() returns a handle to the new thread, // which we MUST keep to access the returned value // // 'move || -> u32' is syntax for a closure that: // * takes no arguments ('||') // * takes ownership of its captured variables ('move') and // * returns an unsigned 32-bit integer ('-> u32') // // Rust is smart enough to infer the '-> u32' from // the closure itself so we could have left that out. // // TODO: try removing the 'move' and see what happens children.push(thread::spawn(move || -> u32 { // Calculate the intermediate sum of this segment: let result = data_segment // iterate over the characters of our segment.. .chars() // .. convert text-characters to their number value.. .map(|c| c.to_digit(10).expect("should be a digit")) // .. and sum the resulting iterator of numbers .sum(); // println! locks stdout, so no text-interleaving occurs println!("processed segment {}, result={}", i, result); // "return" not needed, because Rust is an "expression language", the // last evaluated expression in each block is automatically its value. result })); } /************************************************************************* * "Reduce" phase * * Collect our intermediate results, and combine them into a final result ************************************************************************/ // combine each thread's intermediate results into a single final sum. // // we use the "turbofish" ::<> to provide sum() with a type hint. // // TODO: try without the turbofish, by instead explicitly // specifying the type of final_result let final_result = children.into_iter().map(|c| c.join().unwrap()).sum::<u32>(); println!("Final sum result: {}", final_result); }
Assignments
It is not wise to let our number of threads depend on user inputted data. What if the user decides to insert a lot of spaces? Do we really want to spawn 2,000 threads? Modify the program so that the data is always chunked into a limited number of chunks, defined by a static constant at the beginning of the program.
See also:
- Threads
- vectors and iterators
- closures, move semantics and
move
closures - destructuring assignments
- turbofish notation to help type inference
- unwrap vs. expect
- enumerate
Channels
Rust provides asynchronous channels
for communication between threads. Channels
allow a unidirectional flow of information between two end-points: the
Sender
and the Receiver
.
use std::sync::mpsc::{Sender, Receiver}; use std::sync::mpsc; use std::thread; static NTHREADS: i32 = 3; fn main() { // Channels have two endpoints: the `Sender<T>` and the `Receiver<T>`, // where `T` is the type of the message to be transferred // (type annotation is superfluous) let (tx, rx): (Sender<i32>, Receiver<i32>) = mpsc::channel(); let mut children = Vec::new(); for id in 0..NTHREADS { // The sender endpoint can be copied let thread_tx = tx.clone(); // Each thread will send its id via the channel let child = thread::spawn(move || { // The thread takes ownership over `thread_tx` // Each thread queues a message in the channel thread_tx.send(id).unwrap(); // Sending is a non-blocking operation, the thread will continue // immediately after sending its message println!("thread {} finished", id); }); children.push(child); } // Here, all the messages are collected let mut ids = Vec::with_capacity(NTHREADS as usize); for _ in 0..NTHREADS { // The `recv` method picks a message from the channel // `recv` will block the current thread if there are no messages available ids.push(rx.recv()); } // Wait for the threads to complete any remaining work for child in children { child.join().expect("oops! the child thread panicked"); } // Show the order in which the messages were sent println!("{:?}", ids); }
Path
The Path
struct represents file paths in the underlying filesystem. There are
two flavors of Path
: posix::Path
, for UNIX-like systems, and
windows::Path
, for Windows. The prelude exports the appropriate
platform-specific Path
variant.
A Path
can be created from an OsStr
, and provides several methods to get
information from the file/directory the path points to.
Note that a Path
is not internally represented as an UTF-8 string, but
instead is stored as a vector of bytes (Vec<u8>
). Therefore, converting a
Path
to a &str
is not free and may fail (an Option
is returned).
use std::path::Path; fn main() { // Create a `Path` from an `&'static str` let path = Path::new("."); // The `display` method returns a `Show`able structure let _display = path.display(); // `join` merges a path with a byte container using the OS specific // separator, and returns the new path let new_path = path.join("a").join("b"); // Convert the path into a string slice match new_path.to_str() { None => panic!("new path is not a valid UTF-8 sequence"), Some(s) => println!("new path is {}", s), } }
Be sure to check at other Path
methods (posix::Path
or windows::Path
) and
the Metadata
struct.
See also:
File I/O
The File
struct represents a file that has been opened (it wraps a file
descriptor), and gives read and/or write access to the underlying file.
Since many things can go wrong when doing file I/O, all the File
methods
return the io::Result<T>
type, which is an alias for Result<T, io::Error>
.
This makes the failure of all I/O operations explicit. Thanks to this, the programmer can see all the failure paths, and is encouraged to handle them in a proactive manner.
open
The open
static method can be used to open a file in read-only mode.
A File
owns a resource, the file descriptor and takes care of closing the
file when it is drop
ed.
use std::fs::File;
use std::io::prelude::*;
use std::path::Path;
fn main() {
// Create a path to the desired file
let path = Path::new("hello.txt");
let display = path.display();
// Open the path in read-only mode, returns `io::Result<File>`
let mut file = match File::open(&path) {
Err(why) => panic!("couldn't open {}: {}", display, why),
Ok(file) => file,
};
// Read the file contents into a string, returns `io::Result<usize>`
let mut s = String::new();
match file.read_to_string(&mut s) {
Err(why) => panic!("couldn't read {}: {}", display, why),
Ok(_) => print!("{} contains:\n{}", display, s),
}
// `file` goes out of scope, and the "hello.txt" file gets closed
}
Here's the expected successful output:
$ echo "Hello World!" > hello.txt
$ rustc open.rs && ./open
hello.txt contains:
Hello World!
(You are encouraged to test the previous example under different failure
conditions: hello.txt
doesn't exist, or hello.txt
is not readable,
etc.)
create
The create
static method opens a file in write-only mode. If the file
already existed, the old content is destroyed. Otherwise, a new file is
created.
static LOREM_IPSUM: &str =
"Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non
proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
";
use std::fs::File;
use std::io::prelude::*;
use std::path::Path;
fn main() {
let path = Path::new("lorem_ipsum.txt");
let display = path.display();
// Open a file in write-only mode, returns `io::Result<File>`
let mut file = match File::create(&path) {
Err(why) => panic!("couldn't create {}: {}", display, why),
Ok(file) => file,
};
// Write the `LOREM_IPSUM` string to `file`, returns `io::Result<()>`
match file.write_all(LOREM_IPSUM.as_bytes()) {
Err(why) => panic!("couldn't write to {}: {}", display, why),
Ok(_) => println!("successfully wrote to {}", display),
}
}
Here's the expected successful output:
$ rustc create.rs && ./create
successfully wrote to lorem_ipsum.txt
$ cat lorem_ipsum.txt
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non
proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
(As in the previous example, you are encouraged to test this example under failure conditions.)
There is OpenOptions
struct that can be used to configure how a file is opened.
read_lines
The method lines()
returns an iterator over the lines
of a file.
File::open
expects a generic, AsRef<Path>
. That's what
read_lines()
expects as input.
use std::fs::File; use std::io::{self, BufRead}; use std::path::Path; fn main() { // File hosts must exist in current path before this produces output if let Ok(lines) = read_lines("./hosts") { // Consumes the iterator, returns an (Optional) String for line in lines { if let Ok(ip) = line { println!("{}", ip); } } } } // The output is wrapped in a Result to allow matching on errors // Returns an Iterator to the Reader of the lines of the file. fn read_lines<P>(filename: P) -> io::Result<io::Lines<io::BufReader<File>>> where P: AsRef<Path>, { let file = File::open(filename)?; Ok(io::BufReader::new(file).lines()) }
Running this program simply prints the lines individually.
$ echo -e "127.0.0.1\n192.168.0.1\n" > hosts
$ rustc read_lines.rs && ./read_lines
127.0.0.1
192.168.0.1
This process is more efficient than creating a String
in memory
especially working with larger files.
Child processes
The process::Output
struct represents the output of a finished child process,
and the process::Command
struct is a process builder.
use std::process::Command;
fn main() {
let output = Command::new("rustc")
.arg("--version")
.output().unwrap_or_else(|e| {
panic!("failed to execute process: {}", e)
});
if output.status.success() {
let s = String::from_utf8_lossy(&output.stdout);
print!("rustc succeeded and stdout was:\n{}", s);
} else {
let s = String::from_utf8_lossy(&output.stderr);
print!("rustc failed and stderr was:\n{}", s);
}
}
(You are encouraged to try the previous example with an incorrect flag passed
to rustc
)
Pipes
The std::Child
struct represents a running child process, and exposes the
stdin
, stdout
and stderr
handles for interaction with the underlying
process via pipes.
use std::io::prelude::*;
use std::process::{Command, Stdio};
static PANGRAM: &'static str =
"the quick brown fox jumped over the lazy dog\n";
fn main() {
// Spawn the `wc` command
let process = match Command::new("wc")
.stdin(Stdio::piped())
.stdout(Stdio::piped())
.spawn() {
Err(why) => panic!("couldn't spawn wc: {}", why),
Ok(process) => process,
};
// Write a string to the `stdin` of `wc`.
//
// `stdin` has type `Option<ChildStdin>`, but since we know this instance
// must have one, we can directly `unwrap` it.
match process.stdin.unwrap().write_all(PANGRAM.as_bytes()) {
Err(why) => panic!("couldn't write to wc stdin: {}", why),
Ok(_) => println!("sent pangram to wc"),
}
// Because `stdin` does not live after the above calls, it is `drop`ed,
// and the pipe is closed.
//
// This is very important, otherwise `wc` wouldn't start processing the
// input we just sent.
// The `stdout` field also has type `Option<ChildStdout>` so must be unwrapped.
let mut s = String::new();
match process.stdout.unwrap().read_to_string(&mut s) {
Err(why) => panic!("couldn't read wc stdout: {}", why),
Ok(_) => print!("wc responded with:\n{}", s),
}
}
Wait
If you'd like to wait for a process::Child
to finish, you must call
Child::wait
, which will return a process::ExitStatus
.
use std::process::Command;
fn main() {
let mut child = Command::new("sleep").arg("5").spawn().unwrap();
let _result = child.wait().unwrap();
println!("reached end of main");
}
$ rustc wait.rs && ./wait
# `wait` keeps running for 5 seconds until the `sleep 5` command finishes
reached end of main
Filesystem Operations
The std::fs
module contains several functions that deal with the filesystem.
use std::fs;
use std::fs::{File, OpenOptions};
use std::io;
use std::io::prelude::*;
use std::os::unix;
use std::path::Path;
// A simple implementation of `% cat path`
fn cat(path: &Path) -> io::Result<String> {
let mut f = File::open(path)?;
let mut s = String::new();
match f.read_to_string(&mut s) {
Ok(_) => Ok(s),
Err(e) => Err(e),
}
}
// A simple implementation of `% echo s > path`
fn echo(s: &str, path: &Path) -> io::Result<()> {
let mut f = File::create(path)?;
f.write_all(s.as_bytes())
}
// A simple implementation of `% touch path` (ignores existing files)
fn touch(path: &Path) -> io::Result<()> {
match OpenOptions::new().create(true).write(true).open(path) {
Ok(_) => Ok(()),
Err(e) => Err(e),
}
}
fn main() {
println!("`mkdir a`");
// Create a directory, returns `io::Result<()>`
match fs::create_dir("a") {
Err(why) => println!("! {:?}", why.kind()),
Ok(_) => {},
}
println!("`echo hello > a/b.txt`");
// The previous match can be simplified using the `unwrap_or_else` method
echo("hello", &Path::new("a/b.txt")).unwrap_or_else(|why| {
println!("! {:?}", why.kind());
});
println!("`mkdir -p a/c/d`");
// Recursively create a directory, returns `io::Result<()>`
fs::create_dir_all("a/c/d").unwrap_or_else(|why| {
println!("! {:?}", why.kind());
});
println!("`touch a/c/e.txt`");
touch(&Path::new("a/c/e.txt")).unwrap_or_else(|why| {
println!("! {:?}", why.kind());
});
println!("`ln -s ../b.txt a/c/b.txt`");
// Create a symbolic link, returns `io::Result<()>`
if cfg!(target_family = "unix") {
unix::fs::symlink("../b.txt", "a/c/b.txt").unwrap_or_else(|why| {
println!("! {:?}", why.kind());
});
}
println!("`cat a/c/b.txt`");
match cat(&Path::new("a/c/b.txt")) {
Err(why) => println!("! {:?}", why.kind()),
Ok(s) => println!("> {}", s),
}
println!("`ls a`");
// Read the contents of a directory, returns `io::Result<Vec<Path>>`
match fs::read_dir("a") {
Err(why) => println!("! {:?}", why.kind()),
Ok(paths) => for path in paths {
println!("> {:?}", path.unwrap().path());
},
}
println!("`rm a/c/e.txt`");
// Remove a file, returns `io::Result<()>`
fs::remove_file("a/c/e.txt").unwrap_or_else(|why| {
println!("! {:?}", why.kind());
});
println!("`rmdir a/c/d`");
// Remove an empty directory, returns `io::Result<()>`
fs::remove_dir("a/c/d").unwrap_or_else(|why| {
println!("! {:?}", why.kind());
});
}
Here's the expected successful output:
$ rustc fs.rs && ./fs
`mkdir a`
`echo hello > a/b.txt`
`mkdir -p a/c/d`
`touch a/c/e.txt`
`ln -s ../b.txt a/c/b.txt`
`cat a/c/b.txt`
> hello
`ls a`
> "a/b.txt"
> "a/c"
`rm a/c/e.txt`
`rmdir a/c/d`
And the final state of the a
directory is:
$ tree a
a
|-- b.txt
`-- c
`-- b.txt -> ../b.txt
1 directory, 2 files
An alternative way to define the function cat
is with ?
notation:
fn cat(path: &Path) -> io::Result<String> {
let mut f = File::open(path)?;
let mut s = String::new();
f.read_to_string(&mut s)?;
Ok(s)
}
See also:
Program arguments
Standard Library
The command line arguments can be accessed using std::env::args
, which
returns an iterator that yields a String
for each argument:
use std::env; fn main() { let args: Vec<String> = env::args().collect(); // The first argument is the path that was used to call the program. println!("My path is {}.", args[0]); // The rest of the arguments are the passed command line parameters. // Call the program like this: // $ ./args arg1 arg2 println!("I got {:?} arguments: {:?}.", args.len() - 1, &args[1..]); }
$ ./args 1 2 3
My path is ./args.
I got 3 arguments: ["1", "2", "3"].
Crates
Alternatively, there are numerous crates that can provide extra functionality
when creating command-line applications. The Rust Cookbook exhibits best
practices on how to use one of the more popular command line argument crates,
clap
.
Argument parsing
Matching can be used to parse simple arguments:
use std::env; fn increase(number: i32) { println!("{}", number + 1); } fn decrease(number: i32) { println!("{}", number - 1); } fn help() { println!("usage: match_args <string> Check whether given string is the answer. match_args {{increase|decrease}} <integer> Increase or decrease given integer by one."); } fn main() { let args: Vec<String> = env::args().collect(); match args.len() { // no arguments passed 1 => { println!("My name is 'match_args'. Try passing some arguments!"); }, // one argument passed 2 => { match args[1].parse() { Ok(42) => println!("This is the answer!"), _ => println!("This is not the answer."), } }, // one command and one argument passed 3 => { let cmd = &args[1]; let num = &args[2]; // parse the number let number: i32 = match num.parse() { Ok(n) => { n }, Err(_) => { eprintln!("error: second argument not an integer"); help(); return; }, }; // parse the command match &cmd[..] { "increase" => increase(number), "decrease" => decrease(number), _ => { eprintln!("error: invalid command"); help(); }, } }, // all the other cases _ => { // show a help message help(); } } }
$ ./match_args Rust
This is not the answer.
$ ./match_args 42
This is the answer!
$ ./match_args do something
error: second argument not an integer
usage:
match_args <string>
Check whether given string is the answer.
match_args {increase|decrease} <integer>
Increase or decrease given integer by one.
$ ./match_args do 42
error: invalid command
usage:
match_args <string>
Check whether given string is the answer.
match_args {increase|decrease} <integer>
Increase or decrease given integer by one.
$ ./match_args increase 42
43
Foreign Function Interface
Rust provides a Foreign Function Interface (FFI) to C libraries. Foreign
functions must be declared inside an extern
block annotated with a #[link]
attribute containing the name of the foreign library.
use std::fmt;
// this extern block links to the libm library
#[link(name = "m")]
extern {
// this is a foreign function
// that computes the square root of a single precision complex number
fn csqrtf(z: Complex) -> Complex;
fn ccosf(z: Complex) -> Complex;
}
// Since calling foreign functions is considered unsafe,
// it's common to write safe wrappers around them.
fn cos(z: Complex) -> Complex {
unsafe { ccosf(z) }
}
fn main() {
// z = -1 + 0i
let z = Complex { re: -1., im: 0. };
// calling a foreign function is an unsafe operation
let z_sqrt = unsafe { csqrtf(z) };
println!("the square root of {:?} is {:?}", z, z_sqrt);
// calling safe API wrapped around unsafe operation
println!("cos({:?}) = {:?}", z, cos(z));
}
// Minimal implementation of single precision complex numbers
#[repr(C)]
#[derive(Clone, Copy)]
struct Complex {
re: f32,
im: f32,
}
impl fmt::Debug for Complex {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
if self.im < 0. {
write!(f, "{}-{}i", self.re, -self.im)
} else {
write!(f, "{}+{}i", self.re, self.im)
}
}
}
Testing
Rust is a programming language that cares a lot about correctness and it includes support for writing software tests within the language itself.
Testing comes in three styles:
- Unit testing.
- Doc testing.
- Integration testing.
Also Rust has support for specifying additional dependencies for tests:
See Also
- The Book chapter on testing
- API Guidelines on doc-testing
Unit testing
Tests are Rust functions that verify that the non-test code is functioning in the expected manner. The bodies of test functions typically perform some setup, run the code we want to test, then assert whether the results are what we expect.
Most unit tests go into a tests
mod with the #[cfg(test)]
attribute.
Test functions are marked with the #[test]
attribute.
Tests fail when something in the test function panics. There are some helper macros:
assert!(expression)
- panics if expression evaluates tofalse
.assert_eq!(left, right)
andassert_ne!(left, right)
- testing left and right expressions for equality and inequality respectively.
pub fn add(a: i32, b: i32) -> i32 {
a + b
}
// This is a really bad adding function, its purpose is to fail in this
// example.
#[allow(dead_code)]
fn bad_add(a: i32, b: i32) -> i32 {
a - b
}
#[cfg(test)]
mod tests {
// Note this useful idiom: importing names from outer (for mod tests) scope.
use super::*;
#[test]
fn test_add() {
assert_eq!(add(1, 2), 3);
}
#[test]
fn test_bad_add() {
// This assert would fire and test will fail.
// Please note, that private functions can be tested too!
assert_eq!(bad_add(1, 2), 3);
}
}
Tests can be run with cargo test
.
$ cargo test
running 2 tests
test tests::test_bad_add ... FAILED
test tests::test_add ... ok
failures:
---- tests::test_bad_add stdout ----
thread 'tests::test_bad_add' panicked at 'assertion failed: `(left == right)`
left: `-1`,
right: `3`', src/lib.rs:21:8
note: Run with `RUST_BACKTRACE=1` for a backtrace.
failures:
tests::test_bad_add
test result: FAILED. 1 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out
Tests and ?
None of the previous unit test examples had a return type. But in Rust 2018,
your unit tests can return Result<()>
, which lets you use ?
in them! This
can make them much more concise.
fn sqrt(number: f64) -> Result<f64, String> { if number >= 0.0 { Ok(number.powf(0.5)) } else { Err("negative floats don't have square roots".to_owned()) } } #[cfg(test)] mod tests { use super::*; #[test] fn test_sqrt() -> Result<(), String> { let x = 4.0; assert_eq!(sqrt(x)?.powf(2.0), x); Ok(()) } }
See "The Edition Guide" for more details.
Testing panics
To check functions that should panic under certain circumstances, use attribute
#[should_panic]
. This attribute accepts optional parameter expected =
with
the text of the panic message. If your function can panic in multiple ways, it helps
make sure your test is testing the correct panic.
pub fn divide_non_zero_result(a: u32, b: u32) -> u32 {
if b == 0 {
panic!("Divide-by-zero error");
} else if a < b {
panic!("Divide result is zero");
}
a / b
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_divide() {
assert_eq!(divide_non_zero_result(10, 2), 5);
}
#[test]
#[should_panic]
fn test_any_panic() {
divide_non_zero_result(1, 0);
}
#[test]
#[should_panic(expected = "Divide result is zero")]
fn test_specific_panic() {
divide_non_zero_result(1, 10);
}
}
Running these tests gives us:
$ cargo test
running 3 tests
test tests::test_any_panic ... ok
test tests::test_divide ... ok
test tests::test_specific_panic ... ok
test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Doc-tests tmp-test-should-panic
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Running specific tests
To run specific tests one may specify the test name to cargo test
command.
$ cargo test test_any_panic
running 1 test
test tests::test_any_panic ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 2 filtered out
Doc-tests tmp-test-should-panic
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
To run multiple tests one may specify part of a test name that matches all the tests that should be run.
$ cargo test panic
running 2 tests
test tests::test_any_panic ... ok
test tests::test_specific_panic ... ok
test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 1 filtered out
Doc-tests tmp-test-should-panic
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Ignoring tests
Tests can be marked with the #[ignore]
attribute to exclude some tests. Or to run
them with command cargo test -- --ignored
#![allow(unused)] fn main() { pub fn add(a: i32, b: i32) -> i32 { a + b } #[cfg(test)] mod tests { use super::*; #[test] fn test_add() { assert_eq!(add(2, 2), 4); } #[test] fn test_add_hundred() { assert_eq!(add(100, 2), 102); assert_eq!(add(2, 100), 102); } #[test] #[ignore] fn ignored_test() { assert_eq!(add(0, 0), 0); } } }
$ cargo test
running 3 tests
test tests::ignored_test ... ignored
test tests::test_add ... ok
test tests::test_add_hundred ... ok
test result: ok. 2 passed; 0 failed; 1 ignored; 0 measured; 0 filtered out
Doc-tests tmp-ignore
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
$ cargo test -- --ignored
running 1 test
test tests::ignored_test ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Doc-tests tmp-ignore
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Documentation testing
The primary way of documenting a Rust project is through annotating the source code. Documentation comments are written in markdown and support code blocks in them. Rust takes care about correctness, so these code blocks are compiled and used as tests.
/// First line is a short summary describing function.
///
/// The next lines present detailed documentation. Code blocks start with
/// triple backquotes and have implicit `fn main()` inside
/// and `extern crate <cratename>`. Assume we're testing `doccomments` crate:
///
/// ```
/// let result = doccomments::add(2, 3);
/// assert_eq!(result, 5);
/// ```
pub fn add(a: i32, b: i32) -> i32 {
a + b
}
/// Usually doc comments may include sections "Examples", "Panics" and "Failures".
///
/// The next function divides two numbers.
///
/// # Examples
///
/// ```
/// let result = doccomments::div(10, 2);
/// assert_eq!(result, 5);
/// ```
///
/// # Panics
///
/// The function panics if the second argument is zero.
///
/// ```rust,should_panic
/// // panics on division by zero
/// doccomments::div(10, 0);
/// ```
pub fn div(a: i32, b: i32) -> i32 {
if b == 0 {
panic!("Divide-by-zero error");
}
a / b
}
Tests can be run with cargo test
:
$ cargo test
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Doc-tests doccomments
running 3 tests
test src/lib.rs - add (line 7) ... ok
test src/lib.rs - div (line 21) ... ok
test src/lib.rs - div (line 31) ... ok
test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Motivation behind documentation tests
The main purpose of documentation tests is to serve as examples that exercise
the functionality, which is one of the most important
guidelines. It allows using examples from docs as
complete code snippets. But using ?
makes compilation fail since main
returns unit
. The ability to hide some source lines from documentation comes
to the rescue: one may write fn try_main() -> Result<(), ErrorType>
, hide it and
unwrap
it in hidden main
. Sounds complicated? Here's an example:
/// Using hidden `try_main` in doc tests.
///
/// ```
/// # // hidden lines start with `#` symbol, but they're still compileable!
/// # fn try_main() -> Result<(), String> { // line that wraps the body shown in doc
/// let res = try::try_div(10, 2)?;
/// # Ok(()) // returning from try_main
/// # }
/// # fn main() { // starting main that'll unwrap()
/// # try_main().unwrap(); // calling try_main and unwrapping
/// # // so that test will panic in case of error
/// # }
/// ```
pub fn try_div(a: i32, b: i32) -> Result<i32, String> {
if b == 0 {
Err(String::from("Divide-by-zero"))
} else {
Ok(a / b)
}
}
See Also
- RFC505 on documentation style
- API Guidelines on documentation guidelines
Integration testing
Unit tests are testing one module in isolation at a time: they're small and can test private code. Integration tests are external to your crate and use only its public interface in the same way any other code would. Their purpose is to test that many parts of your library work correctly together.
Cargo looks for integration tests in tests
directory next to src
.
File src/lib.rs
:
// Define this in a crate called `adder`.
pub fn add(a: i32, b: i32) -> i32 {
a + b
}
File with test: tests/integration_test.rs
:
#[test]
fn test_add() {
assert_eq!(adder::add(3, 2), 5);
}
Running tests with cargo test
command:
$ cargo test
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Running target/debug/deps/integration_test-bcd60824f5fbfe19
running 1 test
test test_add ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Doc-tests adder
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Each Rust source file in tests
directory is compiled as a separate crate. One
way of sharing some code between integration tests is making module with public
functions, importing and using it within tests.
File tests/common.rs
:
pub fn setup() {
// some setup code, like creating required files/directories, starting
// servers, etc.
}
File with test: tests/integration_test.rs
// importing common module.
mod common;
#[test]
fn test_add() {
// using common code.
common::setup();
assert_eq!(adder::add(3, 2), 5);
}
Modules with common code follow the ordinary modules rules, so it's ok to
create common module as tests/common/mod.rs
.
Development dependencies
Sometimes there is a need to have dependencies for tests (or examples,
or benchmarks) only. Such dependencies are added to Cargo.toml
in the
[dev-dependencies]
section. These dependencies are not propagated to other
packages which depend on this package.
One such example is using a crate that extends standard assert!
macros.
File Cargo.toml
:
# standard crate data is left out
[dev-dependencies]
pretty_assertions = "0.4.0"
File src/lib.rs
:
// externing crate for test-only use
#[cfg(test)]
#[macro_use]
extern crate pretty_assertions;
pub fn add(a: i32, b: i32) -> i32 {
a + b
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_add() {
assert_eq!(add(2, 3), 5);
}
}
See Also
Cargo docs on specifying dependencies.
Unsafe Operations
As an introduction to this section, to borrow from the official docs, "one should try to minimize the amount of unsafe code in a code base." With that in mind, let's get started! Unsafe annotations in Rust are used to bypass protections put in place by the compiler; specifically, there are four primary things that unsafe is used for:
- dereferencing raw pointers
- calling functions or methods which are
unsafe
(including calling a function over FFI, see a previous chapter of the book) - accessing or modifying static mutable variables
- implementing unsafe traits
Raw Pointers
Raw pointers *
and references &T
function similarly, but references are
always safe because they are guaranteed to point to valid data due to the
borrow checker. Dereferencing a raw pointer can only be done through an unsafe
block.
fn main() { let raw_p: *const u32 = &10; unsafe { assert!(*raw_p == 10); } }
Calling Unsafe Functions
Some functions can be declared as unsafe
, meaning it is the programmer's
responsibility to ensure correctness instead of the compiler's. One example
of this is std::slice::from_raw_parts
which will create a slice given a
pointer to the first element and a length.
use std::slice; fn main() { let some_vector = vec![1, 2, 3, 4]; let pointer = some_vector.as_ptr(); let length = some_vector.len(); unsafe { let my_slice: &[u32] = slice::from_raw_parts(pointer, length); assert_eq!(some_vector.as_slice(), my_slice); } }
For slice::from_raw_parts
, one of the assumptions which must be upheld is
that the pointer passed in points to valid memory and that the memory pointed to
is of the correct type. If these invariants aren't upheld then the program's
behaviour is undefined and there is no knowing what will happen.
Compatibility
The Rust language is fastly evolving, and because of this certain compatibility issues can arise, despite efforts to ensure forwards-compatibility wherever possible.
Raw identifiers
Rust, like many programming languages, has the concept of "keywords". These identifiers mean something to the language, and so you cannot use them in places like variable names, function names, and other places. Raw identifiers let you use keywords where they would not normally be allowed. This is particularly useful when Rust introduces new keywords, and a library using an older edition of Rust has a variable or function with the same name as a keyword introduced in a newer edition.
For example, consider a crate foo
compiled with the 2015 edition of Rust that
exports a function named try
. This keyword is reserved for a new feature in
the 2018 edition, so without raw identifiers, we would have no way to name the
function.
extern crate foo;
fn main() {
foo::try();
}
You'll get this error:
error: expected identifier, found keyword `try`
--> src/main.rs:4:4
|
4 | foo::try();
| ^^^ expected identifier, found keyword
You can write this with a raw identifier:
extern crate foo;
fn main() {
foo::r#try();
}
Meta
Some topics aren't exactly relevant to how you program but provide you tooling or infrastructure support which just makes things better for everyone. These topics include:
- Documentation: Generate library documentation for users via the included
rustdoc
. - Playpen: Integrate the Rust Playpen(also known as the Rust Playground) in your documentation.
Documentation
Use cargo doc
to build documentation in target/doc
.
Use cargo test
to run all tests (including documentation tests), and cargo test --doc
to only run documentation tests.
These commands will appropriately invoke rustdoc
(and rustc
) as required.
Doc comments
Doc comments are very useful for big projects that require documentation. When
running rustdoc
, these are the comments that get compiled into
documentation. They are denoted by a ///
, and support Markdown.
#![crate_name = "doc"]
/// A human being is represented here
pub struct Person {
/// A person must have a name, no matter how much Juliet may hate it
name: String,
}
impl Person {
/// Returns a person with the name given them
///
/// # Arguments
///
/// * `name` - A string slice that holds the name of the person
///
/// # Examples
///
/// ```
/// // You can have rust code between fences inside the comments
/// // If you pass --test to `rustdoc`, it will even test it for you!
/// use doc::Person;
/// let person = Person::new("name");
/// ```
pub fn new(name: &str) -> Person {
Person {
name: name.to_string(),
}
}
/// Gives a friendly hello!
///
/// Says "Hello, [name]" to the `Person` it is called on.
pub fn hello(& self) {
println!("Hello, {}!", self.name);
}
}
fn main() {
let john = Person::new("John");
john.hello();
}
To run the tests, first build the code as a library, then tell rustdoc
where
to find the library so it can link it into each doctest program:
$ rustc doc.rs --crate-type lib
$ rustdoc --test --extern doc="libdoc.rlib" doc.rs
Doc attributes
Below are a few examples of the most common #[doc]
attributes used with rustdoc
.
inline
Used to inline docs, instead of linking out to separate page.
#[doc(inline)]
pub use bar::Bar;
/// bar docs
mod bar {
/// the docs for Bar
pub struct Bar;
}
no_inline
Used to prevent linking out to separate page or anywhere.
// Example from libcore/prelude
#[doc(no_inline)]
pub use crate::mem::drop;
hidden
Using this tells rustdoc
not to include this in documentation:
// Example from the futures-rs library
#[doc(hidden)]
pub use self::async_await::*;
For documentation, rustdoc
is widely used by the community. It's what is used to generate the std library docs.
See also:
- The Rust Book: Making Useful Documentation Comments
- The rustdoc Book
- The Reference: Doc comments
- RFC 1574: API Documentation Conventions
- RFC 1946: Relative links to other items from doc comments (intra-rustdoc links)
- Is there any documentation style guide for comments? (reddit)
Playpen
The Rust Playpen is a way to experiment with Rust code through a web interface. This project is now commonly referred to as Rust Playground.
Using it with mdbook
In mdbook
, you can make code examples playable and editable.
fn main() { println!("Hello World!"); }
This allows the reader to both run your code sample, but also modify and tweak it. The key here is the adding the word editable
to your codefence block separated by a comma.
```rust,editable
//...place your code here
```
Additionally, you can add ignore
if you want mdbook
to skip your code when it builds and tests.
```rust,editable,ignore
//...place your code here
```
Using it with docs
You may have noticed in some of the official Rust docs a button that says "Run", which opens the code sample up in a new tab in Rust Playground. This feature is enabled if you use the #[doc] attribute called html_playground_url
.