crossbeam-epoch(二):Atomic

pointer align and tag

Atomic 内部包装了一个 AtomicPtr,为了支持 tag 功能,这个指针必须是对齐的。例如,我们有一个 *const i32 指针 ptr。在 Rust 中,i32 类型按照 4 个字节进行对齐:

1
2
3
fn main() {
println!("{}", std::mem::align_of::<i32>()); // 4
}

如果 ptr 是对齐的,那么ptr 必然是 4 的倍数,那么 ptr 的最低的两个比特位必然等于0:

所以我们可以使用对齐指针的最低的 mem::align_of::<T>().trailing_zeros() 个比特位存储 tag。

为了方便起见,把 ptr 指针用来表示对齐的比特位称为 “对齐位”。

crossbeam-epoch 中判断指针是否对齐和实现 tag 的方法:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
/// Returns a bitmask containing the unused least significant bits of an aligned pointer to `T`.
#[inline]
fn low_bits<T: ?Sized + Pointable>() -> usize {
(1 << T::ALIGN.trailing_zeros()) - 1
}

/// Panics if the pointer is not properly unaligned.
#[inline]
fn ensure_aligned<T: ?Sized + Pointable>(raw: *mut ()) {
assert_eq!(raw as usize & low_bits::<T>(), 0, "unaligned pointer");
}

/// Given a tagged pointer `data`, returns the same pointer, but tagged with `tag`.
///
/// `tag` is truncated to fit into the unused bits of the pointer to `T`.
#[inline]
fn compose_tag<T: ?Sized + Pointable>(ptr: *mut (), tag: usize) -> *mut () {
(ptr as usize & !low_bits::<T>()) | (tag & low_bits::<T>())
}

/// Decomposes a tagged pointer `data` into the pointer and the tag.
#[inline]
fn decompose_tag<T: ?Sized + Pointable>(ptr: *mut ()) -> (*mut (), usize) {
(ptr as usize & !low_bits::<T>(), ptr as usize & low_bits::<T>())
}

low_bits 函数返回一个位掩码,用来表示指向类型 T 的对齐指针的对齐位。例如 low_bits::<i32>() 会返回 b011,表示有两个比特位没有使用,这两个比特位就可以存储 tag。

ensure_aligned 函数判断 raw(不带 tag) 指针是否对齐,它将 low_bits::<T>() 的结果与 raw 做与运算,如果等于0,则说明 raw 指针是对齐的,反之说明没有对齐。

compose_tag 函数将 tag 保存到 ptr 指针中,通过 ptr as usize & !low_bits::<T>()ptr 的对齐位清零。通过tag & low_bits::<T>()tag 写到对齐位中,如果 tag 过大,则会被截断,因为 low_bits::<T>() 返回的结果除了对齐位其他的比特位都为0。将上述的两个位运算的结果再做一个或运算,就可以把 tag 写到 ptr 的对齐位中了。

decompose_tag 函数将 ptr 指针拆分成没有 tag 的 raw 指针和 tag 值。内部的操作与 compose_tag 类似。

Pointable trait

crossbeam-epoch 将由 single-word 指针指向的类型抽象成了一个 Pointable trait,其定义如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
pub trait Pointable {
/// The alignment of pointer.
const ALIGN: usize;

/// The type for initializers.
type Init;

/// Initializes a with the given initializer.
///
/// # Safety
///
/// The result should be a multiple of `ALIGN`.
unsafe fn init(init: Self::Init) -> *mut ();

/// Dereferences the given pointer.
///
/// # Safety
///
/// - The given `ptr` should have been initialized with [`Pointable::init`].
/// - `ptr` should not have yet been dropped by [`Pointable::drop`].
/// - `ptr` should not be mutably dereferenced by [`Pointable::deref_mut`] concurrently.
unsafe fn deref<'a>(ptr: *mut ()) -> &'a Self;

/// Mutably dereferences the given pointer.
///
/// # Safety
///
/// - The given `ptr` should have been initialized with [`Pointable::init`].
/// - `ptr` should not have yet been dropped by [`Pointable::drop`].
/// - `ptr` should not be dereferenced by [`Pointable::deref`] or [`Pointable::deref_mut`]
/// concurrently.
unsafe fn deref_mut<'a>(ptr: *mut ()) -> &'a mut Self;

/// Drops the object pointed to by the given pointer.
///
/// # Safety
///
/// - The given `ptr` should have been initialized with [`Pointable::init`].
/// - `ptr` should not have yet been dropped by [`Pointable::drop`].
/// - `ptr` should not be dereferenced by [`Pointable::deref`] or [`Pointable::deref_mut`]
/// concurrently.
unsafe fn drop(ptr: *mut ());
}

在并发编程中,需要使用 single-word 指针来表示一个对象,因为原子操作(例如 read, write, read-modify-write)只支持 single-word。Pointable trait 就表示被 single-word 指针指向的类型,提供了初始化/解引用/析构操作的定义。

crossbeam-epoch 为 sized 类型 T 实现了 Pointable trait:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
impl<T> Pointable for T {
const ALIGN: usize = mem::align_of::<T>();

type Init = T;

unsafe fn init(init: Self::Init) -> *mut () {
Box::into_raw(Box::new(init)).cast::<()>()
}

unsafe fn deref<'a>(ptr: *mut ()) -> &'a Self {
&*(ptr as *const T)
}

unsafe fn deref_mut<'a>(ptr: *mut ()) -> &'a mut Self {
&mut *ptr.cast::<T>()
}

unsafe fn drop(ptr: *mut ()) {
drop(Box::from_raw(ptr.cast::<T>()));
}
}

Pointable trait 为 sized 类型 T 生成了一个 Box<T>Box::new 在堆上为类型 T 分配内存,并返回一个 single-word 指针。对于 sized TBox<T> 的大小都等于 single-word,例如 Box<i32>

1
2
3
fn main() {
println!("{}", std::mem::size_of::<Box<i32>>()); // 8
}

但是对于动态大小类型 unsized TBox<T> 就是一个胖指针了,因为需要保存长度信息。例如 Box<[MaybeUninit<u32>]>

1
2
3
4
5
use core::mem::MaybeUninit;

fn main() {
println!("{}", std::mem::size_of::<Box<[MaybeUninit<u32>]>>()); // 16
}

为了支持动态大小类型 [MaybeUninit<T>],就不能使用 Box,必须单独为 [MaybeUninit<T>] 实现 Pointable trait:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
/// Array with size.
///
/// # Memory layout
///
/// An array consisting of size and elements:
///
/// ```text
/// elements
/// |
/// |
/// ------------------------------------
/// | size | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
/// ------------------------------------
/// ```
///
/// Its memory layout is different from that of `Box<[T]>` in that size is in the allocation (not
/// along with pointer as in `Box<[T]>`).
///
/// Elements are not present in the type, but they will be in the allocation.
/// ```
#[repr(C)]
struct Array<T> {
/// The number of elements (not the number of bytes).
len: usize,
elements: [MaybeUninit<T>; 0],
}

impl<T> Pointable for [MaybeUninit<T>] {
const ALIGN: usize = mem::align_of::<Array<T>>();

type Init = usize;

unsafe fn init(len: Self::Init) -> *mut () {
let size = mem::size_of::<Array<T>>() + mem::size_of::<MaybeUninit<T>>() * len;
let align = mem::align_of::<Array<T>>();
let layout = alloc::Layout::from_size_align(size, align).unwrap();
let ptr = alloc::alloc(layout).cast::<Array<T>>();
if ptr.is_null() {
alloc::handle_alloc_error(layout);
}
(*ptr).len = len;
ptr.cast::<()>()
}

unsafe fn deref<'a>(ptr: *mut ()) -> &'a Self {
let array = &*(ptr as *const Array<T>);
slice::from_raw_parts(array.elements.as_ptr(), array.len)
}

unsafe fn deref_mut<'a>(ptr: *mut ()) -> &'a mut Self {
let array = &mut *ptr.cast::<Array<T>>();
slice::from_raw_parts_mut(array.elements.as_mut_ptr(), array.len)
}

unsafe fn drop(ptr: *mut ()) {
let array = &*ptr.cast::<Array<T>>();
let size = mem::size_of::<Array<T>>() + mem::size_of::<MaybeUninit<T>>() * array.len;
let align = mem::align_of::<Array<T>>();
let layout = alloc::Layout::from_size_align(size, align).unwrap();
alloc::dealloc(ptr.cast::<u8>(), layout);
}
}

Box<[T]> 不同的是,Array<T> 的内存布局把切片的长度信息和切片元素保存在一起。在 Pointable 的实现中,使用更为底层的 alloc 在堆上分配内存,并返回指向 Array<T> 的指针,因为不需要附加长度信息,所以指针的大小等于 single-word,这样就满足了 Pointable 的要求。

总的来说,我认为定义 Pointable trait 的目的是统一 sized/unsized 类型的接口。这样的话,我们为类型 T 进行初始化/解引用/析构操作时,就不需要区分 T 是否是动态大小类型,而只需要使用 Pointable 中定义好的接口即可。

Atomic

在上篇文章中,我们讲到当想要访问无锁数据结构中的节点时,需要先调用 pin 方法创建一个 guard。在 guard 的生命周期内,访问就是安全的。

但是,crossbeam-epoch 的 api 设计更加严格,它把 AtomicPtr 包装成了 Atomic 类型,如果要在 Atomic 上执行原子操作就必须传入 &guard,这样就能强制用户事先创建好 guard 再执行操作,避免出现 use-after-free 的问题。

AtomicOwnedShared 的定义如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
/// An atomic pointer that can be safely shared between threads.
///
/// The pointer must be properly aligned. Since it is aligned, a tag can be stored into the unused
/// least significant bits of the address. For example, the tag for a pointer to a sized type `T`
/// should be less than `(1 << mem::align_of::<T>().trailing_zeros())`.
///
/// Any method that loads the pointer must be passed a reference to a [`Guard`].
///
/// Crossbeam supports dynamically sized types. See [`Pointable`] for details.
pub struct Atomic<T: ?Sized + Pointable> {
data: AtomicPtr<()>,
_marker: PhantomData<*mut T>,
}

/// An owned heap-allocated object.
///
/// This type is very similar to `Box<T>`.
///
/// The pointer must be properly aligned. Since it is aligned, a tag can be stored into the unused
/// least significant bits of the address.
pub struct Owned<T: ?Sized + Pointable> {
data: *mut (),
_marker: PhantomData<Box<T>>,
}

/// A pointer to an object protected by the epoch GC.
///
/// The pointer is valid for use only during the lifetime `'g`.
///
/// The pointer must be properly aligned. Since it is aligned, a tag can be stored into the unused
/// least significant bits of the address.
pub struct Shared<'g, T: 'g + ?Sized + Pointable> {
data: *mut (),
_marker: PhantomData<(&'g (), *const T)>,
}

Atomic<T> 和标准库提供的 AtomicPtr<T> 类似,都是可以在多线程之间安全共享的指针类型。Atomic<T> 中实现了很多原子操作方法,例如 loadstorecompare_exchangefetch_add 等,底层基本都调用了 AtomicPtr<T> 中的同名方法。

Owned<T> 表示拥有所有权的指针类型,类似于 Box<T>,都是在堆上为 T 分配内存。当 Owned<T> 离开作用域时,就会释放之前分配的内存。

Shared<T> 类似于 &T,表示没有所有权的指针类型。它是被 epoch gc 保护的指针,例如调用 Atomic<T>load 方法时就会返回 Shared<T>,并附加生命周期 'g'g 应该小于等于 guard 的生命周期,因此在 guard 的生命周期内,线程都是被 “pin” 住的,Shared<T> 被 epoch gc 保护,它指向的节点不会被其他线程回收,因此解引用 Shared<T> 指针就是安全的。

除此之外,crossbeam-epoch 中还实现了一些 Atomic<T>Owned<T>Shared<T> 之间的互相转换操作。


crossbeam-epoch(二):Atomic
https://night-cruise.github.io/2023/01/27/epoch(二):Atomic/
作者
Night Cruise
发布于
2023年1月27日
许可协议