Crate cls

source ·
Expand description

A library for defining CPU-local variables.

See cpu_local for more details on how to define a CPU-local variable; the information below is not required.

Implementation

There are two ways in which a crate can be linked:

  • Statically, meaning it is linked at compile-time by an external linker.
  • Dynamically, meaning it is linked at runtime by mod_mgmt.

Since we have little control over the external linker, we are forced to write CLS code that works with the external linker, despite the linker having no understanding of CLS. This lack of understanding results in significant complexity in calculating the offset of a CLS symbol.

Also there is no way to tell whether code was statically or dynamically linked, and so the same CLS code must work with both static and dynamic linking. This results in additional complexity in mod_mgmt because it must interface with this generic CLS code.

x86-64

Static linking

On x86_64, a TLS data image looks like

                       fs
                        V
+-----------------------+--------------+------------------------+
| statically linked TLS | TLS self ptr | dynamically linked TLS |
+-----------------------+--------------+------------------------+

where statically linked TLS is accessed using a negative offset from the fs register.

Now with the way CLS works on Theseus, when we statically link CLS, the linker believes we are prepending the cls section to the tls section like

                       gs                            fs
                        V                             V
+-----------------------+-----+-----------------------+
| statically linked cls | 000 | statically linked TLS |
+-----------------------+-----+-----------------------+

where 000 are padding bytes to align the start of the statically linked TLS to a page boundary. So the linker will write negative offsets to CLS relocations based on their distance from the end of the statically linked TLS.

However, in reality we have a completely separate data image for CLS, and so we need to figure out the negative offset from the gs register based on the negative offset from the fs register, the CLS size, the TLS size, and the fact that the start of the TLS section is on the next page boundary after the end of the CLS section.

from_cls_start
   +-----+
   |     |
   |     |        -{cls}@TPOFF
   |     +------------------------------+
   |     |                              |
   |     |  -offset                     |
   |     +----------+                   |
   |     |          |                   |
   V     V          V                   V
   +----------------+-----+-------------+
   |      .cls      | 000 | .tls/.tdata |
   +----------------+-----+-------------+
   ^                ^     ^             ^
   |                |     |             |
   |                gs    |            fs
   |                |     |             |
   +----------------+     +-------------+
   |    cls_size          |   tls_size
   |                      |
  a*4kb                  b*4kb
   |                      |
   +----------------------+
    cls_start_to_tls_start

where a*4kb means that the address is a multiple of 4kb i.e. page-aligned.

Dynamic linking

When a crate is dynamically linked on x86_64, mod_mgmt will set __THESEUS_CLS_SIZE and __THESEUS_TLS_SIZE to usize::MAX. Prior to calculating the offset, the CLS access code will check if the variables are set to their sentinel values and if so, it will simply use the provided offset since mod_mgmt would have computed it correctly, unlike the external linker.

aarch64

Unlike x86_64, aarch64 doesn’t have different branches for statically linked, and dynamically linked variables. This is because on aarch64, there are no negative offset shenanigans. A TLS data image looks like

fs
V
+-----------------------+------------------------+
| statically linked TLS | dynamically linked TLS |
+-----------------------+------------------------+

Unlike x86_64, the .cls section is located after .tls in a binary and so the linker thinks the data image looks like

fs
V
+-----------------------+-----+-----------------------+------------------------+
| statically linked TLS | 000 | statically linked CLS | dynamically linked TLS |
+-----------------------+-----+-----------------------+------------------------+

where 000 are padding bytes to align the start of the statically linked CLS to a page boundary.

Hence, CLS symbols have an offset that is incorrect by tls_size.next_multiple_of(0x1000), or 0 if there is not TLS section. So, when loading CLS symbols on aarch64, mod_mgmt simply sets __THESEUS_TLS_SIZE to 0.

Traits

  • A trait abstracting over guards that ensure atomicity with respect to the current CPU.

Attribute Macros

  • A macro for declaring CPU-local variables.