LMAX Exchange - unique vision for global FX

LMAX Exchange

Within a project I’ve been working on I’ve had the need to simulate
the capabilities of Linux’s /proc/cpuinfo on Mac OS. Specifically I
needed to build a topology of the CPUs on a given system. I.e. I need
to map the operating system’s processors to hardware threads, then build
a picture of which cores and sockets those threads reside. For example
my Mac looks something like:


CPU0 (Thread 0) ---+
                   |---> Core 0 ---+
CPU1 (Thread 1) ---+               |
                                   | ----> Socket 0
CPU2 (Thread 2) ---+               |
                   |---> Core 1 ---+
CPU3 (Thread 3) ---+

While
this sounds very simple, it’s actually fraught with a number of little
niggles. Not only did it require getting down and dirty with a bit of
C++ and X86 Assembly, it also required writing a MacOS kernel extension.

The first step was to understand what information was available from the CPU. Intel exposes an instruction called CPUID.
The is the primary mechanism for getting information about the CPU.
There is a raft of information available from listing of the CPU
features available (e.g. hyperthreading) to sizes of the various levels
of cache and the associated cache lines. To access the CPUID
instruction we need a little bit of inline assembler.

The code shows how to get the vendor string from the CPU. On my Mac I get the following:

// Output:
Vendor String: GenuineIntel

For
those unfamiliar with Intel inline assembly, the Intel CPU defines a
number of registers. The ones used for the CPUID instruction are EAX,
EBX, ECX, and EDX (referenced as RAX, RBX, etc if using 64 bit
instructions via the REX extension). These used for both input and
output. An inline asm segment consists of 3 parts. The first part is
the instruction to be executed. In this case the "cpuid" instruction. The second line defines the output parameters. The snippet "=a" (data[0]) means store the result in the EAX register in the variable data[0]. The "=a"
refers to the 2nd letter of the register designation. The 3rd and
final section are the input parameters. The CPUID instruction takes 2
parameters, one in EAX and one in ECX.

The particular CPUID reference that provides information needed to building the topology is 0xB (11) – the extended topology enumeration leaf. The data returned from this instruction is:


     0                   1                   2                   3   
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
EAX | Shift |                     Reserved                          |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
EBX |   No. Process at this level   |            Reserved           |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
ECX |   Level No.   |  Level type   |            Reserved           |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
EDX |                           x2APIC ID                           |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The
extended topology enumeration leaf is one of the CPUID indexes that
also makes use of ECX as an input parameter. This indicates the level
of the CPU that you wish to work at. I started at level 0 and worked my
way up. The ‘Level type’ describes whether the current level is a CPU
thread (1) or a physical core (2) or invalid (0). All values greater
than 2 are reserved, presumably for later use. The 2 other useful
values are the x2APIC ID and Shift. The x2APIC ID is a unique identifier for the smallest logic unit in the CPU. The Shift value is used to group x2APIC ID values into units at the next logical level. This is done by shifting the x2APIC ID right by the value specified by Shift. For example on my using the following code on my workstation (2 sockets, 8 cores, 16 threads):

Outputs the following:


Shift: 1, Count: 2, Level Id: 0, Level Type: 1, x2APIC: 33, Group: 16
Shift: 5, Count: 8, Level Id: 1, Level Type: 2, x2APIC: 33, Group: 1

This
indicates that the hardware thread indicated by APIC 33 has the core id
of 16 and socket id of 1. The socket and core ids aren’t really
references, but values that indicate threads have have the same id share
that unit. This gets me most of the way there, I can group all of my
threads into a hierarchy of cores and sockets. However there is one
small niggle remaining. I need this code to run all of the CPUs on my
system. On Linux this is reasonably simple, you count the number of
CPUs then iterate through all of them and use the pthread_setaffinity_np(...) to specify specify which CPU to run the CPUID instructions on.

However, on Mac OS X actual thread affinity is not supported;
just a mechanism logically grouping or separating threads, powerful in
its own right, but not what I’m after here. This is where the Kernel
module comes in. The XNU Kernel defines a funky little method called mp_rendevous(...).
This method takes in a number function pointers. The key one
(action_func), is run once on each of the CPUs. So to get the topology
information for all of the CPUs we can use it like so:

Because the method mp_rendevous() is defined in the kernel the code above needs to be packaged as a kernel extension.
Even then, getting access to the method is interesting. It’s not
defined in a header that can be easily included, however it is available
at link time when compiling a kernel module. Therefore in order
compile correctly, it’s necessary to provide your own forward
declaration of the method. The same is true of the cpu_number(). Calling into the kernel method from user space requires use of the IOKit framework, but I’ll leave the details of that as an exercise for the reader.

Mo	Tu	We	Th	Fr	Sa	Su
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

LMAX Group blog - FX industry thought leadership

Building a CPU Topology on MacOS X

LMAX Group blog - FX industry thought leadership

Sign up for Global FX Insights, the daily market commentary from LMAX Group