Hacking the Linux Kernel in Ada – Part 3

April 7, 2022

5685

For this three-part series, we implemented a ‘pedal to the metal’ GPIO driven, flashing of a LED, in the context of a Linux kernel module for the NVIDIA Jetson Nano development board (kernel-based v4.9.294, arm64) in my favorite programming language … Ada!

Part 1. Review of a kernel module, build strategy, and Ada integration.
Part 2. Modeling of the problem in Ada.
Part 3. Practical Ada binding to the C kernel APIs.

You can find the whole project published at https://github.com/ohenley/adacore_jetson. It is known to build and run properly. All instructions to be up and running in 5 minutes are included in the accompanying front-facing README.md. Do not hesitate to fill a GitHub issue if you find any problem.

Disclaimer: This text is meant to appeal to both Ada and non-Ada coders. Therefore I try to strike a balance between code story simplicity, didactic tractability, and features density. As I said to a colleague, this is the text I would have liked to cross before starting this experiment.

Binding 101

The binding thickness

Our code boundary to the Linux kernel C methods lies in kernel.ads. For an optional “adaptation” opportunity, kernel.adb exists before breaking into the concrete C binding. Take printk (printf equivalent in kernel space) for example. In C, you would call printk(“hello\n”). Ada strings are not null-terminated, they are an array of characters. To make sure the passed Ada string stays valid on the C side, you expose specification signatures .ads that make sense when programming from an Ada point of view and “adapt” in body implementation .adb before calling directly into the binding. Strictly speaking, our exposed Ada Printk would qualify as a “thick” binding even though the adaptation layer is minimal. This is in opposition to a “thin” binding which is really a one-to-one mapping on the C signature as implemented by Printk_C.

-- kernel.ads
procedure Printk (S : String); -- only this is visible for clients of kernel

-- kernel.adb
procedure Printk_C (S : String) with -- considered a thin binding
    Import        => true,
    Convention    => C,
    External_Name => "printk";

procedure Printk (S : String) is -- considered a thick binding
begin
   Printk_C (S & Ascii.Lf & Ascii.Nul); -- because we ‘mangle’ for Ada comfort
end;

The wrapper function

Binding to a wrapped C macro or static inline is often convenient, potentially makes you inherit fixes, upgrades happening inside/under the macro implementation and are, depending on the context, potentially more portable. create_singlethread_workqueue used in printk_wq.c as found in Part 1 makes a perfect example. Our driver has a C home in main.c. You create a C wrapping function calling the macro.

/* main.c */
extern struct workqueue_struct * wrap_create_singlethread_wq (const char* name)
{
   return create_singlethread_workqueue(name); /* calling the macro */
}

You then bind to this wrapper on the Ada side and use it. Done.

-- kernel.ads
function Create_Singlethread_Wq (Name : String) return Workqueue_Struct_Access with
   Import        => True,
   Convention    => C,
   External_Name => "wrap_create_singlethread_wq";

-- flash_led.adb
...
Wq := K.Create_Singlethread_Wq ("flash_led_work");

The reconstruction

Sometimes a macro called on the C side creates stuff, in place, which you end up needing on the Ada side. You can probably always bind to this resource but I find it often impedes code story. Take DECLARE_DELAYED_WORK(dw, delayed_work_cb) for example. From an outside point of view, it implicitly creates struct delayed_work dw in place.

/* https://elixir.bootlin.com/linux/v4.9.294/source/include/linux/workqueue.h */
#define DECLARE_DELAYED_WORK(n, f)					\
	struct delayed_work n = __DELAYED_WORK_INITIALIZER(n, f, 0)

Using this macro, the only way I found to get a hold of dw from Ada without crashing (returning dw from a wrapper never worked) was to globally call DECLARE_DELAYED_WORK(n, f) in main.c and then bind only to dw. Having to maintain this from C, making it magically appear in Ada felt “breadboard wiring” to me. In the code repository, you will find that we fully reconstructed this macro under the procedure of the same name Declare_Delayed_Work.

The pointer shortcut

Most published Ada to C bindings implement full definition parity. This is an ideal situation in most cases but it also comes with complexity, may generate many 3rd party files, sometimes buried deep, out-of-sync definitions, etc. What can you do when complete bindings are missing or you just want to move lean and fast? If you are making a prototype, you want minimal dependencies, the binding part is peripheral eg. you may only need a quick native window API. You get the point.

Depending on the context you do not always need the full type definitions to get going. Anytime you are strictly dealing with a handle pointer (not owning the memory), you can take a shortcut. Let’s bind to gpio_get_value to illustrate. Again, I follow and layout all C signatures found in the kernel sources leading to concrete stuff, where we can bind.




/* https://elixir.bootlin.com/linux/v4.9.294/source(-) */
/* (+)include/linux/gpio.h */
static inline int gpio_get_value(unsigned int gpio)
{
	return __gpio_get_value(gpio);
}

/* (+)include/asm-generic/gpio.h */
static inline int __gpio_get_value(unsigned gpio)
{
	return gpiod_get_raw_value(gpio_to_desc(gpio));
}
/* (+)include/linux/gpio/consumer.h */
struct gpio_desc *gpio_to_desc(unsigned gpio);            /* bindable */

int gpiod_get_raw_value(const struct gpio_desc *desc);    /* bindable */

/* (+)drivers/gpio/gpiolib.h */
struct gpio_desc {
	struct gpio_device	*gdev;
	unsigned long		flags;
...
	const char		*name;
};

Inspecting the C definitions we find that gpiod_get_raw_value and gpio_to_desc are our available functions for binding. We note gpio_to_desc uses a transient pointer of type gpio_desc *. Because we do not touch or own a full gpio_desc instance we can happily skip defining it in full (and any dependent leads eg. gpio_device).

By declaring type Gpio_Desc_Acc is new System.Address; we create an equivalent to gpio_desc *. After all, a C pointer is a named system address. We now have everything we need to build our Ada version of gpio_get_value.

-- kernel.ads
package Ic renames Interfaces.C;

function Gpio_Get_Value (Gpio : Ic.Unsigned) return Ic.Int; -- only this is visible for clients of kernel

-- kernel.adb
type Gpio_Desc_Acc is new System.Address; -- shortcut

function Gpio_To_Desc_C (Gpio : Ic.Unsigned) return Gpio_Desc_Acc with
   Import        => True,
   Convention    => C,
   External_Name => "gpio_to_desc";
 
function Gpiod_Get_Raw_Value_C (Desc : Gpio_Desc_Acc) return Ic.Int with
   Import        => True,
   Convention    => C,
   External_Name => "gpiod_get_raw_value";

function Gpio_Get_Value (Gpio : Ic.Unsigned) return Ic.Int is
   Desc : Gpio_Desc_Acc := Gpio_To_Desc_C (Gpio);
begin
   return Gpiod_Get_Raw_Value_C (Desc);
end;

The Raw bindings, “100% Ada”

In most production contexts we cannot recommend reconstructing unbindable kernel API calls in Ada. Wrapping the C macro or static inline is definitely easier, safer, portable and maintainable. The following goes full blown Ada for the sake of illustrating some interesting nuts and bolts and to show that it is always possible.

Flags, first take

Given the will power you can always reconstruct the targeted macro or static inline in Ada. Let’s come back to create_singlethread_workqueue. If you take the time to expand its macro using GCC this is what you get.

$ gcc -E [~ 80_switches_for_valid_ko] printk_wq.c 
...
wq = __alloc_workqueue_key(("%s"),
                          (WQ_UNBOUND |
                           __WQ_ORDERED |
                           __WQ_ORDERED_EXPLICIT |
                          (__WQ_LEGACY | WQ_MEM_RECLAIM)),
                          (1),
                          ((void *)0),
                          ((void *)0),
                          "my_wq");

All arguments are straightforward to map except the OR‘ed flags. Let’s search the kernel sources for those flags.

/* https://elixir.bootlin.com/linux/v4.9.294/source/include/linux/workqueue.h */
enum {
   WQ_UNBOUND             = 1 << 1,
   ...
   WQ_POWER_EFFICIENT     = 1 << 7,

   __WQ_DRAINING          = 1 << 16,
   ...
   __WQ_ORDERED_EXPLICIT  = 1 << 19,

   WQ_MAX_ACTIVE          = 512,     
   WQ_MAX_UNBOUND_PER_CPU = 4,      
   WQ_DFL_ACTIVE          = WQ_MAX_ACTIVE / 2,
};

Here are our design decisions for reconstruction

WQ_MAX_ACTIVE, WQ_MAX_UNBOUND_PER_CPU, WQ_DFL_ACTIVE are constants, not flags, so we keep them out.
The enum is anonymous, let’s give it a proper named type.
__WQ pattern is probably a convention but at the same times usage is mixed, eg. WQ_UNBOUND | __WQ_ORDERED, so let’s flatten all this.

Because we do not use these flags elsewhere in our code base, the occasion is perfect to show that in Ada we can keep all this modeling local to our unique function using it.

-- kernel.ads
package Ic renames Interfaces.C;

type Wq_Struct_Access is new System.Address;      -- shortcut
type Lock_Class_Key_Access is new System.Address; -- shortcut
Null_Lock : Lock_Class_Key_Access := 
Lock_Class_Key_Access (System.Null_Address); -- typed ((void *)0) equiv.

-- kernel.adb
type Bool is (NO, YES) with Size => 1;       -- enum holding on 1 bit
for Bool use (NO => 0, YES => 1);            -- "represented" by 0, 1 too

function Alloc_Workqueue_Key_C ...
   External_Name => "__alloc_workqueue_key";      -- thin binding

function Create_Singlethread_Wq (Name : String) return Wq_Struct_Access is
   type Workqueue_Flags is record
      ...
      WQ_POWER_EFFICIENT  : Bool;
      WQ_DRAINING         : Bool;
      ...
   end record with Size => Ic.Unsigned'Size;
   for Workqueue_Flags use record
      ...
      WQ_POWER_EFFICIENT  at 0 range  7 ..  7;
      WQ_DRAINING         at 0 range 16 .. 16;
      ...
   end record;
   Flags : Workqueue_Flags := (WQ_UNBOUND          => YES,
                               WQ_ORDERED          => YES,
                               WQ_ORDERED_EXPLICIT => YES,
                               WQ_LEGACY           => YES,
                               WQ_MEM_RECLAIM      => YES,
                               Others              => NO);
   Wq_Flags : Ic.Unsigned with Address => Flags'Address;
begin
   return Alloc_Workqueue_Key_C ("%s", Wq_Flags, 1, Null_Lock, "", Name);
end;

In C, each flag is implicitly encoded as an integer literal, bit swapped by an amount. Because __alloc_workqueue_key signature expects flags encoded as an unsigned int It should be reasonable to use Ic.Unsigned’Size, to hold a Workqueue_Flags.
We build the representation of Workqueue_Flags type similar to what we learned in Part 2 to model registers. Compared to the C version we now have NO => 0, YES => 1 semantic and no need for bitwise operations.
Remember, in Ada we roll with strong user-defined types for the greater goods. Therefore something like Workqueue_Flags does not match the expected Flags : Ic.Unsigned parameter of our __alloc_workqueue_key thin binding. What should we do? You create a variable Wq_Flags : Ic.Unsigned and overlay it the address of Flags : Workqueue_Flags which you can now pass in to __alloc_workqueue_key.

Wq_Flags : Ic.Unsigned with Address => Flags'Address; -- voila!

Ioremap and iowrite32

The core work of the raw_io version happens in Set_Gpio. Using Ioremap, we retrieve the kernel mapped IO memory location for the GPIO_OUT register physical address. We then write the content of our Gpio_Control to this IO memory location through Io_Write_32.

-- kernel.ads
type Iomem_Access is new System.Address;

-- led.adb
package K renames Kernel;
package C renames Controllers;

procedure Set_Gpio (Pin : C.Pin; S : Led.State) is

   function Bit (S : Led.State) return C.Bit renames Led.State'Enum_Rep;

   Base_Addr : K.Iomem_Access;
   Control   : C.Gpio_Control := (Bits  => (others => 0), 
                                  Locks => (others => 0));
   Control_C : K.U32 with Address => Control'Address;
begin
   ...
   Control.Bits (Pin.Reg_Bit) := Bit (S); -- set the GPIO flags
   ...
   Base_Addr := Ioremap (C.Get_Register_Phys_Address (Pin.Port, C.GPIO_OUT),
                         Control_C'Size); -- get kernel mapped register addr.
   K.Io_Write_32 (Control_C, Base_Addr);  -- write our GPIO flags to this addr.
   ...
end;

Let’s take the hard paths of full reconstruction to illustrate interesting stuff. We first implement ioremap. On the C side we find

/* https://elixir.bootlin.com/linux/v4.9.294/source(-) */
/* (+)arch/arm64/include/asm/io.h */
#define ioremap(addr, size) \
   __ioremap((addr), (size), __pgprot(PROT_DEVICE_nGnRE))

extern void __iomem *__ioremap(phys_addr_t phys_addr, size_t size, pgprot_t prot);

Flags, second take

Here we are both lucky and unlucky. __ioremap is low hanging while __pgprot(PROT_DEVICE_nGnRE) turns out to be a rabbit hole. I skip the intermediate expansion by reporting the final result

$ gcc -E [~ 80_switches_for_valid_ko] test_using_ioremap.c
…
void* membase = __ioremap(  
   (phys_addr + offset),
   (4),
   ((pgprot_t) {
      (((((((pteval_t)(3)) << 0) |
      (((pteval_t)(1)) << 10) |
      (((pteval_t)(3)) << 8)) |
      (arm64_kernel_unmapped_at_el0() ? (((pteval_t)(1)) << 11) : 0)) |
      (((pteval_t)(1)) << 53) |
      (((pteval_t)(1)) << 54) |
      (((pteval_t)(1)) << 55) |
      ((((pteval_t)(1)) << 51)) |
      (((pteval_t)((1))) << 2)))
   }))

Searching for definitions in the kernel sources: (meaningful sampling only)

/* https://elixir.bootlin.com/linux/v4.9.294/source(-) */
/* (+)arch/arm64/include/asm/pgtable-hwdef.h */
#define PTE_TYPE_MASK       (_AT(pteval_t, 3) << 0)
...
#define PTE_NG		    (_AT(pteval_t, 1) << 11) 
...
#define PTE_ATTRINDX(t)     (_AT(pteval_t, (t)) << 2)    

/* (+)arch/arm64/include/asm/mmu.h */
static inline bool arm64_kernel_unmapped_at_el0(void)   
{
   return IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0) &&
   cpus_have_const_cap(ARM64_UNMAP_KERNEL_AT_EL0);
}

/* (+)arch/arm64/include/asm/pgtable-prot.h */
#define PTE_DIRTY           (_AT(pteval_t, 1) << 55)    

/* (+)arch/arm64/include/asm/memory.h */
#define MT_DEVICE_nGnRE     1

The macro pattern _AT(pteval_t, x) can be cleared right away. IIUC, it serves to handle calling both from assembly and C. When you are concerned by the C case, like we do, it boils down to x, eg. ((pteval_t)(1)) << 10) becomes 1 << 10.

arm64_kernel_unmapped_at_el0 is in part ‘kernel configuration dependant’, defaulting to ‘yes’, so let’s simplify our job and bring it in, PTE_NG which is the choice ? (((pteval_t)(1)) << 11), for all cases.

(((pteval_t)((1))) << 2))) turns out to be PTE_ATTRINDX(t) with MT_DEVICE_nGnRE as input. Inspecting the kernel sources, there are four other values intended as input to PTE_ATTRINDX(t). PTE_ATTRINDX behaves like a function so let implement it as such.

type Pgprot_T is mod 2**64; -- type will hold on 64 bits 

type Memory_T is range 0 .. 5;
MT_DEVICE_NGnRnE : constant Memory_T := 0;
MT_DEVICE_NGnRE  : constant Memory_T := 1;
...
MT_NORMAL_WT     : constant Memory_T := 5;

function PTE_ATTRINDX (Mt : Memory_T) return Pgprot_T is
   (Pgprot_T(Mt * 2#1#e+2)); -- base # based_integer # exponent

Here I want to show another way to replicate C behavior, this time using bitwise operations. Something like PTE_TYPE_MASK value ((pteval_t)(3)) << 0 cannot be approached like we did before. 3 takes two bits and is somewhat a magic number. What we can do is improve on the representation. We are doing bit masks so why not express using binary numbers directly. It even makes sense graphically.

PTE_VALID      : Pgprot_T := 2#1#e+0;
...
PTE_TYPE_MASK  : Pgprot_T := 2#1#e+0 + 2#1#e+1; -- our famous 3
...
PTE_HYP_XN     : Pgprot_T := 2#1#e+54;

-- kernel.ads
type Phys_Addr_T is new System.Address;
type Iomem_Access is new System.Address;

-- kernel.adb
function Ioremap (Phys_Addr : Phys_Addr_T; 
                  Size      : Ic.Size_T) return Iomem_Access is
...         
   Pgprot : Pgprot_T := (PTE_TYPE_MASK or
                         PTE_AF        or
                         PTE_SHARED    or
                         PTE_NG        or
                         PTE_PXN       or
                         PTE_UXN       or
                         PTE_DIRTY     or
                         PTE_DBM       or
                         PTE_ATTRINDX (MT_DEVICE_NGnRE));
begin
   return Ioremap_C (Phys_Addr, Size, Pgprot);
end;

So what is interesting here?

Ada is flexible. The original Pgprot_T values arrangement did not allow record mapping like we previously did for type Workqueue_Flags. We adapted by replicating the C implementation, OR‘ing all values to create a final mask.
Everything has been tidied up by strong typing. We are now stuck with disciplined stuff.
Representation is explicit, expressed in the intended base.
Once again this typing machinery lives at the most restrictive scope, inside the Ioremap function. Because Ada “scoping” has few special rules, refactoring up/out of scopes usually boils down to a simple blocks swapping game.

Emitting assembly

Now we give a look at ioread32 and iowrite32. Turns out those are, again, a cascade of static inline and macros ending up directly emitting GCC assembly directives (detailing only iowrite32).

/* https://elixir.bootlin.com/linux/v4.9.294/source(-) */
/* (+)include/asm-generic/io.h */
static inline void iowrite32(u32 value, volatile void __iomem *addr)
{
   writel(value, addr);
}
/* (+)include/asm/io.h */
#define writel(v,c)     ({ __iowmb(); writel_relaxed((v),(c)); })
#define __iowmb()       wmb()    

/* (+)include/asm/barrier.h */
#define wmb()           dsb(st) 
#define dsb(opt)        asm volatile("dsb " #opt : : : "memory")

/* (+)arch/arm64/include/asm/io.h */
#define writel_relaxed(v,c) \
   ((void)__raw_writel((__force u32)cpu_to_le32(v),(c)))
   
static inline void __raw_writel(u32 val, volatile void __iomem *addr)   
{
   asm volatile("str %w0, [%1]" : : "rZ" (val), "r" (addr));
}

In Ada it becomes

with System.Machine_Code
...
procedure Io_Write_32 (Val : U32; Addr : Iomem_Access) is
   use System.Machine_Code;
begin
   Asm (Template => "dsb st",
        Clobber  => "memory",
        Volatile => True);

   Asm (Template => "str %w0, [%1]",
        Inputs   => (U32'Asm_Input ("rZ", Val), 
                     Iomem_Access'Asm_Input ("r", Addr)),
        Volatile => True);
end;

This Io_Write_32 implementation is not portable as we rebuilt the macro following the expansion tailored for arm64. A C wrapper would be less trouble while ensuring portability. Nevertheless, we felt this experiment was a good opportunity to show assembly directives in Ada.

That’s it!

I hope you appreciated this moderately dense overview of Ada in the context of Linux kernel module developpement. I think we can agree that Ada is a really disciplined and powerful contender when it comes to system, pedal to the metal, programming. I thank you for your time and concern. Do not hesitate to reach out and, happy Ada coding!

I want to thank Quentin Ochem, Nicolas Setton, Fabien Chouteau, Jerome Lambourg, Michael Frank, Derek Schacht, Arnaud Charlet, Pat Bernardi, Leo Germond, and Artium Nihamkin for their different insights and feedback to nail this experiment.

The author, Olivier Henley, is a UX Engineer at AdaCore. His role is exploring new markets through technical stories. Prior to joining AdaCore, Olivier was a consultant software engineer for Autodesk. Prior to that, Olivier worked on AAA game titles such as For Honor and Rainbow Six Siege in addition to many R&D gaming endeavors at Ubisoft Montreal. Olivier graduated from the Electrical Engineering program in Polytechnique Montreal. He is a co-author of patent US8884949B1, describing the invention of a novel temporal filter implicating NI technology. An Ada advocate, Olivier actively curates GitHub’s Awesome-Ada list.

Binding 101

The binding thickness

The wrapper function

The reconstruction

The pointer shortcut

The Raw bindings, “100% Ada”

Flags, first take

Ioremap and iowrite32

Flags, second take

Emitting assembly

That’s it!

RELATED ARTICLESMORE FROM AUTHOR

Webinar: Harden Your Security Mindset: Break Down the Critical Security Risks for Web Apps

Kickstart Your Career & Save Up To 40%!

Further Your Education with Courses & Certifications

Save BIG on Earth Day Deals with Sitewide Savings!

Bridging Design and Runtime Gaps: AsyncAPI in Event-Driven Architecture

RELATED ARTICLES MORE FROM AUTHOR