My talk at SSTIC 2024 in Rennes

A few months ago, I had the wonderful experience of presenting as an invited speaker at Symposium sur la Sécurité des Technologies de l’Information et des Communications (SSTIC) in Rennes, France. 🇨🇵

From what my French friends have told me, this is one of the largest and most relevant information security conferences in France, and this year marked its 22nd edition.

The conference is typically held in French, with mine being the only talk in English this year. So, I’m really excited to share the video of the presentation with you all. 🙂 🙌🏽🐧

Enhancing spatial safety: Better array-bounds checking in C (and Linux) — Gustavo A. R. Silva

The C language has historically suffered from a lack of proper bounds-checking on all kinds of arrays. The Kernel Self-Protection Project has been addressing this issue for several years. In this presentation, we will learn about the most recent hardening efforts to resolve the problem of bounds-checking, particularly for fixed-size and flexible arrays.

We will explore the different mechanisms being used to harden key APIs like memcpy() against buffer overflows, which includes the use of some interesting built-in compiler functions. We will also talk about a couple of recent compiler options like -fstrict-flex-arrays and -Wflex-array-member-not-at-end, as well as the new __counted_by__ attribute released in Clang-18 a few weeks ago, which helps us gain run-time bounds-checking coverage on flexible arrays.

Overall, we will discuss how various challenges have been overcome and highlight the innovations developed to solve the problem of array bounds-checking in both C and the Linux kernel once and for all.


Here is a link to the full presentation and slides: https://www.sstic.org/2024/presentation/invite_2024_2/

Thank you!

Here are some photos I took while I was in beautiful Rennes for the conference. 🙂

My talk at Kernel Recipes 2024

I gave a presentation at Kernel Recipes in September this year. 🇨🇵🗣️🐧

My slides are already up, and as always, I hope people find them useful. 🙂

I had some insightful discussions with the audience about the problem I’m currently working on. They asked questions and shared their thoughts. It turned out to be a very productive and interesting session. I really enjoyed it!

This conference is truly one of a kind, and for a number of reasons, it has a special place in my heart. ❤️

Slides: https://embeddedor.com/slides/2024/kr/kr2024.pdf (pdf)

Here are some photos from the closing day of Kernel Recipes last week. 🇨🇵❤️

Now, for me and everyone who’s been attending and speaking at conferences these past weeks: it’s definitely time to catch up on so much-needed sleep.

Thanks to everyone who attended this year! See you all soon!

Gracias totales.

ONE simple and rewarding way to contribute to the Linux Kernel: Fix Coverity issues

Introduction

Motivated by the series of events I describe below, I decided to write this short blog post about how fixing Coverity issues can open the door to your first meaningful contributions to the Linux kernel. I hope people find it both inspiring and useful. 🙂

Kernel Newbies and Kernel Janitors

In October last year, I replied the following to an email sent to the kernel-janitors mailing list.

> Yesterday someone on my lists just sent an email looking for kernel
> tasks. This was a university student in a kernel programming class.
> We also have kernel-janitors and outreachy and those people are always
> asking for small tasks.

We have tons of issues waiting to be audited and fixed here:

https://scan.coverity.com/projects/linux-next-weekly-scan

You will never run out of fun. :) People just need to sign up.

That's really a great way to learn and gain experience across the whole
kernel tree.

--
Gustavo

At the time, my response didn’t gain much traction. However, early this month, I came across a familiar message on the kernel newbies mailing list asking for guidance on how to contribute to the Linux kernel.

Hi all,

I am an embedded software engineer. I use Linux every day, and I appreciate its neatness and simplicity.

One day, I watched a video from Greg: https://youtu.be/LLBrBBImJt4, and I started wondering if maybe I could contribute to the Linux kernel. So, I sent a very simple (and maybe stupid) patch to the community:

[...]

It turns out that the patch was rejected.

So, my question is: how can I start contributing to the Linux kernel? Maybe I could start by fixing some small bugs?

Thanks,
Qianqiang Liu

To which I replied:

Hi!

> One day, I watched a video from Greg: https://youtu.be/LLBrBBImJt4, and I started wondering if maybe I could contribute to the Linux kernel.


If you are interested in security, fixing Coverity issues is a great way to
contribute to the kernel. Here are some presentations that you might find
useful:

https://embeddedor.com/slides/2017/kr/kr2017.pdf
https://embeddedor.com/slides/2018/kr/kr2018.pdf
https://embeddedor.com/slides/2019/kr/kr2019.pdf

You can also watch these presentations on YouTube for additional context.

You can sign up here for linux-next scans:
https://scan.coverity.com/projects/linux-next-weekly-scan

and here for -rc scans:
https://scan.coverity.com/projects/linux

I hope this helps.
--
Gustavo

Later that day, I received a couple of notifications informing me that someone was requesting access to the Linux kernel Coverity scans. I granted the access and forgot about it.

Fixing Coverity issues

Then, early last week, an email from that same thread landed my inbox:

Hi,

Thank you all for the good advice.
I have now successfully submitted some small changes to the kernel:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9b4af913465cc5f903227237d833b4911430fd97
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=590efcd3c75f0e1f7208cf1c8dff5452818b70f2
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7fd551a87ba427fee2df8af4d83f4b7c220cc9dd
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=93497752dfed196b41d2804503e80b9a04318adb

Contributing to the Linux kernel is not that hard, all we need is
patience and persistence.

I definitely will do more work on the Linux kernel!

--
Best,
Qianqiang Liu

I was so happy to discover that those patches looked quite similar to the ones I used to submit back in the day when I was trying to land my first Coverity fixes. Then in a subsequent email, this was confirmed:

Hi Malatesh,

> Can you help me to contribute, what I needs to do ?

You can refer to this mail thread. The advice from Gustavo is pretty
useful.

Also, there is a document for submitting your first kernel patch:
https://kernelnewbies.org/FirstKernelPatch

--
Best,
Qianqiang Liu

Send patches, gain experience

Even though, at the time I started working in the Linux kernel I already had solid experience in embedded systems, C programming, and had taken a Linux kernel development training, I was totally new to the Linux kernel community and all the nuances around upstream contributions in particular. As I was about to learn over the years, this is what is actually crucial for becoming a successful kernel contributor. However, gaining this experience takes time, and the only way to become an experienced contributor, as you might have guessed, is to send tons of patches.

Bug fixing presentations

So, for those interested in landing their first kernel patches, one simple way to start gaining experience contributing to the Linux kernel is by fixing as many Coverity issues as possible. I promise you’ll learn a lot in the process. 😉

Check out the following presentations, where I dive deep into fixing Coverity issues and other problems in the Linux kernel:

And of course, sign up for linux-next (despite being named “linux-next weekly scan”, these are actually daily scans) and -rc Coverity scans. It’d be helpful if you could briefly mention this blog post in your request message when signing up, though it’s not required. Also, I encourage you to make sure you know how to submit a kernel patch. 🙂

Learn from existing contributions

Back in the day, when I started fixing issues in the kernel, it was not uncommon for me to feel a bit lost from time to time. One of the best things you can do in those situations is to look at what others are currently working on in your areas of interest.

In this case, one simple approach is to check the commits addressing Coverity issues that have recently landed in linux-next. The link below will take you to a list of all kernel patches in linux-next that contain the keyword Coverity in their changelog text.

I recommend studying them. If you don’t understand how people concluded that that was the right fix for the issue, take it a step further and look up the email thread that initiated the discussion and read it thoroughly to understand what is going on. This can be a great learning experience. 😉

Below is a link to all the Linux kernel mailing lists, including of course The Linux Kernel Mailing List or LKML:

You can start by checking LKML. However, it’s not uncommon for developers to omit that list and send their patches only to the relevant subsystem mailing lists. So, if you don’t find the thread on LKML, look it up on the other lists. Depending on the driver the patch affects, it should be obvious which lists to check.

Lastly, when you send a patch addressing Coverity issues, please briefly mention that in the changelog text. A simple Reported by Coverity is enough. This way, others can easily find your commits in the future and learn from your contributions as well. 🙂

Try in staging first

As a final piece of advice, I recommend starting by fixing issues in drivers/staging/. After landing several patches and gaining some experience from the feedback provided by the staging maintainers, you will feel more comfortable moving on to other areas of the kernel.

Each subsystem and driver in the kernel is usually maintained by different groups of people, each with their own way of doing things and their own idiosyncrasies. Adapting to these different methods is one the most important pieces of experience you will gain as you continue submitting patches and paying attention to the feedback you receive along the way. Always remember: upstream Linux kernel development is highly social. 😉

Enjoy!

Back to Paris to present at Kernel Recipes 2024

I’m really happy to share that I will be traveling to Paris to speak at Kernel Recipes in the week after the Open Source Summit Europe. ✈️🇨🇵🗣️🎙️ This will be my 6th consecutive edition speaking at one of the most unique Linux kernel conferences. I’m really excited about this opportunity, and as always, feel free to say hi if you see me around. 🙂👋🏽

My talk will cover the work I’ve been doing in the Kernel Self-Protection Project over the last few months to fix thousands of -Wflex-array-member-not-at-end warnings. It can also be considered a sequel to my presentation last year, where I introduced this GCC compiler option to the audience:

You can see the description of my upcoming presentation below.

Enhancing spatial safety: Fixing thousands of -Wflex-array-member-not-at-end warnings

The introduction of the new -Wflex-array-member-not-at-end compiler option, released in GCC-14, has revealed approximately 60,000 warnings in the Linux kernel. Among them, some legitimate bugs have been uncovered.

In this presentation, we will explore in detail the different strategies we are employing to resolve all these warnings. These methods have already helped us resolve about 30% of them. Our ultimate goal in the Kernel Self-Protection Project is to globally enable this option in mainline, further enhancing the security of the kernel in the spatial safety domain.

https://kernel-recipes.org/en/2024/enhancing-spatial-safety-fixing-thousands-of-wflex-array-member-not-at-end-warnings/

By the way, I’m currently writing a detailed blog post about this work. Stay tuned! 📝

Kernel Self-Protection Project ⚔️🛡️🐧

See the entire schedule here: https://kernel-recipes.org/en/2024/schedule/

How to use the new counted_by attribute in C (and Linux)

The counted_by attribute

The counted_by attribute was introduced in Clang-18 and will soon be available in GCC-15. Its purpose is to associate a flexible-array member with a struct member that will hold the number of elements in this array at some point at run-time. This association is critical for enabling runtime bounds checking via the array bounds sanitizer and the __builtin_dynamic_object_size() built-in function. In user-space, this extra level of security is enabled by -D_FORTIFY_SOURCE=3. Therefore, using this attribute correctly enhances C codebases with runtime bounds-checking coverage on flexible-array members.

Here is an example of a flexible array annotated with this attribute:

struct bounded_flex_struct {
        ...
        size_t count;
        struct foo flex_array[] __attribute__((__counted_by__(count)));
};

In the above example, count is the struct member that will hold the number of elements of the flexible array at run-time. We will call this struct member the counter.

In the Linux kernel, this attribute facilitates bounds-checking coverage through fortified APIs such as the memcpy() family of functions, which internally use __builtin_dynamic_object_size() (CONFIG_FORTIFY_SOURCE). As well as through the array-bounds sanitizer (CONFIG_UBSAN_BOUNDS).

The __counted_by() macro

In the kernel we wrap the counted_by attribute in the __counted_by() macro, as shown below.

#if __has_attribute(__counted_by__)
# define __counted_by(member)           __attribute__((__counted_by__(member)))
#else
# define __counted_by(member)
#endif
  • c8248faf3ca27 (“Compiler Attributes: counted_by: Adjust name…”)

And with this we have been annotating flexible-array members across the whole kernel tree over the last year.

diff --git a/drivers/net/ethernet/chelsio/cxgb4/sched.h b/drivers/net/ethernet/chelsio/cxgb4/sched.h
index 5f8b871d79afac..6b3c778815f09e 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/sched.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/sched.h
@@ -82,7 +82,7 @@ struct sched_class {
 
 struct sched_table {      /* per port scheduling table */
 	u8 sched_size;
-	struct sched_class tab[];
+	struct sched_class tab[] __counted_by(sched_size);
 };
  • ceba9725fb45 (“cxgb4: Annotate struct sched_table with …”)

However, as we are about to see, not all __counted_by() annotations are always as straightforward as the one above.

__counted_by() annotations in the kernel

There are a number of requirements to properly use the counted_by attribute. One crucial requirement is that the counter must be initialized before the first reference to the flexible-array member. Another requirement is that the array must always contain at least as many elements as indicated by the counter. Below you can see an example of a kernel patch addressing these requirements.

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.c
index dac7eb77799bd1..68960ae9898713 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.c
@@ -33,7 +33,7 @@ struct brcmf_fweh_queue_item {
 	u8 ifaddr[ETH_ALEN];
 	struct brcmf_event_msg_be emsg;
 	u32 datalen;
-	u8 data[];
+	u8 data[] __counted_by(datalen);
 };
 
 /*
@@ -418,17 +418,17 @@ void brcmf_fweh_process_event(struct brcmf_pub *drvr,
 	    datalen + sizeof(*event_packet) > packet_len)
 		return;
 
-	event = kzalloc(sizeof(*event) + datalen, gfp);
+	event = kzalloc(struct_size(event, data, datalen), gfp);
 	if (!event)
 		return;
 
+	event->datalen = datalen;
 	event->code = code;
 	event->ifidx = event_packet->msg.ifidx;
 
 	/* use memcpy to get aligned event message */
 	memcpy(&event->emsg, &event_packet->msg, sizeof(event->emsg));
 	memcpy(event->data, data, datalen);
-	event->datalen = datalen;
 	memcpy(event->ifaddr, event_packet->eth.h_dest, ETH_ALEN);
 
 	brcmf_fweh_queue_event(fweh, event);
  • 62d19b358088 (“wifi: brcmfmac: fweh: Add __counted_by…”)

In the patch above, datalen is the counter for the flexible-array member data. Notice how the assignment to the counter event->datalen = datalen had to be moved to before calling memcpy(event->data, data, datalen), this ensures the counter is initialized before the first reference to the flexible array. Otherwise, the compiler would complain about trying to write into a flexible array of size zero, due to datalen being zeroed out by a previous call to kzalloc(). This assignment-after-memcpy has been quite common in the Linux kernel. However, when dealing with counted_by annotations, this pattern should be changed. Therefore, we have to be careful when doing these annotations. We should audit all instances of code that reference both the counter and the flexible array and ensure they meet the proper requirements.

In the kernel, we’ve been learning from our mistakes and have fixed some buggy annotations we made in the beginning. Here are a couple of bugfixes to make you aware of these issues:

  • 6dc445c19050 (“clk: bcm: rpi: Assign ->num before accessing…”)
  • 9368cdf90f52 (“clk: bcm: dvp: Assign ->num before accessing…”)

Another common issue is when the counter is updated inside a loop. See the patch below.

diff --git a/drivers/net/wireless/ath/wil6210/cfg80211.c b/drivers/net/wireless/ath/wil6210/cfg80211.c
index 8993028709ecfb..e8f1d30a8d73c5 100644
--- a/drivers/net/wireless/ath/wil6210/cfg80211.c
+++ b/drivers/net/wireless/ath/wil6210/cfg80211.c
@@ -892,10 +892,8 @@ static int wil_cfg80211_scan(struct wiphy *wiphy,
 	struct wil6210_priv *wil = wiphy_to_wil(wiphy);
 	struct wireless_dev *wdev = request->wdev;
 	struct wil6210_vif *vif = wdev_to_vif(wil, wdev);
-	struct {
-		struct wmi_start_scan_cmd cmd;
-		u16 chnl[4];
-	} __packed cmd;
+	DEFINE_FLEX(struct wmi_start_scan_cmd, cmd,
+		    channel_list, num_channels, 4);
 	uint i, n;
 	int rc;
 
@@ -977,9 +975,8 @@ static int wil_cfg80211_scan(struct wiphy *wiphy,
 	vif->scan_request = request;
 	mod_timer(&vif->scan_timer, jiffies + WIL6210_SCAN_TO);
 
-	memset(&cmd, 0, sizeof(cmd));
-	cmd.cmd.scan_type = WMI_ACTIVE_SCAN;
-	cmd.cmd.num_channels = 0;
+	cmd->scan_type = WMI_ACTIVE_SCAN;
+	cmd->num_channels = 0;
 	n = min(request->n_channels, 4U);
 	for (i = 0; i < n; i++) {
 		int ch = request->channels[i]->hw_value;
@@ -991,7 +988,8 @@ static int wil_cfg80211_scan(struct wiphy *wiphy,
 			continue;
 		}
 		/* 0-based channel indexes */
-		cmd.cmd.channel_list[cmd.cmd.num_channels++].channel = ch - 1;
+		cmd->num_channels++;
+		cmd->channel_list[cmd->num_channels - 1].channel = ch - 1;
 		wil_dbg_misc(wil, "Scan for ch %d  : %d MHz\n", ch,
 			     request->channels[i]->center_freq);
 	}
@@ -1007,16 +1005,15 @@ static int wil_cfg80211_scan(struct wiphy *wiphy,
 	if (rc)
 		goto out_restore;
 
-	if (wil->discovery_mode && cmd.cmd.scan_type == WMI_ACTIVE_SCAN) {
-		cmd.cmd.discovery_mode = 1;
+	if (wil->discovery_mode && cmd->scan_type == WMI_ACTIVE_SCAN) {
+		cmd->discovery_mode = 1;
 		wil_dbg_misc(wil, "active scan with discovery_mode=1\n");
 	}
 
 	if (vif->mid == 0)
 		wil->radio_wdev = wdev;
 	rc = wmi_send(wil, WMI_START_SCAN_CMDID, vif->mid,
-		      &cmd, sizeof(cmd.cmd) +
-		      cmd.cmd.num_channels * sizeof(cmd.cmd.channel_list[0]));
+		      cmd, struct_size(cmd, channel_list, cmd->num_channels));
 
 out_restore:
 	if (rc) {
diff --git a/drivers/net/wireless/ath/wil6210/wmi.h b/drivers/net/wireless/ath/wil6210/wmi.h
index 71bf2ae27a984f..b47606d9068c8b 100644
--- a/drivers/net/wireless/ath/wil6210/wmi.h
+++ b/drivers/net/wireless/ath/wil6210/wmi.h
@@ -474,7 +474,7 @@ struct wmi_start_scan_cmd {
 	struct {
 		u8 channel;
 		u8 reserved;
-	} channel_list[];
+	} channel_list[] __counted_by(num_channels);
 } __packed;
 
 #define WMI_MAX_PNO_SSID_NUM	(16)
  • 34c34c242a1b (“wifi: wil6210: cfg80211: Use __counted_by…”)

The patch above does a bit more than merely annotating the flexible array with the __counted_by() macro, but that’s material for a future post. For now, let’s focus on the following excerpt.

-	cmd.cmd.scan_type = WMI_ACTIVE_SCAN;
-	cmd.cmd.num_channels = 0;
+	cmd->scan_type = WMI_ACTIVE_SCAN;
+	cmd->num_channels = 0;
 	n = min(request->n_channels, 4U);
 	for (i = 0; i < n; i++) {
 		int ch = request->channels[i]->hw_value;
@@ -991,7 +988,8 @@ static int wil_cfg80211_scan(struct wiphy *wiphy,
 			continue;
 		}
 		/* 0-based channel indexes */
-		cmd.cmd.channel_list[cmd.cmd.num_channels++].channel = ch - 1;
+		cmd->num_channels++;
+		cmd->channel_list[cmd->num_channels - 1].channel = ch - 1;
 		wil_dbg_misc(wil, "Scan for ch %d  : %d MHz\n", ch,
 			     request->channels[i]->center_freq);
 	}
 ...
--- a/drivers/net/wireless/ath/wil6210/wmi.h
+++ b/drivers/net/wireless/ath/wil6210/wmi.h
@@ -474,7 +474,7 @@ struct wmi_start_scan_cmd {
 	struct {
 		u8 channel;
 		u8 reserved;
-	} channel_list[];
+	} channel_list[] __counted_by(num_channels);
 } __packed;

Notice that in this case, num_channels is our counter, and it’s set to zero before the for loop. Inside the for loop, the original code used this variable as an index to access the flexible array, then updated it via a post-increment, all in one line: cmd.cmd.channel_list[cmd.cmd.num_channels++]. The issue is that once channel_list was annotated with the __counted_by() macro, the compiler enforces dynamic array indexing of channel_list to stay below num_channels. Since num_channels holds a value of zero at the moment of the array access, this leads to undefined behavior and may trigger a compiler warning.

As shown in the patch, the solution is to increment num_channels before accessing the array, and then access the array through an index adjustment below num_channels.

Another option is to avoid using the counter as an index for the flexible array altogether. This can be done by using an auxiliary variable instead. See an excerpt of a patch below.

diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
index 38eb7ec86a1a65..21ebd70f3dcc97 100644
--- a/include/net/bluetooth/hci.h
+++ b/include/net/bluetooth/hci.h
@@ -2143,7 +2143,7 @@ struct hci_cp_le_set_cig_params {
 	__le16  c_latency;
 	__le16  p_latency;
 	__u8    num_cis;
-	struct hci_cis_params cis[];
+	struct hci_cis_params cis[] __counted_by(num_cis);
 } __packed;

@@ -1722,34 +1717,33 @@ static int hci_le_create_big(struct hci_conn *conn, struct bt_iso_qos *qos)
 
 static int set_cig_params_sync(struct hci_dev *hdev, void *data)
 {
 ...

+	u8 aux_num_cis = 0;
 	u8 cis_id;
 ...

 	for (cis_id = 0x00; cis_id < 0xf0 &&
-	     pdu.cp.num_cis < ARRAY_SIZE(pdu.cis); cis_id++) {
+	     aux_num_cis < pdu->num_cis; cis_id++) {
 		struct hci_cis_params *cis;
 
 		conn = hci_conn_hash_lookup_cis(hdev, NULL, 0, cig_id, cis_id);
@@ -1758,7 +1752,7 @@ static int set_cig_params_sync(struct hci_dev *hdev, void *data)
 
 		qos = &conn->iso_qos;
 
-		cis = &pdu.cis[pdu.cp.num_cis++];
+		cis = &pdu->cis[aux_num_cis++];
 		cis->cis_id = cis_id;
 		cis->c_sdu  = cpu_to_le16(conn->iso_qos.ucast.out.sdu);
 		cis->p_sdu  = cpu_to_le16(conn->iso_qos.ucast.in.sdu);
@@ -1769,14 +1763,14 @@ static int set_cig_params_sync(struct hci_dev *hdev, void *data)
 		cis->c_rtn  = qos->ucast.out.rtn;
 		cis->p_rtn  = qos->ucast.in.rtn;
 	}
+	pdu->num_cis = aux_num_cis;
 
 ...
  • ea9e148c803b (“Bluetooth: hci_conn: Use __counted_by() and…”)

Again, the entire patch does more than merely annotate the flexible-array member, but let’s just focus on how aux_num_cis is used to access flexible array pdu->cis[].

In this case, the counter is num_cis. As in our previous example, originally, the counter is used to directly access the flexible array: &pdu.cis[pdu.cp.num_cis++]. However, the patch above introduces a new variable aux_num_cis to be used instead of the counter: &pdu->cis[aux_num_cis++]. The counter is then updated after the loop: pdu->num_cis = aux_num_cis.

Both solutions are acceptable, so use whichever is convenient for you. 🙂

Here, you can see a recent bugfix for some buggy annotations that missed the details discussed above:

  • [PATCH] wifi: iwlwifi: mvm: Fix _counted_by usage in cfg80211_wowlan_nd*

In a future post, I’ll address the issue of annotating flexible arrays of flexible structures. Spoiler alert: don’t do it!

Back to Europe to present at Open Source Summit

Happy to share that I will be traveling back to Europe in September to speak at the Open Source Summit Europe 2024 in Vienna. ✈️🇦🇹🗣️🎙️ I will also attend both Linux Security Summit and Linux Plumbers. 🧑🏽‍💻🐧 I hope to meet with a lot of friends that I haven’t seen in a while. Feel free to say hi if you see me around. 🙂

My talk will be about the work we’ve been doing in the Kernel Self-Protection Project over the last 5 years to harden the upstream Linux kernel, particularly focusing on spatial safety related to array-bounds checking. ⚔ 🛡 🐧 You can see the description below.

Challenges and Innovations Towards Spatial Safety in the Linux Kernel

The first flexible-array transformation we implemented in the kernel, as part of the Kernel Self-Protection Project, took place back in March 2019. At the time, our work on preventing integer overflows during memory allocations led us to discover an 8-year-old bug. Addressing this bug not only resolved a longstanding issue but also initiated the work of flexible-array transformations across the whole kernel tree.

This marked the beginning of a challenging yet rewarding journey to add bounds-checking on trailing arrays in the Linux kernel. Five years have passed since then, and we’ve come a long way. We have now new Clang and GCC hardening compiler options and attributes, that significantly improve the security of the Linux kernel, particularly in the spatial-safety area. We have new hardening helpers that make traditional methods less prone to error.

In general, we have new and safer ways of doing things, which usually require a learning curve, even for seasoned kernel developers. In this talk, we will walk through the most recent challenges and history of our quest to improve spatial safety in the Linux kernel, and with that, get rid of out-of-bounds bugs once and for all.

https://osseu2024.sched.com/event/1ej2k/challenges-and-innovations-towards-spatial-safety-in-the-linux-kernel-gustavo-a-r-silva-the-linux-foundation

I will start by explaining basic technical concepts and then move up to bleeding-edge kernel hardening. Whether you’re an advanced kernel developer or just starting to delve into the world of Linux kernel development, I’m sure you’ll find this presentation interesting and educational. 📖 I really hope to see many of you there. 🙂

You can see the entire schedule here: https://osseu2024.sched.com/

Kernel Self-Protection Project ⚔ 🛡 🐧

Google Open Source Peer Bonus Award

In other news from November, I want to share that I’m thrilled to be the recipient of this award from Google for the first time. I feel really grateful and honored! 🙂🙏🏽

This comes as a result of my contributions to the Linux kernel over the years.

Honestly, I didn’t even know about the existence of this award until I received an email from someone at Google informing me about it. However, learning about it made me feel really great!

My appreciation goes out to my teammates in the Kernel Self-Protection Project, especially to Kees Cook, who has been an invaluable mentor to me over the years. Special thanks to Greg Kroah-Hartman as well, who was instrumental in setting me on my journey as a Linux kernel developer. 👨🏽‍💻🐧

Influencing Software Security: The Impact of the Kernel Self-Protection Project ⚔️🛡️🐧

Compiler Options Hardening Guide

On November 29th, the Open Source Security Foundation (OpenSSF) released a comprehensive and thorough hardening guide aimed at mitigating potential vulnerabilities in C and C++ code through the use of various hardening compiler options.

This guide references some of the work we’ve accomplished over the years in the Kernel Self-Protection Project (KSPP), particularly our efforts to globally enable -Wimplicit-fallthrough and -fstrict-flex-arrays=3 in the upstream Linux kernel. 🐧

-Wimplicit-fallthrough

This warning flag warns when a fallthrough occurs unless it is specially marked as being intended. The Linux kernel project uses this flag; it led to the discovery and fixing of many bugs21.

-fstrict-flex-arrays=3

In this guide we recommend using the standard C99 flexible array notation [] instead of non-standard [0] or misleading [1], and then using -fstrict-flex-arrays=3 to improve bounds checking in such cases. In this case, code that uses [0] for a flexible array will need to be modified to use [] instead. Code that uses [1] for a flexible arrays needs to be modified to use [] and also extensively modified to eliminate off-by-one errors. Using [1] is not just misleading39, it’s error-prone; beware that existing code using [1] to indicate a flexible array may currently have off-by-one errors40.

GCC hardening features

The work of Qing Zhao is also referenced in the guide. Qing is making significant contributions to the KSPP by implementing hardening features in GCC, which we want to adopt in the Linux kernel.

Beyond the Linux kernel

In conclusion, it’s quite fulfilling to see the hardening work we undertake in the Kernel Self-Protection Project having a significant influence in the world of software security, beyond the Linux kernel. 🙂

November 2023 – Linux Kernel work

a71abeb3-f942-4200-b9de-0390f33f904e

-Wstringop-overflow

Late in October I sent a patch to globally enable the -Wstringop-overflowcompiler option, which finally landed in linux-next on November 28th. It’s expected to be merged into mainline during the next merge window, likely in the last couple of weeks of December, but “We’ll see”. I plan to send a pull request for this to Linus when the time is right. 🙂

I’ll write more about the challenges of enabling this compiler option once it’s included in 6.8-rc1, early next year. In the meantime, it’s worth mentioning that several people, including Kees Cook, Arnd Bergmann, and myself, have sent patches to fix -Wstringop-overflow warnings over the past few years.

Below are the patches that address the last warnings, together with the couple of patches that enable the option in the kernel. The first of them enables the option globally for all versions of GCC. However, -Wstringop-overflow is buggy in GCC-11. Therefore, I wrote a second patch adding this option under new configuration CC_STRINGOP_OVERFLOW in init/Kconfig, which is enabled by default for all versions of GCC except GCC-11. To handle the GCC-11 case I added another configuration: GCC11_NO_STRINGOP_OVERFLOW, which will disable -Wstringop-overflowby default for GCC-11 only.

Boot crash on ARM64

Another relevant task I worked on recently was debugging and fixing a boot crash on ARM64, reported by Joey Gouly. This issue was interesting as it related to some long-term work in the Kernel Self-Protection Project (KSPP), particularly our efforts to transform “fake” flexible arrays into C99 flexible-array members. In short, there was a zero-length fake flexible array at the end of a structure annotated with the __randomize_layout attribute, which needed to be transformed into a C99 flexible-array member.

This becomes problematic due to how compilers previously treated such arrays before the introduction of -fstrict-flex-arrays=3. The randstruct GCC plugin treated these arrays as actual flexible arrays, thus leaving their memory layout untouched when the kernel is built with CONFIG_RANDSTRUCT. However, after commit 1ee60356c2dc (‘gcc-plugins: randstruct: Only warn about true flexible arrays’), this behavior changed. Fake flexible arrays were no longer treated the same as proper C99 flexible-array members, leading to randomized memory layout for these arrays in structures annotated with __randomize_layout, which was the root cause of the boot crash.

To address this, I sent two patches. The first patch is the actual bugfix, which includes the flexible-array transformation. The second patch is complementary to commit 1ee60356c2dc, updating a code comment to clarify that “we don’t randomize the layout of the last element of a struct if it’s a proper flexible array.”

diff --git a/include/net/neighbour.h b/include/net/neighbour.h
index 07022bb0d44d..0d28172193fa 100644
--- a/include/net/neighbour.h
+++ b/include/net/neighbour.h
@@ -162,7 +162,7 @@ struct neighbour {
 	struct rcu_head		rcu;
 	struct net_device	*dev;
 	netdevice_tracker	dev_tracker;
-	u8			primary_key[0];
+	u8			primary_key[];
 } __randomize_layout;
 
 struct neigh_ops {
diff --git a/scripts/gcc-plugins/randomize_layout_plugin.c b/scripts/gcc-plugins/randomize_layout_plugin.c
index 910bd21d08f4..746ff2d272f2 100644
--- a/scripts/gcc-plugins/randomize_layout_plugin.c
+++ b/scripts/gcc-plugins/randomize_layout_plugin.c
@@ -339,8 +339,7 @@ static int relayout_struct(tree type)
 
 	/*
 	 * enforce that we don't randomize the layout of the last
-	 * element of a struct if it's a 0 or 1-length array
-	 * or a proper flexible array
+	 * element of a struct if it's a proper flexible array
 	 */
 	if (is_flexible_array(newtree[num_fields - 1])) {
 		has_flexarray = true;

These two patches will be soon backported to a couple of -stable trees.

-Wflex-array-member-not-at-end

During my last presentation at Kernel Recipes in September this year, I discussed a bit about -Wflex-array-member-not-at-end, which is a compiler option currently under development for GCC-14.

One of the highlights of the talk was a 6-year-old bug that I initially uncovered through grepping, and later, while reviewing some build logs from previous months, I realized that -Wflex-array-member-not-at-end had also detected this problem:

This bugfix was backported to 6.5.7, 6.1.57, 5.15.135, 5.10.198, 5.4.258 and 4.19.296 stable kernels.

Encouraged by this discovery, I started hunting for more similar bugs. My efforts led to fixing a couple more:

On November 28th, these two bugfixes were successfully backported to multiple stable kernel trees. The first fix was applied to the 6.6.3, 6.5.13, 6.1.64 stable kernels. The second fix was also applied to these, along with the 5.15.140 stable kernel.

I will have a lot of fun with -Wflex-array-member-not-at-end next year. 😄

-Warray-bounds

In addition to these tasks, I continued addressing -Warray-boundsissues. Below are some of the patches I sent for this.

Patch review and ACKs.

I’ve also been involved in patch review and providing ACKs. Kees Cook, for instance, has been actively annotating flexible-array members with the__counted_byattribute, and I’ve been reviewing those patches.

Google Open Source Peer Bonus Award

In other news from November, I want to share that I’m thrilled to be the recipient of this award from Google for the first time. I feel really grateful and honored! 🙂🙏🏽

This comes as a result of my contributions to the Linux kernel over the years.

Honestly, I didn’t even know about the existence of this award until I received an email from someone at Google informing me about it. However, learning about it made me feel really great!

My appreciation goes out to my teammates in the Kernel Self-Protection Project, especially to Kees Cook, who has been an invaluable mentor to me over the years. Special thanks to Greg Kroah-Hartman as well, who was instrumental in setting me on my journey as a Linux kernel developer.👨🏽‍💻🐧

Acknowledgements

Special thanks to The Linux Foundation and Google for supporting my Linux kernel work. 🙂

Dusting off this blog

This weekend I learned that Jerry Cooperstein has retired, and while dusting off my personal blog (I will be posting regularly next year –once a month, actually), I ran into my “LFD420 Linux Kernel Internals and Development” certificate of completion, which was signed by Jerry back in May 2016.

In December of that year, I quit my job at a consulting company to pursue my dream of becoming a professional Open Source Developer. Then, exactly one year after completing my Linux Foundation training (at The Linux Foundation Training and Certification), in May 2017, I began my career in Open Source. 😀

2024 will mark my 8th consecutive year working as a professional Upstream Linux Kernel Engineer, and I feel so grateful for all the things that have happened over the years. 🐧

Thanks to all the people who have dedicated their careers to making the dream of Open Source possible. It’s really a great community, full of people who truly make a difference in the lives of billions of people around the world.

I’m so honored to be part of this family. 🙂