ONE simple and rewarding way to contribute to the Linux Kernel: Fix Coverity issues

Introduction

Motivated by the series of events I describe below, I decided to write this short blog post about how fixing Coverity issues can open the door to your first meaningful contributions to the Linux kernel. I hope people find it both inspiring and useful. 🙂

Kernel Newbies and Kernel Janitors

In October last year, I replied the following to an email sent to the kernel-janitors mailing list.

> Yesterday someone on my lists just sent an email looking for kernel
> tasks. This was a university student in a kernel programming class.
> We also have kernel-janitors and outreachy and those people are always
> asking for small tasks.

We have tons of issues waiting to be audited and fixed here:

https://scan.coverity.com/projects/linux-next-weekly-scan

You will never run out of fun. :) People just need to sign up.

That's really a great way to learn and gain experience across the whole
kernel tree.

--
Gustavo

At the time, my response didn’t gain much traction. However, early this month, I came across a familiar message on the kernel newbies mailing list asking for guidance on how to contribute to the Linux kernel.

Hi all,

I am an embedded software engineer. I use Linux every day, and I appreciate its neatness and simplicity.

One day, I watched a video from Greg: https://youtu.be/LLBrBBImJt4, and I started wondering if maybe I could contribute to the Linux kernel. So, I sent a very simple (and maybe stupid) patch to the community:

[...]

It turns out that the patch was rejected.

So, my question is: how can I start contributing to the Linux kernel? Maybe I could start by fixing some small bugs?

Thanks,
Qianqiang Liu

To which I replied:

Hi!

> One day, I watched a video from Greg: https://youtu.be/LLBrBBImJt4, and I started wondering if maybe I could contribute to the Linux kernel.


If you are interested in security, fixing Coverity issues is a great way to
contribute to the kernel. Here are some presentations that you might find
useful:

https://embeddedor.com/slides/2017/kr/kr2017.pdf
https://embeddedor.com/slides/2018/kr/kr2018.pdf
https://embeddedor.com/slides/2019/kr/kr2019.pdf

You can also watch these presentations on YouTube for additional context.

You can sign up here for linux-next scans:
https://scan.coverity.com/projects/linux-next-weekly-scan

and here for -rc scans:
https://scan.coverity.com/projects/linux

I hope this helps.
--
Gustavo

Later that day, I received a couple of notifications informing me that someone was requesting access to the Linux kernel Coverity scans. I granted the access and forget about it.

Fixing Coverity issues

Then, early last week, an email from that same thread landed my inbox:

Hi,

Thank you all for the good advice.
I have now successfully submitted some small changes to the kernel:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9b4af913465cc5f903227237d833b4911430fd97
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=590efcd3c75f0e1f7208cf1c8dff5452818b70f2
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7fd551a87ba427fee2df8af4d83f4b7c220cc9dd
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=93497752dfed196b41d2804503e80b9a04318adb

Contributing to the Linux kernel is not that hard, all we need is
patience and persistence.

I definitely will do more work on the Linux kernel!

--
Best,
Qianqiang Liu

I was so happy to discover that those patches looked quite similar to the ones I used to submit back in the day when I was trying to land my first Coverity fixes. Then in a subsequent email, this was confirmed:

Hi Malatesh,

> Can you help me to contribute, what I needs to do ?

You can refer to this mail thread. The advice from Gustavo is pretty
useful.

Also, there is a document for submitting your first kernel patch:
https://kernelnewbies.org/FirstKernelPatch

--
Best,
Qianqiang Liu

Send patches, gain experience

Even though, at the time I started working in the Linux kernel I already had solid experience in embedded systems, C programming, and had taken a Linux kernel development training, I was totally new to the Linux kernel community and all the nuances around upstream contributions in particular. As I was about to learn over the years, this is what is actually crucial for becoming a successful kernel contributor. However, gaining this experience takes time, and the only way to become an experienced contributor, as you might have guessed, is to send tons of patches.

Bug fixing presentations

So, for those interested in landing their first kernel patches, one simple way to start gaining experience contributing to the Linux kernel is by fixing as many Coverity issues as possible. I promise you’ll learn a lot in the process. 😉

Check out the following presentations, where I dive deep into fixing Coverity issues and other problems in the Linux kernel:

And of course, sign up for linux-next (despite being named “linux-next weekly scan”, these are actually daily scans) and -rc Coverity scans. It’d be helpful if you could briefly mention this blog post in your request message when signing up, though it’s not required. Also, I encourage you to make sure you know how to submit a kernel patch. 🙂

Learn from existing contributions

Back in the day, when I started fixing issues in the kernel, it was not uncommon for me to feel a bit lost from time to time. One of the best things you can do in those situations is to look at what others are currently working on in your areas of interest.

In this case, one simple approach is to check the commits addressing Coverity issues that have recently landed in linux-next. The link below will take you to a list of all kernel patches in linux-next that contain the keyword Coverity in their changelog text.

I recommend studying them. If you don’t understand how people concluded that that was the right fix for the issue, take it a step further and look up the email thread that initiated the discussion and read it thoroughly to understand what is going on. This can be a great learning experience. 😉

Below is a link to all the Linux kernel mailing lists, including of course The Linux Kernel Mailing List or LKML:

You can start by checking LKML. However, it’s not uncommon for developers to omit that list and send their patches only to the relevant subsystem mailing lists. So, if you don’t find the thread on LKML, look it up on the other lists. Depending on the driver the patch affects, it should be obvious which lists to check.

Lastly, when you send a patch addressing Coverity issues, please briefly mention that in the changelog text. A simple Reported by Coverity is enough. This way, others can easily find your commits in the future and learn from your contributions as well. 🙂

Try in staging first

As a final piece of advice, I recommend starting by fixing issues in drivers/staging/. After landing several patches and gaining some experience from the feedback provided by the staging maintainers, you will feel more comfortable moving on to other areas of the kernel.

Each subsystem and driver in the kernel is usually maintained by different groups of people, each with their own way of doing things and their own idiosyncrasies. Adapting to these different methods is one the most important pieces of experience you will gain as you continue submitting patches and paying attention to the feedback you receive along the way. Always remember: upstream Linux kernel development is highly social. 😉

Enjoy!

Back to Paris to present at Kernel Recipes 2024

I’m really happy to share that I will be traveling to Paris to speak at Kernel Recipes in the week after the Open Source Summit Europe. ✈️🇨🇵🗣️🎙️ This will be my 6th consecutive edition speaking at one of the most unique Linux kernel conferences. I’m really excited about this opportunity, and as always, feel free to say hi if you see me around. 🙂👋🏽

My talk will cover the work I’ve been doing in the Kernel Self-Protection Project over the last few months to fix thousands of -Wflex-array-member-not-at-end warnings. It can also be considered a sequel to my presentation last year, where I introduced this GCC compiler option to the audience:

You can see the description of my upcoming presentation below.

Enhancing spatial safety: Fixing thousands of -Wflex-array-member-not-at-end warnings

The introduction of the new -Wflex-array-member-not-at-end compiler option, released in GCC-14, has revealed approximately 60,000 warnings in the Linux kernel. Among them, some legitimate bugs have been uncovered.

In this presentation, we will explore in detail the different strategies we are employing to resolve all these warnings. These methods have already helped us resolve about 30% of them. Our ultimate goal in the Kernel Self-Protection Project is to globally enable this option in mainline, further enhancing the security of the kernel in the spatial safety domain.

https://kernel-recipes.org/en/2024/enhancing-spatial-safety-fixing-thousands-of-wflex-array-member-not-at-end-warnings/

By the way, I’m currently writing a detailed blog post about this work. Stay tuned! 📝

Kernel Self-Protection Project ⚔️🛡️🐧

See the entire schedule here: https://kernel-recipes.org/en/2024/schedule/

How to use the new counted_by attribute in C (and Linux)

The counted_by attribute

The counted_by attribute was introduced in Clang-18 and will soon be available in GCC-15. Its purpose is to associate a flexible-array member with a struct member that will hold the number of elements in this array at some point at run-time. This association is critical for enabling runtime bounds checking via the array bounds sanitizer and the __builtin_dynamic_object_size() built-in function. In user-space, this extra level of security is enabled by -D_FORTIFY_SOURCE=3. Therefore, using this attribute correctly enhances C codebases with runtime bounds-checking coverage on flexible-array members.

Here is an example of a flexible array annotated with this attribute:

struct bounded_flex_struct {
        ...
        size_t count;
        struct foo flex_array[] __attribute__((__counted_by__(count)));
};

In the above example, count is the struct member that will hold the number of elements of the flexible array at run-time. We will call this struct member the counter.

In the Linux kernel, this attribute facilitates bounds-checking coverage through fortified APIs such as the memcpy() family of functions, which internally use __builtin_dynamic_object_size() (CONFIG_FORTIFY_SOURCE). As well as through the array-bounds sanitizer (CONFIG_UBSAN_BOUNDS).

The __counted_by() macro

In the kernel we wrap the counted_by attribute in the __counted_by() macro, as shown below.

#if __has_attribute(__counted_by__)
# define __counted_by(member)           __attribute__((__counted_by__(member)))
#else
# define __counted_by(member)
#endif
  • c8248faf3ca27 (“Compiler Attributes: counted_by: Adjust name…”)

And with this we have been annotating flexible-array members across the whole kernel tree over the last year.

diff --git a/drivers/net/ethernet/chelsio/cxgb4/sched.h b/drivers/net/ethernet/chelsio/cxgb4/sched.h
index 5f8b871d79afac..6b3c778815f09e 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/sched.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/sched.h
@@ -82,7 +82,7 @@ struct sched_class {
 
 struct sched_table {      /* per port scheduling table */
 	u8 sched_size;
-	struct sched_class tab[];
+	struct sched_class tab[] __counted_by(sched_size);
 };
  • ceba9725fb45 (“cxgb4: Annotate struct sched_table with …”)

However, as we are about to see, not all __counted_by() annotations are always as straightforward as the one above.

__counted_by() annotations in the kernel

There are a number of requirements to properly use the counted_by attribute. One crucial requirement is that the counter must be initialized before the first reference to the flexible-array member. Another requirement is that the array must always contain at least as many elements as indicated by the counter. Below you can see an example of a kernel patch addressing these requirements.

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.c
index dac7eb77799bd1..68960ae9898713 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.c
@@ -33,7 +33,7 @@ struct brcmf_fweh_queue_item {
 	u8 ifaddr[ETH_ALEN];
 	struct brcmf_event_msg_be emsg;
 	u32 datalen;
-	u8 data[];
+	u8 data[] __counted_by(datalen);
 };
 
 /*
@@ -418,17 +418,17 @@ void brcmf_fweh_process_event(struct brcmf_pub *drvr,
 	    datalen + sizeof(*event_packet) > packet_len)
 		return;
 
-	event = kzalloc(sizeof(*event) + datalen, gfp);
+	event = kzalloc(struct_size(event, data, datalen), gfp);
 	if (!event)
 		return;
 
+	event->datalen = datalen;
 	event->code = code;
 	event->ifidx = event_packet->msg.ifidx;
 
 	/* use memcpy to get aligned event message */
 	memcpy(&event->emsg, &event_packet->msg, sizeof(event->emsg));
 	memcpy(event->data, data, datalen);
-	event->datalen = datalen;
 	memcpy(event->ifaddr, event_packet->eth.h_dest, ETH_ALEN);
 
 	brcmf_fweh_queue_event(fweh, event);
  • 62d19b358088 (“wifi: brcmfmac: fweh: Add __counted_by…”)

In the patch above, datalen is the counter for the flexible-array member data. Notice how the assignment to the counter event->datalen = datalen had to be moved to before calling memcpy(event->data, data, datalen), this ensures the counter is initialized before the first reference to the flexible array. Otherwise, the compiler would complain about trying to write into a flexible array of size zero, due to datalen being zeroed out by a previous call to kzalloc(). This assignment-after-memcpy has been quite common in the Linux kernel. However, when dealing with counted_by annotations, this pattern should be changed. Therefore, we have to be careful when doing these annotations. We should audit all instances of code that reference both the counter and the flexible array and ensure they meet the proper requirements.

In the kernel, we’ve been learning from our mistakes and have fixed some buggy annotations we made in the beginning. Here are a couple of bugfixes to make you aware of these issues:

  • 6dc445c19050 (“clk: bcm: rpi: Assign ->num before accessing…”)
  • 9368cdf90f52 (“clk: bcm: dvp: Assign ->num before accessing…”)

Another common issue is when the counter is updated inside a loop. See the patch below.

diff --git a/drivers/net/wireless/ath/wil6210/cfg80211.c b/drivers/net/wireless/ath/wil6210/cfg80211.c
index 8993028709ecfb..e8f1d30a8d73c5 100644
--- a/drivers/net/wireless/ath/wil6210/cfg80211.c
+++ b/drivers/net/wireless/ath/wil6210/cfg80211.c
@@ -892,10 +892,8 @@ static int wil_cfg80211_scan(struct wiphy *wiphy,
 	struct wil6210_priv *wil = wiphy_to_wil(wiphy);
 	struct wireless_dev *wdev = request->wdev;
 	struct wil6210_vif *vif = wdev_to_vif(wil, wdev);
-	struct {
-		struct wmi_start_scan_cmd cmd;
-		u16 chnl[4];
-	} __packed cmd;
+	DEFINE_FLEX(struct wmi_start_scan_cmd, cmd,
+		    channel_list, num_channels, 4);
 	uint i, n;
 	int rc;
 
@@ -977,9 +975,8 @@ static int wil_cfg80211_scan(struct wiphy *wiphy,
 	vif->scan_request = request;
 	mod_timer(&vif->scan_timer, jiffies + WIL6210_SCAN_TO);
 
-	memset(&cmd, 0, sizeof(cmd));
-	cmd.cmd.scan_type = WMI_ACTIVE_SCAN;
-	cmd.cmd.num_channels = 0;
+	cmd->scan_type = WMI_ACTIVE_SCAN;
+	cmd->num_channels = 0;
 	n = min(request->n_channels, 4U);
 	for (i = 0; i < n; i++) {
 		int ch = request->channels[i]->hw_value;
@@ -991,7 +988,8 @@ static int wil_cfg80211_scan(struct wiphy *wiphy,
 			continue;
 		}
 		/* 0-based channel indexes */
-		cmd.cmd.channel_list[cmd.cmd.num_channels++].channel = ch - 1;
+		cmd->num_channels++;
+		cmd->channel_list[cmd->num_channels - 1].channel = ch - 1;
 		wil_dbg_misc(wil, "Scan for ch %d  : %d MHz\n", ch,
 			     request->channels[i]->center_freq);
 	}
@@ -1007,16 +1005,15 @@ static int wil_cfg80211_scan(struct wiphy *wiphy,
 	if (rc)
 		goto out_restore;
 
-	if (wil->discovery_mode && cmd.cmd.scan_type == WMI_ACTIVE_SCAN) {
-		cmd.cmd.discovery_mode = 1;
+	if (wil->discovery_mode && cmd->scan_type == WMI_ACTIVE_SCAN) {
+		cmd->discovery_mode = 1;
 		wil_dbg_misc(wil, "active scan with discovery_mode=1\n");
 	}
 
 	if (vif->mid == 0)
 		wil->radio_wdev = wdev;
 	rc = wmi_send(wil, WMI_START_SCAN_CMDID, vif->mid,
-		      &cmd, sizeof(cmd.cmd) +
-		      cmd.cmd.num_channels * sizeof(cmd.cmd.channel_list[0]));
+		      cmd, struct_size(cmd, channel_list, cmd->num_channels));
 
 out_restore:
 	if (rc) {
diff --git a/drivers/net/wireless/ath/wil6210/wmi.h b/drivers/net/wireless/ath/wil6210/wmi.h
index 71bf2ae27a984f..b47606d9068c8b 100644
--- a/drivers/net/wireless/ath/wil6210/wmi.h
+++ b/drivers/net/wireless/ath/wil6210/wmi.h
@@ -474,7 +474,7 @@ struct wmi_start_scan_cmd {
 	struct {
 		u8 channel;
 		u8 reserved;
-	} channel_list[];
+	} channel_list[] __counted_by(num_channels);
 } __packed;
 
 #define WMI_MAX_PNO_SSID_NUM	(16)
  • 34c34c242a1b (“wifi: wil6210: cfg80211: Use __counted_by…”)

The patch above does a bit more than merely annotating the flexible array with the __counted_by() macro, but that’s material for a future post. For now, let’s focus on the following excerpt.

-	cmd.cmd.scan_type = WMI_ACTIVE_SCAN;
-	cmd.cmd.num_channels = 0;
+	cmd->scan_type = WMI_ACTIVE_SCAN;
+	cmd->num_channels = 0;
 	n = min(request->n_channels, 4U);
 	for (i = 0; i < n; i++) {
 		int ch = request->channels[i]->hw_value;
@@ -991,7 +988,8 @@ static int wil_cfg80211_scan(struct wiphy *wiphy,
 			continue;
 		}
 		/* 0-based channel indexes */
-		cmd.cmd.channel_list[cmd.cmd.num_channels++].channel = ch - 1;
+		cmd->num_channels++;
+		cmd->channel_list[cmd->num_channels - 1].channel = ch - 1;
 		wil_dbg_misc(wil, "Scan for ch %d  : %d MHz\n", ch,
 			     request->channels[i]->center_freq);
 	}
 ...
--- a/drivers/net/wireless/ath/wil6210/wmi.h
+++ b/drivers/net/wireless/ath/wil6210/wmi.h
@@ -474,7 +474,7 @@ struct wmi_start_scan_cmd {
 	struct {
 		u8 channel;
 		u8 reserved;
-	} channel_list[];
+	} channel_list[] __counted_by(num_channels);
 } __packed;

Notice that in this case, num_channels is our counter, and it’s set to zero before the for loop. Inside the for loop, the original code used this variable as an index to access the flexible array, then updated it via a post-increment, all in one line: cmd.cmd.channel_list[cmd.cmd.num_channels++]. The issue is that once channel_list was annotated with the __counted_by() macro, the compiler enforces dynamic array indexing of channel_list to stay below num_channels. Since num_channels holds a value of zero at the moment of the array access, this leads to undefined behavior and may trigger a compiler warning.

As shown in the patch, the solution is to increment num_channels before accessing the array, and then access the array through an index adjustment below num_channels.

Another option is to avoid using the counter as an index for the flexible array altogether. This can be done by using an auxiliary variable instead. See an excerpt of a patch below.

diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
index 38eb7ec86a1a65..21ebd70f3dcc97 100644
--- a/include/net/bluetooth/hci.h
+++ b/include/net/bluetooth/hci.h
@@ -2143,7 +2143,7 @@ struct hci_cp_le_set_cig_params {
 	__le16  c_latency;
 	__le16  p_latency;
 	__u8    num_cis;
-	struct hci_cis_params cis[];
+	struct hci_cis_params cis[] __counted_by(num_cis);
 } __packed;

@@ -1722,34 +1717,33 @@ static int hci_le_create_big(struct hci_conn *conn, struct bt_iso_qos *qos)
 
 static int set_cig_params_sync(struct hci_dev *hdev, void *data)
 {
 ...

+	u8 aux_num_cis = 0;
 	u8 cis_id;
 ...

 	for (cis_id = 0x00; cis_id < 0xf0 &&
-	     pdu.cp.num_cis < ARRAY_SIZE(pdu.cis); cis_id++) {
+	     aux_num_cis < pdu->num_cis; cis_id++) {
 		struct hci_cis_params *cis;
 
 		conn = hci_conn_hash_lookup_cis(hdev, NULL, 0, cig_id, cis_id);
@@ -1758,7 +1752,7 @@ static int set_cig_params_sync(struct hci_dev *hdev, void *data)
 
 		qos = &conn->iso_qos;
 
-		cis = &pdu.cis[pdu.cp.num_cis++];
+		cis = &pdu->cis[aux_num_cis++];
 		cis->cis_id = cis_id;
 		cis->c_sdu  = cpu_to_le16(conn->iso_qos.ucast.out.sdu);
 		cis->p_sdu  = cpu_to_le16(conn->iso_qos.ucast.in.sdu);
@@ -1769,14 +1763,14 @@ static int set_cig_params_sync(struct hci_dev *hdev, void *data)
 		cis->c_rtn  = qos->ucast.out.rtn;
 		cis->p_rtn  = qos->ucast.in.rtn;
 	}
+	pdu->num_cis = aux_num_cis;
 
 ...
  • ea9e148c803b (“Bluetooth: hci_conn: Use __counted_by() and…”)

Again, the entire patch does more than merely annotate the flexible-array member, but let’s just focus on how aux_num_cis is used to access flexible array pdu->cis[].

In this case, the counter is num_cis. As in our previous example, originally, the counter is used to directly access the flexible array: &pdu.cis[pdu.cp.num_cis++]. However, the patch above introduces a new variable aux_num_cis to be used instead of the counter: &pdu->cis[aux_num_cis++]. The counter is then updated after the loop: pdu->num_cis = aux_num_cis.

Both solutions are acceptable, so use whichever is convenient for you. 🙂

Here, you can see a recent bugfix for some buggy annotations that missed the details discussed above:

  • [PATCH] wifi: iwlwifi: mvm: Fix _counted_by usage in cfg80211_wowlan_nd*

In a future post, I’ll address the issue of annotating flexible arrays of flexible structures. Spoiler alert: don’t do it!

Back to Europe to present at Open Source Summit

Happy to share that I will be traveling back to Europe in September to speak at the Open Source Summit Europe 2024 in Vienna. ✈️🇦🇹🗣️🎙️ I will also attend both Linux Security Summit and Linux Plumbers. 🧑🏽‍💻🐧 I hope to meet with a lot of friends that I haven’t seen in a while. Feel free to say hi if you see me around. 🙂

My talk will be about the work we’ve been doing in the Kernel Self-Protection Project over the last 5 years to harden the upstream Linux kernel, particularly focusing on spatial safety related to array-bounds checking. ⚔ 🛡 🐧 You can see the description below.

Challenges and Innovations Towards Spatial Safety in the Linux Kernel

The first flexible-array transformation we implemented in the kernel, as part of the Kernel Self-Protection Project, took place back in March 2019. At the time, our work on preventing integer overflows during memory allocations led us to discover an 8-year-old bug. Addressing this bug not only resolved a longstanding issue but also initiated the work of flexible-array transformations across the whole kernel tree.

This marked the beginning of a challenging yet rewarding journey to add bounds-checking on trailing arrays in the Linux kernel. Five years have passed since then, and we’ve come a long way. We have now new Clang and GCC hardening compiler options and attributes, that significantly improve the security of the Linux kernel, particularly in the spatial-safety area. We have new hardening helpers that make traditional methods less prone to error.

In general, we have new and safer ways of doing things, which usually require a learning curve, even for seasoned kernel developers. In this talk, we will walk through the most recent challenges and history of our quest to improve spatial safety in the Linux kernel, and with that, get rid of out-of-bounds bugs once and for all.

https://osseu2024.sched.com/event/1ej2k/challenges-and-innovations-towards-spatial-safety-in-the-linux-kernel-gustavo-a-r-silva-the-linux-foundation

I will start by explaining basic technical concepts and then move up to bleeding-edge kernel hardening. Whether you’re an advanced kernel developer or just starting to delve into the world of Linux kernel development, I’m sure you’ll find this presentation interesting and educational. 📖 I really hope to see many of you there. 🙂

You can see the entire schedule here: https://osseu2024.sched.com/

Kernel Self-Protection Project ⚔ 🛡 🐧

Google Open Source Peer Bonus Award

In other news from November, I want to share that I’m thrilled to be the recipient of this award from Google for the first time. I feel really grateful and honored! 🙂🙏🏽

This comes as a result of my contributions to the Linux kernel over the years.

Honestly, I didn’t even know about the existence of this award until I received an email from someone at Google informing me about it. However, learning about it made me feel really great!

My appreciation goes out to my teammates in the Kernel Self-Protection Project, especially to Kees Cook, who has been an invaluable mentor to me over the years. Special thanks to Greg Kroah-Hartman as well, who was instrumental in setting me on my journey as a Linux kernel developer. 👨🏽‍💻🐧

Influencing Software Security: The Impact of the Kernel Self-Protection Project ⚔️🛡️🐧

Compiler Options Hardening Guide

On November 29th, the Open Source Security Foundation (OpenSSF) released a comprehensive and thorough hardening guide aimed at mitigating potential vulnerabilities in C and C++ code through the use of various hardening compiler options.

This guide references some of the work we’ve accomplished over the years in the Kernel Self-Protection Project (KSPP), particularly our efforts to globally enable -Wimplicit-fallthrough and -fstrict-flex-arrays=3 in the upstream Linux kernel. 🐧

-Wimplicit-fallthrough

This warning flag warns when a fallthrough occurs unless it is specially marked as being intended. The Linux kernel project uses this flag; it led to the discovery and fixing of many bugs21.

-fstrict-flex-arrays=3

In this guide we recommend using the standard C99 flexible array notation [] instead of non-standard [0] or misleading [1], and then using -fstrict-flex-arrays=3 to improve bounds checking in such cases. In this case, code that uses [0] for a flexible array will need to be modified to use [] instead. Code that uses [1] for a flexible arrays needs to be modified to use [] and also extensively modified to eliminate off-by-one errors. Using [1] is not just misleading39, it’s error-prone; beware that existing code using [1] to indicate a flexible array may currently have off-by-one errors40.

GCC hardening features

The work of Qing Zhao is also referenced in the guide. Qing is making significant contributions to the KSPP by implementing hardening features in GCC, which we want to adopt in the Linux kernel.

Beyond the Linux kernel

In conclusion, it’s quite fulfilling to see the hardening work we undertake in the Kernel Self-Protection Project having a significant influence in the world of software security, beyond the Linux kernel. 🙂

November 2023 – Linux Kernel work

a71abeb3-f942-4200-b9de-0390f33f904e

-Wstringop-overflow

Late in October I sent a patch to globally enable the -Wstringop-overflowcompiler option, which finally landed in linux-next on November 28th. It’s expected to be merged into mainline during the next merge window, likely in the last couple of weeks of December, but “We’ll see”. I plan to send a pull request for this to Linus when the time is right. 🙂

I’ll write more about the challenges of enabling this compiler option once it’s included in 6.8-rc1, early next year. In the meantime, it’s worth mentioning that several people, including Kees Cook, Arnd Bergmann, and myself, have sent patches to fix -Wstringop-overflow warnings over the past few years.

Below are the patches that address the last warnings, together with the couple of patches that enable the option in the kernel. The first of them enables the option globally for all versions of GCC. However, -Wstringop-overflow is buggy in GCC-11. Therefore, I wrote a second patch adding this option under new configuration CC_STRINGOP_OVERFLOW in init/Kconfig, which is enabled by default for all versions of GCC except GCC-11. To handle the GCC-11 case I added another configuration: GCC11_NO_STRINGOP_OVERFLOW, which will disable -Wstringop-overflowby default for GCC-11 only.

Boot crash on ARM64

Another relevant task I worked on recently was debugging and fixing a boot crash on ARM64, reported by Joey Gouly. This issue was interesting as it related to some long-term work in the Kernel Self-Protection Project (KSPP), particularly our efforts to transform “fake” flexible arrays into C99 flexible-array members. In short, there was a zero-length fake flexible array at the end of a structure annotated with the __randomize_layout attribute, which needed to be transformed into a C99 flexible-array member.

This becomes problematic due to how compilers previously treated such arrays before the introduction of -fstrict-flex-arrays=3. The randstruct GCC plugin treated these arrays as actual flexible arrays, thus leaving their memory layout untouched when the kernel is built with CONFIG_RANDSTRUCT. However, after commit 1ee60356c2dc (‘gcc-plugins: randstruct: Only warn about true flexible arrays’), this behavior changed. Fake flexible arrays were no longer treated the same as proper C99 flexible-array members, leading to randomized memory layout for these arrays in structures annotated with __randomize_layout, which was the root cause of the boot crash.

To address this, I sent two patches. The first patch is the actual bugfix, which includes the flexible-array transformation. The second patch is complementary to commit 1ee60356c2dc, updating a code comment to clarify that “we don’t randomize the layout of the last element of a struct if it’s a proper flexible array.”

diff --git a/include/net/neighbour.h b/include/net/neighbour.h
index 07022bb0d44d..0d28172193fa 100644
--- a/include/net/neighbour.h
+++ b/include/net/neighbour.h
@@ -162,7 +162,7 @@ struct neighbour {
 	struct rcu_head		rcu;
 	struct net_device	*dev;
 	netdevice_tracker	dev_tracker;
-	u8			primary_key[0];
+	u8			primary_key[];
 } __randomize_layout;
 
 struct neigh_ops {
diff --git a/scripts/gcc-plugins/randomize_layout_plugin.c b/scripts/gcc-plugins/randomize_layout_plugin.c
index 910bd21d08f4..746ff2d272f2 100644
--- a/scripts/gcc-plugins/randomize_layout_plugin.c
+++ b/scripts/gcc-plugins/randomize_layout_plugin.c
@@ -339,8 +339,7 @@ static int relayout_struct(tree type)
 
 	/*
 	 * enforce that we don't randomize the layout of the last
-	 * element of a struct if it's a 0 or 1-length array
-	 * or a proper flexible array
+	 * element of a struct if it's a proper flexible array
 	 */
 	if (is_flexible_array(newtree[num_fields - 1])) {
 		has_flexarray = true;

These two patches will be soon backported to a couple of -stable trees.

-Wflex-array-member-not-at-end

During my last presentation at Kernel Recipes in September this year, I discussed a bit about -Wflex-array-member-not-at-end, which is a compiler option currently under development for GCC-14.

One of the highlights of the talk was a 6-year-old bug that I initially uncovered through grepping, and later, while reviewing some build logs from previous months, I realized that -Wflex-array-member-not-at-end had also detected this problem:

This bugfix was backported to 6.5.7, 6.1.57, 5.15.135, 5.10.198, 5.4.258 and 4.19.296 stable kernels.

Encouraged by this discovery, I started hunting for more similar bugs. My efforts led to fixing a couple more:

On November 28th, these two bugfixes were successfully backported to multiple stable kernel trees. The first fix was applied to the 6.6.3, 6.5.13, 6.1.64 stable kernels. The second fix was also applied to these, along with the 5.15.140 stable kernel.

I will have a lot of fun with -Wflex-array-member-not-at-end next year. 😄

-Warray-bounds

In addition to these tasks, I continued addressing -Warray-boundsissues. Below are some of the patches I sent for this.

Patch review and ACKs.

I’ve also been involved in patch review and providing ACKs. Kees Cook, for instance, has been actively annotating flexible-array members with the__counted_byattribute, and I’ve been reviewing those patches.

Google Open Source Peer Bonus Award

In other news from November, I want to share that I’m thrilled to be the recipient of this award from Google for the first time. I feel really grateful and honored! 🙂🙏🏽

This comes as a result of my contributions to the Linux kernel over the years.

Honestly, I didn’t even know about the existence of this award until I received an email from someone at Google informing me about it. However, learning about it made me feel really great!

My appreciation goes out to my teammates in the Kernel Self-Protection Project, especially to Kees Cook, who has been an invaluable mentor to me over the years. Special thanks to Greg Kroah-Hartman as well, who was instrumental in setting me on my journey as a Linux kernel developer.👨🏽‍💻🐧

Acknowledgements

Special thanks to The Linux Foundation and Google for supporting my Linux kernel work. 🙂

Dusting off this blog

This weekend I learned that Jerry Cooperstein has retired, and while dusting off my personal blog (I will be posting regularly next year –once a month, actually), I ran into my “LFD420 Linux Kernel Internals and Development” certificate of completion, which was signed by Jerry back in May 2016.

In December of that year, I quit my job at a consulting company to pursue my dream of becoming a professional Open Source Developer. Then, exactly one year after completing my Linux Foundation training (at The Linux Foundation Training and Certification), in May 2017, I began my career in Open Source. 😀

2024 will mark my 8th consecutive year working as a professional Upstream Linux Kernel Engineer, and I feel so grateful for all the things that have happened over the years. 🐧

Thanks to all the people who have dedicated their careers to making the dream of Open Source possible. It’s really a great community, full of people who truly make a difference in the lives of billions of people around the world.

I’m so honored to be part of this family. 🙂

Two new KoC labs in Mexico

It took a while, but here is a report of the last KoC trip to Oaxaca, Mexico. Recently, some of us have been very busy due to exciting professional changes, but it is important to let people know that we continue working towards bringing Free Open Source Software, Computers and Education to underprivileged communities in diverse areas.

After having a blast at the last Southern California Linux Expo the Kids on Computers crew traveled to Oaxaca, Mexico with the mission of setting up two new labs, which had been previously approved by the board of directors.

20170420_174035

Before going into details about the trip, I have to say that many people are involved in the planning process of every KoC trip and, due to a number of reasons, many of them are not able to get on a plane and visit the schools and meet the communities that benefits from their hard work. I want to express my gratitude and appreciation to all of them.

So this time was very special. We had plans to set up two new labs near Oaxaca city as well as visiting most of the labs we have in both the Huajuapan and Monte Albán area (this area in particular is 20 minutes from downtown Oaxaca). All this within a week and with less than 10 volunteers, actually we were only 8 this time. So in order to accomplish this mission we had to come up with a different plan compared to other years. We split into two groups, the Laptop (LPT) team, who was in charge of setting up a lab in the Antequera School and the Raspberry Pi (RPI) team, who was in charge of setting up the lab in Constitución de 1917 school. Yeah, we set up our second Raspberry Pi lab in Oaxaca! (you can learn more about the first one here 🙂 )

Below you can see the people in each group and our schedule:

Group LPT (OAX / Huajuapan)

  • George
  • Avni
  • Hermes
  • Thomas
  • Adam (Thurs-Friday)

Group RPI (OAX)

  • Gustavo
  • Tim
  • Peter (Mon-Tues)
  • Adam (Mon-Wed)

Monday (Apr 24) – Day 1

  • Group LPT – OAX

    • Morning / Early afternoon:

      • New Lab: Antequera

        • 2 HP laptops installation + 1 server  (HP Laptop) + 1 projector + 1 router.

        • Upgrade 13 Windows system. (Install Ubermix)

        • Installation on old machines – minimum 2GB RAM, minimum 100GB HD

    • Late afternoon:

  • Group RPI – OAX

    • New Lab: Constitución de 1917

      • 15 Raspberry Pi clients + 1 servers (RPI) + 1 projector

Tuesday (Apr 25) –

  • Group LPT and RPI – OAX (we may split up in the afternoon)

    • Morning:

      • Meet 7:45am @ Aurora

      • José Vasconcelos – Raspberry Pi Lab

      • Antequera

        • 1 hour class for students

        • 1 hour class for teachers

        • Triage the remaining 6 broken computers

        • Install client on teacher’s computer

        • Test projector

    • Afternoon:

      • Benito Juárez

        • Upgraded 6 laptops Monday

        • Other systems

          • 4 of them 512 MB RAM

          • 1 Acer Laptop has a 1GB of RAM

          • 1 Dell laptop has 1GB of RAM

          • 1 356 MB Laptop

          • 2 microSD cards

Wednesday (Apr 26) –

  • Group LPT – Huajuapan – leave on 6AM van

  • Group RPI – OAX

    • New lab: Constitución de 1917

Thursdays (Apr 27) –

  • Group LPT – Huajuapan

    • Acatlima brief visit to view progress of building

    • UTM

    • Saucitlán de Morelos

  • Group RPI – OAX

    • New lab: Antequera

Friday (Apr 28) –

  • Group LPT – OAX

  • Group RPI – OAX

    • Constitución de 1917

César, one of the volunteers that could not join us in this trip, prepared Ubermix 3 for the Laptops and Raspbian for the Pi’s prior to the trip. It worked very well. Ubermix 3 was much more responsive than Ubermix 2 on older laptops.

Some microSD card issues

We ran into some problems with one type of microSD card we wanted to use for the Pi’s. It turned out that some 16GB microSD cards don’t have enough space for the Raspbian distro that César prepared.

20170820_191939

Tim and I ended up walking all around Oaxaca city, trying to find the right one to use.

20170820_221437

Below you can see some photos we took in Tim’s room at hotel Aurora in downtown Oaxaca, while we were trying out different microSD cards.

    

Until we found it!

20170426_140100

The RPI servers Adam, Tim and George prepared prior to the trip worked out of the box!

   20170424_105159

The Antequera School

The LPT team did a remarkable job getting rid of Windows and installing Ubermix 3 on 13 desktops in Antequera school. This lab was already operational but they wanted some extra computers. So we gave them some laptops, cast a spell and turned the whole lab to the Open Source. 😀

SmartSelectImage_2017-04-13-07-25-00(1)

IMG_20170118_110445691

20170425_142732

20170425_135519

The kids learned the importance of testing. 😛

20170425_135836

Internet in a Box working just fine:

20170425_140004

20170425_135920

Constitución de 1917 school

This school is located in San Javier, Xoxocotlán, Oaxaca. The map below gives you an idea of the distance between our first Raspberry Pi lab and this new one. They are actually pretty close and, we consider this a great advantage for both the communities to interact and support each other and for us, once we don’t have to travel long distances to visiting the labs when we are in town. You also can see two stars on your right side of the map. Those are two more KoC labs. The one on the top is Emiliano Zapata elementary school (our first lab in the Monte Albán area, near Oaxaca city) and the one on the bottom is Benito Juárez.

san_javier

Follow this link to interact with the map and get familiar with the neighborhood.

20170424_121802

We found out this school has a very nice library:

 

 

Juan Villoro is my favorite Mexican writer and, in my opinion, one of the best ones. El libro salvaje (The wild book) is a story about a book that doesn’t want to be read, which is a pretty clever way of tricking kids into reading. I was very happy and amazed to find this book in the school library.

20170424_094312

This lab was a little bit challenging to set up. First, as I wrote above, we had to find the right microSD cards to use, then we had some problems with the resolution of the monitors, so we had to reconfigure the Pi’s manually.

 

We bought the monitors locally and, it turned out that some of the mounting screws were not the proper ones. The school personnel was very supportive and one of them went all the way down to downtown Oaxaca and got us the right ones. Below you can see Peter working on the monitors.

20170424_094314

Then we had to figure out how to hook the Pi’s to the monitors.

20170428_194412   20170428_195226

20170428_192533

20170428_194445

As you can see below, at the end everything worked out fine thanks to a very determined group of volunteers. We made our way out of the school around 10PM walking in heavy rain, but with the priceless satisfaction of having set up a very nice and completely functional Raspberry Pi lab (with Pi’s 3) with Internet in a Box running on a Raspberry Pi 3, plus a projector.

20170428_185759

20170428_190244

In a future post I will write about our visits to some of the other labs we have in Oaxaca. It’s been very exciting to write this post and recall the things we experienced and went through during that week.

The prequel – Two days before Day 1

Special kudos go to Adam, who on his arrival at Mexico city spent the whole day at the Technology Plaza purchasing projectors, keyboards, mice, headphones and regulators. He compared prices looking for the best deals, trying to make the most out of our funds and, searched the whole plaza for Raspberry Pi cases. Adam ended up carrying 11 bags to the bus station between purchased equipment and peripherals he was already carrying with him. He left Mexico city and arrived in Oaxaca city at 2:30AM the next day. Thomas, Peter and Avni picked him up in 2 taxis.

It would have been very complicated if not impossible to set up our 2 new labs without Adam’s great spirit and remarkable work. Thank you, Adam!

20170424_152311

Some words on the trip by KoC President Avni Khatri:

“To say the least, our KOC trips are never without an adventure – there are so many lessons learned on this trip. Many thanks to everyone here and at home for your hard work and dedication.  It is very much appreciated and is allowing us to achieve our goal to provide access to technology to kids in underserved areas.”

Thank you to everyone who contributed to make this trip a success. We want to go back to Mexico this November, so stay tuned if you want to join us.

We are currently raising funds for two new labs. Please help us make it a reality by donating here. Every little bit helps! 🙂

My Linux Kernel activities in May-July 2017

Hello everybody,

During this time I’ve managed to fix 151 issues in the Linux kernel. I got 16 patches upstream in May, 29 in June and 106 more in July.

The following is a list of the top ten Linux kernel developers over the last four months. I’ve managed to make it to the top three thanks to my recent contributions 🙂 :
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/stats

The following file contains detailed information regarding the categories and types of bugs I’ve resolved, as well as the software components in which they were detected:
https://github.com/GustavoARSilva/linux-hardening/blob/master/cii/2017/reports/mayjunjul_detailed

As a result, I’ve managed to contribute to the following subsystems and architectures during this time:

Below are more links to my contributions upstream during this time:

https://github.com/GustavoARSilva/linux-hardening/blob/master/cii/2017/reports/mayjunjul_commits.log
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/log/?qt=author&q=Gustavo+A.+R.+Silva

Also, during the last week I’ve been working with Julia Lawall on a Coccinelle script that is awaiting upstream at the moment. This script is going to help kernel developers to reduce the code size and increase maintainability, in cases where the lifetime of some variables don’t need to be extended beyond their scope. See the example below:

In the code above, the static on local variable var is unnecessary because such variable is always initialized before it can be used. So there is no need to extend the lifetime of the variable beyond its scope, which in this case is the foo() function.

I’ve managed to identify and fix more than 10 of such cases during the last month and, currently there exist around 60 more in the last linux-next tree.

Similar cases are expected to emerge in the Linux kernel in the future, as they can be easily introduced during code refactoring or maintenance.

The Coccinelle script I’ve been working on is intended to detect and fix those cases. Follow the link for more details: https://lkml.org/lkml/2017/8/1/34

Special thanks to The Linux Foundation‘s Core Infrastructure Initiative for supporting my work. 🙂

Gustavo A. R. Silva