One major difference that will be explored is the safety guarantees of each method. If you don't know what you're doing when writing a kernel module, you can completely crash your system. However, you need to know what you're doing to get an eBPF program to do any real damage[^1].
eBPF programs and kernel modules are all around us. Here are some real-world examples that have components that are either eBPF or kernel modules:
eBPF’s scope is primarily intended for Security, Observability, and Networking. Some examples of what an eBPF program can do include, but are not limited to: overwrite parts of the Linux networking stack, make decisions about a network packet before it even reaches the kernel, and keep track of kernel system calls made by other programs.
Kernel Modules are much broader in scope than eBPF. They can do almost anything an eBPF program can do and more. Two domains that Linux kernel modules can be used for that eBPF cannot is in device drivers and filesystems. One reason is that these use-cases require direct hardware access.
There is one situation I have found where a kernel module can not do what an eBPF program can do. Some modern network interface cards (NIC), such as SmartNIC based ones, support running an eBPF program type called XDP that can process packets as soon as the NIC receives the packet and before the packet reaches the kernel network stack. If the NIC does not support running XDP directly, XDP programs can still run on the CPU and still bypass the linux network stack for high-performance processing. To write non-eBPF code that would run on a SmartNIC, a vendor SDK and toolchain would be required, for example the Netronome Agilio Dev Kit. This is not done in a kernel module, this would be compiling code that runs directly on the NIC. In the other situation where programs bypass the linux kernel stack, but not run on the NIC itself, something like Data Plane Development Kit (DPDK) would be useful. This is not a kernel module as it runs in user-space.
With all this being said, the scope of eBPF might expand in the future. In Liz Rice's book Learning eBPF, she says that work is currently being done to add hardware device support to eBPF.
The following is a simplified explanation.
When kernel modules are loaded into the kernel, the kernel allocates memory for that program to execute, gives it all the privileges kernel code would get, and keeps a list tracking all kernel module names and locations so that the kernel can initialize and cleanup modules on user requests. Kernel module code runs directly on the CPU, like the kernel does.
eBPF programs on the other hand will run as bytecode inside the eBPF VM, which is a sandboxed subsystem of the linux kernel. The bytecode is Just-in-Time (JIT) compiled into machine code during execution. This is similar to how Java bytecode runs on the JVM. The eBPF Linux subsystem will also keep track of running programs.
eBPF programs can be written in any language that has a compiler that can output eBPF bytecode. Currently, at the time of writing, those languages are C, Rust, BPFTrace and P4.
Kernel modules can be written in C or Rust.
As mentioned, eBPF programs are usually launched with user-space applications. These applications can be written in C, Rust, Go, Python, or any language that has libraries that can load eBPF binaries into the eBPF linux subsystem.
Kernel Modules are very general purpose and can be used to write any type of program. There is no hand-holding or guardrails either.
eBPF on the other hand is very rigid and structured. There are even different specified eBPF program types. Some examples include:
BPF_PROG_TYPE_KPROBE
: determine whether a kprobe should fire or notBPF_PROG_TYPE_SCHED_CLS
: a network traffic-control classifierBPF_PROG_TYPE_LWT_*
: a network packet filter for lightweight tunnelsBPF_PROG_TYPE_XDP
: a network packet filter run from the device-driver receive pathBPF_PROG_CGROUP_DEVICE
: determine if a device operation should be permitted or notEach program type will then be restricted on the types of data or functions they are allowed to access since each program is geared towards a narrow domain. To get an eBPF program to compile, it must also pass a verifier check for program correctness and complexity.
if #define
s for different kernel versions necessary for portability.A takeaway from this is that development of eBPF programs and the linux kernel can be developed independently of each other, whereas a kernel module is very dependent on linux kernel development updates.
To learn about eBPF development, I would recommend starting with the BCC Python library. This is a user-space library that compiles and runs eBPF C programs. It can be installed by following these official instructions.
To write a simple hello world program, this article should get you started. You can ignore the installation instructions as following the official BCC install instructions mentioned above will suffice. If you run into macro redefinition errors while going through this article, you can reference this github issue.
If you want to go deeper into advanced topics, I recommend the book Learning eBPF by Liz Rice
To learn about kernel module development, this guide is very useful. It provides a simple hello world program at the beginning and then delves into more advanced topics.
Although eBPF is more rigid and provides fewer use-cases than kernel modules, it is generally easier to work with as there are higher-level libraries for interacting with kernel functionality than kernel modules do. It may be a good option to go with eBPF if the kernel you're working with supports the needed eBPF features (eBPF features are constantly being updated and may not be supported by older kernel versions) and if your use-case involves Networking, Security, or Observability. If your use-case is outside of that, you will probably have to stick with kernel modules for now. Time will tell if eBPF starts including more features to overtake use-cases that were traditionally reserved for kernel modules.
In a cloud-native world, having eBPF applications running on your infrastructure can have large benefits such as having better support for application-first networking and firewall rules, and increased performance in rewriting underlying network rules as workloads and services come and go. eBPF allows developers to create more complex applications at a faster speed and more independence from the kernel itself.
If you would like help installing an eBPF application like those mentioned above or are interested in migrating to eBPF drivers for existing networking solutions like Calico, get in touch with us at contact@ippon.tech