Kernel hacking mini-howto

Where to begin?

I spent over ten years working for Red Hat on the Fedora kernel for living. I frequently got asked how I managed to swing that gig by people hoping to one day get into kernel hacking themselves. One of the most common things I get asked is that the kernel is so big, how could anyone possibly understand it all? Truth is, there are very few people that really understand the whole kernel. The majority of the 'big name' kernel hackers got where they are today by specialising in one thing, and branching out. There are exceptions to this of course, with a number of people like Andrew Morton, Alan Cox, and Linus who are 'all rounders', who have hacked on close to everything in the tree at some point. While the kernel could always use more people like these superheros, there is nothing wrong with becoming a specialist in one area.

One thing that both the all-rounders, and specialists have in common however, is an understanding of the common kernel APIs. Things like 'how to allocate/free memory' 'how to create proc/sysfs files' Most 'how do I ..' questions can be answered by taking a look at how other parts of the kernel are already doing something similar. With enough experience of use of the common APIs, higher level concepts are learned such as 'how to create userspace interfaces that don't suck'.

There is no fast path to learning kernel hacking. It comes down to a big time investment on your part to read (and understand) code, learn from mistakes you make (and you will make them!), and above all, realising that in the end, it's just code. There may some additional restrictions for kernel hacking that you could get away with in userspace, but once you've grasped the basics, a lot of it just follows.

The basics:

  • Make sure you understand how to compile and install a kernel before going any further.
  • A sound knowledge of C is essential. If you're still struggling with pointer arithmetic and such concepts, stick with hacking stuff in userspace until you understand more. A crash in userspace caused by your misunderstanding is a lot easier to debug, understand and learn from than one in the kernel which just causes your machine to lock up or reboot itself.

Must-read material:
Prerequisites:

Kernel specific books:
There have been a number of good books on kernel hacking written. Due to the rapid pace of Linux development however, they are out of date by the time they hit the printing press. They do however remain worth reading, as they explain fundamental concepts well, and knowing some of the historical developments can be useful knowledge. Other resources:
  • kernelnewbies is a great resource for those starting out. It contains a lot of examples, and pointers.
  • Jonathan Corbet also writes a really good concise summary of the past week of kernel development each week at lwn.net. It's well worth reading, even if you follow linux-kernel, as the rephrasing and explanations are sometimes a lot better in summaries than reading through a 200 email thread.
  • The Linux kernel comes with a Documentation/ directory which contains a number of really useful documents worth spending some time reading.
  • Finally the code itself. Find something that interests you, find out where in the kernel that is handled, and just start reading.

Useful tools:

  • git grep is invaluable. (A solid grounding in at least the basic git commands should be considered a prerequisite)
  • You're going to be building and rebuilding a lot. So consider installing 'ccache'. Most distros have it packaged so that it automatically sets itself up after installing. A useful trick is to put your ~/.ccache directory on the fastest drive you have, especially an SSD. If you're building on an especially fast system, you may want to benchmark both with/without ccache. You may also want to look at 'distcc' if you've a lot of potential build-cluster candidate machines local to you.
  • Regardless of whether you're an emacs or vi person, 'ctags' are invaluable for navigating your way around the source tree. 'make tags' in the toplevel of the kernel tree will generate an index. You can learn how to navigate with them in the man pages of your favorite editor. (In vi, ctrl-] over a symbol jumps to that function, ctrl-t takes you back. :ts will bring up a list of alternatives if there are >1 hit for that function name. You can also

    vim -t functionname

    from the command line)
  • Some other people find cscope really useful for the same purpose. 'make cscope' generates the index, running 'cscope' gets you an interface to jump to where functions are used/defined etc.

"I don't know what to hack on!"
A great way of putting your newly learned skills to good use is to take a look at the open bugs in the kernel bug tracker, find something, and try to help fix it. While many driver bugs need the hardware to really debug/test a solution, a lot of problems can still be found purely by code inspection. There are no shortage of new bugs being filed all the time, and bug-fixing is a great way to learn about many different areas of the kernel and how they interact.

"I really just don't get it"
Not everyone gets to be a spaceman, rockstar, or kernel hacker when they grow up. It's fine. Really. There are still a lot of things you can do to help out Linux.

  • Testing. Even if hacking code isn't your thing, building and testing the latest snapshots of Linus' tree, or Andrew's -mm tree if you're feeling really adventurous is always useful. If it breaks, great! You get to contribute something. A bug report to the linux-kernel list.
  • related to testing - write test tools. A new syscall got added? Great, write an application to use it in every way imaginable. Complain loudly when it breaks. Some of the simplest test tools have been the most useful to us. Filesystem stress tools like fsx have been so useful they become a 'must-use' tool for filesystem developers. My own Trinity project constantly finds bugs every release cycle. More tools like this would be awesome.
  • Hacking userspace isn't 'uncool'. There are a *lot* of things still in need of a lot of love in userspace. Find something that bugs you, and fix it. Can't fix it? Get involved with the people who wrote it, maybe they'll give you pointers. A lot of userspace projects have summer of code/mentorship programs that may be worth checking out.
  • Triage work. Bugzilla is swamped with bugs (both upstream kernel.org, and Fedora). A lot of them really old ones that may even be fixed by now. No-one has the time to regularly go through them all, looking for patches that never got applied upstream, closing duplicates, pinging reporters etc. Get involved!
  • Documentation. If along your journey you find something particularly hard to understand, and you found no documentation on it, here's your chance to be a documentation-writing-superhero! Kernel hackers hate writing documentation for some strange reason.

"What about janitor tasks?"
While the janitor project has some useful information, patches that do nothing but clean up code to comply with style guidelines and other such trivial patches aren't really a great way to learn. No-one ever learned any skills by changing indentation of a function. Learn some of the 'rules' proposed there, but instead of focusing on them as 'something to do', use those rules while doing something more useful.




back to Dave Jones home page..