Don’t try this at home.
For the record: It takes an eeepc 901 just under nine hours to build a Fedora kernel.
Yes, dumb idea, but I just had to know.
Don’t try this at home.
For the record: It takes an eeepc 901 just under nine hours to build a Fedora kernel.
Yes, dumb idea, but I just had to know.
I thought I was done with this. Then, today I saw this. To the best of my knowledge, Fedora 8 didn’t suffer from the bug I originally described several posts ago. I think this one happening at nearly midnight UTC is coincidence.
There’s a “me too” in the comments, but it seems odd that two people on slashdot saw it, but we never heard a peep on the Fedora mailing lists, or in bugzilla. Or even in upstream kernel.org. It could just be coincidence, the story is unsurprisingly short on details. I guess slashdot stories are easier to write than bug reports. But without additional debugging info we won’t ever know. Bear in mind that last time we saw a crash of this nature it didn’t affect everyone then either.
It was only by chance I managed to catch the backtrace in the `06 crash. I actually had two locked up machines, but one had its screen blanked, and wouldn’t unblank. The other machine had blanking disabled (setterm -blank 0) and thankfully, had also been set up to use a VGA screen resolution so had plenty of lines to display the whole backtrace.
The boot tracing post I wrote up led me to scrutinise the dmesg a little further. There’s a ton of data in there, and not all of it makes sense.
To call out one example..
[ 7.209578] calling snap_init+0×0/0x2a @ 1
[ 7.215488] initcall snap_init+0×0/0x2a returned 0 after 72 usecs
What is this stuff doing built into the vmlinuz? This stuff is a prime candidate for being modular, given that not everyone needs it. (I’ll bet a majority of users don’t even know what it is, let alone ‘need’ it).
This is defined in net/802/psnap.c Looking at net/802/Makefile, we see this gets built providing one or more of the following CONFIG options are set..
obj-$(CONFIG_LLC) += p8022.o psnap.o
obj-$(CONFIG_TR) += p8022.o psnap.o tr.o
obj-$(CONFIG_IPX) += p8022.o psnap.o p8023.o
obj-$(CONFIG_ATALK) += p8022.o psnap.o
Lets take these one by one.
Here’s where the fail begins. In the Fedora kernel, we had CONFIG_LLC set to =m. But something ends up overriding that decision, and making it a built-in. [note to self: make oldconfig shout when this happens]. Something must be ‘select’ing it somewhere. There’s actually quite a few things that do. But it turns out that the culprit in this case is CONFIG_TR. Wait, tokenring support is being built-in for every user ?
Afraid so. And why this happens is a bit tragic.
Looking at the definition of CONFIG_TR in drivers/net/tokenring/Kconfig is enlightening.
menuconfig TR
bool “Token Ring driver support”
depends on NETDEVICES && !UML
depends on (PCI || ISA || MCA || CCW)
select LLC
The ‘bool’ being the key problem here. Because TR ends up being built-in, all its dependencies and everything it selects also become built-ins. Changing this to a tristate solves this, and LLC remains modular.
Problems like this are why I really loathe the ‘select’ statement in kconfig.
The initcalls mentioned above were _tiny_ in comparison to some of the more obvious bloat, but there’s a bunch of low-hanging fruit in there like this which on first sight just leaves you wondering ‘wtf?’.
I’m on vacation, but I can’t resist playing with new toys, seeing as Santa didn’t bring me anything fun this year. In my previous post, I mentioned that 2.6.28 was for the most part, dull. Reading the excellent changelog summary at kernelnewbies, I noticed a new feature I had until now overlooked.
1.6. Boot tracer
The purpose of this tracer is to helps developers to optimize boot times: it records the timings of the initcalls. Its aim is to be parsed by the scripts/bootgraph.pl tool to produce graphics about boot inefficiencies, giving a visual representation of the delays during initcalls. Users need to enable CONFIG_BOOT_TRACER, boot with the “initcall_debug” and “printk.time=1″ parameters, and run “dmesg | perl scripts/bootgraph.pl > output.svg” to generate the final data.
Very interesting.
Here’s what it looks like when I ran it on my eeepc ..
Looks pretty. Though something isn’t quite right.
If you look at the dmesg output, there are over 400 initcalls. Even if we ignore all the uninteresting ones that return in 0 usecs, there’s still over 300 in the log. What gives?
The script stops parsing once the kernel hands off to the early userspace scripts in initramfs. So everything from the ‘Write protecting the kernel’ message at 8 seconds into the bootup is ignored. (Sidenote: The fact that we’re taking 8 seconds just to get to this stage _sucks_, more on that another time). So all the later modules that get loaded aren’t part of this picture.
My perl is a little rusty so I didn’t spot how it does it, but it seems there’s a threshold at which it ignores the initcalls that return quickly. Of those reported in the graph, the ‘fastest’ was ehci_hcd_init at 126837. acpi_init was almost in the same ballpark at 106445, but didn’t get picked up.
Whilst these big hitters are no doubt damaging to the boot time, it’s important to note the cumulative effect of all those five-figure initcalls.
Linus just released the 2.6.28 kernel. It’s already compiling for tomorrows rawhide. Fedora 9 & 10 will probably move to it in a few weeks. Typically, we wait until the dust settles and the first -stable release comes out. I was asked recently what bits we’re excited about in .28 for Fedora. To be honest, I didn’t give a great answer. It’s just not a “OMG, THIS RELEASE IS AWESOME” kind of release. There’s nothing in there that I was disappointed not to get into .27 for F10′s release. In fact, lots of the bits in there we were already carrying in the Fedora kernel (the DRM bits for example). Asides from that, it’s the usual churn of bug fixes, new drivers, and probably some interesting new bugs.
What about F11 ? Looking at the current schedule, we’ll get at least .29 in. I’m not sure we’ll have enough time to pull in .30 at this stage. All depends on how quickly .29 stabilises. Version numbers are so hand-wavy anyway. I wish when people asked me ‘what version is fX going to be’, they’d really ask ‘is feature xyz going to be merged by fX’. But people sure are hung up on numbers.
People tend not to notice kernel features these days for the most part. Which in a way is a good thing. (means it’s working). Unless it’s something that gets a lot of press like “unified x86 architecture” “tickless kernel” “modesetting”. There are dozens of features every release, but people don’t really get excited about a lot of them, and for good reason. They’re mostly dull from a userspace programmer/end-user perspective.