• userconf support for the boot loader

    I have a machine at work, a Dell Optiplex 745, that cannot boot GENERIC NetBSD kernels. There is a problem in one of the uhci/ehci, bge or azalia drivers that causes a lockup at boot time because of a shared interrupt problem. Disabling ehci or azalia from the kernel lets the machine boot. In order to do that, there are two options: either you rebuild your kernel without the offending driver, or you boot into the userconf prompt with -c and, from there, manually disable the driver at each boot. None of the options are quite convincing.Of course, disabling a faulty driver is not the correct solution, but the workaround is useful on its own. I've just added a userconf command to the boot loader and its configuration file -- /boot and /boot.cfg respectively -- so that the end user can pass random userconf commands to the kernel in an automated way. userconf is a kernel feature that lets you change the parameters of builtin drivers and enable/disable them before the hardware detection routines are run.With this new feature in the boot loader, you can customize a GENERIC kernel without having to rebuild it! Yes, modules could help here too, but we are not there yet for hardware drivers. Note that OpenBSD has had a similar feature for a while with config -e, but they actually modify the kernel binary.You can check the patch out and comment about it in my post to tech-kern. [Continue reading]

  • ATF 0.6 released

    I am very happy to announce the availability of the 0.6 release of ATF. I have to apologize for this taking so long because the code has been mostly-ready for a while. However, doing the actual release procedure is painful. Testing the code in many different configurations to make sure it works, preparing the release files, uploading them, announcing the new release on multiple sites... not something I like doing often.Doing some late reviews, I have to admit that the code has some rough edges, but these could not delay 0.6 any more. The reason is that this release will unblock the NetBSD-SoC atfify project, making it possible to finally integrate all the work done in it into the main NetBSD source tree.Explicit thanks go to Lukasz Strzygowski. He was not supposed to contribute to ATF during his Summer of Code 2008 project, but he did, and he actually provided very valuable code.The next step is to update the NetBSD source tree to ATF 0.6. I have extensive local changes for this in my working copy, but I'm very tired at the moment. I think I'll postpone their commit until tomorrow so that I don't screw up something badly.Enjoy it and I'm looking for your feedback on the new stuff! [Continue reading]

  • pwd_mkdb and the new time_t

    NetBSD-current has recently switched time_t to be a 64-bit type on all platforms to cope with the year-2038 problem. This is causing all sorts of trouble, and a problem I found yesterday was that, after a clean install of NetBSD/amd64, it was impossible to change the data of any user through chfn. The command failed with:chfn: /etc/master.passwd: entry root inconsistent expirechfn: /etc/master.passwd: unchangedSuspiciously, the data presented by chfn showed an expiration date for root set in a seemingly-random day (October 14th, 2021). That seemed like some part of the system not parsing the user database correctly and generating random values. A sample test program that walked through the passwords database with getpwent(3) showed an invalid expiration date for root, even when /etc/master.passwd had a 0 in that field.After some debugging, I found out that libc tries to be compatible with old-format binary password databases (those generated by pwd_mkdb). In order to deal with compatibility, libc checks to see if the database has a VERSION field set in it. If not, it assumes it is an old version and thus time_t may not be 64-bit. If the VERSION field is set, it uses the new time_t.So what was the problem? pwd_mkdb did not set the VERSION field even though it wrote a new-format database. libc assumed it was laid out according to the old format but it was not, so it got garbage data during parsing. After some hacking, I fixed it as described in a post to current-users.Soon after, Christos Zoulas (the one who did all the time_t work) told me that he had already done these changes in his branch but forgot to merge them. He did now, and the code in the main branch should work fine. I still think there is a minor problem in it, but the major issue after installation should be gone. [Continue reading]

  • Windows 3.1 startup speed

    Out of boredom, I installed MS-DOS and Windows 3.1 on my machine a few days ago — yeah, I was inspired by the Hot Dog Stand comments in this post. Check it out here. Don't be scared, it was just a virtual machine!Anyway, this was fun because it reminded me of something. Back in 1994, my father bought a Pentium 60Mhz. After ordering it, we imagined how fast it could be compared to our older machine, a 386DX 40Mhz. Based on magazine reviews of those days, we supposed that Windows 3.1 could start in 1 or 2 seconds. But what a disappointment when we got the machine. It certainly was faster than the 386, but it took many more seconds to start Windows.Now, trying this same thing on the Macbook Pro, Windows 3.1 actually starts in less than 1 second. Finally, after almost 15 years, our thoughts have become true! [Continue reading]

  • Silencing the output of Python's subprocess.Popen

    I'm learning Python these days while writing an script to automate the testing of ATF under multiple virtual machines. I had this code in a shell script, but it is so ugly and clumsy that I don't even dare to add it to the repository. Hopefully, the new version in Python will be more robust and versatile enough to be published.One of the things I've been impressed by is the subprocess module and, in special, its Popen class. By using this class, it is trivial to spawn subprocesses and perform some IPC with them. Unfortunately, Popen does not provide any way to silence the output of the children. As I see it, it'd be nice if you'd pass an IGNORE flag as the stdout/stderr behavior, much like you can currently set those to PIPE or set stderr to STDOUT.The following trivial module implements this idea. It extends Popen so that the callers can pass the IGNORE value to the stdout/stderr arguments. (Yes, it is trivial but it is also one of the first Python code I write so... it may contain obviously non-Pythonic, ugly things.) The idea is that this exposes the same interface so that it can be used as a drop-in replacement. OK, OK, it lacks some methods and the constructor does not match the original signature, but this is enough for my current use cases!import subprocessIGNORE = -3STDOUT = subprocess.STDOUTassert IGNORE != STDOUT, "IGNORE constant is invalid"class Popen(subprocess.Popen): """Extension of subprocess.Popen with built-in support for silencing the output channels of a child process""" __null = None def __init__(self, args, stdout = None, stderr = None): subprocess.Popen.__init__(self, args = args, stdout = self._channel(stdout), stderr = self._channel(stderr)) def __del__(self): self._close_null() def wait(self): r = subprocess.Popen.wait(self) self._close_null() return r def _null_instance(self): if self.__null == None: self.__null = open("/dev/null", "w") return self.__null def _close_null(self): if self.__null != None: self.__null.close() self.__null = None assert self.__null == None, "Inconsistent internal state" def _channel(self, behavior): if behavior == IGNORE: return self._null_instance() else: return behaviorBy the way, somebody else suggested this same thing a while ago. Don't know why it hasn't been implemented in the mainstream subprocess module. [Continue reading]

  • Hardware (and other stuff) on sale

    I need to get rid of lots of stuff that I haven't used for a long time and that are only taking space here. Most of them will go to the trash but, if you are interested, let me know and make an offer for them! Yes, "for free" is valid on some items ;) Those items that have a price is the ones that I think are most valuable and I will keep them unless someone wants to buy them.Keep in mind that all this stuff is in Barcelona, Spain, and that shipping is not too practical for me. Still, if you are very interested in some item, we can probably sort it out.With all that said, check the list out. I may keep updating it during the following days.PS: How can I have accumulated so much crap?... [Continue reading]

  • Kernel modules in NetBSD/shark (or other ARMs)

    I reinstalled NetBSD-current recently on my shark (Digital DNARD) and, out of curiosity, I wanted to see if the new-style kernel modules worked fine on this platform. To test that, I attempted to load the puffs module and failed with an error message saying something like "kobj_reloc: unexpected relocation type 1". Similarly, the same error appeared when running the simpler regression tests in /usr/tests/modules.After seeing that error message, I tracked it down in the source code and ended in src/sys/arch/arm/arm32/kobj_machdep.c. A quick look at it and at src/sys/arch/arm/include/elf_machdep.h revealed that the kernel was lacking support for the R_ARM_PC24 relocation type. "It can't be too difficult to implement", I thought. Hah!Based on documentation, I understood that R_ARM_PC24 is used in "short" jumps. This relocation is used to signal the runtime system that the offset to the target address of a branch instruction has to be relocated. This offset is a 24-bit number and, when loaded, it has to be shifted two bits to the left to accommodate for the fact that instructions are 32-bit aligned. Before the relocation, there is some addend encoded in the instruction that has to be loaded, sign-extended and shifted two bits to the left and, after all that, added to the calculated address.I spent hours trying to implement support for the R_ARM_PC24 relocation type because it didn't want to work as expected. I even ended up looking at the Linux code to see how they dealt with it, and I found out that I was doing exactly the same as them. So what was the problem? A while later I realized that this whole thing wasn't working because the relocated address to be stored in the branch instruction didn't fit in the 24 bits! That makes things harder to solve.At that point, I looked at the port-arm mailing list and found that several other people were looking at this same issue. Great, some time "wasted" but a lot of new stuff learnt. Anyway, it turns out there are basically two solutions to the problem described above. The first involves generating jump trampolines for the addresses that fall too far away. The second one is simpler: just change the kernel to load the modules closer to the kernel text, and thus make the jump offsets fit into the 24 bits of the instructions. Effectively, there is a guy that has got almost everything working already.Let's see if they can get it working soon! [Continue reading]

  • Happy new 2009!

    Happy new year to everyone!I know that the blog has been quite dead recently. Let's see if it can revive during 2009 (but not today). I already have a few ideas for future posts so stay tuned! [Continue reading]