• Processing Makefile.am with M4

    ATF's Makefile.am, which is a single Makefile for the whole tree, was already at the 1300 lines mark and growing. At this size, it is unmanageable, and a quick look at its contents reveals tons of repeated delicate code.Why so much repeated code, you ask, if the whole point of Automake is to simplify Makefiles? Automake does in fact simplify Makefile code when you define targets known by Automake, such as binaries and/or libraries. However, as soon as you start doing fancy things with documentation, building tons of small programs or messing with shell scripts, things get out of control because you are left on your own to define their targets and their necessary build logic.Up until now, I had just kept up with the boilerplate code... but now that I'm starting to add pretty complex rules to generate HTML and plain text documentation out of XML files, the complexity must go. And here comes my solution:I've just committed an experiment to process Makefile.am with M4. I've been trying to look for prior art behind this idea and couldn't find any, so I'm not sure how well this will work. But, so far, this has cut down 350 lines of Makefile.am code.How does this work? First of all, I've written a script to generate the Makefile.am from the Makefile.am.m4 and put it in admin/generate-makefile.sh. All this script does is call M4, but I want to keep this logic in a single place because it has to be used from two call sites as described below.Then, I've added an autogen.sh script to the top-level directory that generates Makefile.am (using the previous script) and calls autoreconf -is. I'm against autogen.sh scripts that pretend to be smart instead of just calling autoreconf, but in this case I see no other way around it.At last, I've modified Makefile.am to add an extra rule to generate itself based on the M4 version. This, of course, also uses generate-makefile.sh.We'll see how this scales, but I'm so far happy with the results. [Continue reading]

  • Extending sudo credentials

    If you use sudo for, e.g. pkgsrc's just-in-time su, you may have often bitten by the problem that some compilations are slow and the build process stops right in the middle to ask you for a root password. If you go away while the system compiles, you'll be frustrated when you come back, as the process may still well be at the very beginning.This happens because, unless disabled by the system administrator, your sudo credentials last for 5 minutes. If you hadn't used sudo for those 5 minutes, it will ask you for your password again. A simple workaround for the problem is to automatically renew your credentials, say, every 2 minutes. You can do this by running the following command (from the same console you are using later on!) right before starting a pkgsrc build:$ ( while :; do sudo -v; sleep 120; done ) & [Continue reading]

  • Best config setting ever

    echo 'set editing-mode vi' >>~/.inputrcThis will enable vi-editing mode for all commands that use the GNU readline library (e.g. bash, python, bc, etc.), not only the shell. For the shell only (including non-bash shells), add 'set -o vi' to your shrc file.I don't know why I didn't do this before given that I'm a pretty hard vi user. Still, for some reason, I kept using emacs-like key bindings for command-line editing. Not any more! However, be careful: if you are used to vim's visual editing mode, you'll keep hitting 'v' in the command line and getting super annoyed.Enjoy! [Continue reading]

  • 1000 revisions for ATF

    Mmm! Revision 7ca234b9aceabcfe9a8a1340baa07d6fdc9e3d33, committed about an hour ago, marks the 1000th revision in the ATF repository. Thanks for staying with me if you are following the project :) [Continue reading]

  • Books by Joel Spolsky

    I just finished reading the third book in a row from Joel Spolsky, titled Joel on Software. Before this one, I read More Joel on Software and The Best Software Writing 1, all in a bit over a month. Note: I hadn't read any book cover-to-cover for a loooong while. Very interesting and entertaining books; highly recommended.Oh, and his writing style is really enjoyable. We, crappy blog writers, can learn a lot from him! [Continue reading]

  • Rearchitecting ATF

    During the last few weeks, I've been doing some ATF coding and, well... I'm not happy. At all. I keep implementing features but I feel, more and more, that ATF is growing out of control and that it is way too sluggish. It oughtn't be so slow. About 6 minutes to run the whole test suite in a Mac G3 I just got? HA! I bet I can do much, much, much better than that. Come on, we are targeting NetBSD, so we should support all those niche platforms rather well, and speed matters.The thing is, the current code base grew out of a prototype that didn't have that much of a design. Well, it had a design but, in my opinion, it has turned to be a bad design. I couldn't imagine that we could hit the bottlenecks (speed) and user-interface issues (for example, the huge difficulties that involve debugging a failing test case) that we are hitting. So...IT IS TIME FOR A CHANGE!!!I'm currently working on a written specification of what ATF will look like, hopefully, in the not-so-distant future. It will take a while to get there, but with enough effort, we soon will. And life will be better. And no, I'm not talking about a from-scratch rewrite; that'd only hurt the project. I plan to take incremental and safe steps, keeping the code base running all the time, but I will do a major face-lift of everything. (I wish I could say "we" instead of "I" here. But not there yet.)Why am I writing a specification, you ask? Well, because that forces me (or ANY developer) to think how I want the thing to look like and to decide, exactly, on what the design will be, which technologies will be used, which languages will be involved and in what components, etc. And no, I'm not talking of a class model design; I'm just talking about the main design of the whole picture, which is quite hard by itself. Plus having a spec will allow me to show it to you before I start coding and you will say "oh, wonderful, this new design sucks so much that I'm not going to bother with the new version". Or maybe hell will freeze and you will think, "mmm, this looks interesting, maybe it will solve these issues I'm having as regards speed, ease of debugging and ease of use".Anyway, I hope to have a draft "soon" and to hear any of the two possible comments as a result!Edit (July 29th): Alright, I have uploaded an extremely preliminary copy of the specification just so that you can see where my current ideas are headed. Expect many more changes to this document, so don't pay too much attention to the tiny details (most of which aren't there anyway yet). [Continue reading]

  • The mess of ATF's code

    Yes. ATF's code is a "bit" messy, to put it bluntly. I'm quite happy with some of the newest bits but there are some huge parts in it that stink. The main reason for this is that the "ugly" parts were the ones that were written first, and they were basically a prototype; we didn't know all the requirements for the code at that point... and we still don't know them, but we know we can do much better. Even though I'm writing in plural... I'm afraid we = I at the moment :-PSo, is it time for the big-rewrite-from-scratch? NO! Joel Spolsky wrote about why this is a bad idea and I have to agree with him. Yeah, I'm basically the only developer of the code so everything is in my head, and I'd do a rewrite with a fresh mind, but... I'd lose tons of work and, specially, I'd lose tons of code that deals with tricky corner-cases that are hard to remember.Sure, I want to clean things up but they'll happen incrementally. And preferably concurrently with feature additions. These two things could definitely happen at the same time if only I had infinite spare time...Anyway, the major point of this post is to describe what I don't like about the current code base and how I'd like to see it changing:A completely revamped C++ API for test cases. The current one sucks. It is not consistent with the C API. It lacks important functionality. It uses exceptions for test-case status reporting (yuck!). And it's ugly.Clear separation of "internal/helper" APIs from the test APIs. You'll agree that the "fs" module, which provides path abstraction and other file system management routines, is something that cannot be part of ATF's API. ATF is about testing. Period. Either that fs module should be in a separate library or should be completely hidden from the public. Otherwise, it'll suffer from abuse and, what scares me, will have to become part of ATF's API. And likewise, most — really — most of the modules in the current code are internal.Less dependencies from the C++ API to the C API. Most of the current C++ modules are wrappers of their corresponding C counterparts. This is nice for code reuse but makes the code extremely fragile. In C++, things like RAII can provide really robust code with minimum effort, but intermixing such C++ code with C makes things ugly really quickly. I'd like to find a way to keep the two libraries separate from each other (and thus keep the C++ binding "pure"), but at the same time I don't want to duplicate code... an interesting problem.Split the tarball into smaller pieces. People writing test cases for C applications don't want to pull in a huge package that depends on C++ and whatnot. And ATF is huge. It takes forever to compile. And this is a serious issue for broad adoption. Note: whether the tools are written in C++ or not is a separate issue, because these are not a dependency for anything!The shell binding is slow. Really slow compared to the other ones. Optimizations would be nice, but those do not address the root of the problem: it's costly to query information from shell-based tests at run time. I.e. it takes a long time to get the full list of test cases available in a test suite because you have to run every single test program with the -l flag. Keeping a separate file with test-case metadata alongside the binary could resolve this and allow more flexibility at run time.And some other things.Those are the major things I'd like to see addressed soon, but they involve tons of work. Of course, I'd like to be able to work on some features expected by other developers: easier debugging, DOCUMENTATION!...So, helpers welcome :-) [Continue reading]

  • Technicians and schedules

    Here I am, on the afternoon of a work day, sitting at home waiting for an eircom technician to come set it up my phone line. How nice. The story goes like this:Two weeks ago, I placed an online order to request a phone line, explicitly specifying that the physical installation is already done (even though I don't know if it works or not, but that should be fairly easy for them to check). A few days later, the technician called me saying that he'd come today (two weeks after), anytime from 12.00 to 15.00, but that I'd call the company the same day to get a more accurate schedule.Fine, I'll wait until the 23rd to do that call. But you know what happened, right? I called them this morning and they said that, effectively, the technician was coming today, from 12.30 onwards but they were unable to provide me any more specific information because the technicians have multiple appointments. What? Again, WHAT? At this age of technology, can't we implement a system to track technicians and their schedules? Can't we make some approximations of how long each visit will take? I bet it's trivial if you put in just some common sense.People have jobs, and they can't leave anytime for unknown periods of time; granted, I have some more freedom at Google, but that is absolutely not the case for most other companies. If you have to be at home at 12.30 sharp, and the appointment will last 30 minutes approximately, that is one thing, but having to be at home from 12.30 for an unexpected period of time, that is a very different thing.Just wondering... couldn't they just make the technician call you about 20-30 minutes before arrival so that you could make the same arrangements as him and be there at the same time? It doesn't seem such an insane request. [Continue reading]

  • Child-process management in C for ATF

    Let's face it: spawning child processes in Unix is a "mess". Yes, the interfaces involved (fork, wait, pipe) are really elegant and easy to understand, but every single time you need to spawn a new child process to, later on, execute a random command, you have to write quite a bunch of error-prone code to cope with it. If you have ever used any other programming language with higher-level abstraction layers — just check Python's subprocess.Popen — you surely understand what I mean.The current code in ATF has many places were child processes have to be spawned. I recently had to add yet another case of this, and... enough was enough. Since then, I've been working on a C API to spawn child processes from within ATF's internals and just pushed it to the repository. It's still fairly incomplete, but with minor tweaks, it'll keep all the dirty details of process management contained in a single, one-day-to-be-portable module.The interface tries to mimic the one that was designed on my Boost.Process Summer of Code project, but in C, which is quite painful. The main idea is to have a fork function to which you pass the subroutine you want to run on the child, the behavior you want for the stdout stream and the behavior you want for the stderr steam. These behaviors can be any of capture (aka create pipes for IPC communcations), silence (aka redirect to /dev/null), redirect to file descriptor and redirect to file. For simplicity, I've omitted stdin. With all this information, the fork function returns you an opaque structure representing the child, from which you can obtain the IPC channels if you requested them and on which you can wait for finalization.Here is a little example, with tons of details such as error handling or resource finalization removed for simplicity. The code below would spawn "/bin/ls" and store its output in two files named ls.out and ls.err:staticatf_error_trun_ls(const void *v){ system("/bin/ls"); return atf_no_error();}staticvoidsome_function(...){ atf_process_stream_t outsb, errsb; atf_process_child_t child; atf_process_status_t status; atf_process_status_init_redirect_path(&outsb, "ls.out"); atf_process_status_init_redirect_path(&errsb, "ls.err"); atf_process_fork(&child, run_ls, &outsb, &errsb, NULL); ... yeah, here comes the concurrency! ... atf_process_child_wait(&child, &status); if (atf_process_status_exited(&status)) printf("Exit: %dn", atf_process_status_exitstatus(&status)); else printf("Error!");}Yeah, quite verbose, huh? Well, it's the price to pay to simulate namespaces and similar other things in C. I'm not too happy with the interface yet, though, because I've already encountered a few gotchas when trying to convert some of the existing old fork calls to the new module. But, should you want to check the whole mess, check out the corresponding revision. [Continue reading]

  • How to find an apartment in Dublin

    It has been three weeks since I moved to Dublin, Ireland, and I finally have settled into my new apartment. It has taken me two weeks (I was pretty busy during the first one) to go through ads, visits and offers to finally get a place that is cozy, nicely decorated and decently located, all at a quite reasonable price. I could have gotten nicer places for a bit more money, but I'm happy with this one so far.If you are looking forward to finding a place to stay in Dublin, this post contains some suggestions based on my experience:First of all, keep in mind that Dublin is outrageously expensive. The prices for housing here are insane at the moment (OK, not as expensive as NYC or SF, but really expensive anyway). Be prepared to spend around 1K EUR for a nice 1 bedroom apartment, and 1.5K EUR for a nice 2 bedroom apartment. Things may improve in the next months, as they just did for the first quarter of the year.With that said, your first point of reference should be daft. This is the place where all landlords and agencies put their ads, and the place where everyone is looking for apartments. To get started, you need to know where you want to live. Get a rough idea and then locate that place in one of the Dublin postal districts and the ones surrounding it. Given that public transportation is... well... suboptimal, you don't want to live too far from your workplace. Then, hunt for places within your budget... and a budget a bit higher: you can always try to negotiate the rent down and get a nicer apartment than you would otherwise, still staying within your initial budget.Once you have selected some of the apartments you want to check, call the landlords or agents and ask for an appointment as soon as possible. And, during the visit, check a few basic stuff:Whether the house is old or new: if it's new, it'll probably be in nicer condition overall.Water pressure: old houses have poor water pressure.Electric shower: this is really scary to me, but it is what most old houses have to deal with poor water pressure.Carpet: nice, but a horrible mess to clean up.Garbage collection service: if the building does not do this for you, you'll have to pay for garbage collection separately. I just bought 3 bin tags and those were almost 9 EUR. Yes: 9 EUR to pay for the collection of THREE garbage bags.Location of supermarkets: Dublin is basically a big town, so most roads don't have shops. Make sure that you have a supermarket nearby where you can walk to to get basic stuff.Availability of cable/phone: you'll need this for Internet.Furniture: most apartments in Dublin are provided fully-furnished, so make sure to pick one with furniture that you like. Ask if you are allowed to replace some. Pay special attention to the mattress and couches!!Cutlery: OK, this is part of the furniture, but check what you have. Your landlord may provide you additional stuff for free upon request.Washer and dryer: you want to have a dryer, as most lease contracts state you cannot hung clothes on public places.Heating and double-windows: you'll need this during the winter.And, at last, don't hurry! The housing market has improved during the last months, so if you see a place that you like, you'll most likely have a few days to decide whether you want it or not (in the past, you had to decide during viewing time, or otherwise it'd be gone afterwards). Think well about your decision and negotiate; don't show yourself as impatient or you'd get worse deals!I think that's all for know. If there is anything else, the post will be updated :) [Continue reading]

  • Trying AdSense

    I've just decided to enable AdSense on this blog and see what the results are. If they are not worth it (what I'm expecting), I'll disable ads after a while. But who knows, maybe I get a nice surprise! [Continue reading]

  • Paella in NYC

    These days, I'm starting to cook by myself (aka learning) and yesterday I made paella for 6 people while staying in NYC (leaving on Sunday...). This is the third time in two weeks that I cook this Spanish dish, but I think the results were pretty good despite the lack of ingredients. After all, cooking is not as hard as I originally thought! And it's pretty fun too!Just blogging this because the results look nice: P.S. I'm now eating the leftovers from yesterday. Yummm! :-) [Continue reading]

  • Mailing lists for commit notifications

    The project I'm currently working on at university uses Subversion as its version control system. Unfortunately, the project itself has no mailing list to receive notifications on every commit, and the managers refuse to set this up. They do not see the value of such a list and they are scared of it because they probably assume that everyone ought to be subscribed to it.Having worked on projects that have a commit notification mailing list available, I strongly advise to have such a list anytime you have more than one developer working on a project[1]. Bonus points if every commit message comes with a bundled copy of the change's diff (in unified form!). This list must be independent from the regular development mailing list and it must be opt-in: i.e. never subscribe anyone by default, let themselves subscribe if they want to! Not everyone will need to receive this information, but it comes very useful... and it's extremely valuable for the project managers themselves!Why is this useful? Being subscribed to the commit notification mailing list, it is extremely easy to know what is going on on the project[2]. It is also really easy to review the code submissions as soon as they are made which, with proper reviews by other developers, trains the authors and improves their skills. And if the revision diff is inlined, it is trivial to pinpoint mistakes in it (be them style errors, subtle bugs, or serious design problems) by replying to the email.So, to my current project managers: if you read me, here is a wish-list item. And, for everyone else, if you need to set up a new project, consider creating this mailing list as soon as possible. Maybe few developers will subscribe to it, but those that do will pay attention and will provide very valuable feedback in the form of replies.1: Shame on me for not having such a mailing list for ATF. Haven't investigated how to do so with Monotone.2: Of course, the developers must be conscious to commit early and often, and to provide well-formed changesets: i.e. self-contained and with descriptive logs. [Continue reading]

  • DEBUG.EXE dropped in Windows 7

    Wow. DEBUG.EXE is being finally phased out in Windows 7. I can't believe it was still there.This brings me back two different memories. I had used this program in the past (a long while ago!) and it caused me both pain and joy.Regarding pain: I had an MS-DOS 5.x book that spent a whole section on DEBUG.EXE, and one of the examples in it contained a command that caused the program in memory to be written to some specific sectors of the floppy disk. Guess what I tried? I executed that same command but told it to use my hard disk instead of the floppy drive. Result: a corrupted file system. Had to run scandisk (remember it?), which marked some sectors as faulty and I thought I had ruined my precious 125MB WD Caviar hard disk. It wasn't until much, much, much later that I learnt that such a thing was not possible, and that really formatting the disk with a tool that had no memory of "bad" sectors (aka, Linux's newfs) could revert the disk to a clean state. (Actually, I kept that hard disk until very recently.)Regarding joy: On a boring weekend away from home, I used DEBUG.EXE on an old portable machine without internet connection to hack a version of PacMan. I disassembled the code until I found where it kept track of the player's lives and tweaked the counter to be infinite (or extra large, can't remember). That was fun. I could get to levels me and my father (who used to be an avid player) had never seen before!It's a pity this tool is going, but it must go. It is way too outdated compared to current debuggers. I wonder if anyone is still using it.Edit (Apr 1st, 2011): This is not a support forum for Windows issues. I've had to disable posting in this particular article because it was receiving lots of traffic and I don't want to moderate posts any more. [Continue reading]

  • Using C++ templates to optimize code

    As part of the project I'm currently involved in at university, I started (re)writing a Pin tool to gather run-time traces of applications parallelized with OpenMP. This tool has to support two modes: one to generate a single trace for the whole application and one to generate one trace per parallel region of the application.In the initial versions of my rewrite, I followed the idea of the previous version of the tool: have a -split flag in the frontend that enables or disables the behavior described above. This flag was backed by an abstract class, Tracer, and two implementations: PlainTracer and SplittedTracer. The thread-initialization callback of the tool then allocated one of these objects for every new thread and the per-instruction injected code used a pointer to the interface to call the appropriate specialized instrumentation routine. This pretty much looked like this:voidthread_start_callback(int tid, ...){ if (splitting) tracers[tid] = new SplittedTracer(); else tracers[tid] = new PlainTracer();}voidper_instruction_callback(...){ Tracer* t = tracers[PIN_ThreadId()]; t->instruction_callback(...);}I knew from the very beginning that such an implementation was going to be inefficient due to the pointer dereference at each instruction and the vtable lookup for the correct virtual method implementation. However, it was a very quick way to move forward because I could reuse some small parts of the old implementation.There were two ways to optimize this: the first one involved writing different versions of per_instruction_callback, one for plain tracing and the other for splitted tracing, and then deciding which one to insert depending on the flag. The other way was to use template metaprogramming.As you can imagine, this being C++, I opted to use template metaprogramming to heavily abstract the code in the Pin tool. Now, I have an abstract core parametrized on the Tracer type. When instantiated, I provide the correct Tracer class and the compiler does all the magic for me. With this design, there is no need to have a parent Tracer class — though I'd welcome having C++0x concepts available —, and the callbacks can be easily inlined because there is no run-time vtable lookup. It looks something like this:templateclass BasicTool { Tracer* tracers[MAX_THREADS]; Tracer* allocate_tracer(void) const = 0;public: Tracer* get_tracer(int tid) { return tracers[tid]; }};class PlainTool : public BasicTool { PlainTracer* allocate_tracer(void) const { return new PlainTracer(); }public: ...} the_plain_tool;// This is tool-specific, non-templated yet.voidper_instruction_callback(...){ the_plain_tool.get_tracer(PIN_ThreadId()).instruction_callback(...);}What this design also does is force me to have two different Pin tools: one for plain tracing and another one for splitted tracing. Of course, I chose it to be this way because I'm not a fan of run-time options (the -split flag). Having two separate tools with well-defined, non-optional features makes testing much, much easier and... follows the Unix philosophy of having each tool do exactly one thing, but doing it right!Result: around a 15% speedup. And C++ was supposed to be slow? ;-) You just need to know what the language provides you and choose wisely. (Read: my initial, naive prototype had a run-time of 10 minutes to trace part of a small benchmark; after several rounds of optimizations, it's down to 1 minute and 50 seconds to trace the whole benchmark!)Disclaimer: The code above is an oversimplification of what the tool contains. It is completely fictitious and obviates many details. I will admit, though, that the real code is too complex at the moment. I'm looking for ways to simplify it. [Continue reading]

  • Numeric limits in C++

    By pure chance when trying to understand a build error of some C++ code I'm working on, I came across the correct C++ way of checking for numeric limits. Here is how.In C, when you need to check for the limits of native numeric types, such as int or unsigned long, you include the limits.h header file and then use the INT_MIN/INT_MAX and ULONG_MAX macros respectively. In the C++ world, there is a corresponding climits header file to get the definition of these macros, so I always thought this was the way to follow.However, it turns out that the C++ standard defines a limits header file too, which provides the numeric_limits<T> template. This template class is specialized in T for every numeric type and provides a set of static methods to query properties about the corresponding type. The simplest ones are min() and max(), which are what we need to replace the old-style *_MIN and *_MAX macros.As an example, this C code:#include <limits.h>#include <stdio.h>#include <stdlib.h>intmain(void){ printf("Integer range: %d to %dn", INT_MIN, INT_MAX); return EXIT_SUCCESS;}becomes the following in C++:#include <cstdlib>#include <iostream>#include <limits>intmain(void){ std::cout << "Integer range: " << std::numeric_limits< int >::min() << " to " << std::numeric_limits< int >::max() << "n"; return EXIT_SUCCESS;}Check out the documentation for more details on additional methods! [Continue reading]

  • The NetBSD Blog

    The NetBSD Project recently launched a new official blog for NetBSD. From here, I'd like to invite you to visit it and subscribe to it. It's only with your support (through reading and, specially, commenting) that developers will post more entries! Enjoy :-) [Continue reading]

  • NetBSD-SoC needs your application!

    The Google Summer of Code 2009 application deadline for students is tomorrow and NetBSD has got very few applications so far. If you have the interest in working on a cool operating system project, where almost any project idea can fit, take the time to read our proposals and apply! New, original ideas not listed there will also be considered.It'd be a pity if the number of assigned slots to NetBSD was small due to the low number of applications! We did much better past year.Note that there are a couple of ATF-related proposals in there. Help will be certainly welcome (by me ;-) in those areas! [Continue reading]

  • Returning to Google

    I've been holding back this announcement until all affected parties knew in advance. They do know now, so I'm happy to announce that I'll be joining Google Dublin on May 25th as a Google.com Software Engineer!Thanks to everyone who made that possible. [Continue reading]

  • Comments for old posts now moderated

    After waking up today and finding 80+ spam comments all around old posts in this blog, I have decided to set all new comments for posts older than 14 days old to be moderated. Took half an hour to clean them all. Thank you, spammers. [Continue reading]

  • What are unnamed namespaces for in C++?

    In the past, I had come by some C++ code that used unnamed namespaces everywhere as the following code shows, and I didn't really know what the meaning of it was:namespace {class something {...};} // namespaceUntil now.Not using unnamed namespaces in my own code bit me with name clash errors. How? Take ATF. Some of its files declare classes in .cpp files (not headers). I just copy/pasted some ATF code in another project and linked the libraries produced by each project together. Boom! Link error because of duplicate symbols. And the linker is quite right in saying so!For some reason, I always assumed that classes declared in the .cpp files would be private to the module. But if you just think a little bit about it, just a little, this cannot ever be the case: how could the compiler tell the difference between a class definition in a header file and a class definition in a source file? The compiler sees preprocessed sources, not what the programmer wrote, so all class definitions look the same!So how do you resolve this problem? Can you have a static class, pretty much like you can have a static variable or function? No, you cannot. Then, how do you declare implementation-specific classes private to a module? Put them in an unnamed namespace as the code above shows and you are all set. Every translation unit has its own unnamed namespace and everything you put in it will not conflict with any other translation unit. [Continue reading]

  • Making ATF 'compiler-aware'

    For a long time, ATF has shipped with build-time tests for its own header files to ensure that these files are self-contained and can be included from other sources without having to manually pull in obscure dependencies. However, the way I wrote these tests was a hack since the first day: I use automake to generate a temporary library that builds small source files, each one including one of the public header files. This approach works but has two drawbacks. First, if you do not have the source tree, you cannot reproduce these tests -- and one of ATF's major features is the ability to install tests and reproduce them even if you install from binaries, remember? And second, it's not reusable: I now find myself needing to do this exact same thing in another project... what if I could just use ATF for it?Even if the above were not an issue, build-time checks are a nice thing to have in virtually every project that installs libraries. You need to make sure that the installed library is linkable to new source code and, currently, there is no easy way to do this. As a matter of fact, the NetBSD tree has such tests and they haven't been migrated to ATF for a reason.I'm trying to implement this in ATF at the moment. However, running the compiler in a transparent way is a tricky thing. Which compiler do you execute? Which flags do you need to pass? How do you provide a portable-enough interface for the callers?The approach I have in mind involves caching the same compiler and flags used to build ATF itself and using those as defaults anywhere ATF needs to run the compiler itself. Then, make ATF provide some helper check functions that call the compiler for specific purposes and hide all the required logic inside them. That should work, I expect. Any better ideas? [Continue reading]

  • Debug messages without using the C++ preprocessor

    If you are a frequent C/C++ programmer, you know how annoying a code plagued of preprocessor conditionals can be: they hide build problems quite often either because of, for example, trivial syntax errors or unused/undefined variables.I was recently given some C++ code to rewrite^Wclean up and one of the things I did not like was a macro called DPRINT alongside with its use of fprintf. Why? First because this is C++, so you should be using iostreams. Second because by using iostreams you do not have to think about the correct printf-formatter for every type you need to print. And third because it obviously relied on the preprocessor and, how not, debug builds were already broken.I wanted to come up with an approach to print debug messages that involved the preprocessor as less as possible. This application (a simulator) needs to be extremely efficient in non-debug builds, so leaving calls to printf all around that internally translated to noops at runtime wasn't a nice option because some serious overhead would still be left. So, if you don't use the preprocessor, how can you achieve this? Simple: current compilers have very good optimizers so you can rely on them to do the right thing for release builds.The approach I use is as follows: I define a custom debug_stream class that contains a reference to a std::ostream object. Then, I provide a custom operator<< that delegates the insertion to the output stream. Here is the only place where the preprocessor is involved: a small conditional is used to omit the delegation in release builds:template< typename T >inlinedebug_stream&operator{#if !defined(NDEBUG) d.get() << t;#endif // !defined(NDEBUG) return d;}There is also a global instance of a debug_stream called debug. With this in mind, I can later print debugging messages anywhere in the code as follows:debug So how does this not introduce any overhead in release builds?In release builds, operator<< is effectively a noop. It does nothing. As long as the compiler can determine this, it will strip out the calls to the insertion operator.But there is an important caveat. This approach requires you to be extremely careful in what you insert in the stream. Any object you construct as part of the insertion or any function you call may have side effects. Therefore, the compiler must generate the call to the code anyway because it cannot predict what its effects will be. How do you avoid that? There are two approaches. The first one involves defining everything involved in the debug call as inline or static; the trick is to make the compiler see all the code involved and thus be able to strip it out after seeing it has no side effects. The second approach is simply to avoid such object constructions or function calls completely. Debug-specific code should not have side effects, or otherwise you risk having different application behavior in debug and release builds! Not nice at all.A last note: the above is just a proof of concept. The code we have right now is more complex than what I showed above as it supports debug classes, the selection of which classes to print at runtime and prefixes every line with the class name. All of this requires several inline magic to get things right but it seems to be working just fine now :-)So, the conclusion: in most situations, you do not need to use the preprocessor. Find a way around it and your developers will be happier. Really. [Continue reading]

  • userconf support for the boot loader

    I have a machine at work, a Dell Optiplex 745, that cannot boot GENERIC NetBSD kernels. There is a problem in one of the uhci/ehci, bge or azalia drivers that causes a lockup at boot time because of a shared interrupt problem. Disabling ehci or azalia from the kernel lets the machine boot. In order to do that, there are two options: either you rebuild your kernel without the offending driver, or you boot into the userconf prompt with -c and, from there, manually disable the driver at each boot. None of the options are quite convincing.Of course, disabling a faulty driver is not the correct solution, but the workaround is useful on its own. I've just added a userconf command to the boot loader and its configuration file -- /boot and /boot.cfg respectively -- so that the end user can pass random userconf commands to the kernel in an automated way. userconf is a kernel feature that lets you change the parameters of builtin drivers and enable/disable them before the hardware detection routines are run.With this new feature in the boot loader, you can customize a GENERIC kernel without having to rebuild it! Yes, modules could help here too, but we are not there yet for hardware drivers. Note that OpenBSD has had a similar feature for a while with config -e, but they actually modify the kernel binary.You can check the patch out and comment about it in my post to tech-kern. [Continue reading]

  • ATF 0.6 released

    I am very happy to announce the availability of the 0.6 release of ATF. I have to apologize for this taking so long because the code has been mostly-ready for a while. However, doing the actual release procedure is painful. Testing the code in many different configurations to make sure it works, preparing the release files, uploading them, announcing the new release on multiple sites... not something I like doing often.Doing some late reviews, I have to admit that the code has some rough edges, but these could not delay 0.6 any more. The reason is that this release will unblock the NetBSD-SoC atfify project, making it possible to finally integrate all the work done in it into the main NetBSD source tree.Explicit thanks go to Lukasz Strzygowski. He was not supposed to contribute to ATF during his Summer of Code 2008 project, but he did, and he actually provided very valuable code.The next step is to update the NetBSD source tree to ATF 0.6. I have extensive local changes for this in my working copy, but I'm very tired at the moment. I think I'll postpone their commit until tomorrow so that I don't screw up something badly.Enjoy it and I'm looking for your feedback on the new stuff! [Continue reading]

  • pwd_mkdb and the new time_t

    NetBSD-current has recently switched time_t to be a 64-bit type on all platforms to cope with the year-2038 problem. This is causing all sorts of trouble, and a problem I found yesterday was that, after a clean install of NetBSD/amd64, it was impossible to change the data of any user through chfn. The command failed with:chfn: /etc/master.passwd: entry root inconsistent expirechfn: /etc/master.passwd: unchangedSuspiciously, the data presented by chfn showed an expiration date for root set in a seemingly-random day (October 14th, 2021). That seemed like some part of the system not parsing the user database correctly and generating random values. A sample test program that walked through the passwords database with getpwent(3) showed an invalid expiration date for root, even when /etc/master.passwd had a 0 in that field.After some debugging, I found out that libc tries to be compatible with old-format binary password databases (those generated by pwd_mkdb). In order to deal with compatibility, libc checks to see if the database has a VERSION field set in it. If not, it assumes it is an old version and thus time_t may not be 64-bit. If the VERSION field is set, it uses the new time_t.So what was the problem? pwd_mkdb did not set the VERSION field even though it wrote a new-format database. libc assumed it was laid out according to the old format but it was not, so it got garbage data during parsing. After some hacking, I fixed it as described in a post to current-users.Soon after, Christos Zoulas (the one who did all the time_t work) told me that he had already done these changes in his branch but forgot to merge them. He did now, and the code in the main branch should work fine. I still think there is a minor problem in it, but the major issue after installation should be gone. [Continue reading]

  • Windows 3.1 startup speed

    Out of boredom, I installed MS-DOS and Windows 3.1 on my machine a few days ago — yeah, I was inspired by the Hot Dog Stand comments in this post. Check it out here. Don't be scared, it was just a virtual machine!Anyway, this was fun because it reminded me of something. Back in 1994, my father bought a Pentium 60Mhz. After ordering it, we imagined how fast it could be compared to our older machine, a 386DX 40Mhz. Based on magazine reviews of those days, we supposed that Windows 3.1 could start in 1 or 2 seconds. But what a disappointment when we got the machine. It certainly was faster than the 386, but it took many more seconds to start Windows.Now, trying this same thing on the Macbook Pro, Windows 3.1 actually starts in less than 1 second. Finally, after almost 15 years, our thoughts have become true! [Continue reading]

  • Silencing the output of Python's subprocess.Popen

    I'm learning Python these days while writing an script to automate the testing of ATF under multiple virtual machines. I had this code in a shell script, but it is so ugly and clumsy that I don't even dare to add it to the repository. Hopefully, the new version in Python will be more robust and versatile enough to be published.One of the things I've been impressed by is the subprocess module and, in special, its Popen class. By using this class, it is trivial to spawn subprocesses and perform some IPC with them. Unfortunately, Popen does not provide any way to silence the output of the children. As I see it, it'd be nice if you'd pass an IGNORE flag as the stdout/stderr behavior, much like you can currently set those to PIPE or set stderr to STDOUT.The following trivial module implements this idea. It extends Popen so that the callers can pass the IGNORE value to the stdout/stderr arguments. (Yes, it is trivial but it is also one of the first Python code I write so... it may contain obviously non-Pythonic, ugly things.) The idea is that this exposes the same interface so that it can be used as a drop-in replacement. OK, OK, it lacks some methods and the constructor does not match the original signature, but this is enough for my current use cases!import subprocessIGNORE = -3STDOUT = subprocess.STDOUTassert IGNORE != STDOUT, "IGNORE constant is invalid"class Popen(subprocess.Popen): """Extension of subprocess.Popen with built-in support for silencing the output channels of a child process""" __null = None def __init__(self, args, stdout = None, stderr = None): subprocess.Popen.__init__(self, args = args, stdout = self._channel(stdout), stderr = self._channel(stderr)) def __del__(self): self._close_null() def wait(self): r = subprocess.Popen.wait(self) self._close_null() return r def _null_instance(self): if self.__null == None: self.__null = open("/dev/null", "w") return self.__null def _close_null(self): if self.__null != None: self.__null.close() self.__null = None assert self.__null == None, "Inconsistent internal state" def _channel(self, behavior): if behavior == IGNORE: return self._null_instance() else: return behaviorBy the way, somebody else suggested this same thing a while ago. Don't know why it hasn't been implemented in the mainstream subprocess module. [Continue reading]

  • Hardware (and other stuff) on sale

    I need to get rid of lots of stuff that I haven't used for a long time and that are only taking space here. Most of them will go to the trash but, if you are interested, let me know and make an offer for them! Yes, "for free" is valid on some items ;) Those items that have a price is the ones that I think are most valuable and I will keep them unless someone wants to buy them.Keep in mind that all this stuff is in Barcelona, Spain, and that shipping is not too practical for me. Still, if you are very interested in some item, we can probably sort it out.With all that said, check the list out. I may keep updating it during the following days.PS: How can I have accumulated so much crap?... [Continue reading]

  • Kernel modules in NetBSD/shark (or other ARMs)

    I reinstalled NetBSD-current recently on my shark (Digital DNARD) and, out of curiosity, I wanted to see if the new-style kernel modules worked fine on this platform. To test that, I attempted to load the puffs module and failed with an error message saying something like "kobj_reloc: unexpected relocation type 1". Similarly, the same error appeared when running the simpler regression tests in /usr/tests/modules.After seeing that error message, I tracked it down in the source code and ended in src/sys/arch/arm/arm32/kobj_machdep.c. A quick look at it and at src/sys/arch/arm/include/elf_machdep.h revealed that the kernel was lacking support for the R_ARM_PC24 relocation type. "It can't be too difficult to implement", I thought. Hah!Based on documentation, I understood that R_ARM_PC24 is used in "short" jumps. This relocation is used to signal the runtime system that the offset to the target address of a branch instruction has to be relocated. This offset is a 24-bit number and, when loaded, it has to be shifted two bits to the left to accommodate for the fact that instructions are 32-bit aligned. Before the relocation, there is some addend encoded in the instruction that has to be loaded, sign-extended and shifted two bits to the left and, after all that, added to the calculated address.I spent hours trying to implement support for the R_ARM_PC24 relocation type because it didn't want to work as expected. I even ended up looking at the Linux code to see how they dealt with it, and I found out that I was doing exactly the same as them. So what was the problem? A while later I realized that this whole thing wasn't working because the relocated address to be stored in the branch instruction didn't fit in the 24 bits! That makes things harder to solve.At that point, I looked at the port-arm mailing list and found that several other people were looking at this same issue. Great, some time "wasted" but a lot of new stuff learnt. Anyway, it turns out there are basically two solutions to the problem described above. The first involves generating jump trampolines for the addresses that fall too far away. The second one is simpler: just change the kernel to load the modules closer to the kernel text, and thus make the jump offsets fit into the 24 bits of the instructions. Effectively, there is a guy that has got almost everything working already.Let's see if they can get it working soon! [Continue reading]

  • Happy new 2009!

    Happy new year to everyone!I know that the blog has been quite dead recently. Let's see if it can revive during 2009 (but not today). I already have a few ideas for future posts so stay tuned! [Continue reading]