• Using RAII to clean up temporary values from a stack

    For the last couple of days, I have been playing around with the Lua C API and have been writing a thin wrapper library for C++. The main purpose of this auxiliary library is to ensure that global interpreter resources such as the global state or the execution stack are kept consistent in the presence of exceptions — and, in particular, that none of these are leaked due to programming mistakes when handling error codes.To illustrate this point, let's forget about Lua and consider a simpler case. Suppose we lost the ability to pass arguments and return values from functions in C++ and all we have is a stack that we pass around. With this in mind, we could implement a multiply function as follows:void multiply(std::stack& context) { const int arg1 = context.top(); context.pop(); const int arg2 = context.top(); context.pop(); context.push(arg1 * arg2);}And we could call our function as this:std::stack context;context.push(5);context.push(6);multiply(context);const int result = s.top();s.pop();In fact, my friends, this is more-or-less what your C/C++ compiler is internally doing when converting code to assembly language. The way the stack is organized to perform calls is known as the calling conventions of an ABI (language/platform combination).Anyway, back to our point. One important property of such a stack-based system is that any function that deals with the stack must leave it in a consistent state: if the function pushes temporary values (read: local variables) into the stack, such temporary values must be gone upon return no matter how the function terminates. Otherwise, the caller will not find the stack as it expects, which will surely cause trouble at a later stage. The above example works just fine because our function is extremely simple and does not put anything on the stack.But things get messier when our functions can fail halfway through, and, in particular, if such failures are signaled by exceptions. In these cases, the function will abort abruptly and the function must take care to clean up any values that may still be left on the stack. Let's consider another example:void magic(std::stack& context) { const int arg1 = context.top(); context.pop(); const int arg2 = context.top(); context.pop(); context.push(arg1 * arg2); context.push(arg1 / arg2); try { ... do something with the two values on top ... context.push(arg1 - arg2); try { ... do something with the three values on top ... } catch (...) { context.pop(); // arg1 - arg2 throw; } context.pop(); } catch (...) { context.pop(); // arg1 / arg2 context.pop(); // arg1 * arg2 throw; } context.pop(); context.pop();}The above is a completely fictitious and useless function, but serves to illustrate the point. magic() starts by pushing two values on the stack and then performs some computation that reads these two values. It later pushes an additional value and does some more computations on the three temporary values that are on the top of the stack.The "problem" is that the computation code can throw an exception. If it does, we must sanitize the stack to remove the two or three values we have already pushed. Otherwise, the caller will receive the exception, it will assume nothing has happened, and will leak values on the stack (bad thing). To prevent this, we have added a couple of try/catch clauses to capture these possible exceptions and to clean up the already-pushed values before exiting the function. Unfortunately, this gets old very quickly: having to add try/catch statements surrounding every call is boring, ugly, and hard to read (remember that, potentially, any statement can throw an exception). You can see this in the example above with the two nested try/catch blocks.To mitigate this situation, we can apply a RAII-like technique to make popping elements on errors completely transparent and automated. If we can make it transparent, writing the code is easier and reading it is trivial; if we can make it automated, we can be certain that our error paths (rarely tested!) correctly clean up any global state. In C++, destructors are deterministically executed whenever a variable goes out of scope, so we can use this to our advantage to clean up temporary values. Let's consider this class:class temp_stack { std::stack& _stack; int _pop_count;public: temp_stack(std::stack& stack_) : _stack(stack_), _pop_count(0) {} ~temp_stack(void) { while (_pop_count-- > 0) _stack.pop(); } void push(int i) { _stack.push(i); _pop_count++; }};With this, we can rewrite our function as:void magic(std::stack& context) { const int arg1 = context.top(); context.pop(); const int arg2 = context.top(); context.pop(); temp_stack temp(context); temp_stack.push(arg1 * arg2); temp_stack.push(arg1 / arg2); ... do something with the two values on top ... temp_stack.push(arg1 - arg2); ... do something with the three values on top ... // Yes, we can return now. No need to do manual pop()s!}Simple, huh? Our temp_stack function keeps track of how many elements have been pushed on the stack. Whenever the function terminates, be it due to reaching the end of the body or due to an exception thrown anywhere, the temp_stack destructor will remove all elements previously registered from the stack. This ensures that the function leaves the global state (the stack) as it was on entry — modulo the function parameters consumed as part of the calling conventions.So how does all this play together with Lua? Well, Lua maintains a stack to communicate parameters and return values between C and Lua. Such stack can be managed in a similar way with a RAII class, which makes it very easy to write native functions that deal with the stack and clean it up correctly in all cases. I would like to show you some non-fictitious code right now, but it's not ready yet ;-) But when it is, it will be part of Kyua. Stay tuned!And, to conclude: to make C++ code robust, wrap objects that need manual clean up (pointers, file descriptors, etc.) with small wrapper classes that perform such clean up on destruction. These classes are typically fully inlined and contain a single member field, so they do not impose any performance penalty. But, on the contrary, your code can avoid the need of many try/catch blocks, which are tricky to get right and hard to validate. (Unfortunately, this technique cannot be applied in, e.g. Java or Python, because the execution of the class destructors is completely non-deterministic and not guaranteed to happen whatsoever!) [Continue reading]

  • Dependency injection: simple class constructors

    Following my previous post on dependency injection (DI for short), I wanted to show you today another example of code in which DI helps in making the code clearer and easier to validate. In this case, the person to blame for the original piece of code being criticized is me.The atffile module in ATF provides a class to represent the contents of Atffiles. An Atffile is, basically, a file containing a list of test programs to run and a list of properties associated to these test programs. Let's consider the original implementation of this module:class atffile { strings_vector _test_programs; strings_vector _properties;public: atffile(const path& file) { std::ifstream input(file.c_str()); _test_programs = ... parse list from input ...; _properties = ... parse list from input ...; } ... getters and other read-only methods ...};According to the object-oriented programming (OOP) principles we are taught over and over again, this seems like a reasonable design. An atffile object is entirely self-contained: if the constructor finishes successfully, we know that the new object matches exactly the representation of an Atffile on disk. The other methods in the class provide read-only access to the internal attributes, which ensures that the in-memory representation remains consistent.However, this design couples the initialization of an object with external dependencies, and that is bad for two main reasons: first, because it makes testing (very) difficult; and, second, because it makes an apparently simple action (constructing an object) a potentially expensive task (reading from an external resource).To illustrate the first point, let's consider a helper free-function that deals with an atffile object:std::stringget_property(const atffile& file, const std::string& name, const std::string& defvalue){ const strings_vector& props = file.properties(); const strings_vector::const_iterator iter = props.find(name); if (iter == props.end()) return defvalue; else return *iter;}Now, how do we write unit-tests for this function? Note that, to execute this function, we need to pass in an atffile object. And to instantiate an atffile, we need to be able to read a Atffile from disk because the only constructor for the atffile class has this dependency on an external subsystem. So, summarizing, to test this innocent function, we need to create a file on disk with valid contents, we need to instantiate an atffile object pointing to this file, and only then we can pass it to the get_property function. At this point, our unit test smells like an integration test, and it actually is for no real reason. This will cause our test suite to be more fragile (the test for this auxiliary function depends on the parsing of a file) and slow.How can we improve the situation? Easy: decoupling the dependencies on external systems from the object initialization. Take a look at this rewritten atffile class:class atffile { strings_vector _test_programs; strings_vector _properties;public: atffile(const strings_vector& test_programs_, const strings_vector& properties_) : _test_programs(test_programs_), _properties(properties_) { assert(!_test_programs.empty()); } static atffile parse(const path& file) { std::ifstream input(file.c_str()); strings_vector test_programs_ = ... parse list from input ...; strings_vector properties_ = ... parse list from input ...; return atffile(test_programs_, properties_); } ... getters and other read-only methods ...};Note that this new design does not necessarily violate OOP principles: yes, we can now construct an object with fake values in it by passing them to the constructor, but that does not mean that such values can be inconsistent once the object is created. In this particular example, I have added an assertion in the constructor to reenforce a check performed by parse (that an atffile must list at least one test program).With this new design in mind, it is now trivial to test the get_property function shown above: constructing an auxiliary atffile object is easy, because we can inject values into the object to later pass it to get_property: no need to create a temporary file that has to be valid and later parsed by the atffile code. Our test now follows the true sense of a unit test, which is much faster, less fragile and "to-the-point". We can later write integration tests if we so desire. Additionally, we can also write tests for atffile member functions, and we can very easily reproduce corner cases for them by, for example, injecting bad data. The only place where we need to create temporary Atffiles is when we need to test the parse class method.So, to conclude: make your class constructors as simple as possible and, in particular, do not make your class constructors depend on external systems. If you find yourself opening resources or constructing other objects from within your constructor, you are doing it wrong (with very few exceptions).I have been using the above principle for the last ~2 years and the results are neat: I am much, much more confident on my code because I write lots of more accurate test cases and I can focalize dependencies on external resources on a small subset of functions. (Yes, Kyua uses this design pattern intensively!) [Continue reading]

  • Dependency injection and testing: an example

    A coworker just sent me some Python code for review and, among such code, there was the addition of a function similar to:def PathWithCurrentDate(prefix, now=None): """Extend a path with a year/month/day subdirectory layout. Args: prefix: string, The path to extend with the date subcomponents. now: datetime.date, The date to use for the path; if None, use the current date. Returns: string, The new computed path with the date appended. """ path = os.path.join(prefix, '%Y', '%m', '%d') if now: return now.strftime(path) else: return datetime.datetime.now().strftime(path)The purpose of this function, as the docstring says, is to simplify the construction of a path that lays out files on disk depending on a given date.This function works just fine... but it has a serious design problem (in my opinion) that you only see when you try to write unit tests for such function (guess what, the code to review did not include any unit tests for this). If I ask you to write tests for PathWithCurrentDate, how would you do that? You would need to consider these cases (at the very very least):Passing now=None correctly fetches the current date. To write such a test, we must stub out the call to datetime.datetime.now() so that our test is deterministic. This is easy to do with helper libraries but does not count as trivial to me.Could datetime.datetime.now() raise an exception? If so, test that the exception is correctly propagated to match the function contract.Passing an actual date to now works. We know this is a different code path that does not call datetime.datetime.now(), but still we must stub it out to ensure that the test is not going through that past in case the current date actually matches the date hardcoded in the test as an argument to now.My point is: why is such a trivial function so complex to validate? Why such a trivial function needs to depend on external state? Things become more obvious if we take a look at a caller of this function:def BackupTree(source, destination): path = PathWithCurrentDate(destination) CreateArchive(source, os.path.join(path, 'archive.tar.gz'))Now, question again: how do we test this? Our tests would look like:def testOk(self): # Why do we even have to do this? ... create stub for datetime.datetime.now() to return a fake date ... CreateArchive('/foo', '/backups/prefix') ... validate that the archive was generated in the fake date directory ...Having to stub out the call to datetime.datetime.now() before calling CreateArchive is a really, really weird thing at first glance. To be able to write this test, you must have deep insight of how the auxiliary functions called within the function work to know what dependencies on external state they have. Lots of black magic involved.All this said, the above may not seem like a big issue because, well, a call to datetime.datetime.now() is cheap. But imagine that the call being performed deep inside the dependency tree was more expensive and dealt with some external state that is hard to mock out.The trick to make this simpler and clearer is to apply a form of Dependency injection (or, rather, "value injection"). We want the PathWithCurrentDate function to be a simple data manipulation routine that has no dependencies on external state (i.e. make it purely functional). The easiest way to do so is to remove the now=None code path and pass the date in right from the most external caller (aka, the main() program). For example (skipping docstrings for brevity):def PathWithCurrentDate(prefix, now): path = os.path.join(prefix, '%Y', '%m', '%d') return now.strftime(path)def BackupTree(source, destination, backup_date): path = PathWithCurrentDate(destination, backup_date) CreateArchive(source, os.path.join(path, 'archive.tar.gz'))With this approach, the dependency on datetime.datetime.now() (aka, a dependency on global state) completely vanishes from the code. The code paths to validate are less, and they are much simpler to test. There is no need to stub out a function call seemingly unused by BackupTree.Another advantage of this approach can be seen if we were to have multiple functions accessing the same path. In this case, we would need to ensure that all calls receive the exact same date... what if the program kept running past 12AM and the "now" value changed? It is trivial to reason about this feature if the code does not have hidden queries to "now" (aka global state) within the code... but it becomes tricky to ensure our code is right if we can't easily audit where the "now" value is queried from!The "drawback", as some will think, is that the caller of any of these functions must do more work on its own to provide the correct arguments to the called functions. "And if I always want the backup to be created on the current directory, why can't the backup function decide on itself?", they may argue. But, to me, the former is definitely not a drawback and the latter... is troublesome as explained in this post.Another "drawback", as some others would say, is that testing is not a goal. Indeed it is not: testing is only a means to "correct" code, but it is also true that having testable code often improves (internal) APIs and overall design.To conclude: the above is an over-simplistic case study and my explanations will surely not convince anyone to stop doing black evil "clever" magic from within functions (and, worse, from within constructors). You will only realize that the above makes any sense when you start unit-testing your code. Start today! :-) [Continue reading]

  • Kyua: Design of the configuration system

    Over a week ago, I mostly finished the implementation of the runtime engine for test cases of Kyua and, along the way, realized that it is imperative to write a configuration system right now before the configuration code becomes messier than it already is.To that end, I spent the last week working on a design document for the configuration system. Summarizing, the document describes what the requirements for the configuration files of Kyua are, what the possible alternatives to implement them are, and advocates the use of Lua — a tiny embedded programming language — to bring these configuration files to life.It is your chance to get involved in the early stages of the development of Kyua! :-) Take a look at the email to kyua-discuss asking for comments and feel free to join the mailing list ;-) [Continue reading]

  • Sticky bit trivia

    Did you ever wonder where the "sticky" part of the "sticky bit" name comes from? I actually didn't, but I just came across the Sticky bit page on Wikipedia through a tweet from @AaronToponce and discovered why.If you have used any "recent" (the quotes are important) Unix-like system, you probably know what the sticky bit is used for: to restrict the deletion of files in a directory to, basically, the owner of such files. The sticky bit is used on /tmp (among other directories) so that anyone can create temporary files in it but only those that created the temporary files can delete them.But that's not all that is to know.The original purpose of the sticky bit was to mark frequently-used executable files so that the operating system kept their text segment on swap space. This speeded up subsequent executions of such programs because the system would not need to access the file system to load the binary: it could just use the image already kept in the swap space. This behavior became obsolete with the advent of memory-mapped executables.Now it's clear why the sticky bit has the name it has, isn't it? But still, that's not all that is to know.SunOS 4 introduced a new behavior for the sticky bit on regular, non-executable files: reads and writes to such files would bypass the buffer cache, thus basically telling the system to perform raw I/O on those files. This was particularly useful on swap files when NFS was involved.With the above, I have just tried to summarize you the information that is in NetBSD's chmod(2) and sticky(7) manual pages; they contain much more detailed information. (And yep, that's right: contrary to what the Wikipedia article says, NetBSD does not support this behavior any more.) Hope you found this insightful if you did not know this little piece of history! [Continue reading]

  • Getting the hang of Twitter searches

    I have had a Twitter account (@jmmv) for several years already but I have never leveraged its power. Why? Well... basically because I have never known how.The Twitterville book by Shel Israel (@shelisrael), which I have been reading lately, has opened my mind quite a bit. Twitter is not so much about posting status updates, but more about sharing content and starting/joining conversations with other people. Today I like to think of Twitter as a world-wide chat-room.Twitterville mentions many times that the key in Twitter usage is to search for content. But finding content among all the junk that floats in Twitter is hard. I had seen that it is possible to actually do searches in the Twitter web page, but that is not really usable. Yes, you can search once for something you are interested in... but you know what, you can do the same in any search engine and get more relevant results.To me, what has made a difference is to switch to a client that does actually support live searches (TweetDeck in my case). With such a client, all you have to do is create a search for any given topic you may be remotely interested in and status updates will just pop up in your client as soon as someone posts about that particular topic. Easy, huh? See, it's like joining your favorite #topic chat-room.With this in mind, you can, for example, create a search such as "#netbsd OR #freebsd OR #openbsd" to get real-time tweets about these BSD operating systems. It is a fact that you will see loads of junk (disable popup notifications recommended), but you will catch some interesting content. And the best of it, you can reply to that content. This is particularly useful because you can (try to) fix misconceptions that people have before they spread out too much.That said, the value you see in Twitter and how you use it, is fully up to you. Different people will find different use cases, all of them interesting on their own.And to conclude, I have to confess that while the above may seem obvious to many, it is something that has escaped my mind until last week. [Continue reading]

  • Introducing Kyua

    Wow. I have just realized that I have not blogged at all about the project that has kept me busy for the past two months! Not good, not good. "What is this project?", I hear. Well, this project is Kyua.A bit of background first: the Automated Testing Framework, or ATF for short, is a project that I started during the Summer of Code of 2007. The major goal of ATF was, and still is, to provide a testing framework for the NetBSD operating system. The ATF framework is composed of a set of libraries to aid in the implementation of test cases in C, C++ and shell, and a set of tools to ease the execution of such test cases (atf-run) and to generate reports of the execution (atf-report).At that point in time, I would say that the original design of ATF was nice. It made test programs intelligent enough to execute their test cases in a sandboxed environment. Such test programs could be executed on their own (without atf-run) and they exposed the same behavior as when they were run within the runtime monitor, atf-run. On paper this was nice, but in practice it has become a hassle. Additionally, some of these design decisions mean that particular features (in particular, parallel execution of tests) cannot be implemented at all. At the end of 2009 and beginning of 2010, I did some major refactorings to the code to make the test programs dumber and to move much of the common logic into atf-run, which helped a lot in fixing the major shortcomings encountered by the users... but the result is that, today, we have a huge mess.Additionally, while ATF is composed of different modules conceptually separate from each other, there is some hard implementation couplings among them that impose severe restrictions during development. Tangentially, during the past 2 years of working at Google (and coding mainly in Python), I have been learning new neat programming techniques to make code more testable... and these are not followed at all by ATF. In fact, while the test suite of ATF seems very extensive, it definitely is not: there are many corner cases that are not tested and for which implementing tests would be very hard (which means that nasty bugs have easily sneaked in into releases).Lastly, a very important point that affects directly the success of the project. Outsiders that want to contribute to ATF have a huge entry barrier: the source repository is managed by Monotone, the bug tracker is provided by Gnats (a truly user-unfriendly system), and the mailing lists are offered by majordomo. None of these tools is "standard" by today's common practices, and some of them are tied to NetBSD's hosting which puts some outsiders off.For all the reasons above and as this year has been moving along, I have gotten fed up with the ATF code base. (OK, things are not that bad... but in my mind they do ;-) And here is where Kyua comes into the game.Kyua is a project to address all the shortcomings listed above. First of all, the project uses off-the-shelf development tools that should make it much, much easier for external people to contribute. Secondly, the project intends to be much more modular, providing a clear separation between the different components and providing code that is easily testable. Lastly, Kyua intends to remain compatible with ATF so that there are no major disruptions for users. You can (and should) think of Kyua as ATF 2.0, not as a vastly different framework.As of today, Kyua implements a runtime engine that is on par, feature-wise, to the one provided by atf-run. It is able to run test cases implemented with the ATF libraries and it is able to test itself. It currently contains 355 test cases that run in less than 20 seconds. (Compare that to the 536 test cases of ATF, which take over a minute to run, and Kyua is still really far from catching up with all the functionality of ATF.) Next actions involve implementing reports generation and configuration files.Anyway. For more details on the project, I recommend you to read the original posting to atf-devel or the project's main page and wiki. And of course, you can also download the preliminary source code to take a look!Enjoy :-) [Continue reading]

  • Creating atf-based tests for NetBSD src

    Thanks to Antti Kantee's efforts, atf is seeing increasing visibility in the NetBSD community during the past few months. But one of the major concerns that we keep hearing from our developers is "Where is the documentation?". Certainly I have been doing a pretty bad job at that, and the current in-tree documents are a bit disorganized.To fix the short-term problem, I have written a little tutorial that covers pretty much every aspect that you need to know to write atf tests and, in particular, how to write such tests for the NetBSD source tree. Please refer to the official announcement for more details.Comments are, of course, welcome! And if you can use this tutorial to write your first tests for NetBSD, let me know :-) [Continue reading]

  • ATF 0.10 released

    Ladies and gentlemen: I have just released ATF 0.10! This release with such a magic number includes lots of new exciting features and provides a much simplified source tree.Dive into the 0.10 release page for details!I'm now working in getting this release into the NetBSD tree to remove some of the custom patches that have been superseded by the official release. Will be there soon.And all this while I am a meetBSD in Kraków :-) [Continue reading]

  • Testing NetBSD: Easy Does It

    Antti Kantee has been, for a while, writing unit/integration tests for the puffs and rump systems (for which he is the author) shipped with NetBSD. Recently, he has been working on fixing the NetBSD test suite to report 0 failures in the i386 platform so as to encourage developers to keep it that way while doing changes to the tree. The goal is to require developers to run the tests themselves before submitting code.Antti has just published an introductory article, titled Testing NetBSD: Easy Does It, that describes what ATF and Anita are, how to use them and how they can help in NetBSD development and deployment. Nice work! [Continue reading]

  • ATF 0.9 released (late announcement)

    Oops! Looks like I forgot to announce the release of ATF 0.9 here a couple of weeks ago. Just a short notice that the formal release has been available since June 3rd and that 0.9 has been in NetBSD since June 4th!You can also enjoy a shiny-new web site! It even includes a FAQ!And, as a side note: I have added a test target to the NetBSD Makefiles, so now it's possible to just do make test within any subdirectory of src/tests/ and get what you expect. [Continue reading]

  • Trac installation for ATF

    During the past few months, I've got into the habit of using a bug tracker to organize my tasks at the work place. People assign tickets to me to get things done and I also create and self-assign tickets to myself to keep them as a reminder of the mini-projects to be accomplished. Sincerely, this approach works very well for me and keeps me focused.Since then, I've been wishing to have a similar system set up for ATF. Yeah, we could use the Gnats installation provided by NetBSD... but I hate this issue tracking system. It's ancient, ugly, and I do really want a web interface to manage my tickets through.So, this weekend, I finally took some time and set up a Trac installation for ATF to provide a decent bug/task tracking system. The whole Apache plus Trac setup was more complex than I imagined, but I do hope that the results will pay off :-)Take a look at the official announcement for more details! [Continue reading]

  • Ads gone

    Almost a year ago, I decided to give a try to AdSense. And, so far, the "earnings" have been ~30 EUR which I cannot even cash. Given this and how ugly and disturbing the ads look on the front page, I have disabled them. (I think the ads have gotten much worse over time... but as I do not pay attention to the front page, I didn't see them.) Thanks to Roman Valls for pointing this out! [Continue reading]

  • ATF 0.8 imported into NetBSD

    Finished importing ATF 0.8 into the NetBSD source tree. Wow, the CVS import plus merge was much easier than I expected.Note that, while the NetBSD test suite should continue to work as usual, there are some backwards incompatible changes in the command line interface of test programs. If you are used to run them by hand, expect different results. Please read the release news for details.Now let's wait for complaints about broken builds! And enjoy this new release in your NetBSD-current system! [Continue reading]

  • Announcing ATF 0.8

    Looks like today is a release day. I've just pushed ATF 0.8 out in the wild and will proceed to import it into pkgsrc and NetBSD later. Refer to the release announcement for details. This is an exciting release! You have been warned ;-) [Continue reading]

  • Announcing etcutils 0.1

    During past week, I worked on a new package called etcutils. It provides a (reduced) tool-set to programmatically manage files in /etc and is specially designed to allow pkgsrc to update /etc/shells and /etc/services in a more consistent way.I'm happy to say that the 0.1 release is now ready! Go to the etcutils web page for details. (I know that if you are a Linux user, you probably don't care about this because your distribution most likely already provides something similar... albeit more complex.)I'll now proceed to import this new package into pkgsrc as sysutils/etcutils. Later on (most likely not today), the following should happen: rework pkginstall to use the new shells(8) utility to update /etc/shells, add a new feature to pkgsrc to abstract the updates to /etc/services, and swipe through pkgsrc to make all packages touching this file use the new frameworky option. (Hey FAM, that includes you!) [Continue reading]

  • Forget about test(1)'s == operator

    Some implementations of test(1), in an attempt to be smart, provide non-standard operators such as ==. Please forget about those: they make your scripts non-portable and a pain to use in other systems. Why? Because, due to the way the shell works, failures in calls to test(1) will often just result in an error message (which may not be seen due to other output) and the script will happily continue running even if it missed to perform some important operation.So... just use the standard equality operators:= for string equality comparison.-eq for numeric equality comparison.Note that whenever I refer to test(1), I'm also talking about the [ ... ] construction in conditionals.Also, please note that this also affects configure scripts, and the problem in these appears much more commonly than in other scripts! [Continue reading]

  • Always define an else clause for portability #ifdefs

    If you use #ifdef conditionals in your code to check for portability features, be sure to always define a catch-all else clause that actually does something, even if this something is to error out.Consider the following code snippet, quoted from gamin's tests/testing.c file:if (arg != NULL) {#ifdef HAVE_SETENV setenv("GAM_CLIENT_ID", arg, 1);#elif HAVE_PUTENV char *client_id = malloc (strlen (arg) + sizeof "GAM_CLIENT_ID="); if (client_id) { strcpy (client_id, "GAM_CLIENT_ID="); strcat (client_id, arg); putenv (client_id); }#endif /* HAVE_SETENV */}ret = FAMOpen(&(testState.fc));The FAMOpen method queries the GAM_CLIENT_ID environment variable to set up the connections parameters to the FAM server. If the variable is not defined, the connection will still work, even though it will use some default internal value. In the test code above, the variable is explicitly set to let the tests use a separate server instance.Now, did you notice that we have to conditional branches? One for setenv and one for putenv? It seems reasonable to assume that one or the other must be present on any Unix system. Unfortunately, this is flawed:What happens if the code forgets to include config.h?What happens if the configure script fails to detect both setenv and putenv? This is not that uncommon, given how some configure scripts are written.What happens if neither setenv nor putenv are available?The answer to the three questions is: in the above code snippet, the code builds just fine but will misbehave at run time: neither HAVE_SETENV nor HAVE_PUTENV are defined, so the code will not be able to define the required environment variable. However, FAMOpen will later be called and it will not behave as expected because the variable has not been set.Note that this code snippet is just an example. I have seen many more instances of this exact same problem with worse consequences than the above. Read: they were not part of the test code, but just part of the regular code path.So how do you implement the above in a saner way? You have two alternatives:Add an #else clause that contains a fallback implementation. In the case above, we could, for example, prefer to use setenv if present because it has a nicer interface, and fall back to putenv if not found.This has a disadvantage though: if you forget to include config.h or the configure script cannot correctly detect one of the possible implementations (even when present), you will always use the fallback implementation.Keep each possible implementation correctly protected by a conditional, but add a #else clause that raises an error at build time. This will make sure that you never forget to define at least one of the portability macros for any reason. This is the preferred approach.Following the second suggestion above, the code would get the following structure: #if defined(HAVE_SETENV)setenv(...);#elif defined(HAVE_PUTENV)putenv(...);#else# error "Don't know how to set environment variables."#endifWith this code, we can be sure that the code will not build if none of the possible implementations are selected. We can later proceed to investigate why that happened. [Continue reading]

  • Where does Gnome use file monitoring?

    One of the readers of my post yesterday, wonders where Gnome uses the file monitoring APIs. Well, the answer is: everywhere.Here are some examples:Nautilus monitors all open folders so that it can update their contents whenever the underlying file store changes. Say you are viewing the Documents folder and you save a new Document from within OpenOffice into that folder. You definitely want Nautilus to show it immediately, without having to manually hit Refresh from the menu.The trash applet monitors the trash folders and updates its icon from empty to full whenever one of these folders ceases to be non-empty.The panel monitors the applications directory to notice when new applications get installed. This allows it to update the Applications menu immediately as soon as a new program gets installed into the system.The GTK file open/save dialogs monitor the directory they are viewing for the same reason as Nautilus. (Actually, I'm unsure about this point. My Linux installation is an old Ubuntu 8.04 LTS that does not have GIO, so I can't verify in recent ones. However, this makes perfect sense and if not implemented, it should be!)The background switcher control panel monitors the folders containing images to be able to show new installed backgrounds. (I'm not sure about this either. It doesn't happen in my Linux installation, but it also makes sense.)Media players such as Rhythmbox and Banshee allow you to point them to a folder containing music and have an option to automatically add music to the library as soon as it pops up in such folder.Potentially, any document editor, picture viewer, etc. monitors the documents opened in them so that these applications can notice external notifications to those files. This is useful to prevent overwriting a file with an out-dated in-memory copy. For example: you are taking some notes with GEdit. On a separate terminal window, you quickly edit the notes file with Vim to add a new note. When you go back to GEdit, you want the editor to tell you that the file has changed out of its control and offer you a choice: e.g. reload or ignore?And many, many more other situations that I'm surely missing...As you can see, it is fairly important to get the file monitoring subsystem working flawlessly. Otherwise, all these tiny details don't work and the end user experience is undermined. [Continue reading]

  • New gio-fam package

    As briefly outlined in the previous post, new versions of Glib provide GIO, a library that intends to be a low-level file system API on top of the POSIX interface. This library provides an interface to asynchronously wait for file system change notifications including the creation, deletion and modification of files.The monitoring functionality in GIO is modular: it is backed by different loadable plugins that implement OS-specific functionality. In particular, GIO uses an inotify module in Linux and a FAM module everywhere else.Up until now, the devel/glib2 package in pkgsrc provided a build-time option to specify whether to build the GIO FAM plugin or not. Given that this plugin is built as a shared object that is loaded dynamically at run-time, having a build-time option for this is clearly wrong: it gives no choice to those relying on binary packages (e.g. end/new users). Furthermore, it adds a dependency on the ugly-FAM at the very bottom of the huge Gnome dependency chain. (As already stated, FAM is outdated and hard to set up.)So, based on this, I've just removed all FAM support from devel/glib2 altogether and packaged its loadable module as sysutils/gio-fam.Now waiting for a clean rebuild of the Gnome packages to see if the desktop now works on my machine by avoiding FAM/Gamin. [Continue reading]

  • File system monitoring, Gnome and NetBSD

    A few days ago, I decided to start using NetBSD, as well as Gnome on NetBSD once again, mostly because the lack of their use makes my skills feel rusty in many different areas. While NetBSD has surprised me in a good way (I am running it on a Macbook Pro and things like wireless and DRI work), Gnome has not. There are tons of broken things that prevent a smooth user experience.One of these broken things is the monitoring of changes in the file system. Actually, this has never worked 100%. But what is this and why does it matter, you ask? Well, file system monitoring is an internal component of the Gnome infrastructure that allows the desktop to receive notifications when files or directories change. This way, if, say, you are viewing the Downloads folder in Nautilus and you start downloading a file from Epiphany into that folder, Nautilus will realize the new file and show it immediately without requiring a manual refresh.How to monitor the file system depends on the operating system. There are basically two approaches: polling and asynchronous notifications. Polling is suboptimal because the notifications are usually delayed. Asynchronous notifications are tied to the operating system: Linux provides inotify, NetBSD provides kqueue and other systems provide their own APIs.In the past, Gnome monitored the file system by a combination of FAM, a system-level service that provides an API to file system monitoring, and GNOME VFS, a high-level layer that hides the interaction with FAM. This approach was good in spirit (client/server separation) but didn't work well:FAM is abandoned.Does not support kqueue out of the box.FAM runs as root.FAM is too hard to set up: it requires rpcbind, an addition to /etc/services, a sysctl tweak, and the configuration of a system-level daemon.To solve some of these problems, a drop-in replacement for FAM was started. Gamin, as it is known, still does not fix everything:Gamin is abandoned.Supports kqueue out of the box, but does not work very well.Actually, Gamin itself does not work. Running the tests provided by the distfile in a modern Linux system results in several test failures.Anyway. Did you notice the abandoned pattern above? This is important: in the new world order, Gnome does not use FAM any more.The new structure to monitor files is: the low-level glib library provides the gio module, which has some file system monitoring APIs. The GVFS module provides higher level abstractions to file system management, and relies on gio for file system monitoring. There is no more GNOME VFS any more.The key point is: gio uses inotify directly; no abstraction layers in between. FAM support is still there for platforms without inotify, but as it is not used in Linux any more, it rots. Linux developers will never experience what it is to have a system that needs to use FAM to get this functionality to work.At last, let's look at the status of all this in NetBSD:The FAM package was patched to support kqueue. Although this kinda works, it is not perfect. Also, as mentioned above, FAM is, I'd say, the package with the hardest installation procedure of the whole Gnome platform.The Gamin packages are nicer than the FAM package regarding their configuration. However, when using Gamin instead of FAM, all sorts of bugs appear in Gnome (it actually gets stuck during startup for me). The breakage of the unit tests does not provide any confidence, and the fact that Gamin is abandoned, the idea of fixing it doesn't make me thrive.The glib2 package depends on FAM. This is ugly; really ugly. I had to shout WTF when I saw this, seriously.Seeing the direction gio/gvfs take, it is obvious that things can only get worse in the future.If time permits, I'm planning to work on improving this whole situation. Ideas include:Splitting the FAM gio module out of the glib2 package. Ideally, this would happen upstream.Implement a gio backend for kqueue.Check if the core packages still using gnome-vfs have a more recent version that uses gvfs instead and, if so, update them.Can't promise you anything other than, if I get to work on it, I will keep you posted! [Continue reading]

  • NetBSD in Google Summer of Code 2010

    For the 6th year in a row, NetBSD is a mentoring organization for Google Summer of Code 2010!If you are a bright student willing to develop full-time for an open source project during this coming summer, consider applying with us! You will have a chance to work with very smart people and, most likely, in the area that you are most passionate about. NetBSD, being an operating system project, has offers for project ideas at all levels: from the kernel to the packaging system, passing by drivers, networking tools, user-space utilities, the system installer, automation tools and more!I would like to point you at the 3 project proposals I'm willing to directly mentor:Optimize and speed-up ATF: Make the testing framework blazing fast so that running the NetBSD automated tests does not take ages on slow platforms.Reorganize ATF to improve modularity: Refactor pieces of the testing framework so that it is easier to redistribute, has cleaner interfaces and is easier to depend on from third-party projects.Rewrite pkg_comp with portability as a major goal: Use Python to create a tool to automatically build binary packages from within a sandbox.If you find any of the above projects interesting, or if you have any other project proposal that you think I could mentor, do not hesitate to contact me. Feel free to send me a draft of your application, together with a bit of information about you, so that we can discuss your proposal and make sure it gets selected!Or, if none of the projects above interests you, please do check out the full list of NetBSD project proposals. I'm sure you will find something that suits your interests :-) [Continue reading]

  • New version of the monotone-server package in pkgsrc

    Wow, it has been a long time... 5 years ago, I created the monotone-server package in pkgsrc, a package that provided an interactive script to set up a monotone server from scratch with, what I though, minimal hassle.My package did the job just fine, but past year I was blown away by the simplicity of the same package in Fedora: their init.d script provides a set of extra commands to initialize the server before starting it up, and that is it. No need to mess with a separate interactive script; no need to create and memorize passphrases that you will never use; and, what's more, all integrated in the only single place that makes sense: in the init.d "service management" script.It has been a while since I became jealous of their approach, but I've finally got to it: I've spent the last few days rewriting the monotone-server package in pkgsrc and came up with a similar scheme. And this new package just made its way to pkgsrc-HEAD! The new package comes with what I think is a detailed manual page that explains how to configure the server from scratch. Take a look and, if you find any mistakes, inconsistencies or improvements to be done, let me know!In the meantime, I will log into my home server, rebuild the updated package and put it in production :-) [Continue reading]

  • Introducing the ATF nofork branch

    Despite my time for free software being virtually zero these days, I have managed to implement a prototype of what ATF would look like if it didn't implement forking and isolation in test programs. This feature has been often requested by users to simplify their life when debugging test cases.I shouldn't repeat everything I posted on the atf-devel mailing list regarding this announcement, so please refer to that email for details. But I must say that the results look promising: the overall code of ATF is much simpler and also faster. (An execution I just tried cuts the run time of the ATF test suite from 1m 41s to 1m 16s.) Expect more simplifications and speed-ups! [Continue reading]

  • set -e and set -x

    If you write shell scripts, you definitely need to know about two nice features that can be enabled through the set builtin:set -e: Enables checking of all commands. If a command exits with an error and the caller does not check such error, the script aborts immediately. Enabling this will make your scripts more robust. But don't wait until your script is "complete" to set the flag as an afterthought, because it will be a nightmare to fix the scrip to work with this feature enabled. Just write set -e as the very first line of your code; well... after the shell bang.set -x: If you are writing simple scripts that are meant to, well, script the execution of a few tasks (as opposed of being full-flown programs written in shell), set this flag to trace the execution of all commands. This will make the interpreter print each command right before it is executed, so it will aid you in knowing what is happening at any point in time.Subscribe to the Julipedia to stay tuned on posts similar to this one and read more about this blog to see what you may be missing! [Continue reading]

  • Installing NetBSD/macppc on a Mac Mini G4

    Yesterday, I spent a while installing NetBSD/macppc 5.0.1 on a Mac Mini G4. The process wasn't easy, as it involved the following steps. I'm omitting many details, as they are "common knowledge" to Mac users (or otherwise can be easily found on the net):After booting the installer from the CD image, drop into the shell.Use pdisk to create an Apple_HFS partition for the boot loader and two Apple_UNIX_SVR2 partitions, one for the root file system and another for swap.Run sysinst and install the system. When asked to repartition the disk, just say Use existing partition sizes.Once the system is installed, drop again into the shell before rebooting.Mount your hard disk into /mnt and chroot into it.Fetch a copy of pkgsrc.Install the sysutils/hfsutils package.Use hformat to create a new HFS file system in the Apple_HFS partition we created.Mount the installation CD.Copy, using hcopy, the ofwboot.xcf file from the CD to the boot partition.Reboot.Drop into the OpenFirmware setup (Command+Option+P+R).Set boot-device to hd:,ofwboot.xcf.Set boot-file to netbsd.And here is the tricky thing to get the machine to auto-boot: Set boot-command to ." hello" cr " screen" output boot, not mac-boot.I found the last command somewhere on the Internet (dunno where now), but, supposedly, a regular mac-boot should have worked. In fact, it works if you call this command from the prompt, but not during automatic boot. (It turns out to be a problem with the version of OpenFirmware I have.)Just writing down the steps in case I need them later on. Installing Debian stable was much, much easier, but the installer for testing crashes every day with a different error, so I gave up.(Oh, by the way, I did the same installation into an old PowerMac G3 and that was really painful. The machine refused to boot from any of the CDs I tried and the prebuilt kernels hang during initialization due to a bogus driver. In the end: netbooting and using custom kernels.) [Continue reading]