• SoC: The atf-run tool

    One of the key goals of atf is to let the end user — not only the developer — to easily run the tests after their corresponding application is installed. (In our case, application = NetBSD, but remember that atf also aims to be independent from NetBSD.) This also means, among other things, that the user must not need to have any development tool installed (the comp.tgz set) to run the tests, which rules out using make(1)... how glad I'm of that! :-)Based on this idea, each application using atf will install its tests alongside its binaries, being the currently location /usr/tests/<application>. These tests will be accompanied by a control file — an Atffile — that lists which tests have to be run and in which order. (In the future this may also include configuration or some other settings.) Later on, the user will be able to launch the atf-run tool inside any of these directories to automatically run all the provided tests, and the tool will generate a pretty report while the tests are run.Given that atf is an application, it has to be tested. After some work today, it is finally possible for atf to test itself! :-) Of course, it also comes with several bootstrap tests, written using GNU Autotest, to ensure that atf's core functionality works before one can run the tests written using atf itself. Otherwise one could get unexpected passes due to bugs in the atf code.This is what atf installs:$ find /tmp/local/tests/tmp/local/tests/tmp/local/tests/atf/tmp/local/tests/atf/Atffile/tmp/local/tests/atf/units/tmp/local/tests/atf/units/Atffile/tmp/local/tests/atf/units/t_file_handle/tmp/local/tests/atf/units/t_filesystem/tmp/local/tests/atf/units/t_pipe/tmp/local/tests/atf/units/t_pistream/tmp/local/tests/atf/units/t_postream/tmp/local/tests/atf/units/t_systembuf$All the t_* files are test programs written using the features provided by libatf. As you can see, each directory provides an Atffile which lists the tests to run and the directories to descend into.The atf-run tool already works (*cough* it's code is ugly, really ugly) and returns an appropriate error code depending on the outcomes of the tests. However, the report it generates is completely un-understandable. This will be the next thing to attack: I want to be able to generate plain-text reports to be seen as the text progresses, but also to generate pretty HTML files. To do the latter, the plan is to use some intermediate format such as XML and have another tool to do the formatting. [Continue reading]

  • SoC: Prototypes for basename and dirname

    Today, I've attempted to build atf on a NetBSD 4.0_BETA2 system I've been setting up in a spare box I had around, as opposed to the Mac OS X system I'm using for daily development. The build failed due to some well-understood problems, but there was an annoying one with respect to some calls to the standard XPG basename(3) and dirname(3) functions.According to the Mac OS X manual pages for those functions, they are supposed to take a const char * argument. However, the NetBSD versions of these functions take a plain char * parameter instead — i.e., not a constant pointer.After Googling for some references and with advice from joerg@, I got the answer: it turns out that the XPG versions1 of basename and dirname can modify the input string by trimming trailing directory separators (even though the current implementation in NetBSD does not do that). This makes no sense to me, but it's what the XPG4.2 and POSIX.1 standards specify.I've resolved this issue by simply re-implementing basename and dirname (which I wanted to do anyway), making my own versions take and return constant strings. And to make things safer, I've added a check to the configure script that verifies if the basename and dirname implementations take a constant pointer and, in that (incorrect) case, use the standard functions to sanity-check the results of my own by means of an assertion.1 How not, the GNU libc library provides its own variations of basename and dirname. However, including libgen.h forces the usage of the XPG versions. [Continue reading]

  • SoC: Start of the atf tools

    Aside from the libraries I already mentioned in a past post, atf1 will also provide several tools to run the tests. An interesting part of the problem, though, is that many tests will be written in the POSIX shell language as that will be much easier than writing them in C or C++: the ability to rapidly prototype tests is a fundamental design goal; otherwise nobody could write them!However, providing two interfaces to the same framework (one in POSIX shell and one in C++) means that there could be a lot of code duplication in the two if not done properly. Not to mention that sanely and safely implementing some of these features in shell scripting could be painful.In order to resolve the above problem, the atf will also provide several binary tools that will be helpers for the shell scripts. Most of these tools will be installed in the libexec directory as they should not be exposed to the user, yet the shell scripts will need to be able to reach them. The key idea will be to later build the shell interface on top of the binary one, reusing as much code as possible.So far I have the following tools:atf-config: Used to dynamically query information about atf's installation. This is needed to let the shell scripts locate where the tools in libexec can be found (because they are not in the path!).atf-format: Pretty-prints a message (single- or multi-paragraph), wrapping it on terminal boundaries.atf-identify: Calculates a test program's identifier based on where it is placed in the file system. Tests programs will be organized in a directory hierarchy, and each of them has to have a unique identifier.The next one to write, hopefully, will be atf-run: the tool to effectively execute multiple test programs in a row and collect their results.Oh, and in case you are wondering. Yes, I have decided to provide each tool as an independent binary instead of a big one that wraps them all (such as cvs(1)). This is to keep them as small as possible — so that shell scripts can load them quickly — and because this seems to be more in the traditional Unix philosophy of having tools for doing very specific things :-)1 Should I spell it atf, ATF or Atf? [Continue reading]

  • SoC: A quote

    I've already spent a bunch of time working on the packaging (as in what will end up in the .tar.gz distribution files) of atf even though it is still in very preliminary stages of development. This involved:Preparing a clean and nice build framework, which due to the tools I'm using meant writing the configure.ac and Makefile.am files. This involved adding some useful comments and, despite I'm familiar with these tools, re-reading part of the GNU Automake and GNU Autoconf manuals; this last step is something that many, many developers bypass and therefore end up with really messy scripts as if they weren't important. (Read: if the package used some other tools, there'd be no reason to not write pretty and clean build files.)Preparing the package's documentation, or at least placeholders for it: I'm referring to the typical AUTHORS, COPYING, NEWS, README and TODO documents that many developers seem to treat as garbage and fill them up at the last minute before flushing out a release, ending with badly formatted texts full of typos. Come on, that's the very first thing a user will see after unpacking a distribution file, so these ought to be pretty!Having spent a lot of time packaging software for pkgsrc and dealing with source code from other projects, I have to say that I've found dozens of packages that do not have the minimum quality one can expect in the above points. I don't like to pinpoint, but I have to: this includes several GNOME packages and the libspe. This last one is fairly "interesting" because the user documentation for it is high-quality, but all the credibility slips away when you look at the source code packaging...To all those authors:“Programs should be written and polished until they acquire publication quality.” — Niklaus Wirth [Continue reading]

  • SoC: Project name

    The automated testing framework I'm working on is a project idea that has been around for a very long time. Back in SoC 2005, this project was selected but, unfortunately, it was not developed. A that time, the project was named regress, a name that was derived from the current name used in the NetBSD's source tree to group all available tests: the src/regress directory.In my opinion, the "regress" name was not very adequate because regression tests are just a kind of all possible tests: those that detect whether a feature that was supposed to be working has started to malfunction. There are other kinds of tests, such as unit tests, integration tests and stress tests, all of which seemed to be excluded from the project just because of its name.When I wrote my project proposal this year, I tried to avoid the "regression testing" name wherever possible and, instead, simply used the "testing" word to emphasize that the project was not focusing on any specific test type. Based on that, the NetBSD-SoC administrators chose the atf name for my project, which stands for Automated Testing Framework. This is a very simple name,and, even though it cannot be easily pronounced, I don't dislike it: it is short, feels serious and clearly represents what the project is about.And for the sake of completion, let me mention another idea I had for the project's name. Back when I proposed it, I thought it could be named "NetBSD Automated Testing Framework", which could then be shortened to nbatf or natf (very similar to the current name, eh?). Based on the latter name, I thought... the "f" makes it hard to pronounce, so it'd be reduced to "nat", and then, it could be translated to the obvious (to me) person name that derives from it: Natalie. That name stuck on my head for a short while, but it doesn't look too serious for a project name I guess ;-) But now, as the atf won't be tied to NetBSD, that doesn't make much sense anyway. [Continue reading]

  • SoC: Getting started

    This weekend I have finally been able to start coding for my SoC project: the Automated Testing Framework for NetBSD. To my disliking, this has been delayed too much... but I was so busy with my PFC that I couldn't find any other chance to get my hands on it.I've started by working on two core components:libatf: The C/C++ library that will provide the interface to easily write test cases and test suites. Among its features will be the ability to report the results of each test, a way to define meta-data for each test, etc.libatfmain: A library that provides a default entry point for test programs that takes care to run the test cases in a controlled environment — e.g. it captures all signals and deals with them gracefully — and provides a standard command-line interface for them.Soon after I started coding, I realized that I could need to write my own C code to deal with data collections and safe strings. I hate that, because that's a very boring task — it is not related to the problem at hand at all — and because it involves reinventing the wheel: virtually all other languages provide these two features for-free. But wait! NetBSD has a C++ compiler, and atf won't be a host tool1. So... I can take advantage of C++, and I'll try to. Calm down! I'll try to avoid some of the "complex" C++ features as much as possible to keep the executables' size small enough. You know how the binaries' size blows up when using templates... Oh, and by the way: keep in mind that test cases will be typically written in POSIX shell script, so in general you won't need to deal with the C++ interface.Furthermore, I see no reason for atf to be tied to NetBSD. The test cases will surely be, but the framework needn't. Thus I'm thinking of creating a standalone package for atf itself and distributing it as a completely independent project (under the TNF2 umbrella), which will later be imported into the NetBSD source tree as we currently do for other third-party projects such as Postfix. In fact, I've already started work on this direction by creating the typical infrastructure to use the GNU auto-tools. Of course this separation could always be done at a later step in the development, but doing it from the very beginning ensures the code is free of NetBSD-isms, emphasizes the portability desire and keeps the framework self-contained.I'd like to hear your comments about these "decisions" :-)1 A host tool is a utility that is built with the operating system's native tools instead of with the NetBSD's tool-chain: i.e. host tools are what build.sh tools builds. Such tools need to be highly portable because they have to be accepted by old compilers and bizarre build environments.2 TNF = The NetBSD Foundation. [Continue reading]

  • Ohloh, an open source directory

    A friend has just told me about Ohloh, a web site that analyzes the activity of open source projects by scanning their source repositories. It is quite nice! It generates statistics about the recent activity of each registered project, the languages they uses, the people working on them... And, for each developer, it accumulates statistics about their work on the different projects he has contributed to, automatically building a developer profile.You can add your own projects to the site, which is a very easy procedure, and create an account to have your own profile, which is useful to merge all your contributions to various projects into a single person. I.e. if you have contributed to one of the registered projects and search for yourself, the web will return some hits; if you have an account, you can claim that you are that person and link your contributions to multiple projects into a single page.Check out my account for an example :-)Edit (19:20): Added the detailed widget. [Continue reading]

  • tmpfs added to FreeBSD

    A bit more than a year ago, I reported that tmpfs was being ported to FreeBSD from NetBSD (remember that tmpfs was my Google SoC 2005 project and was integrated into NetBSD soon after the program ended). And Juan Romero Pardines has just brought to my attention that tmpfs is already part of FreeBSD-current! This is really cool :-)The code was imported to FreeBSD-current on the 16th as seen in the commit mail, so I suppose it will be part of the next major version (7.0). I have to thank Rohit Jalan, Howard Su and Glen Leeder for their efforts in this area.Some more details are given in their TMPFS wiki page.Edit (June 23): Mentioned where tmpfs is being ported from! [Continue reading]

  • Six months with the MacBook Pro

    If memory serves well, today makes the sixth month since I have got my MacBook Pro and, during this period, have been using it as my sole computer. I feel it is a good time for another mini-review.Well... to get started: this machine is great; I probably haven't been happier with any other computer before. I have been able to work on real stuff — instead of maintaining the machine — during these months without a hitch. Strictly speaking I've got a couple of problems... but that was "my fault" for installing experimental kernel drivers.As regards the machine's speed, which I think is the main reason why I wanted to write this post: it is pretty impressive considering it is a laptop. With a good amount of RAM, programs behave correctly and games can be played at high quality settings with a decent FPS rate. But, and everything has a "but": I really, really, really hate its hard disk (a 160 GB, 5400 RPM drive). I cannot stress that more. It's slow. Painfully slow under medium load. Seek times are horrible. That alone makes me feel I'm using a 10 year-old machine. I'm waiting for the shiny-new big 7200 RPM drives to become a bit easier to purchase and will make the switch, even if that means my battery life will be a bit shorter.About Mac OS X... what can I say that you already don't know. It is very comfortable for daily use — although that's very subjective, of course; it's quite "funny" to read some reviews that blame OS X for not behaving exactly like Windows — and, being based on Unix, allows me to do serious development with a sane command-line environment and related tools. Parallels Desktop for Mac is my preferred tool so far as I can seamlessly work with Windows-only programs and do Linux/NetBSD development, but other free applications are equally great; some worth of mention: Adium X, Camino or QuickSilver.At last, sometimes I miss having a desktop computer at home because watching TV series/movies on the laptop is rather annoying — I have to keep adjusting the screen's position so it's properly visible when laying on bed. I can imagine that an iMac with the included remote control and Front Row could be awesome for this specific use.All in all, don't hesitate to buy this machine if you are considering it as a laptop or desktop replacement. But be sure to pick the new 7200 RPM drive if you will be doing any slightly-intensive disk operation. [Continue reading]

  • PFC report almost ready

    The deadline for my PFC (the project that will conclude my computer science degree) is approaching. I have to hand out the final report next week and present the project on July 6th. Its title is "Efficient resource management in heterogeneous multiprocessor systems" and its basic goal is to inspect the poor management of such machines in current operating systems and how this situation could be improved in the future.Our specific case study has been the Cell processor, the PlayStation 3 and Linux, as these form a clear example of an heterogeneous multiprocessor system that may become widespread due to its relatively cheap price and the attractive features (gaming, multimedia playback, etc.) it provides to a "home user".Most of the project has been an analysis of the current state of the art and the proposal of ideas at an abstract level. Due to timing constraints and the complexity of the subject (should I also mention bad planning?), I have been unable to implement most of them even though I wanted to do so at the very beginning. The code I've done is so crappy that I won't be sharing it anytime soon, but if there is interest I might clean it up (I mean, rewrite it from the ground up) and publish it to a wider audience.Anyway, to the real point of this post. I've published an almost definitive copy of the final report so that you can take a look at it if you want to. I will certainly welcome any comments you have, be it mentioning bugs, typos, wrong explanOctations or anything! Feel free to post them as comments here or to send me a mail, but do so before next Monday as that's the deadline for printing. Many thanks in advance if you take the time to do a quick review!(And yes... this means I'll be completely free from now on to work on my SoC project, which is being delayed too much already...)Edit (Oct 17th): Moved the report in the server; fixed the link here. [Continue reading]

  • NetBSD's website redesign

    Even though I don't usually repost general NetBSD news, I would like to mention this one: the NetBSD web site has got a severe facelifting aiming at improving its usability and increasing the consistency among its pages.Many thanks to Daniel Sieger for his perseverance and precious work. This is something that had been attempted in the past many times but raised so many bikesheds that it was never accomplished.In case you would like to contribute to the project doing something relatively easy, you can do so now. It could be interesting to revamp many of the existing pages to be more user friendly by reorganizing their contents (simplification is good sometimes!), their explanations and making better use of our XML-based infrastructure. Keep in mind that the web site is the main "entry point" to a project and newcomers should feel very comfortable with it; otherwise they will go away in less than a minute!Furthermore, it'd be nice to see if there are any plain HTML pages left and convert them to XML. This could make all those pages automatically use the new look of the site and integrate better with it. (If you don't know what I mean, just click, for example, on the Report or query a bug at the top of the front page. It looks ugly; very ugly. But unfortunately, this is not as simple as to convert the page to XML because it is automatically generated by some script.)Send your feedback to www AT NetBSD.org or to the netbsd-docs AT NetBSD.org public list. [Continue reading]

  • Compiler-level parallelization and languages

    Some days ago, Intel announced a new version of their C++ and Fortran compilers. According to their announcement:Application performance is also accelerated by multi-core processors through the use of multiple threads.So... as far as I understand, and as some other news sites mention, this means that the compiler tries to automatically parallelize a program by creating multiple threads; the code executed on each thread is decided at build time through some algorithm that deduces which blocks of code can be executed at the same time.If this is true — I mean, if I understood it correctly —, it sounds great but I don't know to what level the compiler is able to extract useful information from code blocks in either C++ and Fortran. These two languages follow the imperative programming paradigm: a program written in them describes step by step how the machine must operate. In other words: the program specifies how a specific type of machine (a load/store one) must behave in order to compute a result, not how the result is computed.Extracting parallelization information from this type of languages seems hard if not impossible except for very simple cases. Even more, most imperative languages are not protected against side effects: there is a global state that is accessible from any part of the program, which means that you cannot predict how a specific call will change this global state. In terms of functions: a function with a specific set of parameters can return different values on each call, because it can store auxiliary information on global variables.It seems to me that functional languages are much more suitable to this kind of compiler-level parallelization than imperative ones. In a functional language, the program describes how to compute a result at an abstract level, not how to reach that result by a specific type of machine. The way the compiler arrives to that result is generally irrelevant. (If you know SQL, it has the same properties: you describe what you want to know through a query but you don't know how the database engine will handle it.) Furthermore, and this is important, purely functional languages such as Haskell do not have side effects as long as you don't use monads. So what does this mean? A function, when called with a specific same of parameters, will always return the same result. (So yes, the compiler could, and possible does, trivially apply memoization.)With these characteristics, a compiler for a functional language could do much more to implement compiler-level parallelization. Each call to a function could be analyzed to see which other functions it calls, thus generating a call graph; later on, the compiler could decide which subset of this graph is sent to each computing core (i.e. placed on an independent thread) and merge the results between threads when it got to processing the top level function it split. So if you had an expression such as:foo = (bar 3 4) + (baz 5 6)The compiler could prepare two threads, one to compute the result of bar 3 4 and one to calculate bar 5 6. At the end, and after the two threads finished, it could do the sum. Of course, bar and baz could have to be "large" enough to compensate the time spent on creating and managing the threads.Anyway, what I wanted to emphasize is that depending on the language you choose, doing specific types of code analysis and optimization can be much easier and, of course, much better.To conclude, and as I'm talking about Haskell, I'd like to suggest you to read the article "An introduction to Haskell, part 1" recently published at ONLamp.com. It ends talking about this idea a bit. [Continue reading]

  • Flattening an array of arrays

    This evening a friend asked me if I knew how to easily flatten an array of arrays (AoA from now on) in Perl 5. What that means is, basically, to construct a single array that contains the concatenation of all the arrays inside the AoA.My first answer was: "foldr", but I knew beforehand that he wouldn't like it because... this is Haskell. After some time we got to the conclusion that there is no trivial way to flatten an AoA in Perl 5, even though Perl 6 includes a built-in function to do so. He ended up using this code to resolve the problem:my @ordered = map { @$_ } values %arches;Ew, how ugly. Anyway, as part of the discussion, I then continued on my first answer just to show him how versatile functional programming is. And I said, hey, look at this nice example:Hugs> foldr (++) [] [[1,2], [3,4]][1,2,3,4]His answer: oh well, but it is easier in Ruby: just use the built-in ary.flatten function. Hmm... but why would I need a built-in function in Haskell when I can just redefine it in a trivial single line?flatten = foldr (++) []There you go, you can now flatten as much AoAs as you want! (Huh, no parameters? Well, you don't need to name them.) Isn't functional programming great? ;-)PS: I know nothing about Ruby, but I bet you can write a very similar definition using this or other non-functional languages. I remember someone explaining somewhere (yeah, that's very specific) that Ruby has some patterns that resemble functional programming. So yes, you can probably define a flatten function by using something that looks like foldr, but that might look out of place in an imperative language. (Would be great to know about it for sure!)Edit (June 9th): Added a link to my friend's blog. [Continue reading]

  • Is assembly code faster than C?

    I was reading an article the other day and found an assertion that bugged me. It reads:System 6.0.8 is not only a lot more compact since it has far fewer (mostly useless) features and therefore less code to process, but also because it was written in assembly code instead of the higher level language C. The lower the level of the code language, the less processing cycles are required to get something done.It is not the first time I see someone claiming that writing programs in assembly by hand makes them faster, and I'm sure it is not the last time I'll see this. This assertion is, simply put, wrong.Back in the (good?) old days, processors were very simple: they fetched a instruction from main memory, executed it and once finished (and only then), they fetched the next instruction and repeated the process. On the other hand, compilers were very primitive and their optimization engines were, I dare to say, non-existent. In such scenario, a good programmer could really optimize any program by writing it in assembly instead of in a high-level language: he was able to understand very well how the processor internally behaved and what the outcomes of each machine-level instruction were. Furthermore, he could get rid of all the "bloat" introduced by a compiler.Things have changed a lot since then. Nowadays' processors are very complex devices: they have a very deep execution pipeline that, at a given time, can be executing dozens of instructions at once. They have powerful branch prediction units. They reorder instructions at run time and execute them in an out-of-order way (provided they respect the data dependencies among them). There are memory caches everywhere. So... it is, simply put, almost impossible for a programmer's brain to keep track of all these details and produce efficient code. (And even if he could, the efficiency could be so tied to a specific microprocessor version that it'd be useless in all other cases.)Furthermore, compilers now have much better optimization stages than before and are able keep track of all these processor-specific details. For example, they can reorder instructions on their own or insert prefetching operations at key points to avoid cache misses. They can really do a much better job in converting code to assembly than a programmer would in most cases.But hey! Of course it is still possible and useful to manually write optimized routines in assembly language — to make use of SIMD extensions for example — but these routines tend to be as short and as simple as possible.So, summarizing: it no longer makes sense to write big programs (such as a complete operating systems) in assembly language. Doing that means you lose all the portability gains of a not-so-high-level language such as C and that you will probably do a worse optimization job than a compiler would. Plus well-written and optimized C code can be extremely efficient, as this language is just a very thin layer over assembly.Oh, and back to the original quote. It would have made sense to mention the fact that the Mac Plus was written in assembly if it had been compared with another system of its epoch written in C. In that case, the argument would have been valid because the compilers were much worse than they are today and the processors were simpler. Just remember that such assertion is, in general, not true any more. [Continue reading]

  • Mac tutorials at ScreenCasts Online

    I've recently subscribed to (the free version of) ScreenCasts Online based on some comments I read somewhere. This is a video podcast that explains tips and tricks for the Mac, and presents third party software — either commercial or free — in great detail, which is ideal if you are planning to purchase some specific commercial program.The typical show starts by presenting a problem to be resolved or by directly talking about the specific program to be presented. It is then followed by a detailed inspection of the user interface and some sections that exemplify common tasks. At the very end, it gives pointers to either fetch or buy the program. I have to confess that I find some of these shows to be excessively detailed, to the point of becoming boring at some point. But they are still a good way to see all the possibilities a given program can offer."Thanks" to them, I've fallen in love with OmniGraffle and OmniPlan ;-) Pity they are so expensive because I won't be paying that amount of money for my extremely modest needs. [Continue reading]