• Projects migrated to Git

    I finally took the plunge. Yesterday night, I migrated the Kyua and Lutok repositories from Subversion to Git. And this morning I migrated ATF from Monotone and custom hosting to Git and Google Code; oh, and this took way longer than expected.Migration of Kyua and LutokMigrating these two projects was straightforward. After preparing a fresh local Git repository following the instructions posted yesterday, pushing to Google Code is a simple matter:$ git remote add googlecode https://code.google.com/p/your-project$ git push googlecode --all$ git push googlecode --tagsOne of the nice things I discovered while doing this is that a Google Code project supports multiple repositories when the VCS system is set to Git or Mercurial. By default, the system creates the default and wiki repositories, but you can add more at will. This is understandable given that, in Subversion, you have the ability to check out individual directories of a project whereas you cannot do that in the other supported VCSs: you actually need different repositories to group different parts of the project.I performed the full migration under a Linux box so that I could avail of the most recent Git version along with a proven binary package. The migration went alright, but I encountered a little problem when attempting a fresh checkout from NetBSD: git under NetBSD will not work correctly against SSL servers because it lacks the necessary CA certificates. The solution is to install the security/mozilla-rootcerts package and follow the instructions printed during installation; why this does not happen automatically escapes my mind.Migration of ATFI had been having doubts about migrating ATF itself, although if Kyua was moved to Git, it was a prerequisite to move ATF as well.  Certainly I could convert the repository to Git, but where could I host it afterwards?  Creating a new Google Code project just for this seemed too much of a hassle. My illumination came when I found out, as above, that Google Code supports an arbitrary amount of repositories in a particular project when converting it to Git.So, for ATF, I just ran mtn git_export with appropriate flags, created a new atf repository on the Kyua site, and pushed the contents there. Along the way, I also decided to kill the home-grown ATF web site and replace it by a single page containing all the relevant information. At this point, ATF and Kyua are supposed to work together quite tightly (in the sense that ATF is just a "subcomponent" of Kyua), so coupling the two projects on the same site makes sense.Now, let's drink the kool aid. [Continue reading]

  • Converting a Subversion repository to Git

    As discussed over a week ago, I have been pondering the idea of migrating my projects from Subversion to Git. One of the prerequisites of such a migration is the preparation of a process to cleanly migrate the revision history from the old system to the new one. Of course, such process should attempt to preserve the revision history as close to reality as possible (regardless of what some other big projects have done by just throwing away their history; shrug).The projects I am interested in migrating from Subversion to Git all live in Google Project Hosting. This is irrelevant for all but one detail I will mention later. However, the project hosting's site is the obvious place to look for help and, as expected, there is a little document titled Convert your project from Subversion to Git that provides a good start.Summarizing, there are two well-suited tools to convert a Subversion repository to Git:The built-in git svn command: works pretty well and has a good amount of customization features. The only little problem is that git svn does not recognize Subversion tags as such and therefore converts them to Git branches instead of tags.The third-party svn2git: while this is actually built on top of git svn, it properly handles Subversion tags and converts them to native Git tags.If you do not have tags in your Subversion repository or do not care about them being properly converted to Git tags, you can go with git svn. However, I am interested in properly recording tags, so I chose svn2git for my needs.Now, using svn2git by itself as instructed in the referenced document will happily convert your Subversion repository to Git. You should not have to care about anything else... unless you want to polish a few details mentioned below.Set upIn the rest of this article, I will place all the scripts for the conversion in a ~/convert/ directory. When the script is run, it will create a ./git/ subdirectory that will end up containing the new Git repository, and a ./control/ directory that will maintain temporary control files.Disclaimer: Be aware that all the code below is quite ugly and probably bug-ridden. It is not meant to be pretty as I only have two use cases for it. Also please note that I am, by no means, a Git expert, so the process below may be way more convoluted than is actually necessary.Oh, and this does not deal at all with the wiki changes. Google Code stores such information in the same Subversion repository as your code. However, when migrating to Git, you get two different repositories. Adapting the code below to deal with the wiki component of your project should be easy enough, although I'm not sure I care enough about the few contents I have in the wikis to go through the hassle.Warning: The instructions below completely mess up tags if you happen to have any in your repository. I only discovered this after following the procedure below and pushing HEAD and the tags, which resulted in a different set of revisions pushed for every tag. When I actually did this, I ended up running the steps below, then discarding the tags, and creating the tags again by hand pointing to the appropriate revisions.Setting up authorsThe first step in using svn2git is defining a mapping between Subversion authors to Git authors. This is easy enough, but there is one caveat that affects projects hosted in Google Code: the root revision of your Subversion repository is not actually authored by you; its author is (no author) so you should account for that. Something like this in your ~/.svn2git/authors file will tae care of this teeny tiny detail:your-address@example.net = Your Name (no author) = Your Name However, as we will see below, we will be skipping the first revision of the repository altogether so this is actually not really necessary. I just felt like mentioning it for completeness, given that I really didn't expect (no author) to be properly recognized in this context.References to old Subversion revision idsUnless you have been extremely terse in your commit history and in your bug tracker, you will have plenty of cross-references pointing to Subversion revision identifiers. For example, if you are fixing a bug introduced a month ago in r123, you may as well point that out in the commit message of r321 for reference purposes. I guess you can see the problem now: if we migrate the history "as is", all these references become meaningless because the new Git history has no traces of the old revision identifiers.The -m option to svn2git will annotate every revision with a git-svn-id line that contains the relevant information. However, such line is quite ugly because it is not really meant for human consumption: the information in such line is used by git-svn to allow pulls and pushes from a master Subversion repository.What I want to do is reword the git-svn-id line to turn it into a real sentence rather than some internal control code. We can achieve this with git rebase in interactive mode: mark all revisions as "reword" and then go revision by revision fixing its message. A pain that can be automated: if an editor can do the necessary changes, we can create a fake "editor" script that performs the same modifications.How? Store this as ~/convert/editor1.sh:#! /bin/shif grep git-svn-id "${1}"; then # We are editing a commit message. new="This revision was r\1 in Subversion." sed -r -i -e "s,git-svn-id[^@]+@([0-9]+).*$,${new}," "${1}"else # We are editing the git-rebase interactive control file. sed -i -e 's,^pick,reword,' "${1}"fiWith this script, we can simply do EDITOR=~/convert/editor1.sh git rebase -i base-revision and get every revision tagged... but this will blow up if your repository contains empty revisions, which takes us to the next concern.Drop empty revisionsAs you probably know, Subversion supports attaching metadata to directories and files in the form of properties. These properties cannot be represented in Git, so, if you have any Subversion commits that touched properties alone, svn2git will happily convert those as empty Git commits.There is nothing wrong with this, but things like git rebase will choke on these empty commits over and over again... and it gets quite annoying. Furthermore, these empty commits serve no purpose in Git because the changes they performed in Subversion make no sense in Git land. It is easier to just kill them all from the history.The git rebase command above will abort on every empty revision it encounters. We can, at that point, record their identifiers for later deletion. However, recording the revision identifier will not work because, as we are doing a rebase, the identifier will have changed once we are done. Instead, and because I have been careful to write detailed commit messages, we can rely on the first line of the message (aka the subject) to identify every message. Rerun the rebase as follows, storing the list of empty commits in ../control/empty:first=$(git log | grep '^commit' | tail -n 1 | cut -d ' ' -f 2)EDITOR="${convert}/editor1.sh" git rebase --interactive "${first}" || truetouch ../control/emptywhile [ -f .git/MERGE_MSG ]; do head -n 1 .git/COMMIT_EDITMSG >>../control/empty EDITOR="${convert}/editor1.sh" git commit --allow-empty EDITOR="${convert}/editor1.sh" git rebase --continue || truedoneWith this list in mind, we create another ~/convert/editor2.sh script to remove the empty revisions once we know them:#! /bin/shecho "Empty revisions to be deleted:"cat ../control/empty | while read line; do grep "${line}" "${1}" sed -i -e "/^pick ${line}$/s,^pick,fixup," "${1}"doneAmend the root revisionThe root revision in a Google Code subversion repository is empty: the system creates it to initialize the "directory layout" of your repository, but it serves no purpose. We can skip this root revision by passing the --revision=2 flag to svn2git.However, no matter what we do, the git rebase above to tag Git revisions with their corresponding Subversion identifiers, will happily skip our first real revision and leave it untagged. We have to manually go and fix this, which is actually quite tricky. Luckily, this reply in stackoverflow provides the solution.Putting it all togetherAlright then. If all the above was dense and cryptic code-wise, it is the time to put it all together in a script that performs all the steps for us. Assuming you already have ~/convert/editor1.sh and ~/convert/editor2.sh in place, now create ~/convert/driver.sh as follows:#! /bin/shset -e -x[ ${#} -eq 1 ] || exit 1convert=$(dirname ${0})project=${1}rm -rf git controlmkdir git controlcd git# Start at revision 2 to skip the initial empty revision.svn2git -v --revision=2 -m "http://${project}.googlecode.com/svn"# Tag git revisions with the original Subversion revision id.first=$(git log | grep '^commit' | tail -n 1 | cut -d ' ' -f 2)EDITOR="${convert}/editor1.sh" git rebase --interactive "${first}" || truetouch ../control/emptywhile [ -f .git/MERGE_MSG ]; do head -n 1 .git/COMMIT_EDITMSG >>../control/empty EDITOR="${convert}/editor1.sh" git commit --allow-empty EDITOR="${convert}/editor1.sh" git rebase --continue || truedone# Drop empty revisions recorded in the previous step.# The list is in the 'empty' file and is read by editor2.sh.EDITOR="${convert}/editor2.sh" git rebase --interactive "${first}" || true# Tag the root revision with the original Subversion revision id.git tag root $(git rev-list HEAD | tail -1)git checkout -b new-root rootEDITOR="${convert}/editor1.sh" git commit --amendgit checkout @{-1}git rebase --onto new-root rootgit branch -d new-rootgit tag -d rootcd -rm -rf controlYuck. Well, it works, and it works nicely. Converting a Subversion repository to Git will all the constraints above is now as easy as: ~/convert/driver.sh your-project-name! [Continue reading]

  • Kyua 0.3 released!

    Dear readers,I am pleased to announce that Kyua 0.3 is available!The major feature in this release is the introduction of the "test results store"; i.e. a SQLite3-based database that records all test execution activity so that reports can be gathered in a central machine and reports generated out of it. This is still very experimental and the generated reports are quite rudimentary, but is a first step towards this direction.The Kyua 0.3 release page provides links to the download as well as the list of major changes in this new release.It has been a really long time since the 0.2 release. I wanted to postpone 0.3 until HTML reports were ready, but I have not had the time to implement this for a while already. Instead of postponing the release even further, I have decided to create it right now and then take the time to create nice reports separately. Additionally, I am planning on doing some repository changes and wanted to do them without an imminent release along the way.The package in pkgsrc has been updated and now I'm working on bringing binary packages to Fedora.Enjoy! [Continue reading]

  • Encrypted disk images in NetBSD

    When I joined the NetBSD Board of Directors, I was trusted with access to private information and one of the prerequisites for downloading such data was to be able to store it safely: i.e. inside an encrypted volume. I initially went the easy route by putting the data in my laptop, which already uses FileVault 2 to encrypt the whole disk. But soon after, I decided that it would be nicer to put this data in my NetBSD home server, which is the machine that is perpetually connected to IRC and thus the one I use to log into the weekly meetings.At that point, I was faced with the "challenge" to set up an encrypted volume under NetBSD. I knew of the existence of cgd(4): cryptographic disk driver but had never used it. It really is not that hard, but there are a bunch of manual steps involved so I've been meaning to document them for almost a year already.Because my home server is a NetBSD/macppc and reinstalling the system is quite a nightmare, I did not want to go through the hassle of repartitioning the disk just to have an encrypted partition of 1GB. The choice was to create a disk image instead, store it in /secure.img and mount it on /secure. Let's see how to do this.The first step is to create the disk image itself and "loop-mount" it (in Linux terms). We do this the traditional way, by creating an empty file and setting up a vnd(4): vnode disk driver device to access it:# dd if=/dev/zero of=/secure.img bs=1m count=1024# vnconfig -c /dev/vnd2 /secure.imgThe next step is to configure a new cgd(4) device. This block-level device is a layer placed on top of a regular storage device, and is in charge of encrypting the data to the lower-level device transparently. cgdconfig(8), the tool used to configure cgd(4) devices, stores control information on a device basis under /etc/cgd/. The control information can be generated as follows:# cgdconfig -g -o /etc/cgd/vnd2c aes-cbc 192This creates the /etc/cgd/vnd2c configuration file, which you can choose to inspect now. With the file in place, configuring the cryptographic device for the first time is easy:# cgdconfig cgd0 /dev/vnd2cThis command will inspect the underlying /dev/vnd2c device looking for a signature indicating that the volume is valid and already encrypted. Because the volume is still empty, the tool will proceed to ask us for an encryption key to initialize the volume. We enter our key, wait a couple of seconds, and /dev/cgd0 will be ready to be used as a regular disk device. With that in mind, we can proceed to create a new file system and mount it on the desired location:# newfs -O 2 /dev/rcgd0c# mkdir /secure# mount /dev/cgd0c /secureTo simplify access to the device for your users, I did something like this:# user=your-user-name# mkdir /secure/${user}# chown ${user}:users /secure/${user}# chmod 700 /secure/${user}# ln -s /home/${user}/secure /secure/${user}And we are done. Now, depending on the situation, we could choose to get /secure automatically mounted during boot, but that would involve having to type the decryption key. This is OK for a laptop, but not for a headless home server. Because I don't really need to have this secure device mounted all the time, I have the following script in /root/secure.sh that I can use to mount and unmount the device at will:#! /bin/shset -e -xcase "${1:-mount}" inmount) vnconfig -c /dev/vnd2 /secure.img cgdconfig cgd0 /dev/vnd2c fsck -y /dev/cgd0c mount /dev/cgd0c /secure ;;unmount) umount /secure cgdconfig -u cgd0 vnconfig -u /dev/vnd2 ;;esacBe aware that cgdconfig(1) has many more options! Take a look at the manual page and choose the ones that best suit your use case.Random idea: instead of using a vnd(4) device, plug in a USB stick and use that instead for your secure data!  (I actually do this too to back up sensitive information like private keys.) [Continue reading]

  • NetBSD 6.0 BETA tagged

    Dear users of NetBSD,I am pleased to announce that we (well, the release engineering team!) have just tagged the netbsd-6 branch in the CVS repository and thus opened the gate for testing of NetBSD 6.0_BETA. New binary snapshots should start appearing in the daily FTP archive soon. You can, of course, perform a cvs update -r netbsd-6 on your existing source tree and roll your own binaries (as I'm already doing on my home server).Please help us make NetBSD 6.0 the best release ever! As you may already know, it will provide tons of new features compared to the ancient 5.x release series that will make your user experience much better. The branch just needs a bit of love to shake out any critical bugs that may be left and to ensure that all those annoying cosmetic issues are corrected. So, if you find any of those, do not hesitate to file a problem report (PR).Thank you! [Continue reading]

  • Kyua: Weekly status report

    A couple of things happened this week:Spent quite a few time researching the idea of moving away from Monotone and Subversion to Git. I haven't made a decision yet, but I'm pretty convinced this is the right way to go. It will simplify development significantly, it will allow me to code offline (have a bunch of really long flights coming), and it will lower the entry barrier to Kyua by making all components use the same, mainstream VCS.Implemented filtering by result type of the test case results in the textual reports.I think it is time to prepare a 0.3 release. I wanted to wait until we had HTML reports in place, but this will require significant effort and I have been postponing the implementation for too long already. As it is now, the current codebase provides major changes since the ancient 0.2 release and it is worth a release. Then, we can create packages for NetBSD and Fedora instead of continuing to add new features, which should be a good step in giving further visibility to the project. Finally, we can reserve HTML reporting as the major feature for 0.4 :-) [Continue reading]

  • Switching projects to Git

    The purpose of this post is to tell you the story of the Version Control System (VCS) choices I have made while maintaining my open source projects ATF, Kyua and Lutok. It also details where my thoughts are headed to these days.This is not a description of centralized vs. distributed VCSs, and it does not intend to be one. This does not intend to compare Monotone to Git either, although you'll probably feel like it while reading the text. Note that I have fully known the advantages of DVCSs over centralized systems for many years, but for some reason or another I have been "forced" to use centralized systems on and off. The Subversion hiccup explained below is... well... regrettable, but it's all part of the story!Hope you enjoy the read.Looking back at Monotone (and ATF)I still remember the moment I discovered Monotone in 2004: simply put, it blew my mind. It was clear to me that Distributed Version Control Systems (DVCSs) were going to be the future, and I eagerly adopted Monotone for my own projects. A year later, Git appeared and it took all the praise for DVCSs: developers all around started migrating en masse to Git, leaving behind other (D)VCSs. Many of these developers then went on to make Git usable (it certainly wasn't at first) and well-documented. (Note: I really dislike Git's origins... but I won't get into details; it has been many years since that happened.)One of the projects in which I chose to use Monotone was ATF. That might have been a good choice at the time despite being very biased, but it has caused problems over time. These have been:Difficulty to get Monotone installed: While most Linux distributions come with a Monotone binary package these days, it was not the case years ago. But even nowadays if all Linux distributions have binary packages, the main consumers of ATF are NetBSD users, and their only choice is to build their own binaries. This generates discomfort because there is a lot of FUD surrounding C++ and Boost.High entry barrier to potential contributors: It is a fact that Monotone is not popular, which means that nobody is familiar with it. Monotone's CLI is very similar to CVS, and I'd say the knowledge transition for basic usage is trivial, but the process of cloning a remote project was really convoluted until "recently". The lack of binary packages, combined with complex instructions on just how to fetch the sources of a project only help in scaring people away.Missing features: Despite years have passed, Monotone still lacks some important features that impact its usability. For example, to my knowledge, it's still not possible to do work-directory merges and, while the interactive merges offered by the tool seem like a cool idea, they are not really practical as you get no chance to validate the merge. It is also not possible, for example, to reference the parent commit of any given commit without looking at the parent's ID. (Yeah, yeah, in a DAG there may be more than one parent, but that's not the common case.) Or know what a push/pull operation is going to change on both sides of the connection. And key management and trust has been broken since day one and is still not fixed. Etc, etc, etc.No hosting: None of the major project hosting sites support Monotone. While there are some playground hosting sites, they are toys. I have also maintained my own servers sometimes, but it's certainly inconvenient and annoying.No tools support: Pretty much no major development tools support Monotone as a VCS backend. Consider Ohloh, your favorite bug tracking system or your editor/IDE. (I attempted to install Trac with some alpha plugin to add Monotone support and it was a huge mess.)No more active development: This is the drop that spills the cup. The developers of Monotone that created the foundations of the project left years ago. While the rest of the developers did a good job in coming up with a 1.0 release by March 2011, nothing else has happened since then. To me, it looks like a dead project at this point :-(Despite all this, I have been maintaining ATF in its Monotone repository, but I have felt the pain points above for years.Furthermore, the few times some end user has approached ATF to offer some contribution, he has had tons of trouble getting a fresh checkout of the repository and given up. So staying with Monotone hurts the project more than it helps.The adoption of Subversion (in Kyua)To fix this mess, when I created the Kyua project two years ago, I decided to use Subversion instead of a DVCS. I knew upfront that it was a clear regression from a functionality point of view, but I was going to live with it. The rationale for this decision was to make the entry barrier to Kyua much lower by using off-the-shelf project hosting. And, because NetBSD developers use CVS (shrugh), choosing Subversion was a reasonable choice because of the workflow similarities to CVS and thus, supposedly, the low entry barrier.Sincerely, the choice of Subversion has not fixed anything, and it has introduced its own trouble. Let's see why:ATF continues to be hosted in a Monotone repository, and Kyua depends on ATF. You can spot the problem, can't you? It's a nightmare to check out all the dependencies of Kyua, using different tools, just to get the thing working.As of today, Git is as popular, if not more, than Subversion. All the major operating systems have binary packages for Git and/or bundle Git in their base installation (hello, OS X!). Installing Git on NetBSD is arguably easier (at least faster!) than Subversion. Developers are used to Git. Or let me fix that: developers love Git.Subversion gets on the way more than it helps; it really does once you have experienced what other VCSs have to offer. I currently maintain independent checkouts of the repository (appropriately named 1, 2 and 3) so that I can develop different patches on each before committing the changes. This gets old really quickly. Not to mention when I have to fly for hours, as being stuck without an internet connection and plain-old Subversion... is suboptimal. Disconnected operation is key.The fact that Subversion is slowing down development, and the fact that it really does not help in getting new contributors more than Git would, make me feel it is time to say Subversion goodbye.The migration to GitAt this point, I am seriously considering switching all of ATF, Lutok and Kyua to Git. No Mercurial, no Bazaar, no Fossil, no anything else. Git.I am still not decided, and at this point all I am doing is toying around the migration process of the existing Monotone and Subversion repositories to Git while preserving as much of the history as possible. (It's not that hard, but there are a couple of details I want to sort out first.)But why Git?First and foremost, because it is the most popular DVCS. I really want to have the advantages of disconnected development back. (I have tried git-svn and svk and they don't make the cut.)At work, I have been using Git for a while to cope with the "deficiencies" of the centralized VCS of choice. We use the squashing functionality intensively, and I find this invaluable to constantly and shamelessly commit incomplete/broken pieces of code that no-one will ever see. Not everything deserves being in the recorded history!Related to the above, I've grown accustomed to keeping unnamed, private branches in my local copy of the repository. These branches needn't match the public repository. In Monotone, you had this functionality in the form of "multiple heads for a given branch", but this approach is not as flexible as named private branches.Monotone is able to export a repository to Git, so the transition is easy for ATF. I have actually been doing this periodically so that Ohloh can gather stats for ATF.Lutok and ATF are hosted in Google Code, and this hosting platform now supports Git out of the box.No Mercurial? Mercurial looks a lot like Monotone, and it is indeed very tempting. However, the dependency on Python is not that appropriate in the NetBSD context. Git, without its documentation, builds very quickly and is lightweight enough. Plus, if I have to change my habits, I would rather go with Git given that the other open source projects I am interested in use Git.No Bazaar? No, not that popular. And the fact that this is based on GNU arch makes me cringe.No Fossil? This tool looks awesome and provides much more than DVCS functionality: think about distributed wiki and bug tracking; cool, huh? It also appears to be a strong contender in the current discussions of what system should NetBSD choose to replace CVS. However, it is a one-man effort, much like Monotone was. And few people are familiar with it, so Fossil wouldn't solve the issue of lowering the entry barrier. Choosing Fossil would mean repeating the same mistake as choosing Monotone.So, while Git has its own deficiencies — e.g. I still don't like the fact that it is unable to record file moves (heuristics are not the same) — it seems like a very good choice. The truth is, it will ease development by a factor of a million (OK, maybe not that much) and, because the only person (right?) that currently cares about the upstream sources for any of these projects is me, nobody should be affected by the change.The decision may seem a bit arbitrary given that the points above don't provide too much rationale to compare Git against the other alternatives. But if I want to migrate, I have to make a choice and this is the one that seems most reasonable.Comments? Encouragements? Criticisms? [Continue reading]

  • A brief look into Fedora's packaging infrastructure

    As you probably know, I have been a long-time "evangelist" of pkgsrc. I started contributing to this packaging system when I first tried NetBSD in 2001 by sending new packages for small tools, and I later became a very active contributor while maintaining the GNOME packages. My active involvement came to an end a few years ago when I switched to OS X, but I still maintain a few packages and use pkgsrc in my multiple machines: a couple of NetBSD systems and 3 OS X machines.Anyway. pkgsrc is obviously not everything in this world, and if I realistically want other people to use my software, there have to be binary packages for more mainstream systems. Let's face it: nobody in their sane mind is going to come over to my project pages, download the source package, mess around with dependencies that do not have binary packages either, and install the results. Supposedly, I would need just one such person, which by coincidence would also be a packager of a mainstream distribution, to go through all these hoops and create the corresponding binary packages. Ah yes, what I said: not gonna happen anytime soon.Sooooo... I spent part of past week learning (again) how to create binary packages for Fedora and, to bring this into practice, I prepared an RPM of lutok and pushed it to Fedora rawhide and Fedora 16. All in all, it has been a very pleasant experience, and the whole point of this post is to provide a small summary of the things I have noticed. Because I know pretty well how pkgsrc behaves and what its major selling points are, I am going to provide some pkgsrc-related comments in the text below.Please note that the text below lacks many details and that it may claim some facts that are not completely accurate. I'm still a novice in Fedora development land.First, let's start describing the basic components of a package definition:spec file: The spec file of a package is RPM's "source description" of how to build a package (pkgsrc's Makefile) and also includes all the package's metadata (pkgsrc's PLIST, DESCR, etc.). This does not include patches nor the original sources. I must confess that having all the details in a single file is very convenient.File lists: Contrary to (common?) misconception, spec files can and do have an explicit list of the files to be included in the package (same as pkgsrc's PLIST). This list can include wildcards to make package maintenance easier (e.g. you can avoid having to list all files generated by Doxygen and just include the directory name, which will just do the right thing). No matter what you do, the build system will ensure that all files generated by the package are part of the file list to ensure that the package is complete.SRPMs: Think about this as a tarball of all the files you need to build a particular package, including the spec file, the source tarball and any additional patches. These files are very convenient to move the source package around (e.g. to publish the package for review or to copy it to a different machine for rebuilding) and also to upload the package to Koji's build system (see below).Subpackages: Oh my... what a relief compared to pkgsrc's approach to this. Creating multiple independent packages from a single spec file is trivial, to the point where providing subpackages is encouraged rather than being a hassle. For what is worth, I have always liked the idea of splitting development files from main packages (in the case of libraries), which in many cases helps in trimming down dependencies. pkgsrc fails miserably here: if you have ever attempted to split a package into subpackages to control the dependencies, you know what a pain the process is... and the results are a collection of unreadable Makefiles.Now let's talk a bit about guidelines and access control:Policies: I was going to write about documentation in this point, but what I really wanted to talk about are policies. There are several policies governing packaging rules, and the important thing is that they are actually documented (rather than being tribal knowledge). The other nice thing is that their documentation is excellent; just take a moment to skim through the Packaging Guidelines page and you will see what I mean. The packaging committee is in charge of editing these policies whenever necessary.Review process: Any new package must go through a peer review process. Having grown accustomed to Google's policy of having every single change to the source tree peer-reviewed, I can't stress how valuable this is. It may seem like a burden to newcomers, but really, it is definitely worth it. The review process is quite exhaustive, and from what I have seen so far, the reviewers tend to be nice and reasonable. As an example, take a look at lutok's review.Repository and ACLs: The source files that describe a package (mainly a spec file and a sources file) are stored in a Git repository (I believe there is a different repository for every package, but I may be wrong). This is nothing unusual, but the nice thing is that each package has its own read/write ACLs. New maintainers have access to their own packages only, which means that the barrier of entry can be lowered while resting assured that such contributors cannot harm the rest of the packages until they have gained enough trust. Of course, there are a set of trusted developers  that can submit changes to any and every package."But you said packaging infrastructure in the title!", you say. I know, I know, and this is what I wanted to talk most about, so here it goes:Common tools: Other than the well-known rpm and yum utilities, developers have access to rpmbuild and fedpkg. rpmbuild would be rpm's counterpart, in the sense that it is the lowest level of automation and exposes many details to the developer. fedpkg, on the other hand, is a nice wrapper around the whole packaging process (involving git, mock builds, etc.).Koji: Koji is Fedora's build system, ready to build packages for you on demand from a simple command-line or web interface. Koji can be used to test the build of packages during the development process on architectures that the developer does not have (the so-called "scratch builds"). However, Koji is mainly used to generate the final binary packages that are pushed into the distribution. Once the packager imports a new source package into the repository, he triggers the build of binary packages to include them later into the distribution.Bodhi: Bodhi is Fedora's update publishing system. When a packager creates a new version of a particular package and wishes to push such update to a formal release (say, Fedora 16), the update is first posted in Bodhi. Then, there are a set of scripts, rules and peer reviews that either approve the update for publication on the branch or not.Let's now talk a bit about pkgsrc's waived strengths and how they compare to Fedora's approach:Mass fixes: In pkgsrc, whenever a developer wants to change the infrastructure, he can do the change himself and later adjust all existing packages to conform to the modification. In Fedora, because some particular developers have write access to all packages, it seems certainly possible to apply a major fix and/or rototill to all packages in the same manner as is done in pkgsrc. Such developer could also trigger a rebuilt of all affected packages using a specific branch for testing purposes and later ensure that the modified packages still work.Isolated builds: buildlink3 is an awesome pkgsrc technology that isolates the build of a particular package from the rest of the system by means of symlinks and wrapper scripts. However, pkgsrc is not alone. Mock is Fedora's alternative to this: Mock provides a mechanism to build packages in a chroot environment to generate deterministic packages. The tools used to generate the "blessed" binary packages for a distribution (aka Koji) use this system to ensure the packages are sane.Bulk builds: This is a term very familiar to pkgsrc developers, so I'm just mentioning it en-passing because this is also doable in RPM-land. While package maintainers are responsible for building the binary packages of the software they maintain (through Koji), authorized users (e.g. release engineering) can trigger rebuilds of any or all packages.And, lastly, let's raise the few criticisms I have up to this point:Lack of abstractions: spec files seem rather arcane compared to pkgsrc Makefiles when it comes to generalizing packaging concepts. What I mean by this is that spec files seem to duplicate lots of logic that would better be abstracted in the infrastructure itself. For example: if a package installs libraries, it is its responsibility to call ldconfig during installation and deinstallation. I have seen that some things that used to be needed in spec files a few years ago are now optional because they have moved into the infrastructure, but I believe there is much more that could be done. (RHEL backwards compatibility hurts here.) pkgsrc deals with these situations automatically depending on the platform, and extending the pkgsrc infrastructure to support more "corner cases" is easier.No multi-OS support: One of the major selling points of pkgsrc is that it is cross-platform: it runs under multiple variants of BSD, Linux, OS X, and other obscure systems. It is true that RPM also works on all these systems, but Fedora's packaging system (auxiliary build tools, policies, etc.) does not. There is not much more to say about this given that this is an obvious design choice of the developers.To conclude: please keep in mind that the above is not intended to describe Fedora's system as a better packaging system than pkgsrc. There are some good and bad things in each, and what you use will depend on your use case or operating system. What motivated me to write this post were just a few small things like Koji, Bodhi and subpackages, but I ended up writing much more to provide context and a more detailed comparison with pkgsrc. Now draw your own conclusions! ;-) [Continue reading]

  • Kyua: Weekly status report

    Created an RPM package for Lutok for inclusion in Fedora.Created a preliminary RPM spec for ATF for Fedora. Now in discussions with the FPC to figure out how to install the tests on a Fedora system, as /usr/tests may not be appropriate.No activity on Kyua itself though, unfortunately. [Continue reading]

  • Kyua: Weekly status report

    This comes two days late... but anyway, I felt like posting it now instead of waiting until next Sunday/Monday.The activity past week focused mostly on implementing support for the require.memory test-case metadata property recently introduced into ATF. This was non-trivial due to the need to write some tricky Autoconf code to make this "slightly portable". Seriously: It's scary to see how hard it is to perform, in a portable manner, an operation as simple as "query the amount of physical memory"... but oh well, such are the native Unix APIs...I later spent the weekend on Lutok, preparing and publishing the project's first release and writing my first RPM library spec for it. This was a prerequisite for the upcoming 0.3 release of Kyua and thus deserved special attention! [Continue reading]