• Type-safe, dynamic tree data type

    The core component of the new configuration library in Kyua is the utils::config::tree class: a type-safe, dynamic tree data type. This class provides a mapping of string keys to arbitrary types: all the nodes of the tree have a textual name, and they can either be inner nodes (no value attached to them), or leaf nodes (an arbitrary type attached as a value to them). The keys represent traversals through such tree, and do this by separating the node names with dots (things of the form root.inner1.innerN.leaf).The tree class is the in-memory representation of a configuration file, and is the data structure passed around methods and algorithms to tune their behavior. It replaces the previous config static structure.The following highlights describe the tree class:Keys are (and thus the tree layout is) pre-registered. One side-effect of moving away from a static C++ structure as the representation of the configuration to a dynamic structure such as a tree is that the compiler cannot longer validate the name of the configuration settings when they are queried. In the past, doing something like config.architecture would only compile if architecture was a valid structure defined... but now, code like config["architecture"] cannot be validated during the build.In order to overcome this limitation, trees must have their keys pre-defined. Pre-defining the keys declares their type within the tree.  Accesses to unknown keys results in an error right away, and accesses to pre-defined keys must always happen with their pre-recorded types.Note that pre-defined nodes can, or cannot, hold a value. The concept of "being set" is different than "being defined".Some nodes can be dynamic. Sometimes we do not know what particular keys are valid within a context. For example, the test_suites subtree of the configuration can contain arbitrary test suite names and properties within it, and there is no way for Kyua (at the moment) to know what keys are valid or not.As a result, the tree class allows defining a particular node as "dynamic", at which point accesses to any undefined keys below that node result in the creation of the node.Type safety. Every node has a type attached to it. The base configuration library provides common types such as bool_node, int_node and string_node, but the consumer can define its own node types to hold any other kind of data type. (It'd be possible, for example, to define a map_node to hold a full map as a tree leaf.)The "tricky" (and cool) part of type safety in this context is to avoid exposing type casts to the caller: the caller always knows what type corresponds to every key (because, remember, the caller had to predefine them!), so it knows what type to expect from every node. The tree class achieves this by using template methods, which just query the generic internal nodes and cast them out (after validation) to the requested type.Plain string representations. The end user has to be able to provide overrides to configuration properties through the command line... and the command line is untyped: everything is a string. The tree library, therefore, needs a mechanism to internalize strings (after validation) and convert them to the particular node types. Similarly, it is interesting to have a way to export the contents of a tree to strings so that they can be shown to the user.With that said, let's see a couple of examples. First, a simple one. Let's create a tree with a couple of fictitious nodes (one a string, one an integer), set some values and then query such values:config::tree tree;// Predefine the valid keys.tree.define< config::string_node >("kyua.architecture");tree.define< config::int_node >("kyua.timeout");// Populate the tree with some sample values.tree.set< config::string_node >("kyua.architecture", "powerpc");tree.set< config::int_node >("kyua.timeout", 300);// Query the sample values.const std::string architecture = tree.lookup< config::string_node >("kyua.architecture");const int timeout = tree.lookup< config::int_node >("kyua.timeout");Yep, that's it. Note how the code just knows about keys and their types, but does not have to mess around with type casts nor tree nodes. And, if there is any typo in the property names or if there is a type mismatch between the property and its requested node type, the code will fail early. This, coupled with extensive unit tests, ensures that configuration keys are always queried consistently.Note that we'd also have set the keys above as follows:tree.set_string("kyua.architecture", "powerpc");tree.set_string("kyua.timeout", "300");... which would result in the validation of "300" as a proper integer, conversion of it to a native integer, and storing the resulting number as the integer node it corresponds to. This is useful, again, when reading configuration overrides from the command line as types are not known in that context yet we want to store their values in the same data structure as the values read from the configuration file.Let's now see another very simple example showcasing dynamic nodes (which is a real-life example from the current Kyua configuration file):config::tree tree;// Predefine a subtree as dynamic.tree.define_dynamic("test_suites");// Populate the subtree with fictitious values.tree.set< config::string_node >("test_suites.NetBSD.ffs", "ext2fs");tree.set< config::int_node >("test_suites.NetBSD.iterations", 5);// And the querying would happen exactly as above with lookup().Indeed, it'd be very cool if this tree type followed more standard STL conventions (iterators, for example). But I didn't really think about this when I started writing this class and, to be honest, I don't need this functionality.Now, if you paid close attention to the above, you can start smelling the relation of this structure to the syntax of configuration files. I'll tell you how this ties together with Lua in a later post. (Which may also explain why I chose this particular representation.) [Continue reading]

  • Rethinking Kyua's configuration system

    In the previous blog post, I described the problems that the implementation of the Kyua configuration file parsing and in-memory representation posed. I also hinted that some new code was coming and, after weeks of work, I'm happy to say that it has just landed in the tree!I really want to get to explaining the nitty-gritty details of the implementation, but I'll keep these for later. Let's focus first on what the goals for the new configuration module were, as these drove a lot of the implementation details:Key/value pairs representation: The previous configuration system did this already, and it is a pretty good form for a configuration file because it is a simple, understandable and widespread format. Note that I have not said anything yet about the types of the values.Tree-like representation: The previous configuration schema grouped test-suite specific properties under a "test_suites" map while it left internal run-time properties in the global namespace. The former is perfect and the latter was done just for simplicity. I want to move towards a tree of properties to give context to each of them so that they can be grouped semantically (e.g. kyua.report.*, kyua.runtime.*, etc.). The new code has not changed the structure of the properties yet (to remain compatible with previous files), but it adds very simple support to change this in the shortcoming future.Single-place parsing and validation: A configuration file is an external representation of a set of properties. This data is read (parsed) once and converted into an in-memory representation. All validation of the values of the properties must happen at this stage, and not when the properties are queried. The reason is that validation of external values must be consistent and has to happen in a controlled location (so that errors can all be reported at the same time).I have seen code in other projects where the configuration file is stored in memory as a set of key/value string pairs and parsing to other types (such as integers, etc.) is delayed until the values are used. The result is that, if a property is queried more than once, the validation will be implemented in different forms, each with its own bugs, which will result in dangerous inconsistencies.Type safety: This is probably the trickiest bit. Every configuration node must be stored in the type that makes most sense for its value. For example: a timeout in seconds is an integer, so the in-memory representation must be an integer. Or another example: the type describing the "unprivileged user" is a data structure that maps to a system user, yet the configuration file just specifies either a username or a UID.Keeping strict type validation in the code is interesting because it helps to ensure that parsing and validation happen in just a single place: whenever the configuration file is read, every property will have to be converted to its in-memory type, and this means that the validation can only happen at that particular time. Once the data is in memory, we can and have to assume that it is valid. Additionally, strict types ensure that the code querying such properties uses the values as intended, without having to do additional magic to map them to other types.Extensibility: Parsing a configuration file is a very generic concept, yet the previous code made the mistake of tying this logic with the specific details of Kyua configuration files. A goal of the new code has been to write a library that parses configuration files, and allows the Kyua-specific code to define the schema of the configuration file separately. (No, the library is not shipped separately at this point; it's placed in its own utils::config module.)With all this code in place, there are a bunch of things that can now be easily implemented. Consider the following:Properties to define the timeout of test cases depending on their size (long-standing issue 5).Properties to tune the UI behavior: width of the screen, whether to use color or not (no, there is no color support yet), etc.Properties to configure how reports look like "by default": if you generate reports of any form frequently, it is very likely that you will want them to look the same every time and hence you will want to define the report settings once in the configuration file.Hooks: one of the reasons for using Lua-based configuration files was to allow providing extra customization abilities to the user. Kyua could theoretically call back into Lua code to perform particular actions, and such actions could be explicitly stated by the user in the form of Lua functions. Neither the current configuration code nor Kyua has support for hooks, but the new implementation makes it rather easy to add them.And that's all for today. Now that you know what the current code is trying to achieve and why, we will be able to look at how the implementation does all this in the next posts. [Continue reading]

  • Kyua's configuration system showing its age

    A couple of years ago, when Kyua was still a newborn, I wrote a very ad-hoc solution for the parsing and representation of its configuration files. The requirements for the configuration were minimal, as there were very few parameters to be exposed to the user. The implementation was quick and simple to allow further progress on other more-important parts of the project. (Yep, quick is an euphemism for dirty: the implementation of the "configuration class" has to special-case properties everywhere to deal with their particular types... just as the Lua script has to do too.)As I just mentioned in the previous paragraph, the set of parameters exposed through the configuration file were minimal. Let's recap what these are:Run-time variables: architecture and platform, which are two strings identifying the system; and unprivileged_user, which (if defined) is the name of the user under which to run unprivileged tests as. It is important to mention that the unprivileged_user is internally represented by a data type that includes several properties about a system user, and that it ensures that the data it contains is valid at all times. The fact that every property holds a specific type is an important design requirement.Test suite variables: every test suite can accept arbitrary configuration variables. Actually, these are defined by the test programs themselves. All of these properties are strings (and cannot be anything else because ATF test programs have no way of indicating the type of the configuration variables they accept/expect).Because of the reduced set of configurable properties, I opted to implement the configuration of the program as a simple data structure with one field per property, and a map of properties to represent the arbitrary test suite variables. The "parser" to populate this structure consists on a Lua module that loads these properties from a Lua script. The module hooks into the Lua metatables to permit things like "test_suites.NetBSD.timeout=20" to work without having to predeclare the intermediate tables.Unfortunately, as I keep adding more and more functionality to Kyua, I encounter additional places where a tunable would be appreciated by the end user (e.g. "disallow automatic line wrapping"). Exposing such tunable through a command-line flag would be a possibility, but some of these need to be permanent in order to be useful. It is clear that these properties have to be placed in the configuration file, and attempting to add them to the current codebase shows that the current abstractions in Kyua are not flexible enough.So, why am I saying all this? Well: during the last few weeks, I have been working on a new configuration module for Kyua. The goals have been simple:Have a generic configuration module that parses configuration files only, without any semantics about Kyua (e.g. what variables are valid or not). This ensures that the implementation is extensible and at the right level of abstraction.Be able to get rid of the ad-hoc parsing of configuration files.Allow defining properties in a strictly-typed tree structure. Think about being able to group properties by function, e.g. "kyua.host.architecture"; this is more or less what we have today for test-suite properties but the implementation is a special-case again and cannot be applied to other tunables.And... I am pleased to say that this code is about to get merged into the tree just in time for Kyua 0.4. In the next few posts, I will explain what the particular design constraints of this new configuration system were and outline a little bit its implementation. I think it's a pretty cool hack that mixes C++ data structures and Lua scripts in a "transparent" manner, albeit you may think it's too  complex. The key part is that, as this new configuration module is not specific to Kyua, you might want to borrow the code/ideas for your own use! [Continue reading]