• A teeny tiny review of Twitterville

    After about two months, I finally finished reading Twitterville by Shel Israel (@shelisrael). One of my followers (@drio) asked for a review of the book, so here is my attempt to do so.But first, a quick summary: Twitterville is a book that focuses on the dynamics of Twitter. It starts by explaining how Twitter works, but that is only a tiny introductory part of the book. The majority of the contents explain how people and business interact with each other by means of Twitter, and it does so by providing lots of real-life stories. The stories range from topics as diverse as businesses offering deals, to individuals raising funds for specific causes.The book is easy to read and is well structured, and as you read through it you will realize that the author had to do some major research efforts to collect all the stories that he presents. I personally enjoyed the first half of the book a lot, but at some point I ran out of time for reading and my interest dropped. It was hard to pick on reading again because the book becomes quite repetitive after a few chapters; just keep in mind that it is a collection of personal experiences organized by different major topics and you won't be disappointed.Twitterville has changed my view on Twitter. I have discovered many "use cases" for Twitter that I could not imagine and, as many people do, I used to disregard Twitter as a useless "status updates" system. Today, however, I have set up several Twitter searches to monitor some topics of my interest and I engage with people that I did not know beforehand. It kinda feels like a world-wide unorganized chat room to me... but, as the author mentions many times, the way you see and use Twitter is up to you and you alone! [Continue reading]

  • Injecting C++ functions into Lua

    The C++ interface to Lua implemented in Kyua exposes a lua::state class that wraps the lower-level lua_State* type. This class completely hides the internal C type of Lua to ensure that all calls that affect the state go through the lua::state class.Things get a bit messy when we want to inject native functions into the Lua environment. These functions follow the prototype represented by the lua_CFunction type:typedef int (*lua_CFunction)(lua_State*);Now, let's consider this code:intawesome_native_function(lua_State* state){ // Uh, we have access to s, so we bypass the lua::state! ... do something nasty ... // Oh, and we can throw an exception here... //with bad consequences.}voidsetup(...){ lua::state state; state.push_c_function(awesome_native_function); state.set_global("myfunc"); ... run some script ...}The fact that we must pass a lua_CFunction prototype to the lua_pushcfunction object means that such function must have access to the raw lua_State* pointer... which we want to avoid.What we really want is the caller code to define a function such as:typedef int (*cxx_function)(lua::state&)In an ideal world, the lua::state class would implement a push_cxx_function that took a cxx_function, generated a thin C wrapper and injected such generated wrapper into Lua. Unfortunately, we are not in an ideal world: C++ does not have high-order functions and thus the "generate a wrapper function" part of the previous proposal does not really work.What we can do instead, though, is to make the creation of C wrappers for these C++ functions trivial. And this is what r42 did. The approach I took is similar to this overly-simplified (and broken) example:template< cxx_function Function >intwrap_cxx_function(lua_State* state){ try { lua::state state_wrapper(state); return Function(state_wrapper); } catch (...) { luaL_error(state, "Geez, don't go into C's land!"); }}This template wrapper takes a cxx_function object and generates a corresponding C function at compile time. This wrapper function ensures that C++ state does not propagate into the C world, as that often has catastrophical consequences. (Due to language limitations, the input function must have external linkage. So no, it cannot be static.)As a result, we can rewrite our original snippet as:intawesome_native_function(lua::state& state){ // See, we cannot access lua_State* now. ... do something ... throw std::runtime_error("And we can even do this!");}voidsetup(...){ lua::state state; state.push_c_function( wrap_cxx_function); state.set_global("myfunc"); ... run some script ...}Neat? I think so, but maybe not so much. I'm pretty sure there are cooler ways of achieving the above purpose in a cleaner way, but this one works nicely and has few overhead. [Continue reading]

  • Error handling in Lua: the Kyua approach

    About a week ago, I detailed the different approaches I encountered to deal with errors raised by the Lua C API. Later, I announced the new C++ interface for Lua implemented within Kyua. And today, I would like to talk about the specific mechanism I implemented in this library to deal with the Lua errors.The first thing to keep in mind is that the whole purpose of Lua in the context of Kyua is to parse configuration files. This is an infrequent operation, so high performance does not matter: it is more valuable to me to be able to write robust algorithms fast than to have them run at optimal speed. The other key point to consider is that I want Kyua to be able to use prebuilt Lua libraries, which are built as C binaries.The approach I took is to wrap every single unsafe Lua C API call in a "thin" (FSVO thin depending on the case) wrapper that gets called by lua_pcall. Anything that runs inside the wrapper is safe to Lua errors, as they are caught and safely reported to the caller.Lets examine how this works by taking a look at an example: the wrapping of lua_getglobal. We have the following code (copy pasted from the utils/lua/wrap.cpp file but hand-edited for publishing here):static intprotected_getglobal(lua_State* state){ lua_getglobal(state, lua_tostring(state, -1)); return 1;}voidlua::state::get_global(const std::string& name){ lua_pushcfunction(_pimpl->lua_state, protected_getglobal); lua_pushstring(_pimpl->lua_state, name.c_str()); if (lua_pcall(_pimpl->lua_state, 1, 1, 0) != 0) throw lua::api_error::from_stack(_pimpl->lua_state, "lua_getglobal");}The state::get_global method is my public wrapper for the lua_getglobal Lua C API call. This wrapper first prepares the Lua stack by pushing the address of the C function to call and its parameters and then issues a lua_pcall call that executes the C function in a Lua protected environment.In this case, the argument preparation for protected_getglobal is trivial because the lua_getglobal call does not require access to any preexisting values on the Lua stack. Things get much trickier when that happens as in the case of the lua_getglobal wrapper. I'll leave understanding how to do this as an exercise to the reader (but you can cheat by looking at line 154).Anyway. The above looks all very nice and safe and the tests for the state::get_global function, even the ones that intentionally cause a failure, all work fine. So we are good, right? Nope! Unfortunately, the code above is not fully safe to Lua errors.In order to prepare the lua_pcall execution, the code must push values on the stack. As it turns out, both lua_pushcfunction and lua_pushstring can fail if they run out of memory (OOM). Such failure would of course be captured inside a protected environment... but we have a little chicken'n'egg problem here. That said, OOM failures are rare so I'm going to leverage this fact and not worry about it. (Note to self: install a lua_atpanic handler to complain loudly if that ever happens.)Addendum: Bundling Lua within my program and building it as a C++ binary with exception reporting enabled in luaconf.h would magically solve all my issues. I know. But I don't fancy the idea of bundling the library into my source tree for a variety of reasons. [Continue reading]

  • C++ interface to Lua for Kyua

    Finally! After two weeks of holidays work, I have finally been able to submit Kyua's r39: a generic library that implements a C++ interface to Lua. The code is hosted in the utils/lua/ subdirectory.From the revision description:The utils::lua library provides thin C++ wrappers around the Lua C API to ease the interaction between C++ and Lua. These wrappers make intensive use of RAII to prevent resource leakage, expose C++-friendly data types, report errors by means of exceptions and ensure that the Lua stack is always left untouched in the face of errors. The library also provides a place (the operations module) to add miscellaneous utility functions built on top of the wrappers.In other words: this code aims to decouple all details of the interaction with the Lua C API from the main code of Kyua so that the high level algorithms do not have to worry about Lua C API idiosyncrasies.Further changes to Kyua to implement the new configuration system will follow soon as all the basic code to talk to Lua has been ironed out. Also expect some extra posts regarding the design decisions that went on this helper code and, in particular, about error reporting as mentioned in the previous post.(Yep, Lua and Kyua sound similar. But that was never intended; promise!) [Continue reading]

  • Error handling in Lua

    Some of the methods of the Lua C API can raise errors. To get an initial idea on what these are, take a look at the Functions and Types section and pay attention to the third field of a function description (the one denoted by 'x' in the introduction).Dealing with the errors raised by these functions is tricky, not to say a nightmare. Also, the ridiculously-short documentation on this topic does not help. This post is dedicated to explain how these errors may be handled along with the advantages and disadvantages of each case.The Lua C API provides two modes of execution: protected and unprotected. When in protected mode, all errors caused by Lua are caught and reported to the caller in a controlled manner. When in unprotected mode, the errors just abort the execution of the calling process by default. So, one would think: just run the code in protected mode, right? Yeah, well... entering protected mode is nontrivial and it has its own particularities that make interaction with C++ problematic.Let's analyze error reporting by considering a simple example: the lua_gettable function. The following Lua code would error out when executed:my_array = nilreturn my_array["test"]... which is obvious because indexing a non-table object is a mistake. Now let's consider how this code would look like in C (modulo the my_array assignment):lua_getglobal(state, "my_array");lua_pushstring(state, "test");lua_gettable(state, -2);Simple, huh? Sure, but as it turns out, any of the API calls (not just lua_gettable) in this code can raise errors (I'll call them unsafe functions). What this means is that, unless you run the code with a lua_pcall wrapper, your program will simply exit in the face of a Lua error. Uh, your scripting language can "crash" your host program out of your control? Not nice.What would be nice is if each of the Lua C API unsafe functions reported an error (as a return value or whatever) and allowed the caller to decide what to do. Ideally, no state would change in the face of an error. Unfortunately, that is not the case but it is exactly what I would like to do. I am writing a C++ wrapper for Lua in the context of Kyua and fine granularity in error reporting means that automatic cleanup of resources managed by RAII is trivial.Let's analyze the options that we have to control errors caused within the Lua C API. I will explain in a later post the one I have chosen for the wrapper in Kyua (it has to be later because I'm not settled yet!).Install a panic handlerWhenever Lua code runs in an unprotected environment, one can use lua_atpanic to install a handler for errors. The function provided by the user is executed when the error occurs and, if the panic function returns, the program exits. To prevent exiting prematurely, one could opt for two mechanisms:Make the panic handler raise a C++ exception. Sounds nice, right? Well, it does not work. The Lua library is generally built as a C binary which means that our panic handler will be called from within a C environment. As a result, we cannot throw an exception from our C++ handler and expect things to work: the exception won't propagate correctly from a C++ context to a C context and then back to C++. Most likely, the program will abort as soon as we leave the C++ world and enter C to unwind the stack.Use setjmp before the call to the unsafe Lua function and recover with longjmp from within the panic handler. It turns out that this does work but with one important caveat: the stack is completely cleared before the call to the panic handler. As a result, this prevents the requirement of "leave the stack unmodified on failure" as is desired of any function (report errors early before changing state).Run every single call in a protected environmentThis is doable but complex and not completely right: to do this, we need to write a C wrapper function for every unsafe API function and run it with lua_pcall. The overhead of this approach is significant: something as simple as a call to lua_gettable turns into several stack manipulation operations, a call to lua_pcall and then further stack modifications to adjust the results.Additionally, in order to prepare the call to lua_pcall, one has to use the multiple lua_push* functions to prepare the stack for the call. And, guess what, most of these functions that push values onto the stack can themselves fail. So... in order to prepare the environment for a safe call, we are already executing unsafe calls. (Granted, the errors in these case are only due to memory exhaustion... but still, the solution is not fully robust.)Lastly, note that we cannot use lua_cpcall because it does discard all return values of the executed function. Which means that we can't really wrap single Lua operations. (We could wrap a whole algorithm though.)Run the whole algorithm in a protected environmentThis defeats the whole purpose of the per-function wrapping. We would need to provide a separate C/C++ function that runs all unsafe code and then call it by means of lua_pcall (or lua_cpcall) so that errors are captured and reported in a controlled manner. This seems very efficient... albeit not transparent and will surely cause issues.Why is this problematic? Errors that happen inside the protected environment are managed by means of a longjmp. If the code wrapped by lua_pcall is a C++ function, it can instantiate objects. These objects have destructors. A longjmp outside of the function means that no destructors will run... so objects will leak memory, file descriptors, and anything you can imagine. Doom's day.Yes, I know Lua can be rebuilt to report internal errors by means of exceptions which would make this particular problem a non-issue... but this rules out any pre-packaged Lua binaries (the default is to use longjmp and henceforth what packaged binaries use). I do not want to embed Lua into my source tree. I want to use Lua binary packages shipped with pretty much any OS (hey, including NetBSD!), which means that my code needs to be able to cope with Lua binaries that use setjmp/longjmp internally.Closing remarksI hope the above description makes any sense because I had to omit many, many details in order to make the post reasonably short. It could also be that there are other alternatives I have not considered, in which case I'd love to know them. Trying to find a solution to the above problem has already sucked several days of my free time, which translates in Kyua not seeing any further development until a solution is found! [Continue reading]

  • Understanding setjmp/longjmp

    For a long time, I have been aware of the existence of the standard C functions setjmp and longjmp and that they can be used to simulate exceptions in C code. However, it wasn't until yesterday that I had to use them... and it was not trivial. The documentation for these functions tends to be confusing, and understanding them required looking for additional documents and a bit of experimentation. Let's see if this post helps in clarifying how these functions work.The first call to setjmp causes the process state (stack, CPU registers, etc.) to be saved in the provided jmp_buf structure and, then, a value of 0 to be returned. A subsequent call to longjmp with the same jmp_buf structure causes the process to go "back in time" to the state stored in said structure. The way this is useful is that, when going back in time, we tweak the return value of the setjmp call so we can actually run a second (or third or more) path as if nothing had happened.Let's see an example:#include <setjmp.h>#include <stdio.h>#include <stdlib.h>static jmp_buf buf;static voidmyfunc(void){ printf("In the function.n"); ... do some complex stuff ... /* Go back in time: restore the execution context of setjmp * but make the call return 1 instead of 0. */ longjmp(buf, 1); printf("Not reached.n");}intmain(void) { if (setjmp(buf) == 0) { /* Try block. */ printf("Trying some function that may throw.n"); myfunc(); printf("Not reached.n"); } else { /* Catch block. */ printf("Exception caught.n"); } return EXIT_SUCCESS;}The example above shows the following when executed:Trying some function that may throw.In the function.Exception caught.So, what happened above? The code starts by calling setjmp to record the execution state and the call returns 0, which causes the first part of the conditional to run. You can think of this clause as the "try" part of an exception-based code. At some point during the execution of myfunc, an error is detected and is "thrown" by a call to longjmp and a value of 1. This causes the process to go back to the execution of setjmp but this time the call returns 1, which causes the second part of the conditional to run. You can think of this second clause as the "catch" part of an exception-based code.It is still unclear to me what the "execution context" stored in jmp_buf is: the documentation does not explain what kind of resources are correctly unwinded when the call to longjmp is made... which makes me wary of using this technique for exception-like handling purposes. Oh, and this is even less clear in the context of C++ code and, e.g. calls to destructors. Would be nice to expand the description of these APIs in the manual pages. [Continue reading]

  • Happy new year!

    Dear readers,2011 is here so...Happy new year!I hope you all are having a nice holiday season and enjoyed the new year's eve celebration, should it be something special for you.My tentative resolutions for this year related to non-work and non-personal areas would be:First, to revive this blog. I have been lately posting more frequently than has been usual and it has been a pleasant task. I would like to recover the habit of blogging several times per week and, for that, I need topics! Keep'em coming! I already have some topics on the queue but they need a bit of research on my side first.Second, to bring Kyua to reality (i.e. to deprecate ATF). Work is continuing intensively and a preliminary release should be ready during Q1. This release will bring the much-needed replacement for atf-run.And third... well, haven't thought that much about resolutions ;-) We will see. [Continue reading]