Achtung Ubuntu 10.4 users: data loss
April 30, 2010 on 1:38 am | In General | 1 Commenti’m just so frigging sick to my stomach at the moment because Ubuntu 10.4 ate my home directory, and i wanted to take a moment to warn other Ubuntu upgraders about a data-loss scenario.
Full details are in the bug report:
https://bugs.launchpad.net/ubuntu/+bug/571958
If you’re going to use Ubuntu 10.4, make damned sure your shell scripts do not depend on case-sensitive shell expansion!
(Yes, i have backups, but i’ve got to pull them from 20 different source repos and a 15GB Dropbox, and the whole point of the exercise which revealed the bug was to avoid having to pull down 30GB of data from the net.)
>:-(
A live previewer/editor for Google Code Wiki
April 22, 2010 on 1:54 pm | In General | 1 CommentAnd how for something very different…
A couple of days ago i was looking for JavaScript code which can parse the Google Code wiki syntax reasonably well. i stumbled upon (via a link in a bug report) some code by Fabien Ménager which did most of the work. After coming across a few too many corner cases, i sat down to re-implement the parser using a char-by-char scan, rather than regexes, for most of the work.
It’s now about 20 hours after that work started, i haven’t slept in 26 hours, and Fabien and i now have:
- A reasonable parser (only missing a few markup features): http://code.google.com/p/wikiwym/
- An application to demo it: http://wikiwym.googlecode.com/svn/trunk/index.html
- And another application which allows the user to load arbitrary wiki pages from arbitrary Google Code-hosted projects and preview them in raw and HTML modes. That code can then be pasted back into the Google Code wiki editor when editing is finished: http://fossil.wanderinghorse.net/demos/wikiwym/
Basically, that last app provides an editor environment for people who edit wiki pages of arbitrary Google Code projects. The app cannot save the data back to Google Code for you, but it provides a relatively effective interface for editing wiki markup. If you make a mistake while typing in the editor, for example forgetting to close an inlined markup tag (bold, italics, etc.), the preview mode will prominently mark the error so you know where to fix it (and what to fix).
If you’re looking for a wiki parser implementation in JavaScript, this one might just suit you. If you’re a parser guru and would like to help us improve the parser, then please get in touch!
Happy hacking!
—– stephan beal
CLI-like Web Apps with JavaScript
April 2, 2010 on 8:53 pm | In General | 1 CommentOn April 1st xkcd.com added a new web interface which allows one to view the xkcd comics using a Unix-style command-line interface: http://uni.xkcd.com/
Interested in how it was being done, i followed the source code links, downloaded it, and set it up locally. After spending the whole day hacking on it, and sharing several mails with the author (who goes by the moniker Chromakode), i’m really interested in exploring this idea for web GUIs a bit more…
For those interested in truly geeky things, the source code is here:
http://github.com/chromakode/xkcdfools
i’ve tied that into an experimental JavaScript app framework i’ve been tinkering on the past year:
http://wanderinghorse.net/computing/javascript/jquery/jqapp/console.php
That page demonstrates tying in the CLI interface with JSON-centric RPC. e.g. the “ping” command sends of a JSON-structured RPC request to the server, the server dispatches it to the appropriate event handler, the handler answers and sends us back another JSON object describing the result. Such an interface could be used to write some exceedingly geeky web applications.
In any case… not something of general interest, but in case there are a few other hackers who find the Web/CLI mixture interesting… see the above links.
Happy Hacking!
—– stephan
Stupid C Tricks: Portably Passing Function Pointers via (void*)
March 2, 2010 on 5:15 pm | In General | 1 CommentHello, fellow C programmers!
(If you don’t program in C (or even know what that means) then the following won’t interest you in the slightest. If you are a C programmer, it might interest you in the slightest.)
It is common for C code to use the venerable void pointer, and (despite what our OO teachings might tell us), the void pointer isn’t the bad guy. Mis-use of void pointers is the bad guy. Because there are lots of ways to abuse a void pointer, void pointers are, at least in higher-level APIs, generally to be avoided. Lower-level APIs often cannot get by without them.
Today’s talk is not about void pointers in general (there’s an awful lot to say about void!), but specifically about using a void pointer to hold a function pointer.
Did you know that passing a function pointer via a void pointer, casting that void pointer to the original function pointer, and dereferencing (i.e. calling) the casted function results in undefined behaviour? That’s what the C standard says.
What does that mean? It means the following code is invokes undefined behavhiour in any country or providence which respects the C Standard:
typedef int (*my_callback)( void * state ); my_callback cb = some_callback_function; void * ptr = cb; ((my_callback)ptr)( ... ); // <--- undefined behaviour!
The above code is rather contrived, but it demonstrates a feature which is seen from time to time in ancient C code. It is also used in some OS-level functions which open DLLs (e.g. based on dlopen() on POSIX systems). DLL-loading systems often look for a symbol with a specific name within the DLL right after they open it. If they find it, they treat it like a function and call it. Some DLL-loading systems provide a similar feature for closing a DLL. Higher-level “plugin” systems, common in today’s applications, often use this approach to fetch a factory function from the DLL, which is then used to create an instance of the plugin.
If such systems expect their magic symbol(s) to be a function (and in my experience they do!), then they are relying on undefined behaviour. And we abhor undefined behaviour, don’t we?
There is a portable, and only slightly inconvenient, workaround. In short, we add a level of indirection between the function we want to fetch from the DLL and the symbol name the DLL-opener expects to find. That level of indirection is a tiny struct:
typedef int (*my_callback)( void * state );
typedef struct my_callback_struct {
my_callback callback;
} my_callback_struct;
And in our DLLs we use that indirection like so:
/* implements the my_callback() interface. */
static int private_callback( void * state );
/* The symbol our DLL/plugin system will look for: */
const my_callback_struct TheExpectedSymbolName = {private_callback};
In the DLL opening code, we look for “TheExpectedSymbolName”, but instead of casting it to a function pointer, we cast it to a my_callback_struct pointer. From that object, we can invoke its member function without invoking undefined behaviour. Obviously, if the DLL contains a non-my_callback_struct with the expected symbol name, behaviour is undefined, but the function-pointer approach has the same problem.
This approach can easily be expanded to provide open() and close() routines for plugins, or any other application-specific plugin functionality. It can be combined with macros to allow clients to easily implement a plugin by calling the macro and passing pointers to their callback implementations. Since the DLL lookup symbols must unique, however, it means we have a limitation of one logical plugin per DLL file. (That is the norm in most plugin systems, but this limitation makes it an explicitproperty of such systems.)
In C++ (as opposed to C), the ability to to construct objects and call functions during the DLL’s static initialization phase (i.e. while it’s being opened, before the DLL opener gets it back) allows for an elegant solution to all of the above-mentioned problems and limitations. We won’t go into that here, but for interested readers there is a detailed article about it, called Classloading in C++, over at http://wanderinghorse.net/computing/papers/.
Happing hacking!
—– stephan beal, 2 March 2010
Implementing not-quite-namespaces in C
February 28, 2010 on 11:24 am | In General | 1 CommentHi, all!
A few days ago i came across a trick in C. It isn’t new, but i hadn’t seen it used quite this way before and i wanted to share it…
Background: i have several mini-libraries which i tend to copy directly in to other projects. It has happened that i then use two other libraries together, and both of them internally use the same underlying mini-library. In some cases i want each to continue to use their own copy (maybe a different/incompatible version), but not collide with the other copy. Here’s one way to do it…
In the lib’s main header, before we declare any of its API:
#if !defined(MY_NS) # define MY_NS(X) my_namespace_prefix_ ## X #endif
We then declare our types and functions using that macro:
int MY_NS(func1)( ... );
struct MY_NS(type1) { ... };
And throughout the implementation and client code we use:
int x = MY_NS(func1)( ... );
Each library which wants to import and rename the API then simply has to redefine MY_NS while building the included mini-library.
The major down-side is that this construct makes reading the docs through tools like doxygen more difficult.
Happy hacking!
Allocating memory in C without dynamic memory
December 1, 2009 on 2:47 am | In General | 1 CommentHi, all!
The past year or two i’ve been working on various C-based projects. Through that work i’ve taken a strong interest in conserving memory and reducing the number of calls made to make new memory available (e.g. via malloc()). For example, i’ve spent many hours optimizing the whefs embedded filesystem library to run with fewer than 2kb of dynamic memory for the average use case (and as little as 96 bytes(!!!) for a highly-optimized case).
The past few days, that latent fascination with saving calls to malloc() has culminated into a new library:
whalloc (the WanderingHorse.net Allocator) is a small C library which provides an alternative memory management approach. It is initialized with a block of client-supplied memory, typically a stack-allocated char buffer, and then it can slice up that memory and use it for allocations and deallocations. The whole process looks a bit like:
whalloc_bt pool = whalloc_bt_empty;
enum { BufLen = 1024 * 8 };
unsigned char buffer[BufLen];
whalloc_size_t blockSize = sizeof(my_type);
int rc = whalloc_bt_init(&pool, buffer, BufLen, blockSize );
if( whalloc_rc.OK != rc ) { … error … }
my_type * m = whalloc_bt_alloc(&pool, sizeof(my_type));
// ^^^ m now lives somewhere inside of buffer
…
whalloc_bt_free(&pool, m); // makes the memory available for re-use
whalloc_bt_drain(&pool); // “deallocates” all allocated objects at once
The most interesting part is how it stores its memory management information: by taking up a small slice of the memory it is managing. It needs only two bits of storage for each block of memory it manages (there are (memBufferSize/blockSize) blocks in the buffer). The allocator takes up, worst-case (block size of 1 byte), 18-19% of that memory, dropping to 11% for 2-byte blocks, 6% for 4-byte-blocks, and halving for each additional doubling of block size. With a block size of 64 bytes it uses less than 0.5% of the memory for its own purposes. If its storing only a small number of blocks (default setting=128) then it can use its own few-byte-long internal cache and must reserve none of the client’s memory for its own purposes.
For the average use case, where objects are destroyed in reverse of their allocation order, it can perform O(1) if the allocations are equal to or smaller than the defined block size. Its worst-case performance is O(N), with N being a function of the number of blocks being allocated, the total number of blocks managed, and current memory pool fill status. Deallocation is always O(N), with N being the number of blocks being deallocated. For deallocation, finding the underlying management data is always O(1) - a simple hashing operation which uniquely maps any given pointer to its memory block index.
There’s also a variant of the allocator which requires 2 bytes (instead of 2 bits) per managed block. The primary difference is that it stores the requested size of each allocation, whereas the optimized variation simply knows whether a block as a whole has been allocated or not.
The allocators optionally support a client-provided mutex, to lock the allocator, and fallback allocation/deallocation functions, which they can use if their own pool runs out of space (e.g. falling back to malloc() and free()).
For those of you interested in C, in particular in shaving off a few bytes of dynamic memory in your C apps, the code is available (it’s in the Public Domain) over on the whalloc web page:
http://fossil.wanderinghorse.net/repos/whalloc
Happy hacking!
The Chicken/Egg Scenario
August 10, 2009 on 8:29 pm | In General | 3 CommentsThe age-old question goes:
Which came first: the chicken or the egg?
And i think the point of it is that nobody really knows. Kind of like the ancient Egyptian saying, “if a tree falls in the dunes of the Sahara desert and no one hears it, did it really emit audible vibrations?” (Though the original hieroglyph looks something like a man with an axe in one hand and the other hand cupped to his ear.)
But i think there might be a solution:
A chicken, by definition, comes from an egg. Not just any egg, mind you, but a chicken egg. Thus it is impossible that the chicken came before the egg. Ergo, what laid the first chicken egg was not a chicken. That creature might indeed have been hatched from an egg, but because it cannot have been a chicken egg (as we just established), the first chicken egg still came before the first chicken.
Elementary, dear Watson!
whefs: an embedded filesystem library for C
June 18, 2009 on 1:15 pm | In General | 1 CommentHi, all!
Late last year i started working on an embedded filesystem library for C. The library uses “container files” (called embedded filesystems, or EFSes), in which the client can store “files” (i call them “pseudofiles”). The API allows random read/write access to them using a flexible i/o device interface. If you’ve ever programmed with sqlite - whefs is similar in concept but creates an embedded/virtual filesystem instead an embedded/virtual SQL server.
While whefs is still beta and open for lots of experimentation, it seems to be in a usable state.
To give whefs a more public home, a couple days ago i moved whefs over from its original source repository to Google Code:
http://code.google.com/p/whefs/
The project is of course open to collaboration, and i invite any interested C hackers out there to get in touch.
Happy hacking!
Powered by WordPress with Pool theme design by Borja Fernandez.
Entries and comments feeds.
Valid XHTML and CSS. ^Top^