On this day in 2004, i made a post with the subject line “3… 2… 1.0!” to announce the release of libs11n, which had been in heavy development for almost 18 months at that time. A year later, 1.2.0 was released. Since then, libs11n’s 1.2.x branch has been fairly stable and has undergone only relatively minor updates and a few bug fixes.
This year i don’t have a new libs11n release to make :`(. That library is stable and useful, and the improvements i’d like to make would require some significant rearchitecting in areas where (A) i’m simply not the right man for the job (e.g. i118n/wide character support) and (B) i have little interest and therefore correspondingly little drive to work on them.
But that doesn’t mean i haven’t been coding. In hindsight, 2008 has been a record year for me in terms of number of lines of code produced, on par with 2004 (when i would sometimes spend 60+ hours per week hacking on s11n - one particular week saw about 100 hours of hacking and less than 20 hours of sleep). Quite unlike 2004, where i expended 100% of my energy on libs11n (and its supporting code/subprojects), 2008 saw the birth of several new pet projects.
So, though i don’t have another 1.0 (or 1.1, or 1.5) release to present at the end of this year, i thought i’d post a bit about what software i’ve been working on this year. This post is not to brag, but does serve a few purposes:
- i won’t deny it: publicity for new personal projects which might be of interest to some other hackers out there.
- i’m trying to justify to myself why i spent so little time with my family (Simone and Baako) this year. Likewise, my parents and friends probably wonder why i’m so uncommunicative.
- Help me internally sum up what’s just passed, to help me sort out what comes next.
- We all started coding somewhere, and we all draw inspiration from different places. It is my hope that some young hacker out there may be inspired to explore his love of computing. With some effort on his part he may someday surpass us all in ability.
- And lastly, to set up the scene for my “coming out” at the end of this post.
So, here it goes…
(In case you’re more interested in a quick summary, just read the text marked in bold.)
The first half of the year was quite slow, in terms of coding. i had just moved back to Munich and was settling in. On this very night one year ago we didn’t yet have a flat in Munich, and were stuck in a hotel room so small you could practically step into the bed from the doorway.
The code i do remember working on includes what i now call whprintf, a custom printf() implementation which i took from the sqlite3 source tree and refactored so that it can send its output to an arbitrary destination (e.g. a GUI widget, stdout, a socket, or a memory buffer) via a callback mechanism. Aside from that, i mainly experimented with utility classes for C, such as managed memory buffers and hashtables (i couldn’t code my own hashtable from scratch, but i hacked quite a lot on one written by Christopher Clark).
In April or May of 2008 i came across PEGTL, a C++0x library for creating PEG parsers using C++ templates. PEGTL inspired me tremendously, and PEGTL’s author (Dr. Colin Hirsch) and myself exchanged over 100 emails on the topic inside of a month or so. Unsatisfied with some of his design decisions, i of course took it upon myself to take a crack at the problem. My first attempt, named parse0x, was also a C++0x library. It did almost everything pegtl did and i was happy with it. i was, however, unhappy that i couldn’t use parse0x in real projects because it requires C++0x support, which is still far from leaving beta status in the next generation of C++ compilers.
And all was good. Two projects behind me and the year not quite half over, i got a sudden urge (for reasons i don’t remember), to rewrite an old program of mine for playing boardgames on the PC (tactical/strategy games are a hobby of mine, though i haven’t actually played any in some years). So i spent much of June getting back into Qt by writing QBoard. As is usual with Qt apps, QBoard grew way beyond the minimalistic app i wanted to write (Qt just makes it easy to keep adding features), and within a month or two it was doing 90% of everything i would probably ever want it do, and QBoard now sits quietly, awaiting the next urge to hack on it. If you use (or are aware of) libs11n, it might interest you to know that libs11n was originally written to support the rewrite which would become QBoard. That is, QBoard is largely the reason libs11n ever came into existance.
The last time i did any significant work on QBoard was September, after which i was again enchanted by the idea of PEG parser generators…
In the end of 2007 i got involved on the fringes of the Fossil project, where i contribute patches now and then. Fossil re-awakened my interest in C (which i used heavily in 1992-1995, but not since discovering higher-level languages), and since early 2008 i have spent a significant amount of time banging out little C libraries, both to get back into practice and to build up components for other planned projects. That re-emergence of interest in C, combined with my fascination of the concept of PEG parsers, led me to try tackling the PEG problem again, but from a much different direction than before.
And thus pegc was born. pegc was to be my third PEG parsing library in 2008, but this time it was implemented in C (and in fact has turned out to be somewhat more interesting than the C++ variants). As far as i have been able to determine, pegc is the only C library of its kind (there are some C code generators for PEGs, but no C libraries). After getting pegc to a “90% there” point, i put it down for a while to put some more thought into a few of the internals, and didn’t hack on it for a couple months. Unaware that anyone else knew about pegc (other than google, of course, but i thought he could keep a secret), another coder surprised me by sending me an email in which he explained that he had implemented a LISP-based PEG generator using pegc as the back-end. That has (yet again) re-awakened my interest in PEGs, and there is certainly more work to be done in this area in 2009.
The year was a couple months short of ending and i had accomplished much coding and solved some problems which interested me. But, as i would later find out, the year was far from over. My two greatest challenges were to be found hiding in the bushes up ahead…
Encouraged by pegc’s development, i decided to expend some effort on a problem which i had original discounted as “too much trouble to be worth the effort” - the generic serialization of objects in the C programming language. (Ouch!) So in late October i sat down to hack. Within a few days c11n was born. While c11n cannot reach the ease of use levels of C++ serialization libraries (because C is not “dynamically expressive” enough to do so), it does work and wasn’t nearly as difficult to implement as i had initially anticipated (it was simply a matter of finding (err… stumbling across) a useful model).
It was sometime in early December, the year almost over, when i got the itch to work on yet more C code.
There’s a problem i’ve contemplated for years but never really knew where to start - a virtual/embedded filesystem. Google reveals little non-commercial activity in this area, so there aren’t many decent starting points to study. This type of problem is C’s bread and butter, and having lost much of my previous distaste for C, i took what i’d learned over the previous year and tried to apply it to what was (in my mind) my most challenging C program yet. Truth be told, i was largely anticipating a crash-and-burn coding session, at the end of which i would be so frustrated as to leave C forever.
After a day or two of hacking i had the basic filesystem generator in place, but wasn’t happy with the i/o model (based on the C-standard (FILE*) API). So i scraped out the i/o layer API from c11n (which i had grown quite happy with), extended it to support random-access devices, and forked that to create the whio i/o library. The primary reason for this step was so that the VFS could use arbitrary back-end storage (provided it’s capable of random-access), and to that end i added implementations for treating standard file handles and in-memory buffers as i/o devices.
With whio in place i reimplemented the embedded filesystem (now called whefs) around it, and within a week or ten days i had gotten the rest of the significant bits in place. As of a few days ago i’ve got a working embedded filesystem library, which is like my little Christmas present to myself (just a few days late). Now i’ve just got to find a use case for it. (That said, googling has revealed very little open source code in this area, so there is potential for whefs to become a useful niche market product.)
And now i’m tired and have sworn not to program a single line of code for the rest of 2008. All 4 hours and 39 minutes of it. I might write some documentation, but i’ll (somehow) avoid the temptation to code. i think i can do it. Wish me luck.
It wasn’t my intention to turn 2008 into a running hackathon, nor to set a personal record, but that’s essentially how it turned out. Truth be told, little of this is code which i will use on a regular basis. Nonetheless, i immensely enjoyed hacking on these projects, and some of them will certainly see continued development for some time (namely c11n, pegc, and whefs).
According to David Wheeler’s SLOCCount, it would have cost a company around half a million dollars to get it all written and out the door, whereas a garden variety hacker can do it all from the comfort of his living room using nothing more than freely-available tools like XEmacs, GNU Make, gcc, and google. Though he also isn’t likely to get paid for it.
And speaking of hackers…
(Now for the “coming out” announcement…)
Per long-standing traditions, programmers are never to call themselves hackers until another hacker calls them a hacker. This is fair and respectful, and in deference to this ideal i have always been careful about who i publically dub to be a “hacker.” i have in fact been called a hacker by other hackers, but i don’t normally proclaim myself to be a hacker. Part of the reason is the misinformed public opinion that a hacker is one who breaks into computer systems (something i’ve never had an interest in and certainly never done), where as we (that is, anyone who would read so far into this blog post!) all know that a hacker is someone who not only loves working on software, but is also particularly good at it. Another reason i’ve avoided using the word in reference to myself has been because i have not always felt that i am quite qualified to wear the title. i won’t claim to be a guru in any given area of computer science (and i’m certainly a zero in many areas!), but i can confidently say that i am a fairly good general-purpose programmer.
But i’m also now convinced that i am indeed a hacker. The past year i somehow managed to implement three programs in particular (c11n, pegc, and whefs) which i would have thought impossible (for me) half a decade ago. That might be reason enough to be dubbed a hacker. The more compelling reason, however, has nothing to do with lines of code or architecture or the number of functions in one’s API. Simone has sometimes asked me, “how can you be so tired from your six hours at work, and then sit here for 12 hours programming?” The answer took me some time to find (as i had never given it any thought before), but is in principal simple. Everything we do requires an expenditure of energy. Working as a Unix system administrator (my current job) takes a lot of energy. It sucks me dry at times. Programming, on the other hand, not only takes relatively little energy (per unit of time), but often literally gives more energy than it takes. That has convinced me that i may unashamedly use the title Hacker (though i’ll use a small “h”, to keep it in proportion ;).
So, there you have it. i’ve just written the world’s longest “i’m a hacker” post.
PS: i’ve still got 3 hours and 14 minutes before i may hack again.
Update: 1 hour and 47 minutes
i’m no lisp programmer. Though i’ve used some variant of emacs as my primary editor since 1996, i’ve never bothered to learn lisp (the linga franca of emacs). But today i got a bit more interested in it…
First we need to briefly introduce pegc, a PEG (Parsing Expression Grammar) parser generator library for C (it’s not a code generator). A few months ago i spent some weeks working on pegc, mainly just to see how far i could stretch/push/mold that model using only the C language (it’s been demonstrated many times over in C++ and other high-level languages, but AFAIK pegc is the first C library of its kind). It was an interesting problem with interesting implications and future uses, but not something i really needed “right now.” So, in short, pegc is a beta piece of software which i wrote but which i’ve never actually used (only in pegc test code!), so i haven’t gotten around to figuring out what needs to be refined/changed. So it surprised me when i got an email about pegc today.
With that in mind…
A couple hours ago i got a mail from Zajcev Evgeny (a.k.a. “lg”), a member of the SXEmacs project, where he revealed something pretty impressive. As part of the SXEmacs project he’s integrated pegc with lisp, such that people can write parser generators in lisp, and those parser generators will use pegc for the back end. (His intention is to use it to implement a syntax highlighter.) As an example he says:
yeah, i’ve done this on Emacs Lisp level, you can specify either raw
ruleset like this:(setq pp (pegc-intern-ruleset '((sentence (and (opt article) subject verb (and (opt preposition) (opt article) (or object (and object preposition))))) (article (and spaces (or (str "the") (str "the two") (str "a")))) (preposition (and spaces (or (str "on") (str "at") (str "to") (str "with")))) (subject (and spaces (or (str "man") (str "men") (str "dog") (str "dogs") (str "cat")))) (verb (and spaces (or (str "sat") (str "saw") (str "shot") (str "gave")))) (object (and spaces (or (str "cannon") (str "hat") (str "mat")))) )))
and then use it like:
(pegc-parse (pegc-create-parser “the cat sat on the mat”) pp)
which assumes first rule in ruleset as start symbol
Or otherwise you can use user friendly expression like this:(setq pp #<peg sentence - article? subject verb (preposition? article? (object / object preposition object)) article - spaces ("the" / "the two" / "a") preposition - spaces ("on" / "at" / "to" / "with") subject - spaces ("man" / "men" / "dog" / "dogs" / "cat") verb - spaces ("sat" / "saw" / "shot" / "gave") object - spaces ("cannon" / "hat" / "mat")>
which produces the same interned ruleset as previous raw
Pretty cool, in my humble opinion. i was extremely impressed with what he’d accomplished, but also extremely surprised that pegc could do the things he’s doing. The exchange went like this:
Stephan Beal wrote:
> Holy cow! You got all that working with the existing pegc code??? And
> it works!?!?
yes, actually pegc is quite solid and extensible. I did not modified
a bit in pegc code to implement this. I just created two levels of
abstractions - 1) low level - is direct FFI to pegc code and 2) high
level - is written to omit low level details and just do what user
Just what a software developer wants to hear! :-D
When i started pegc, i never envisioned the possibility of using it as the back-end for parsers in any language other than C. Now that the proof of concept is out there, the possibilities would seem to be pretty limitless. We could theoretically use SWIG to generate bindings for just about any scripting language.
Just what i needed - yet another interesting project to divide my time amongst!
Back to hacking!
—– stephan beal, 23 Dec 2008
i’ve never really done anything like this before, but let’s give it a try…
2008 is almost at its end and there’s one product/technology which has been so exceedingly helpful to me the past six months, that i thought i’d spend some time evangelizing it and giving it the completely unofficial title “My Technology Pick of the Year 2008″.
Dropbox (http://www.getdropbox.com) has revolutionized the way i do backups. It’s not just useful for backups, but for transparently keeping files synchronized across several machines.
Dropbox provides free access to up to 2GB of storage. For a nominal fee, 50GB of storage can be had (curiously, there are currently no packages between 2GB and 50GB, nor more than 50GB).
For backups, i used to reserve a second hard drive, run a makefile once a month or so, and create huge tarballs of selected directories. For the files i wanted to keep synced across machines i’d use subversion. Subversion is all fine and good, but (a) i don’t like using it for large groups of binary files, (b) i hate having to manually sync the trees, do commits, etc., and (c) it’s dog slow for large repositories.
In comes Dropbox. Dropbox is a daemon process which runs on your PC (Windows, Linux, or Mac). During the installation you choose a specific directory (a.k.a. your “dropbox”), and everything under that directory is automatically/transparently synchronized with your dropbox account. When you change a file in your dropbox, it will be transparently synced at some point.
You can currently link up to five computers to one account, so that you can synchronize your files across up to five machines. By “you can synchronize”, i mean “you copy a file into your dropbox on machine A and it will magically appear on machines B, C, and D.”
One of the most useful aspects for me has been sending myself files over the web interface. For example, i’m at a buddy’s house and he has a file for me. From his PC i can upload it to my dropbox via the web interface. When i get home, the file is literally already on my hard drive (in my dropbox folder). Convenience at its finest.
Dropbox also has features which i haven’t personally used but are obviously useful in certain cases, such as sharing a dropbox folder with multiple dropbox users, and sharing a photo album (i personally use Google Picasa for that).
The only annoying thing about dropbox is that the developer’s haven’t yet documented their data protocol (they claim they will eventually get around to documenting it), so i haven’t been able to start writing a CLI client for dropbox. They do distribute a GNOME-based file manager called Nautilus which has dropbox support built in (e.g. it can show you which files are synched and which are not), so in theory we have the information we need to write our own apps, but separating the file manager code from the dropbox code is tedious. Once they open up their protocol specs, i expect we’ll see a flood of useful dropbox add-ons, up to and including a dropbox filesystem driver (to allow one to mount a remote dropbox as a filesystem).
If you haven’t tried Dropbox, try it out. i fell in love with it inside of 27 seconds, and there hasn’t been a day since then which i haven’t been thankful for dropbox.
—– stephan beal, 23 Dec 2008
Ever since the middle of 1998, i’ve been a KDE user (it was right before the 1.0 release, if i recall correctly). KDE was, at the time, light years ahead of other Unix window managers, and it still is to this day. Or was, until KDE 4.0 came out.
i was recently (due to a hard drive problem) prompted to upgrade from Kubuntu 8.04 to 8.10. Part of that change is the switch from KDE 3.x (which is essentially Desktop Nirvana for me) to KDE 4.x. Man, am i disappointed.
KDE 4… here’s what i have to say about it:
- It’s frigging slow. Even with the proper graphics drivers and everything set up properly, every single thing about it is dog slow. My PC isn’t slow. KDE4 is slow.
- It’s got fewer customization options than even GNOME has. That is, next to none.
- The desktop is now unusable. i can only add new “plasmoids” to it, and i don’t frigging need to have my weather forecast on the desktop. The new folder views are a half-ass attempt at approximating a desktop-within-a-desktop, but it fails miserably, IMO. i want folder and application icons on my desktop. i like icons. KDE won’t let me do that simple thing.
- It took me 5 minutes to figure out how the new GUI (Graphical Unusable Interface) for configuring the panel is supposed to work. It’s extremely non-intuitive and only marginally logical.
- Try dragging an application to the panel (to create a quick-launch button for it). All find and good. Now try changing that app’s icon from the default “gear” icon to something more useful. This most utterly basic feature is missing - it is impossible to visually differentiate quick-launch apps added to the panel this way (and i haven’t yet seen another way to add apps to the panel).
- They went all out on look and feel and put an inverse amount of effort into usability and functionality. i’m sure some KDE 4 applications are (or will be) pretty cool, but KDE 4 as a desktop environment cannot do anything useful except sit there and look good. Try to use it, and it will fight you.
- The Konsole app (where i spend about 20% of my computing time) no longer saves its sessions, meaning i have to re-setup some 6 or 8 terminal sessions when i log in.
- To paraphrase Terry Pratchett: whoever developed the UI concept for KDE4 had apparently heard of the desktop idiom before, but had never actually used it. (While some do not like it, i do like the conventional desktop idiom.)
- KDE4 should still be labeled as Beta. It should not be the default desktop for any distribution just yet.
After fighting with KDE4 for a few days, i have given up on it. Kubuntu 8.10 doesn’t have official repos for KDE3, so i did the unthinkable and switched to GNOME. GNOME is not a bad desktop, really, but for power-users it needs about 3x the number of customization options. i’ve never used GNOME on a regular basis because (A) the lack of customization options is insulting to me and (B) i have some deep philosophical differences with some of GNOME’s underlying architecture.
Nonetheless, given the useless state of KDE4, and the impending obsolecense of KDE3, it would seem to be time to move on to other (though not necessarily greener) pastures…
[i originally wrote this post on 27 November 2008, but didn't post it because i try to avoid posting flame-bait or "bad mood" posts. Two weeks after writing it, however, i've decided i still agree with all of the above points, and i am quite certain that i wasn't just in a bad mood when i wrote it.]
Update: 26 January 2009:
According to http://www.linuxworld.com/news/2009/012209-open-source-identity-linux-founder.html?page=6, Linus Torvalds has also dropped KDE for GNOME. i unfortunately can’t quote it here because the fuckers use an image for the article text, as opposed to something one can copy/paste. Linus does say, however, “… the whole ‘break everything’ model is painful for users and they can choose to use something else… I suspect that I’m not the only person they lost.”