Saturday, 4 July, 2009 - 1:46pm (Sydney Australia)

Thoughts

The webmaster's thoughts on a variety of topics. In accordance with the spirit of this site, all thoughts are open for discussion, so feel free to express your comments on them. While some thoughts are more informational in nature, others are just pure opinion.

Selecting one of the browsing options is the easist way to navigate through this section, as you can choose thoughts in the category of your choice. Alternatively, a chronological listing is given below.

Hook soup

Of late, I seem to keep stumbling upon Drupal hooks that I've never heard of before. For example, I was just reading a blog post about what you can't modify in a _preprocess() function, when I saw mention of hook_theme_registry_alter(). What a mouthful. I ain't seen that one 'til now. Is it just me, or are new hooks popping up every second day in Drupal land? This got me wondering: exactly how many hooks are there in Drupal core right now? And by how much has this number changed over the past few Drupal versions? Since this information is conveniently available in the function lists on api.drupal.org, I decided to find out for myself. I counted the number of documented hook_foo() functions for Drupal core versions 4.7, 5, 6 and 7 (HEAD), and this is what I came up with (in pretty graph form):

Drupal hooks by core version

And those numbers again (in plain text form):

  • Drupal 4.7: 41
  • Drupal 5: 53
  • Drupal 6: 72
  • Drupal 7: 183

Aaaagggghhhh!!! Talk about an explosion — what we've got on our hands is nothing less than hook soup. The rate of growth of Drupal hooks is out of control. And that's not counting themable functions (and templates) and template preprocessor functions, which are the other "magically called" functions whose mechanics developers need to understand. And as for hooks defined by contrib modules — even were we only counting the "big players", such as Views — well, let's not even go there; it's really too massive to contemplate.

Filed in:

Installing the uploadprogress PECL extension on Leopard

The uploadprogress PECL extension is a PHP add-on that allows cool AJAX uploading like never before. Version 3 of Drupal's FileField module is designed to work best with uploadprogress enabled. As such, I found myself installing a PECL extension for the first time. No doubt, many other Drupal developers will soon be finding themselves in the same boat.

Unfortunately, for those of us on Mac OS X 10.5 (Leopard), installing uploadprogress ain't all smooth sailing. The problem is that the extension must be compiled from source in order to be installed; and on Leopard machines, which all run on a 64-bit processor, it must be compiled as a 64-bit binary. However, the gods of Mac (in their infinite wisdom) decided to include with Leopard (after Xcode is installed) a C compiler that still behaves in the old-school way, and that by default does its compilation in 32-bit mode. This is a right pain in the a$$, and if you're unfamiliar with the consequences of it, you'll likely see a message like this coming up in your Apache error log when you try to install uploadprogress and restart your server:

PHP Warning: PHP Startup: Unable to load dynamic library '/usr/local/php5/lib/php/extensions/no-debug-non-zts-20060613/uploadprogress.so' - (null) in Unknown on line 0

Hmmm… (null) in Unknown on line 0. WTF is that supposed to mean? (You ask). Well, it means that the extension was compiled for the wrong environment; and when Leopard tries to execute it, a low-level error called a segmentation fault occurs. In short, it means that your binary is $#%&ed.

But fear not, Leopard PHP developers! Read on for some instructions for how to install uploadprogress by compiling it as a 64-bit binary.

Filed in:

Self-referencing symlinks can hang your IDE

One of my current Drupal projects has been giving me a headache lately, due to a small but very annoying problem. My PHP development tools of choice, at the moment, are Eclipse PDT and TextMate. Both of these generally work great for me. I prefer TextMate if I have the choice (better config options + much more usable), but I switch to Eclipse whenever I need a good debugger (or a bit of contextual help / autocomplete). However, they haven't been working well for me in this case. Every time I try to load in the source code for this one particular project, the IDE either hangs indefinitely (in Eclipse), or it slows down to a crawl (in TextMate). I've been tearing my hair out, trying to work out the cause of this problem, which has forced me to edit individual files for several weeks, and which has meant that I can't have a debugger or an IDE workspace for this project. Finally, I've nailed it: self-referencing symlinks are the culprit.

The project is a Drupal multisite setup, and like most multisite setups, it uses a bunch of symlinks in order for multiple subdomains to share a single codebase. For each subdomain, I create a symlink that points to the directory in which it resides; in effect, each symlink points to itself. When Apache comes along, it treats a symlink as the "directory" for a subdomain, and it follows it. By the time Drupal is invoked, we're in the root of the Drupal codebase shared by all the subdomains. Everything works great. All our favourite friends throw a party. Champagne bottles pop.

The bash command to create the symlinks is pretty simple — for each symlink, it looks something like this:

ln -s . subdomain

Unfortunately, a symlink like this does not play well with certain IDEs that try to walk your filesystem. When they hit such a symlink, they get stuck infinitely recursing (or at least, they keep recursing for a long time before they give up). The solution? Simple: delete such symlinks from your development environment. If this is what's been dragging your system down, then removing them will instantly cure all your woes. For each symlink, deleting it is as simple as:

rm subdomain

(Don't worry, deleting a symlink doesn't also delete the thing that it's pointing at).

Filed in:

Deployment and migration: hot at DrupalCon DC

There was no shortage of kick-a$$ sessions at the recent DrupalCon DC. The ones that really did it for me, however, were those that dealt with the thorny topic of deployment and migration. This is something that I've been thinking about for quite a long time, and it's great to see that a lot of other Drupal people have been doing likewise.

The thorniness of the topic is not unique to Drupal. It's a tough issue for any system that stores a lot of data in a relational database. Deploying files is easy: because files can be managed by any number of modern VCSes, it's a snap to version, to compare, to merge and to deploy them. But none of this is easily available when dealing with databases. The deployment problem is similar for all of the popular open source CMSes. There are also solutions available for many systems, but they tend to vary widely in their approach and in their effectiveness. In Drupal's case, the problem is exacerbated by the fact that a range of different types of data are stored together in the database (e.g. content, users, config settings, logs). What's more, different use cases call for different strategies regarding what to stage, and what to "edit live".

Read on to find out about:

  • Context, Spaces and Exportables
  • Deploy module
  • Other solutions presented
  • How we've come a long way
Filed in:

First DrupalCamp Australia was a success

A few weeks ago (on Sat 18th Oct 2008), we (a.k.a. the Sydney Drupal Users' Group) held the first ever DrupalCamp Australia. Sorry for the late blog post — but hey, better late than never. This was Sydney's second full-day Drupal event, and as with the first one (back in May), it was held at the University of Sydney (many thanks to Jim Woulfe from the Faculty of Pharmacy, for providing the venue). This was Sydney's biggest Drupal event to date: we had an incredible turnout of 50 people (that's right — we were booked out), and for part of the day we had two presentation tracks running in adjacent rooms.

DrupalCamp Australia logoDrupalCamp Australia in lovely Sydney.

Morning welcome to DrupalCamp AustraliaMorning welcome to DrupalCamp Australia.

Geeks in a roomSydney Drupal geeks gather for some... errr... free breakfast?

Filed in:

Listing every possible combination of a set

The problem is simple. Say you have a set of 12 elements. You want to find and to list every possible unique combination of those elements, irrespective of the ordering within each combination. The number of elements making up each combination can range between 1 and 12. Thanks to the demands of some university work, I've written a script that does just this (written in PHP). Whack it on your web server (or command-line), give it a spin, hack away at it, and use it to your heart's content.

Filed in:

A count of Unicode characters grouped by script

We all know what Unicode is (if you don't, then read all about it and come back later). We all know that it's big. Hey, of course it's big: its aim is to allow for the representation of characters from every major language script in the world. That's gotta be a lot of characters, right? It's reasonably easy to find out how many unicode characters there are in total: e.g. the Wikipedia page (linked above) states that: "As of Unicode 5.1 there are 100,507 graphic [assigned] characters." I got a bit curious today, and — to my disappointment — after some searching, I was unable to find a nice summary of how many characters there are in each script that Unicode supports. And thus it is that I present to you my count of all assigned Unicode characters (as of v5.1), grouped by script and by category.

Filed in:

Randfixedsum ports in C# and PHP

For a recent programming assignment that I was given at university, I was required to do some random number generation. I decided to write my program in such a way that it needed a set of random numbers (with a fixed set size), each of which had to be within a fixed range, and all of which had to add up to a fixed total. In other words, what I needed was a function that let me say: "give me 50 random numbers, and make sure that each of those numbers is between 1 and 20, and also make sure that the total of all those numbers is 200... and remember, despite all that, they have to be random!" Only problem? Finding a function that returns such data is extremely difficult.

Fortunately, I stumbled across the ingenious randfixedsum, by Roger Stafford. Randfixedsum — as its name suggests — does exactly what I was looking for. The only thing that was stopping me from using it, is that it's written in Matlab. And I needed it in C# (per the requirements of my programming assignment). And that, my friends, is the story of why I decided to port it! This was the first time I've ever used Matlab (actually, I used Octave, a free alternative), and it's pretty different to anything else I've ever programmed with. So I hope I've done a decent job of porting it, but let me know if I've made any major mistakes. I also ported the function over to PHP, as that's my language of choice these days. Download, tinker, and enjoy.

Filed in:

Legislation and programming: two peas in a pod

The language of law and the language of computers hardly seem like the most obvious of best buddies. Legislation endeavours to be unambiguous, and yet it's infamous for being plagued with ambiguity problems, largely because it's ultimately interpreted by subjective and unpredictable humang beings. Computer code doesn't try to be unambiguous, it simply is unambiguous — by its very definition. A piece of code, when supplied with any given input, is quite literally incapable of returning inconsistent output. A few weeks ago, I finished an elective subject that I studied at university, called Legal Method and Research. The main topic of the subject was statutory interpretation: that is, the process of interpreting the meaning of a single unit of law, and applying a given set of facts to it. After having completed this subject, one lesson that I couldn't help but take away (being a geek 'n' all) was how strikingly similar the structure of legislation is to the structure of modern programming code. This is because at the end of the day, legislation — just like code — needs to be applied to a real case, and it needs to yield a Boolean outcome.

Filed in:

On the causes of the First World War

WWI was one of the most costly and the most gruesome of wars that mankind has ever seen. It was also one of the most pointless. I've just finished reading The First Casualty, a ripper of a novel by author and playwright Ben Elton. The novel is set in 1917, and much of the story takes place at the infamous Battle of Passchendaele, which is considered to have been the worst of all the many hellish battles in the war. I would like to quote one particular passage from the book, which I believe is the best summary of the causes of WWI that I've ever read.

Syndicate content