Saturday, July 4, 2009

Using X11 over high-latency ADSL and an AltGr NX solution

X11 has been bad at high-latency links for a long time. Supposedly, the Xlib replacement XCB would fix some of that, but so far no-one seems to have picked it up. I found some work on GTK+ in 2006 but nothing since then. Sigh.

So there's this other proprietary solution, Nomachine NX with a free companion FreeNX. I decided to give it a try today, and after spending some time contemplating how to get it installed on Debian, it's not in the main repository because the basic architecture is broken (it's including it's own copy of X and SSH and some other stuff), I finally realized that the easiest setup was probably just downloading and installing the .debs on Nomachine's web site.

I was hoping to get ssh somehost then emacs & working, but it's a bit weirder than that. You have to start up a graphical connection thing, in which you can fortunately select a console. This gives you a remote xterm. The rest is simple.

I must say I'm very impressed with the performance. Of course, I'm only using Emacs but it works really well. It's like being connected through a local network.

Of course, once I started coding, the first time I needed a curly brace, which happens soon enough with Javascript, it was bam, back to start. AltGr sends an arrow-key left. After some digging and hair-pulling, it turns out I needed to run setxkbmap [layout] where layout is "dk" or "de" or similar. Except setxkbmap wasn't installed. So I installed it. Then it was missing it's data. I installed that too. Now everything is fine.

If someone would now just make port this kind of thing to the default X.org setup, then I would be a happy man. No more console Emacs sessions.

Friday, May 1, 2009

Safe truncation of HTML

Another recipe, this time for solving the problem of truncating a piece of HTML, i.e. turning "<p>Blah blah blah</p>" into "<p>Blah ...</p>". Google didn't really turn anything useful up, except for a suggestion of using a full-blown HTML parser and then simplifying the result, so I thought I would post the snippet here for Google to pick up.

The code never splits a valid tag or character entity. It should be able to cope with invalid HTML too, but note that it won't sanitize it. So for instance, if there's an unbalanced <a> in the source string, it won't fix it. Character entities are dealt with by counting them as one character.

The basic idea in the snippet is that we just skip through the string unless we encounter an opening tag. If so, we see if we can find the corresponding end tag and save it for later. When we got enough non-HTML characters, a ... is put in and any saved but not yet used end tags are added to the output.

Here's the code in Python (it's easily turned into a Django filter), I aimed for readability rather than ultra-regexp ninja tricks:

import re

tag_end_re = re.compile(r'(\w+)[^>]*>')
entity_end_re = re.compile(r'(\w+;)')

@register.filter
def truncatehtml(string, length, ellipsis='...'):
"""Truncate HTML string, preserving tag structure and character entities."""
output_length = 0
i = 0
pending_close_tags = {}

while output_length < length and i < len(string):
c = string[i]
if c == '<':
# probably some kind of tag
if i in pending_close_tags:
# just pop and skip if it's closing tag we already knew about
i += len(pending_close_tags.pop(i))
else:
# else maybe add tag

i += 1
match = tag_end_re.match(string[i:])
if match:
tag = match.groups()[0]
i += match.end()

# save the end tag for possible later use if there is one
match = re.search(r'(</' + tag + '[^>]*>)', string[i:], re.IGNORECASE)
if match:
pending_close_tags[i + match.start()] = match.groups()[0]
else:
output_length += 1 # some kind of garbage, but count it in

elif c == '&':
# possible character entity, we need to skip it
i += 1
match = entity_end_re.match(string[i:])
if match:
i += match.end()

# this is either a weird character or just '&', both count as 1
output_length += 1
else:
# plain old characters
skip_to = string.find('<', i, i + length)
if skip_to == -1:
skip_to = string.find('&', i, i + length)
if skip_to == -1:
skip_to = i + length

# clamp
delta = min(skip_to - i,
length - output_length,
len(string) - i)

output_length += delta
i += delta

output = [string[:i]]
if output_length == length:
output.append(ellipsis)

for k in sorted(pending_close_tags.keys()):
output.append(pending_close_tags[k])

return "".join(output)

Friday, April 17, 2009

Test if uploaded file is JPEG, PNG or TIFF

I've been looking at some of uploads that went wrong on YayArt lately, and it turns out that people sometimes submit images with the wrong extension, e.g. "someimage.png" when it's really a JPEG. This confuses the image backend we're using to process large images, VIPS, so it reports back an error.

I did a bit of googling, and it seems the easiest way out is to simply check the first few bytes of the file for magic numbers. So here's a bit of Python code for checking for whether the file data belongs to a JPEG, PNG or TIFF image:
def is_jpg(data):
return data[:4] == '\xff\xd8\xff\xe0' and data[6:11] == 'JFIF\0'

def is_png(data):
return data[:8] == '\x89PNG\x0d\x0a\x1a\x0a'

def is_tiff(data):
return data[:4] == 'MM\x00\x2a' or data[:4] == 'II\x2a\x00'
If the file is already on disk, you can grab the first few bytes with
f = open("somefile.jpg", 'r')
data = f.read(11)
if is_jpeg(data):
ext = ".jpg"
elif is_png(data):
ext = ".jpg"
Of course this won't test that the whole file is valid. But it's easier to do that afterwards with an image library once the extension is correct.

The magic numbers are documented in the specifications for the formats. You can also find some help for other formats in the source code of the file command on Unix systems.

Update: I'm liking this so much that I ended up putting it in a separate file and making a convenience function for getting an extension like '.jpg'. Grab the Python file here. I also added support for GIF. Here's another easy reference for magic file numbers.

Wednesday, April 8, 2009

Good art and bad art

There's a couple of things I've learned in the process with YayArt that I'd like to share. First, the key question: what is good art?

If you're like me, you'll probably think that this is impossible to say, or at least so difficult to answer that you need to have studied art for many years, be part of the art elite, to answer in a qualified way.

That's right. But in my opinion also wrong.

One approach to definition of good art is the Aristotelian approach. We try to pinpoint the common traits of the subject. This is a slippery path, but we can still do some.

First, art is a result of craftmanship which is performed with the purpose of engaging you in some way, by moving you, touching your feelings. It must be more than just an everyday thing.

If you're part of the art elite, you would add to this that good new art must bring something fresh to the table every time. It must be innovative. The past is important, to the point that new works that in the past would have been considered good art seize to be interesting.

If you're not part of the art elite but like me, you're probably more interested in the looks than prior art. It must be pleasing to the eye, or at least pleasingly unpleasant. The small details that are hard to define, but nevertheless obvious, must be right.


Good or bad art?

There's a conflict between these opposing views, and I believe this conflict leads directly to museums and obscure works like the infamous diamond skull by Damien Hurst, which most people are likely to agree is pretty ugly.

Why do most people hardly ever go to museums for inspiration? Is it because they're not susceptible to art? They lack the gene that enables them to experience art? This seems like a dubious explanation. Witness the commercial success of photography and Hollywood. Pictures work. And good art is, by definition, capable of engaging people.


Good or bad art?

The explanation is probably rather that going to a museum is cumbersome, and when you finally get there you might only see art that is definitely different, but doesn't engage you. It doesn't speak to you. Art that doesn't engage you is bad art, by definition - for you.

With YayArt we're trying to make it less cumbersome to see the art. And we are trying to replace the elitist notion of good art as being innovative art by something else. It's not that having an art elite is bad. It's just that we think there's a large proportion of people who could enjoy and benefit from art if the current art market would serve them. Or a new art market emerged.

The definition of good art we're using is a platonic definition, i.e. it works by examples. For instance, to define a chair to native from the jungle we wouldn't talk about furniture and four legs but instead point at the six chairs around the dining table and say, now do you understand?

At YayArt we show you a piece of art and ask, is it good or bad?


Good or bad art?

This is not abstract, so it is easy for everyone, not just art pros, to answer. What we're implicitly asking are questions like, does it move you? Do you like it? Everybody can answer that.

And when you think about it, this is really what good art is about.

Thursday, March 26, 2009

Ketil Bjørnstad

I finished reading To Music by Ketil Bjørnstad the other day.

It's mixing a realistic presentation of an young aspiring pianist and his chaotic life before his carrier starts with a sneaky, subtle weirdiness, a bit like Twin Peaks; very unnerving but at the same time funny. Of course, once I started on it, it was impossible to put down. For any interested in their general health, I'd recommend staying away. Seriously.

It makes me wonder what I look for in a book when I go to the library. My previous book was by Alistair MacLean, which today requires an humorous attitude towards his anachronistic a-man-is-man world-view to read, at least for me.

Friday, March 20, 2009

Beautiful code

Anders pointed me to a talk by some Ruby guy (Marcel Molina) speaking about beautiful code. I must admit that I was a bit sceptical, but Anders was very convincing.

The verdict?

When it comes to ruminating on software design, I think there's a big nasty trap which he unfortunately walked straight into (as foreseen) with few extenuating circumstances (not expected).

The problem is that the field is hidden in a fog of mysteries, buried in what's governed by intuition and tacit knowledge rather than explainable ration. Good programming, hah, that's an art, nobody can tell you how to do that!

What we need here is to be able to talk about the thing. More ration, less intuition. Trying to explain things as beautiful or not is a step in the wrong direction. It's romantic self-indulgence, like when you look at people younger than you and think, people these days... An operational set of values for evaluating code is an essential thing for an aspiring programmer. How can beauty be operational if you don't even know how to argue about it with a fellow programmer?


Beauty? Digital graphic art from YayArt.

In any case, I think his main point can be summed up to: ensure that the code is as small as possible, as clear as possible and does what it's supposed to do. As I mentioned last time I wrote about software design, I think this can be simplified to make it easy to understand.

Save the interesting but tricky beauty discussions for things like this.

Saturday, March 14, 2009

The repair shop

One of the annoying things about modern home electronics is that they are to a large degree black boxes. Like a microwaver, cold food in, push buttons, warm food out. Who knows what happened inside?

Most people are probably happy they don't know. But a life without curiousity is a life with less passion. I was fortunate enough to get an excuse to take apart two home appliances recently because they broke. And it wasn't even my fault.

The first succesful repair was Janne's younger sister's laptop. The power chord had been loose for some time, at some point the laptop simply stopped working with symptoms of no power.

The anatomy of the repair is not too far from debugging software you don't know (I've previously talked debugging software you've written yourself). Take the system apart, examine the individual components, collect information, reason. Identify the faulty component and apply the easiest fix you can think of.


Taking apart the laptop


The faulty component, the switchboard that the power chord plugs into (visible at the finger tip)

I had an edge here. I'd heard about this problem before. So instead of giving up beforehand, it got me thinking that if it was a common problem, the remedy would probably be well-known amongst electro-hobbyists. A bit of googling revealed that some people had success with soldering off the small house the power jack is inserted into, and replacing it. When we examined the house, something was in fact wrong with it, as witnessed by a simple currency test with a multimeter.

So we set out to solder it off. Unfortunately, that didn't work out. The soldering metal wouldn't melt properly. Instead we ended up replacing the whole component, i.e. the small part of the mother board that the ports were sitting on. The web is extremely handy here. Just jot down the spare part number and search for it, or parts of it.



We ordered a spare part which ended up costing less than 1/10 of the price of a new laptop, and put it in.

In reality, this was a bit harder than it sounds, because the first two places that turned up on the web didn't actually have the part when we tried ordering it. Also because taking apart a laptop is a bit complicated because of all the tiny screws and chords and plastics that have to be bent in awkward positions, sometimes more violently than you'd like to think about.

Last week, I had a much nicer experience changing the BIOS battery on my trusty old Pentium III laptop. Ugly looks, but nice internals.

Continuing on this saga, I've also fixed our microwave oven. Microwave ovens are a bit more complicated in the feature set than I hinted above. They also make the food turn around, slowly. But our oven stopped doing that. So I took it apart, hoping that it would be a bad connection.



One non-rotating Samsung microwave oven


The faulty component in the microwaver

However, there was nothing wrong with the connections inside the oven. The funny thing about hardware is that as soon as you take off the shell, it looks complicated and futuristic, but in reality it's just a set of interconnected smaller components. In this case, the sealed turntable motor was broken. Again, with the component number it wasn't hard to find a spare part on the web.

However, this presented me with an interesting real-life dilemma. Is it worth 22 £ to me to be able to see the food inside the oven carouseling past? My first answer was no. I already got to see the oven inside, it's actually pretty simple, and identify the problem. I didn't have to actually fix it.

A couple of weeks later, the oven fried a very useful corn bag we're using to losen stiffened neck muscles, a useful cure for some kinds of head-aches. The oven had burnt a hole at one particular spot in the non-rotating bag.

Today I installed the spare part. Works like a charm.

This may sound silly, but fixing supposedly unfixable things is really rewarding. You feel powerful and virtuous.

A couple of extra garden pictures:


Enjoying the spring sun at noon


Next day it's snowing