Langomatic


I spent a sleepless night working on an algorithm by which a computer could analyse an English sentence into grammatical blocks. And I have come to the conclusion that it's easy as pie to generate sentences from a grammar, and monstrously complicated to analyse grammar from sentences.

Given a grammatical framework, it's easy to drop in words to make sentences. Here's a framework in BNF notation:

Sentence = Subject clause + Verb clause + Object clause + Complement clause
Subject clause = (Adjective)* + Noun
Verb clause = Present verb | Past Verb
Object clause = "the" + (Adjective)* + Noun
Complement clause = Adjective

This gives you sentences like:
My dad painted the big door red
Happy children ran the old sneakers threadbare
Old Harry sends the lame horse away

It will also give you sentences like:
Telephonic your paper creeps the biscuit reticular
Sideways flat aunt gives the sticky sticky moose deaf
Clout wobbled Marjorie loud

...so it's not the most robust algorithm in the world, but for making sentences of the form "noun performs verb in such a way as to make noun adjective", it works. And if you try, you can make sense out of the last three examples. It's remarkably difficult to make sentences that have no possible interpretation if they are grammatically correct.

But what about doing it the other way around? Providing a computer with a set of frameworks, and a set of tools for it to try to fit any input sentence into one of them?

"Dad paints it red" is, as above, Subject (Noun) + Verb + Object (Noun) + Complement (Adjective). "Dad paints it fast" ought to have the same structure. But "fast" in this position isn't an adjective - it's an adverb which has an identical form to the adjective from which it derives.

"Fast" also has another adverbial sense, with no corresponding adjective, as in "The door won't budge, it's stuck fast".

Sometimes there's just no way to know from syntax alone how to parse a sentence - and computers by definition can only deal with form, not function.

One thing you don't want after a long walk is to find you've dropped your keys somewhere. You've locked yourself out, and you don't have a telephone number to call for help, and even if you did your mobile doesn't work and this country doesn't seem to have payphones. Gah.

So I retraced my steps and miracle of miracles found my keys. So from now on they don't travel on a belt loop - they go in a zipped pocket. Possibly on a chain, surgically attatched to my gall bladder. Or maybe not.

Anyway, there were two pieces of graffiti in English on the way. "Graffiti art for everyone", which was nice, and "Combat 18", which wasn't so nice. Just who in Bulgaria would have even heard of a bunch of British neo-nazi gun fetishists?

Like all languages, Bulgarian takes words from other languages. I've seen some from German (Tanz - Dance), and some from Turkish (Portokal - Orange juice, Chai - Tea), and a few I thought might be French.

For obvious reasons English derived words jump out at me - Taksi, Foto, Resterant, Apartament, Marmalad, Biskvito. But there's also some nice pseudo-angliscisms.

A teller machine is a Bankomat, and a shop where you buy hot sandwiches is a Tosteri. I think that's quite neat.

No comments:

Post a Comment