GM/BM Friday: Pathological Programming Languages

In real life, I'm not a mathematician; I'm a computer scientist. Still a math geek, mind you, but what I really do is very much in the realm of applied math, researching how to build systems to help people program.

One of my pathological obsessions is programming languages. Since I first got exposed to TRS-80 Model 1 BASIC back in middle school, I've been absolutely nuts programming languages. Last time I counted, I'd learned about 130 different languages; and I've picked up more since then. I've written programs most of them. Like I said, I'm nuts.

Anyway, I decided that it would be amusing to inflict my obsession on you, my readers, with a new feature: the friday pathological programming language. You see, there are plenty of *crazy* people out there; and many of them like to invent programming languages. Some very small number of them try to design good languages and succeed; a much larger number try to design good languages and fail; and *then* there are the folks who design the languages I'm going to talk about. They're the ones who set out to design *bizzare* programming languages, and succeed brilliantly. They call them "esoteric" programming languages. I call them evil.

Today, the beautiful grand-daddy of the esoteric language family: the one, the only, the truly and deservedly infamous: [Brainfuck!][bf], designed by Urban Müller. (There are a number of different implementations available; just follow the link.)

Only 8 commands - including input and output - all written using symbols. And yet Turing complete; and not just Turing complete, but actually based on a *real* [formal theoretical design][pprimeprime]. And it's even been implemented [*in hardware*!][bf-hard]

BrainFuck is based on something very much like a twisted cross between a [Turing machine][turing] and a [Minsky machine][minsky]. It's got the idea of an input tape, like the turing machine. But unlike the turing machine, each cell of the tape stores a number, which can be incremented or decremented, like a Minsky machine. And like a Minsky, the only control flow is a test for zero.

The 8 instructions:

1. **>**: move the tape head one cell forward.
2. **<**: move the tape head one cell backward.
3. **+**: increment the value on the current tape cell.
4. **-**: decrement the value on the current tape cell.
5. **.**: output the value on the current tape cell as a character.
6. **,**: input a character and write it's numeric value onto the current tape cell.
7. **[**: Jump forward to the first instruction after the matching "]" *if* the value on the current tape cell is 0.
8. **]**: jump backward to the matching "[" *unless* the value on the current tape cell is 0.

Here's a hello-world program in BF:

++++++++[>+++++++++<-]>.<+++++[>++++++<-]>-.+++++++..+++.<
++++++++[>>++++<<-]>>.<<++++[>------<-]>.<++++[>++++++<-]>
.+++.------.--------.>+.

Let's pull that apart just a bit so that we can hope to understand.

* "++++++++": store the number "8" in the current tape cell. We're going to use that as a loop index, so the loop is going to repeat 8 times.
* "[>+++++++++<-]": Run a loop: using the tape cell *after* the loop index, add "9" to it. Then go back to the loop index, decrement it, and branch back to the beginning of the loop if it's not zero. So we wind up with the number 72 in the second cell. That's the ascii code for "H".
* ">.": go to the cell after the loop index, and output what's there. That outputs the "72" as a character: "H".
* "<+++++": back to the loop index. This time store 5 in it.
* "[>++++++<-]": same idea as the loop to generate the "H": this time, we're going to add 5 * 6 to the value in the second cell. (Remember that we didn't get rid of the value in that cell - so it's *still* 72.) So now the second cell contains 102.
* ">-.": Advance past the index, subtract one, and output. That's 101, or "e".

Continues in pretty much the same vein, using a couple of tape cells, and running loops to generate the values of the characters. Beautiful, eh?

If that didn't seem impressive enough, [here][bf-fib] is a really gorgeous implementation of a fibonacci sequence generator, with documentation. The BF compiler used to write this ignores any character other than the 8 commands, so the comments don't need to be marked in any way; they just need to be really careful not to use punctuation.

+++++++++++ number of digits to output
> #1
+ initial number
>>>> #5
++++++++++++++++++++++++++++++++++++++++++++ (comma)
> #6
++++++++++++++++++++++++++++++++ (space)
<<<<<< #0
[
  > #1
  copy #1 to #7
  [>>>>>>+>+<<<<<<<-]>>>>>>>[<<<<<<<+>>>>>>>-]

  <
  divide #7 by 10 (begins in #7)
  [
    >
    ++++++++++  set the divisor #8
    [
      subtract from the dividend and divisor
      -<-
      if dividend reaches zero break out
        copy dividend to #9
        [>>+>+<<<-]>>>[<<<+>>>-]
        set #10
        +
        if #9 clear #10
        <[>[-]<[-]]
        if #10 move remaining divisor to #11
        >[<<[>>>+<<<-]>>[-]]
      jump back to #8 (divisor possition)
      <<
    ]
    if #11 is empty (no remainder) increment the quotient #12
    >>> #11
    copy to #13
    [>>+>+<<<-]>>>[<<<+>>>-]
    set #14
    +
    if #13 clear #14
    <[>[-]<[-]]
    if #14 increment quotient
    >[<<+>>[-]]
    <<<<<<< #7
  ]

  quotient is in #12 and remainder is in #11
  >>>>> #12
  if #12 output value plus offset to ascii 0
  [++++++++++++++++++++++++++++++++++++++++++++++++.[-]]
  subtract #11 from 10
  ++++++++++  #12 is now 10
  < #11
  [->-<]
  > #12
  output #12 even if it's zero
  ++++++++++++++++++++++++++++++++++++++++++++++++.[-]
  <<<<<<<<<<< #1

  check for final number
  copy #0 to #3
  <[>>>+>+<<<<-]>>>>[<<<<+>>>>-]
  <- #3
  if #3 output (comma) and (space)
  [>>.>.<<<[-]]
  << #1

  [>>+>+<<<-]>>>[<<<+>>>-]<<[<+>-]>[<+>-]<<<-
]

[bf]: http://www.muppetlabs.com/~breadbox/bf/
[bf-fib]: http://esoteric.sange.fi/brainfuck/bf-source/prog/fibonacci.txt
[turing]: http://goodmath.blogspot.com/2006/03/playing-with-mathematical-machines…
[minsky]: http://goodmath.blogspot.com/2006/05/minsky-machine.html
[bf-hard]: http://www.robos.org/?bfcomp
[pprimeprime]: http://en.wikipedia.org/wiki/P_prime_prime

More like this

I'm currently away on a family vacation, and as soon as vacation is over, I'm off on a business trip for a week. And along the way, I've got some deadlines for my book. So to fill in, I'm recycling some old posts. I decided that it's been entirely too long since there was any pathological…
I'm sure that in the friday pathological programming languages, I have a fondness for languages that make your code beautiful: languages like [SNUSP][snusp], [Befunge][befunge], [Piet][piet], and such. Those languages all you two-dimensional layouts for their programs. [snusp]: http://scienceblogs…
Sorry for the missed weeks of friday pathological programming language columns. To be honest, I'm running out of languages. I'm sure there must be more, but my usual sources (dealers?) are running out - so send links! Anyway, today I'm going to look at a really simple one, which I find fun. It's…
Really we need the data on babies born 30 years ago, but this is still pretty stunning: Argentina: MatÃas, #3; Mateo, #13 Australia/New South Wales: Matthew, #21 Australia/Victoria: Matthew, #21 Austria: Matthias, #19 Belgium: Mathis, #9; Matteo, #22; Mathias, #23; Mathéo, #35; Mats, #89; Mathieu…

Interesting.

One of the Comp.Science professors from the porgramming language group at the Uniersty of Copenhagen have made a language that can only be used for rolling random d10 numbers (useful for generating World of Darkness Characters).

He uses it as an example of a domain specific language.

It's not Turing Complete though.

By Kristjan Wager (not verified) on 14 Jul 2006 #permalink

Domain specific languages can be really nifty things... But they do often set off one of my pet peeves. For example:

A few years ago, I was a talk about XSLT. This was in the very early days of XSL, before the standard was even complete. To me, XSL is a good example of a domain specific language: it's for doing XML transformations, and everything about it is geared towards that.

The guy giving the talk, who was on the standards committee explained some of the design decisions by saying: "We know that it's bad to design new general purpose programming languages. So we deliberately *made sure* that XSL was *not* Turing complete, because if it was TC, then it would be a general purpose language, which would be bad. So we made sure that it *can't* do X, which would require a Turing complete language."

The "X" that it couldn't do was something that someone writing an XML transformer might very well want to do. (It was switching between separator tags and enclosing tags.)

We know that it's bad to design new general purpose programming languages

Huh? Why would that be "bad"? Unnecessary perhaps, but bad?

By Kristjan Wager (not verified) on 14 Jul 2006 #permalink

Yeah, definitely setting the horse before the cart. A Turing-complete language may be *technically* general purpose, since you can theoretically do anything in it, but I'm pretty sure that few people would consider Brainfuck 'general purpose' (despite its Turing complete-ness).

>_<

Sometimes people just need to be reminded of exactly *why* they are believing something. If they forget, it's easy to twist the belief around into crap like that.

Kristjan Wager:

Huh? Why would [designing a universal language] be "bad"? Unnecessary perhaps, but bad?

I can think of specific instances. You might want to guarantee performance or formal semantics using automatic verification. If your language is universal, then these questions (even whether it will ever halt) are undecidable.

Regular expressions (the extended kind such as used in Perl, but also supported in Java and probably other languages) are a good example of a limited language. You can do many useful kinds of string matching and pattern extraction with them but unlike a universal program, you can be assured of efficiency at some level. (Actually, I'm not sure about full-blown Perl regexps but typical use cases stick to a non-universal feature subset). Another example would be SQL queries. In fact, you can write queries that would returned exponentially large cartesian products, but at least you can guarantee halting behavior.

I agree that it's stupid to leave out an obviously useful feature just to protect yourself from the monster under the bed. If the most natural extension of a language is universal, then so be it (it's hard to avoid; for instance, just iterating a Perl substitution expression can easily be used to simulate a universal TM). In many useful cases, you can do formal verification of programs written in a potentially universal language, and perhaps it would be better to provide such tools than to handicap the language itself.

Kristjan:

Why would designing a turing complete language to be bad?

Well, in this case, it's frankly, a very bizzare notion. When people asked questions, his response was, basically, that it was an issue of acceptance. New general purpose programming languages are proposed all of the time, and pretty much all of them are failures - maybe once every decade a new GP language sees widespread acceptance. Since Java was newly successful at the time, that meant that the window for a new general purpose language was closed. So for XSL to have any chance of acceptance, it had to be a non-GP domain specific language; and to make sure that it was domain specific and not general purpose, it had to be crippled.

This, of course, is why XSLT is so widely used on todays web, roughly 8 years later :-)

I can think of specific instances. You might want to guarantee performance or formal semantics using automatic verification. If your language is universal, then these questions (even whether it will ever halt) are undecidable.

Yes, I can think of instances as well, but surely it's dependent upon the situation. To say it's "bad" to make general purpose languages seems rather bizare to me.

By Kristjan Wager (not verified) on 14 Jul 2006 #permalink

Here's another rather impressive program written in Brainfuck: a CSS descrambler. This was written back around the time of the DeCSS court case, and is one of many implementation of DeCSS hosted at the Gallery of CSS Descrabmlers.

By Courtney Bane (not verified) on 14 Jul 2006 #permalink

Yeah, it's a small internet. Although it's not too unexpected, since I found ScienceBlogs from you mentioning it, and somebody on one of the blogs here linked to the old Good Math/Bad Math site, which got me hooked.

By Courtney Bane (not verified) on 15 Jul 2006 #permalink

Does bf really warrant "grand daddy" status? Shouldn't that belong to intercal, or the oisc people? And hey, if we are to talk of some invention that gave birth to other similar inventions, shouldn't we talk about a grandmother? After all, that's the terminology we use when one slime mold colony sprouts off a separate slime mold colony, which seems the most appropriate analogy when dealing with these things.

Looking forward to the unlambda article, and the follow-up article on Lazy-K.

Daniel:

I actually don't know about OISC; I'll go look it up later today.

As far as I know, BrainFuck is the first of a particular kind of pathological language. In some deeply warped sense, BF is a remarkably *clean* language. It's very simple, completely self-consistent, with well-defined clear semantics. And yet - it's still deeply and insanely warped. *That* is what I think defines the interesting pathological language. You can, under the right conditions, actually mistake BF for a real useful language - see the P'' stuff that it's based on. I don't know of anything like that that predates BF.

By comparison, I find Intercal to be a moderately funny joke, but not much more. There's no level on which it can be looked at as a serious language. Things like the "please" ratio (if not enough lines start with please, the compiler rejects it for being rude; if too many start with please, it rejects it for being too kiss-ass) are just too over-the-top for me.

Mark CC:

*That* is what I think defines the interesting pathological language. You can, under the right conditions, actually mistake BF for a real useful language - see the P'' stuff that it's based on. I don't know of anything like that that predates BF.

This is the first I ever heard of BF. My first thought is not to be too excited, since minimal universal computational models are kind of a dime a dozen.

On second thought, I can think of it being useful in one case. It's definitely not useful as a language for humans to write in. But a language that can express short, useful, code segments in such a small alphabet might work well as a "genome" for a Tierra-like evolving system. Has anyone explored the use of genetic algorithms and other optimization techniques to generate minimal BF code segments?

PaulC:

In fact, you're pretty much dead on: a BF-derived language is used by Nanopond for doing evolutionary algorithms.

The first esoteric programming was INTERCAL, I think, not Brainfuck

By Zzo38computer (not verified) on 21 Mar 2010 #permalink