Fleeting glances

Since replacing my eyes’ lens there have been a few times – every week or two, call it – when I see an hallucination out of the corner of an eye.

They move. Fast. And are gone.

What was that?? A rabbit? A car? A leaf falling where it shouldn’t?

They always seem like … something. Something identifiable.

But then they are gone and forgotten.

Nothing to see here, folks. Move on.

Coupled with more apparent oddities of new lenses – weird depth perception, fluid focus, bright blue world – these hallucinations do show that vision is not a camera.

So, now I wonder – just where in the brain is our GPU? Hey, it’s no new story that what we see in our mind’s eye is highly processed. But some gizmo has to render that highly processed information back in to pixels. That gizmo is impressive. And it’s not surprising the works of such a gizmo would be gummed up when its raw input is changed.

(Yes. I also believe that “gizmo” is a misleading representation of this aspect of the vision system’s architecture. It might make more sense to think of the image in our head as being heavily shopped. With parts of the image variably selected from many shopped alternatives.)

Back and forth translations

Starting with thoughts of how it would be nice to have a back and forth translator when writing an email to someone who speaks a different language…

You would normally write the email in your language, then crank it through a web translator to get the text for the other person.

Well, shouldn’t the translator show the text translated back from the target language to yours so you can check what the other guy will be reading?

Oddly enough, I didn’t find anything on the web to do that as you type. Weird.

But, then it gets interesting.

The guys developing translation software could use instances of people’s starting text and final text when people process their writings through a back-and-forth-erizer. Figure a person starts by saying what he wants to say in his language. Then, as he modifies what he writes so the translation is better, he’s effectively spelling out a way to translate from his language to his language. He’s showing you a meaning thesaurus, not just simple word substitutions.

(Gunning) Fog index: Wouldn’t the difflib comparison score between the original and the back-translated text be in some way consistent with fog indices? The translation software builders probably use something like this to evaluate their software. I would.

Certain pairs of languages will translate back and forth better than other pairs. What does that mean?

  • The translation software is better for those two lingoes?
  • The cultures/people are closer?
  • What else?

Over time, what happens? Can the changes in the pair distances be used as a metric of how the world is becoming a global village? Can such changes be used in any way to understand cultural differences? Can translation software improvements be normalized out of the pair distances over time?

Presumably, the translation software guys are monitoring the pair distances between languages so that there are no instances of translations being better going through an intermediate language rather than going direct from one language to another. If such were ever the case then the thing to do would be to train the direct translator using the longer route translations. Doing such a process iteratively sounds like a pretty good way to bring a new language in to the system. All the new language needs is a corpus of translations from it to one other language. Of course, this wouldn’t be a binary thing. The more effective pair-corpus’s would be able to bootstrap the less effective links, generally.

What are the implications of a world where people write using a language back-and-forth de-fogger? Does the writing end up bureaucratic? No personality. No sharp meaning. Vanilla.

Should textbooks be run through such a de-fogger? Should speeches? Especially in the education field, it seems important to get things across clearly.

Could using these back-and-forth techniques be used to build a new language? A better language? Could they be used to build a creole language?

If a language translation system built a creole language that’s close to an existing one, does that imply that the translation system understands the ingredient languages like a human?

Given net-available text, how much CPU does it take to build an effective language translation system?

Could back-and-forth translations be used to help translate old text in to modern language? That is, keep modifying the old text until you get the best back-and-forth for your modified text. It would be interesting to automate this whole process. Proof reader, editor, re-writer system.

Good URL, but probably going away in December (returns JSON translation of the ‘q’ string):

http://ajax.googleapis.com/ajax/services/language/translate?q=come%20to%20papa?&v=1.0&langpair=en%7ces

Compression Methods

I count 3 ways to compress data:

  1. Use short strings/symbols often, long strings rarely (e.g. Huffman coding, Zipf’s law’s effect on words, I me you the / prestidigitation onomatopoeia)
  2. Refer back to things (e.g. zip, gif, lz, “One if by land, two if by sea.”, symbols)
  3. Remove unwanted stuff (e.g. jpg, mpg, stop reading boring stuff)

Are there any others?

Harry Potter – the new Star Trek

It occurs to me that Harry Potter will drive a future generation’s idea of where tech should go. Kinda like Star Trek has driven so much over the last 40 years.

Consider spells. They are rather like the search engine query interface. A few simple words and … magic.

But consider the problem from the spell’s point of view. The spell needs quite a lot of processing power. If your spell is to “freeze” something, do you mean the virus 1 meter in front of you? Or, do you mean your friend very near the general direction your wand is pointing? What’s the spell’s target and intention?

And consider the wand. Its purpose, apparently, is the help the spell figure out your intention. Is a wand the best thing to use?

Pointing works very, very well to indicate many things. But what if the “thing” is not a physical thing in space? What if you want to freeze a discussion? “Everyone stop talking for a moment” (everyone being, presumably, slaves – robots – machines). Let’s say you have a half dozen home-building machines sawing and hammering away on the new house. Pointing the wand and saying “freezzzaaam” is really, really ambiguous. Maybe you mean that the place in the house you’re pointing at is “just right”. But maybe you want all of the robots to stop working and take a lunch break or something. Or maybe they are all sorta working at cross purposes and need to stop and take a breath. Or maybe you’re putting another robot in the group and the others need to stop for a moment to regroup.

If things work out the way they apparently will, such things will be important problems to solve.

Consider, for instance, this simple example: Garbage trucks.

How do they work? Let’s assume low tech here. Nothing fancy.

The garbage guy is in a nice, comfortable cab, monitoring what’s happening with his truck. Maybe he takes the wheel in locations that are not handled by the auto-driver – like running through town to the freeway back to the dump, for instance. But, the truck does all the work while slowly prowling up and down the residential streets, flipping garbage cans’ contents in to the truck. The arm that grabs the cans can see the cans, grab ’em, empty ’em, and put ’em back on the curb. Not a big deal. Especially late at night or early in the morning. Slowly driving the truck down these streets really amounts to dodging any kids there may be. Late night, early morning hours makes that job pretty easy, too.

So the operator, the garbage man, is a monitor, a watchman. He may occasionally need to get out to unjam something. But on the other hand, he’s more like the brakeman on a train, isn’t he?

Which puts him back at the shop rather than in the cab. He’s monitoring a dozen trucks. When he has a problem with more than one at one time he simply stops the others while each problem is dealt with. Freezing a bunch of trucks is troublesome in traffic but not on the garbage-can streets.

So, when the trucks need to go in traffic, do they flock together?

Anyway, what’s this guy in the shop going to point his wand at?

More Expected Characters

Now, it’s expected words.

Or, more exactly, after running the Buffet letters through a program that tracks strings of words (rather than characters), the last of a sequence of letters is shown with the words that are in common strings made small. And unusual words or strings of words are made big.

Common strings are small - uncommon strings are big

The effect is the same. Boiler plate paragraphs are small. New stuff is big.

Data Compression

I count three ways to compress data:

  1. Make common quantas of data short, uncommon ones long. e.g. Huffman encoding. I, am, not, be, a, or, prestidigitation, gesticulate, onomatopoeia, redundant.
  2. Reference known data. e.g. Symbols. ZIP file encoding of references to repeated byte strings. Refer to a whole book’s worth of information by referencing the title. One if by land, two if by sea.
  3. Drop information that is not needed. e.g. JPG images. MP3 music. Forget it all. Don’t do it.

Are there any more?

In a sense, all optimization is data compression, is it not?

How many MIPS does the Earth’s DNA crank

I’ve often wondered how much processing power the Earth’s DNA has. In a sense, genetic evolution seems to be a search function, not unlike a nervous system, always seeking to represent, in some transformed way, a match to its outside reality.

Hmmm. If there were some reasonable way to measure “MIPS” in a broad sense, then I imagine that some of the parallels ‘tween nervous systems and genetic evolution would be clearer than they are now.

Anyway, thinking of genetic evolution as being a “brain” brings up the question: Is it self-aware?

Which brings up the question of: What the heck does “self-aware” mean?

Or, another question: Does it, genetic evolution, that is, have intention? (I’m thinking that anything that is “self-aware”, whatever that means, probably has motives of some sort. Maybe not true, of course, but it sure feels ok to think so.)

Etc.

It’s the “etc” part that’s fun.