My first geocache

It was a sloppy job. Got kinda cold and wet up there. But I left my first geo-cache at the top of a hill, bushwhacked up from Bear Gap. It is one of the old ammo boxes from 25 years in the garage wrapped in reflective tape.

http://www.geocaching.com/seek/cache_details.aspx?wp=GC2EACX

Another hill near Bear Gap

And invented a new art form – web site coming soon … maybe … someday.

Behold, the “LimeriKu”:

There once was a gap named Bear,
who had a cache 6 thou in the air.
The cache it did shine
by flashlight at nine.
The car … … without spare.

Poetic license, ho!

Media numbers and crowd estimates. Always a gas.

The media is famously innumerate. And, credulous with numbers.

So, when I read crowd estimates of yet another march on Washington – this one headed by a recent celebrity, Glenn Beck, I got curious.

The media, no friends of Beck, had the numbers slightly under 90k. Beck people seemed to like 4 or 500k.

That seems like enough of a difference to check without a lot of trouble.

So, here is the best picture I could find of the crowd as a whole:

Original link to slightly larger image:

http://www.therightscoop.com/wp-content/uploads/2010/08/bigcrowd.jpg

DC crowd

One of the odd things that struck me was that there were so many photos of parts of the crowd and area, but none of the whole thing. I imagine it’s hard for political-security reasons to get a plane up in that area, but, golly, where are the raw images and wide angle lenses?

Anyway, from this and other pictures you can do a quick guesstimate by simply figuring everyone is packed in the areas to the north and south of the reflecting pond, from the Lincoln Memorial to the WWII Memorial.

And that area is easy to measure. Google Earth’s ruler has it about 250 feet wide to the north and about 650 feet wide to the south. That includes the area under the trees near the pool. And, it includes a couple hundred, under-tree feet on the south.

The pool is about 2000 feet long. So, we’re talking 900*2000=1,800,000 square feet.

Divide that area up in to 18,000 cubical-sized, 10 by 10 foot tiles. Put 4 people in each and you’ve got 70k people.

OK. This is a really, really rough count. What’s wrong with it?

  1. It doesn’t count people outside this big rectangle.
  2. It doesn’t count people who came and went before or after the photo was taken.
  3. 4 people per 100 square feet? Let’s up the density a bit, eh?

#1: With regard to outsiders: Yes, this skips everyone right at the Lincoln and those people in the picture’s foreground on the hill to the west of Washington. But, I figure that they all can be moved to relatively empty spots in the big rectangle. The crowd’s kinda sparse in the southeast. There seems to be a sparse area half way down on the south side. Anyway, we’re talking percentage differences here. Nothing to write home about. Density estimate inaccuracies dwarf the outsider effect.

#2: Time? I’d have a hard time believing that the total crowd for the day would be more than, say, 50% higher than a high-crowd snapshot. The original .jpg doesn’t have any useful EXIF information, by the way, so I’m assuming that it’s close to high-crowd time. Other, on-the-ground photos don’t seem to refute this notion.

#3: Density. Yes, there’s the nub. If you make it a really tight crowd, then you could quadruple the density. (Double density, that is, causing 4 times as many people to be counted.) But, that’s getting the whole crowd in to some pretty serious intimacy. No room for lawn chairs and coolers. Anyway, taking in to account camera angle and telephoto lens effects for the on-the-ground photos, I’m wondering whether 4 people per 10^2 is a wee bit high. But, even if it is, I figure throwing the outsiders in to the big rectangle will make up for it.

Bottom line on this guesstimate: 60 to 100k people.

Oh. And one more count for that hippy girl in the pool shouting, “Forest!”.

Spammers distribute emails to addresses by Pareto’s law

I’ve got an II server that knows a lot of email addresses. Most of them are bogus addresses at tranzoa.com or .net. Emails to the bogus addresses are tossed in the bit bucket.

This email address list also includes legitimate addresses.

What happens when the number of emails to each of these addresses is graphed on a semi-log scale?

Log graph of email counts by To: name

About 10,000 of the 14,000 names are 1-email names. The rest go up in counts to the top name – a name that has had 100,000 emails to it. (Log file emails from the server really run that number up!) Oddly enough, the number two name, by count, is some long, bizarre name involving the following sub-strings:

  • kstc
  • nsdg

I should explore where emails to this name are coming from. I’d presume that they would be coming from a botnet.

Another odd thing is what normal names are high in the list. Sure, “alex” is way up there in many forms. And other legitimate names. But, “dennis”, “fleming”, “gutierrez”, and “garza”? What happened to “john”, “micheal” and “smith”? Those are the big kahunas of names out there in the Interwebs.

No Surrender

Cover of one edition

No Surrender – My Thirty-year War” was a top-notch, one-sitting book.

Hiroo Onoda was one of the last Japanese WWII holdouts. He spent 30 years on a small island in the Philippines, most of the time with companions, convinced that the war was still on and that he lived in a Truman Show world of disinformation.

When my explanations of why something I don’t want to believe are long and self-dependent, I’ll try to remember this fascinating book.

Travel Honey GPS Watch from Chinavasion

Quick bottom line:

Chinavasion Professional, responsive and web-competent.
Travel Honey Watch Not much value. A bust.

The GlobalSat GH615 watch is falling apart.

I like the watch form-factor for GPS.

  • The wrist is a very handy place to carry a GPS unit.
  • A watch must necessarily be nicely small.
  • Synchronizing the camera’s clock with geo-tagging photos is easy. Take a picture of the watch’s GPS time. Then sync the picture in TrodTrack to the time on the watch. I do this for every hike or photo session. The camera’s clock loses a second every week or so.

What to do?

Turns out, GPS watches are rare. I’ll sadly share credit with GlobalSat for ruining the GH615’s excellent chances for success in the market. Garmin Forerunners can be had for a bit over $100 on eBay. Forerunners look kind of klunky, at best. And, new, they are overpriced.

So what to do?

A web search found Chinavasion and the unfortunately named Travel Honey watch.

Note to manufacturer: In the U.S., “honey” means either the tasty stuff that comes from bees or this. Your products don’t come from bees.

On the subject of names: Chinavasion? Guys, if my experience with you is representative, you’re on your way to the top. But, consider, what would your impression of a Japanese company named Japanvasion be?

It took a two or three weeks to get the watch by mail from Hong Kong. No problem there. ‘Bout what you’d expect.

Out of box:

The shipping box perfectly fit around the product’s box. Wow!

Software

The included iTravel software is a finished product. Its Google Maps code is better than my TrodTrack code – faster and with a couple of nice spiffs. The track point editor is a nice thing. The UI layout looks good and well thought out.

It took a product key to get the program to talk to the watch. The key was not in the box so I got one from Chinavasion by on-line chat and email. Who knows whether it’s paid for. Anyway, it worked.

But I won’t be using the iTravel software except on the laptop while traveling. I have used an open source Linux program to pull the tracks off the watch. The watch protocol is documented and if I were to use the watch, I’d probably end up writing Python code to talk to it. But I won’t be using this watch as a primary GPS.

Watch

The watch is smaller than the GH615. That’s nice.

The time-keeping part of the watch, itself, is bargain basement. “Uselessly inaccurate” might be the most accurate description. And, since it’s not a GPS-time watch, it cannot be used to sync the camera time.

The GPS is provided by a SkyTraq Venus 6 GPS chip. In this watch the GPS is clearly inferior to the SirfIII in the GH-615. It loses its way in Northwest forests often and without fail.

This is a killer.

There are other problems. For instance, I have the GPS set for 1-second samples. It occasionally switches to 5-second samples and/or no sampling. The only way to get the watch back working is to reset the settings through the PC software.

So, this Travel Honey watch was a nice experiment. I’d wanted to see how another GPS chip matched up against the SirfIII. Now I know. It could be that the weakness of this GPS is in the small, watch packaging. But why chance it? I’ll probably get a normal GPS logger that uses the SirfIII chip. And, knowing me, I’ll probably end up using the GH-615 for another couple of years.

I’m inclined to get another gadget from Chinavasion. They (and many other outlets like them) certainly open a window in to another world. … so many gadgets at cut-rate prices of probable cut-rate quality.

This other world is interesting. During the 80’s and 90’s Taiwan cranked out a lot of PC boards and such-like in white boxes for low prices. One would have expected that the quality of such devices would be low. But that was not the case. Compared to the “name” brands, they were almost always:

  • Cheaper
  • Simpler to install and use
  • Higher quality
  • More powerful
  • Even with fractured English, often better (geekier) documented

My gut feeling is that these eleven-teen jillion Chinese gadgets are not like that. They give off an aura that matches the Travel Honey watch: cheap junk with a promising core. Think Japanese products from the 50’s and early 60’s.

Anyway, this evolutionary process will be fun to watch.

Ants have eaten my house

Well, this is what you get when you don’t take care of business.

Ant farm in the beam

Just a quick cleaning of that bad spot on the wall, I said to myself.

The drywall gave way in a little spot and thousands of ants were very surprised.

A couple of garbage bags and lots of shop-vac work with the old bag-less canister vacuum cleaner (I’m often glad I saved it.) and viola:

Time to go to work

The kicker is that there’s a certain satisfaction in watching the little bugs desperately running around like Bond villain minions at the end of the movie. You can forget you’re destroying your own house and revel in just blowing things up. Yesss.

RainFilter gutter thingee

Picked up yet another gutter fixer thingee to try at Costco.

This one is called RainFilter.

RainFilter Package end

Installation was quick and easy. Just stash the foam strips in the gutter. Finshing off the odd-length end was easy, too, as the foam rips easily and accurately.

It hasn’t rained yet.

I am not hopeful for this gutter “solution”, though.

After I put the foam in the lower, garage roof, I swept the main, upper roof, sending lots of pine needles, etc. down to the garage roof. After resweeping the lower roof, here is what it looks like:

RainFilter gutter foam installed with pine needles.

Which shows that, after some rain and wind, one can expect that there will be a pretty nice layer of needles and leaves stuck on top of the foam. The foam is not slick, that’s for sure.

Anyway, we’ll see.

Two needs

Couple of needs from the Panama trip:

  1. Some kind of tiny, packable, cot thingee that can convert an uncomfortable airport seat in to a usable bed that “watches” your things.
  2. A wearable display to replace netbook/laptop/phone screens. The visual equivalent of an earbud, smaller and more robust than a normal screen, but with higher resolution.

Being away from the dual 1900×1280 screens is unpleasant.

And it would be pleasant to get some real sleep pending a flight on Godot Airlines.

Panama

Well, after nearly two weeks in Panama, I should have something to say.

Aaaaarrrrgggghhhhh. My bright red legs hurt!

But that’s not what I’ve found most interesting.

Most interesting is the Albrook Bus Terminal and Mall in Panama City.

Albrook is big, new, effective and popular.

Albrook blows sky high an impression of Latin America I got in the 70’s and 80’s. Where are the gentry – those rich people who fly above the rabble? They aren’t in Albrook. Albrook Mall is straight off the big, indoor mall production line. McDonald’s? Of course. Burger King? You bet. Multi-screen theater? Sure. And all those cloned stores that populate malls everywhere.

The differences: Albrook Mall seemed to have Christmas-level shopping going on. Crowded. And I don’t remember any anchor stores.

But cities? Pfffft. So I went to the San Blas Islands. These islands are on what you might call the north east coast of Panama. Apparently, that section of the country is operated relatively independently from the rest of Panama. It’s populated and run by the Kuna people – an indigenous group.

Though the Panama City Mamallena hostel aims their customers to Franklin Island, I routed myself to the 2nd choice, Robinson Island. It was an easy decision.

The other people on the island, mostly backpacker types, were quite nice and from scattered places. Canada, Switzerland, Italy, Auz, Kiwi-land, GB, Argentina, and EU.

I couldn’t help delighting in the guests’ occasional reverence toward the Kunas’ “traditional way of life”. One guy (to his credit, verified as an independent thinker) was especially keyed on the traditional-Kuna life, Luddite don’t-change-anything, the islands will soon be under the rising global warming sea, etc faith. I was kind and only asked some vaguely probing questions, letting the answers stand for themselves. Perhaps someday he’ll notice the contrast in time scales between two things he noted:

  1. Sinking islands – in a 50 to 100 year time frame, no less!
  2. Robinson Island had “changed so much” since he’d been there a year or two earlier. Yeah. Funny how dough coming in the door “changes everything”.

My off-hand, dinner comment that the local people might be making their living building bio-tech products in 50 years got some fairly blank stares. I didn’t even note that some of the Kuna might be living in Africa or Central Asia or Europe while doing do.

What I saw was an operation that competed with similar places in Fiji, Thailand and the Red Sea – to pick a global girdling set of warm places. The island I was on doesn’t have Internet yet, but does have cellular. A couple years from now there will be visitors who lament that “This area used to be so unspoiled. There wasn’t even Internet.”

Robinson Island rooms

Knowing what you want is so hard. Robinson Island had the essence of a greener-grass fantasy I’ve had for years: snorkeling when I want, at a moment’s notice, with no preparation needed. And nice weather. And even “volleyball.” Note the quotes. Soccerball. Medicineball. Whatever. It was fun and exhausting to satiation.

So, I’ve lived the dream.

In reality, I forgot swim trunks. My one pair of shorts were wet all day. And, well, out of the water, air conditioning would have been nice. Restricting swimming to morning and evening is no problem. But, to me, sun-bathing is right up there with watching the grass grow. So, the mid-day needs something good to do out of the sun.

It’s a pity that mid-day reading in a hammock under a shady palm beneath a cloudy sky didn’t work out well for my lower legs. Ouch, ouch, ouch. Can’t stand up without “some discomfort,” as they say. Like, it takes a minute of work to do so.

Scott, Sam and I took a Blue Bird bus down to Yaviza, at the end of the Pan-American Highway. The sense of the road I got was US in the 50’s. There are still (as in the 70’s and 80’s in parts of Latin America I visited) tire repair places at odd intervals. Our bus used one.

Tire change on Pan-Am HIghway bus

The housing lagged that of the 50’s US. But the road’s painted lines were often of the modern yellow-middle/white-side type that California, but not Oregon, could afford in the 60’s. And there were cell towers all the way down the line. Say all the bad things I want about cellular companies, but the fact is they are having a profound, positive effect on the world.

The buses were without chickens and unmarked bags of food-stuff. And no one was on top.

Panama is a country with lots of construction – tall buildings in the city and Levittown types of developments north of the city. They are in a moving-forward stage. And it sure doesn’t look all driven by American retirees. I’m impressed.

Panama City building under construction

Though, as others note: the trash is sad. Trash everywhere. I can see why Singapore puts so much effort in to cleanliness. Your town can be broke, but if it’s clean, no one pities you, thank your stars.

Which brings us to Boquete, a town at 3500 feet and, as I write this at 9:30PM is actually cold. Well, cool, anyway.

Boquete from above

Boquete has a gob of American retirees who have apparently run property values up and expect them to go higher. And the town has a number of hostels for backpacker types. Hippies even sell jewelry on the sidewalk. I’m not comfortable here. There is supposed to be hiking and a very long zip line. Both, I want to do, but with the legs and general tiredness it’s not clear what will happen here. I’m not unaware that I could switch out of the little dorm room here at Mamallena’s and in to an up-scale hotel with a lot more comfort to ease the pain. Nah. I’ll hope to get some strength back tomorrow. Maybe rent a scooter and buzz around some. I did walk around today. Saw a rather unusual “garden”. The pictures won the race with this blog post to the Internet. The place, El Explorador, had lots of little thoughtful sayings scattered around. And whimsically painted rocks and other such odd-balls things. And a goat. And a swing.

El Explorador

So. A good walk of some miles.

Fun techie problem solved

How can data be split among three sites such that any two of the site’s data is enough to reconstruct the original data?

This question led to an hours-long web search for key words I can never remember, “Reed-Solomon.” And a lot of useless information.

Time to think for myself.

Which was fun.

I figured out quickly an easy way to solve the problem is to store 2 bits in each of the three sites for every 3 original data bits. Of the 3 data bits, site A has bits 1 and 2. Site B has bits 2 and 3. And site C has bits 1 and 3. Any pair of sites has all 3 bits between them.

This didn’t look optimal – 6 bits stored for every 3 bits of data. Sure, it is only double the original data size, but the two-of-three logic doesn’t really buy any extra reliability over two mirrors of the data. Maybe some obfuscation of the data at each site, and maybe it’s harder for a bad guy to get his hands on two sites’ data than any one, but that’s it.

The other thing I figured quickly was that each site would need to store at least 3 bits for every 6 data bits – 9 stored bits, total. It would be asking a lot to recover 6 bits of data from, say, 5 bits of storage from a pair of sites. Or 100 data bits from 99 site bits. Or 1,000,000,000 data bits from 999,999,999 site bits. Yes. very difficult.

But how to do better than 6 bits stored for every 3 data bits?

That stumped me. I didn’t like either the idea of using ungrocked, off-the-shelf code or trying to make sense out of the horrid, academic writings that explain how this sort of thing is officially, correctly done.

And when have I ever been interested in doing things the official way?

So.

So.

So, I figured:

  1. There would be some actual minimum at or above 50% of the data that each site would need to store. This minimum would probably be some odd transcendental meditation number depending upon pi to the e’th power or some such.
  2. The simple data recover operation I already knew for each 3 bits of data has 4 bits from any pair of sites to work with – 2 from each of two sites. 2 of these 4 bits are unique and needed. The other 2 are dupes of each other. We know which bits they are, too, since we know which of the three sites the two recovery sites are! e.g. Sites A and B? Of the 3 data bits, A has bit 1, B has bit 3 and they both have bit 2. Sites B and C? Of the 3 data bits, B has bit 2, C has bit 1 and both have bit 3.
  3. The thing to do is to use half of each of those duped bits to replace one of those 2, unique data bits.

So, I changed to thinking of the original data being 6 bits rather than 3. No more half-bits. A half-bit is just too much to wrap the mind around. Instead, each site would need 3 or more bits for every 6 bits of data.

For instance, as above, when the data is recovered, each of a pair of sites can contribute 2 bits of unique, required data not at the other site. And each site can have 2 bits that dupe the other site’s. That’s 4 bits, total, at each site. 12 bits of storage for 6 data bits. The dupes are required because if either of a pair of sites didn’t have its copies of the duped bits, things would get difficult when that site is paired up with the third site – the one which does not have a copy of those particular bits.

OK. How to spread those duped bit’s information? How to trade those wasted bits for some of the information contained in the needed, unique 4 bits?

Parity combines parts of bits.

What is parity?

Add a bunch of data bits together. That is, count the ones in a list of ones and zeros. Is the count odd? The parity bit is 1. Even? The parity bit is 0. Effectively, some of the information in each of the bits is “stored” in the parity bit. But it’s all smeared together so you can’t reliably get the original data bits back from the parity bit. Parity loses information. In a way, the parity bit contains parts of each of the data bits. If a parity bit is computed from four data bits, then, in effect, the parity bit has a quarter of each of the four bits’ information. But, a “bit” is, by definition, the minimum quantity of data. So, Mr. Parity is one interesting fellow, what with having a part of each, indivisible data bit.

But how to use parity?

Each site would store 3 parity bits for every 6 bits of original data. Combine the 3 parity bits from two sites and use the 6-bit result to look up the original 6-bit data value in a 64 element translate/lookup table.

But, which bits of the original 6 bits of data would the parity bits each cover?

Since I didn’t know the answer to that question I just wrote down on paper parity masks in an arbitrary pattern. Each of the 9 parity masks defined a parity bit for a different combination of data bits. As an example, site A’s first bit was parity for data bits 1, 2, 3, and 4. Site A’s second parity bit was for data bits 3, 4, 5, and 6. Etc.

Problem: I can’t accurately calculate 9 parity bits for even one 6-bit value, let alone all 64. So, how am I to know if each of the 64 possible data values has a unique parity value for each of the three pairs of sites, AB, AC, and BC?

No problem: God invented computers to replace my clerical ineptitude.

And that’s when I got lucky.

I mistyped two parity bit mask definitions.

And ran the program against all 64 possible values that 6 bits of data can have.

The program said that sites A and B together needed 1 extra bit to disambiguate all their shared parity values. Ditto, sites A and C. But B and C were good to go with just their parity bits! And, not only that, only site A needed the extra bit! It could be used for both the AB and AC combinations.

Cool!

That meant that site A could store 4 bits and sites B and C could each store 3 bits. 10 bits, total, to recover the original 6 bits. 12 bits was beat on the very first try.

Then I saw a typo.

One of the “parity” bits in site C was a “parity” bit of only one data bit! It did not have the full mask I’d drawn on paper.

Easy to fix.

Oooops. All three sites needed an extra bit.

Weird.

And, since the code was typed without a lot of thought, I spent a lot of time checking it out. It had been too easy to get the storage requirement down to 10 bits per 6 data bits. Something must be wrong.

Nope. In fact, when I fixed an “x” typo and flipped one of the bits in the mask the typo was in, the program declared that 9 backup bits did the trick. In two tries I was at the true minimum data storage requirement. And all I’d done was type badly.

Pleased, I was.

Very pleased.

Rest of the story:

Packing and unpacking 3/6 bit values? Yuck! I made the three sites have 4 parity bits each, covering 8 data bits. That way, the sites’ data would pack evenly in to bytes.

The program had to find the parity masks for the 12 parity bits. Too much work for me even to make the masks in some fancy pattern. After the bugs were out of the program (the key word being “after” – the start of “after” was a looooong time coming, what with my typing and all), my modern, personal super-computer took about a moment and a half to find a gob of perfect parity mask sets.

I fooled around for a while letting the program find parity masks with a minimum or maximum of data bits covered per parity bit. And finding parity masks that were as close to each parity bit covering half of the data bits as possible. And noting that some of the mask bits can be either one or zero and they still work fine.

All fun, but like the slide-out at the bottom of the hill, the thrill was over.

The program to split/join files is two_of_three.py.

It actually works.

Stand back and consider. Isn’t it kind of magical that any two parts of the data can be put together to get the whole? And the two parts are, together, no bigger than the whole.

Epilogue:

From an information point of view, notice the sneaky way that information about which site is which is the key to how this whole scheme works so simply. You can’t just take two sites’ 4-bit parity nibbles, make one of them the high nibble and one the low and then do the 256-entry table lookup to find the data byte. No, you have to know which of the sites is the high nibble and which the low. Certainly not a problem. Bad data is pretty easy to spot. Especially encrypted backup data. The program actually stores an extra data byte at the top of the data. The extra byte is used to identify the sites and weakly cross-check that the joining logic uses the same parity tables the splitting logic did. There’s a byte or two at the end of the data, too, to tell whether the file has an even or odd number of bytes in it. The parity nibbles are stored packed in bytes, so half the time there’s an extra nibble.

Post epilogue:

It turns out that the logic can be done without using table lookup. That’s handy if the lookup tables would be too big for memory. The extra cost of non-table driven is probably parity translations on the sites’ parity bits, themselves. That is, the original data bits would each be a parity bit on pre-calculated subsets of the sites’ parity bits.

Such logic would nicely handle recovery from M sites out N where M time N causes the lookup tables to be too big. The lookup tables are sized at 2 to the power of M times N entries. So, if M is 3 and N is 5, then the tables have 32k, 15 bit entries. When M*N gets up in the 20’s and 30’s, tables get big.

I’ve not yet found a way to quickly generate the parity masks for M and N sizes greater than 3/5 and 6/2. The search program is pretty dumb, though.