Net Connection Monitor

The net connection here was getting pretty flaky. Down for chunks of time in the night, day and in between.

Since our connection’s gears are hidden beneath a couple layers of organizations, it’s been frustrating to deal with the outages. It feels like Qwest / Century Link’s problem – the telling from the second hand says they say that there is only 1 wire-pair in to the house, so they probably are not wholly on top of things. But who knows?

So I wrote a little script to ping every 10 seconds an address randomly chosen from:

The script prints the exit value from the “ping” program (1 is bad) and the duration the pinging has been bad when a good ping follows any bad one.

Here is a day of activity:

1         Sat Oct 22 00:02:11 2011
0      Sat Oct 22 00:02:33 2011 00:00:22
1      Sat Oct 22 00:30:48 2011
0      Sat Oct 22 00:30:59 2011 00:00:11
1         Sat Oct 22 00:31:11 2011
0         Sat Oct 22 00:31:44 2011 00:00:32
1      Sat Oct 22 01:27:28 2011
0         Sat Oct 22 01:27:39 2011 00:00:10
1         Sat Oct 22 01:27:51 2011
0      Sat Oct 22 01:28:02 2011 00:00:10
1      Sat Oct 22 01:32:49 2011
0      Sat Oct 22 01:32:59 2011 00:00:10
1      Sat Oct 22 02:03:45 2011
0         Sat Oct 22 02:04:39 2011 00:00:54
1      Sat Oct 22 02:14:32 2011
0      Sat Oct 22 02:14:42 2011 00:00:10
1      Sat Oct 22 02:21:29 2011
0      Sat Oct 22 02:21:50 2011 00:00:21
1      Sat Oct 22 02:33:26 2011
0      Sat Oct 22 02:33:36 2011 00:00:10
1      Sat Oct 22 02:45:24 2011
0         Sat Oct 22 02:45:34 2011 00:00:10
1      Sat Oct 22 02:57:01 2011
0         Sat Oct 22 02:57:23 2011 00:00:21
1         Sat Oct 22 03:28:36 2011
0      Sat Oct 22 03:29:09 2011 00:00:32
1         Sat Oct 22 03:32:55 2011
0      Sat Oct 22 03:33:06 2011 00:00:10
1      Sat Oct 22 04:27:07 2011
0      Sat Oct 22 04:27:28 2011 00:00:21
1      Sat Oct 22 04:32:22 2011
0      Sat Oct 22 04:32:43 2011 00:00:21
1      Sat Oct 22 04:44:21 2011
0      Sat Oct 22 04:44:31 2011 00:00:10
1         Sat Oct 22 04:56:09 2011
0         Sat Oct 22 04:56:19 2011 00:00:10
1         Sat Oct 22 05:19:54 2011
0         Sat Oct 22 05:20:04 2011 00:00:10
1      Sat Oct 22 05:31:40 2011
0      Sat Oct 22 05:32:01 2011 00:00:21
1      Sat Oct 22 06:18:38 2011
0      Sat Oct 22 06:19:11 2011 00:00:32
1      Sat Oct 22 06:19:23 2011
0      Sat Oct 22 06:19:34 2011 00:00:10
1      Sat Oct 22 06:19:46 2011
0      Sat Oct 22 06:19:56 2011 00:00:10
1      Sat Oct 22 06:54:21 2011
0         Sat Oct 22 06:54:32 2011 00:00:10
1      Sat Oct 22 06:56:35 2011
0      Sat Oct 22 06:56:46 2011 00:00:10
1      Sat Oct 22 07:17:56 2011
0         Sat Oct 22 07:18:06 2011 00:00:10
1      Sat Oct 22 07:53:16 2011
0      Sat Oct 22 07:53:26 2011 00:00:10
1      Sat Oct 22 07:53:38 2011
0      Sat Oct 22 07:53:48 2011 00:00:10
1      Sat Oct 22 08:16:52 2011
0      Sat Oct 22 08:17:02 2011 00:00:10
1      Sat Oct 22 08:40:22 2011
0      Sat Oct 22 08:40:44 2011 00:00:21
1         Sat Oct 22 09:54:46 2011
0      Sat Oct 22 09:54:57 2011 00:00:10
1      Sat Oct 22 09:56:30 2011
0      Sat Oct 22 09:56:52 2011 00:00:21
1         Sat Oct 22 11:54:25 2011
0      Sat Oct 22 11:54:46 2011 00:00:21
1      Sat Oct 22 11:56:32 2011
0      Sat Oct 22 11:57:15 2011 00:00:43
1         Sat Oct 22 12:05:20 2011
0         Sat Oct 22 12:05:41 2011 00:00:21
1         Sat Oct 22 12:40:48 2011
0      Sat Oct 22 12:40:58 2011 00:00:10
1      Sat Oct 22 12:52:24 2011
0      Sat Oct 22 12:52:35 2011 00:00:10
1      Sat Oct 22 12:57:20 2011
0      Sat Oct 22 12:57:41 2011 00:00:21
1      Sat Oct 22 14:23:30 2011
0      Sat Oct 22 14:24:02 2011 00:00:32
1      Sat Oct 22 19:32:21 2011
0         Sat Oct 22 19:32:32 2011 00:00:10
1      Sat Oct 22 21:39:20 2011
0      Sat Oct 22 21:39:30 2011 00:00:10
1         Sat Oct 22 21:53:51 2011
0      Sat Oct 22 21:54:13 2011 00:00:21
1         Sat Oct 22 21:54:36 2011
0      Sat Oct 22 21:54:57 2011 00:00:21
1      Sat Oct 22 23:12:39 2011
0      Sat Oct 22 23:12:49 2011 00:00:10

Analysis of more data shows that, indeed, the gateway is more reliably pinged than the Speakeasy/Megapath DNS server, which is slightly more reliable than the Google DNS server. So, it’s the wire.

Meanwhile, the net connection currently is faster than it’s ever been. Almost a meg, down and a half meg up. Whooo, hoo.

Back and forth translations

Starting with thoughts of how it would be nice to have a back and forth translator when writing an email to someone who speaks a different language…

You would normally write the email in your language, then crank it through a web translator to get the text for the other person.

Well, shouldn’t the translator show the text translated back from the target language to yours so you can check what the other guy will be reading?

Oddly enough, I didn’t find anything on the web to do that as you type. Weird.

But, then it gets interesting.

The guys developing translation software could use instances of people’s starting text and final text when people process their writings through a back-and-forth-erizer. Figure a person starts by saying what he wants to say in his language. Then, as he modifies what he writes so the translation is better, he’s effectively spelling out a way to translate from his language to his language. He’s showing you a meaning thesaurus, not just simple word substitutions.

(Gunning) Fog index: Wouldn’t the difflib comparison score between the original and the back-translated text be in some way consistent with fog indices? The translation software builders probably use something like this to evaluate their software. I would.

Certain pairs of languages will translate back and forth better than other pairs. What does that mean?

  • The translation software is better for those two lingoes?
  • The cultures/people are closer?
  • What else?

Over time, what happens? Can the changes in the pair distances be used as a metric of how the world is becoming a global village? Can such changes be used in any way to understand cultural differences? Can translation software improvements be normalized out of the pair distances over time?

Presumably, the translation software guys are monitoring the pair distances between languages so that there are no instances of translations being better going through an intermediate language rather than going direct from one language to another. If such were ever the case then the thing to do would be to train the direct translator using the longer route translations. Doing such a process iteratively sounds like a pretty good way to bring a new language in to the system. All the new language needs is a corpus of translations from it to one other language. Of course, this wouldn’t be a binary thing. The more effective pair-corpus’s would be able to bootstrap the less effective links, generally.

What are the implications of a world where people write using a language back-and-forth de-fogger? Does the writing end up bureaucratic? No personality. No sharp meaning. Vanilla.

Should textbooks be run through such a de-fogger? Should speeches? Especially in the education field, it seems important to get things across clearly.

Could using these back-and-forth techniques be used to build a new language? A better language? Could they be used to build a creole language?

If a language translation system built a creole language that’s close to an existing one, does that imply that the translation system understands the ingredient languages like a human?

Given net-available text, how much CPU does it take to build an effective language translation system?

Could back-and-forth translations be used to help translate old text in to modern language? That is, keep modifying the old text until you get the best back-and-forth for your modified text. It would be interesting to automate this whole process. Proof reader, editor, re-writer system.

Good URL, but probably going away in December (returns JSON translation of the ‘q’ string):

Pulsing Web Page

Years ago, I daydreamed about a head’s up display for futuristic Instant Messaging. The display would show vital signs of the “buddy” you were monitoring. Such an intimate thing might help with communication.

Seemed like a good idea at the time.


Viewing this waveform from the CMS50E requires an HTML5-capable browser (Firefox / Safari / Modern phones / Webit / maybe IE9). I do not know how timely the waveform will show up from outside the house. It updates on localhost about once a second – roughly in time with the heartbeat beeping from the device.

Alex Pulse

The waveform just stays put if the “finger” is out. Who knows what will happen when the device is not connected to the PC that runs the server.

The “finger” is my left index toe.

left toe

The project got a little out of hand. The general idea is to have a generic server that makes it easy to stream line/point numbers out to arbitrary numbers of web clients to graph. I cut things short to get this thing on line.

CMS50E Pulse Oximeter USB / Serial Protocol

I hate reverse engineering.

You cannot underestimate how little I care that bit 3 of byte 2 of an 11 byte, binary message indicates whether the left flange of the Acme, Incorporated Doohickee 3500-XL is up or down. It just does not matter.

I don’t care to know the details of how someone implemented a device or protocol or whatever. Those details don’t matter.

But, this CMS50E is out of the Far East, so talking to it through the serial/USB port requires reverse engineering.

And a strong stomach.

Now, the CMS50E has a 1-button user interface: Beautifully done. A work of art in design and implementation.

The protocol? … … … Otherwise.

So, here goes:

The device talks to a PC through a serial-to-USB conversion cable. The PC program sets 19.2 8O1. Yes. Odd parity. And the PC program actually does a 4800/19200 dance at the top. Is this bug-clearing logic for the cable or device? Who knows.

The device sends a data stream to the PC when the “USB” menu item is “on”.

Any single byte sent to the device appears to turn USB streaming on. Perhaps, any voltage transition on the receive data line turns it on.

Let’s cover this first goof in the PC interface:

If the device is USB powered, then USB streaming should start and stop when the USB power is on or off. Duh. And, in any case and if the device does not use USB power, then a particular command message from the PC should turn streaming on – for a limited time so the battery isn’t drained from the streaming.

Poof. The menu item goes away.

USB powered devices should send identifying heartbeat messages in any case. This would allow a PC program to find the device by opening and only listening on the serial port. The heartbeat should include device identity information.

A menu choice tells the device to dump its recorded data.

Let’s cover this second goof:

A message from the PC should start the dump.

Poof. The menu item goes away.

Turns out, the PC program sends two 0xf5 bytes when it begins the data dump. And it sends three 0xf6 bytes when it has received the dump. I cannot find any reason the PC program does this. The only effect they seem to have is to turn on data streaming. Note: the displayed state of the “USB” option does not visually change until the menu choice is actively scrolled to. No big deal, but this is the reason I’ve not tested the effects of all 256 byte values sent to the device.

Streaming data format:

The streaming data is composed of 5-byte messages sent 60 times a second:

  1. Byte 1: 128 / 0x80 means the “finger” is not in the device. Ignore the other 4 bytes.

    If the high bit is not set then this is not the first byte of a message. Wait for a byte with the high bit set.

    0xf0 bits: Outside documentation indicates this is a “signal strength for pulsate” value. I have recorded only values from 0 through 9. No recorded values from 10 through 15. In all cases of a zero value but two, the heart rate and SpO2 values have been zero, but the waveform value has been valid, though also often zero. The two anamalous cases had a spurious heart rate of 132.

    0x10 bit: Outside documentation indicates this bit means “searching too long” when set.

    0x20 bit: Outside documentation indicates this bit means “dropping of SpO2” when set.

    0x40 bit: is set when the device senses a heart “beat” – a peak in the waveform. This “beat” marker comes a few samples after the actual peak and seems to coincide with the beep sound the device makes. There are often two samples together with the beat markers.

  2. Byte 2: The waveform value. 0..127. The high bit is not set. If it is set (and the high bit is set on byte 1, of course), then this is not a streaming data message, but rather a recorded data dump message.

  3. Byte 3: High bit of heart rate and certain status bits.

    The 0x40 bit is the heart rate high bit – allowing heart rates of up to 255 BPM.

    0x0f bits: Apparent duplications of the top 4 bits of the waveform value. I tried to make sense of these bits. Were they a way to get at the instantaneous oxygen saturation? No luck so far. Outside documentation indicates that they are to be used for a bar-graph on a display. In any case the 0x08 bit is always zero as the 0x80 bit of the waveform data in byte 2 is always zero, too.

    0x10 bit: Outside documentation indicates it may be “probe error” if set.

    0x20 bit: Outside documentation indicates it may be “searching” if set.

    I have no instances of the 0x30 bits being set.

    0x80 bit: Must always be zero. Otherwise, this is not a regular sample.

  4. Byte 4: Heart rate: 0..127. The low 7 bits of the heart rate, that is.

    If the third byte is 0xf2 and the fourth byte has its high bit set, then they are the first two bytes of a recorded data dump.

    The heart rate appears to be a calculation on the time difference between the oldest and most recent “beat” in the last 30 seconds plus a few samples.

  5. Byte 5: Oxygen saturation percentage.

    This value seems to be a 30 second average of some sort. Anyway, it lags by 30 seconds.

Data dump format:

A recorded data dump is composed of 3-byte messages telling the heart rate and oxygen saturation level once a second.

The first two messages sent contain the HH:MM time value set by the user when the recording was started.

The third message sent tells how many bytes are in the full data dump.

Subsequent messages are the dump, itself.

Once started, the dump continues until finished. I have not tested the effect of pulling the USB connection during a dump.

The three message types:

  1. Time value (from the menu HH:MM time, set by the user when recording was started).

    Two of these messages are sent to start the data dump.

    They can be recognized by:

    (first_byte == 0xf2) and (second_byte & 0x80)

    1. Byte 1:


    2. Byte 2:

      High bit is set.

      The 0x1f bits are the hours: 0..23.

    3. Byte 3:

      Minutes 0..59.

  2. The single message not starting with an 0xf2 value and following an 0xf2 message tells how many bytes of recorded data will be sent in the subsequent messages.

    The calculation is:

    ((first_byte & 0x3f) < <  14) | ((second_byte & 0x7f) << 7) | third_byte

    Note: There appear to be bugs in the device which makes this byte count subject to adjustments along the way. See the code for my current best guesses. Too. WordPress seems to render the shift-left 14 with an extra space.

  3. Recorded data.
    1. Byte 1:

      0xf0 or 0xf1 (possibly 0xf2 and 0xf3, but I doubt it)

      The low bit (or two bits) are the high bit(s) of the heartrate.

      If this byte masked with 0xf0 is not 0xf0, then see the code. It gets knarly.

      The device appears to be directly dumping its flash memory and the data seems to be organized on 256 byte page boundaries. 256 / 3 (3 being the message length) is not an even number. So strange things happen 3 times every 256 data messages. It’s baffling why the engineer did things this way. But there it is. Perhaps extra information is encoded by special messages at these page boundaries, but it sure doesn’t look like it. The whole thing just looks incredibly sloppy. This feel of sloppiness is enhanced because there can be obvious glitches in the data and/or dumping during particular recording dumps. The glitches appear to be in memory rather than communications problems.

    2. Byte 2:

      Low 7 bits of the heart rate.

      The 0x80 or 0x180 bits – the high bit(s) – of the heart rate are in the first byte’s low bit(s).

      If byte 2 and byte 3 are both zero, then presumably the finger was out.

    3. Byte 3:

      The oxygen saturation percentage: ?..99. I have never seen 100. At the first two 256-byte boundaries in the data dump for each 256 samples, this value is 255.

      The third 256-byte boundary seems to yield a regular streaming data sample message with bongoed heartrate – or something.

There you have it. Gosh, I hope the engineer responsible for this can say, “Hey, whadya want? I had an hour to do it in!”