CMS50E Pulse Oximeter USB / Serial Protocol

I hate reverse engineering.

You cannot underestimate how little I care that bit 3 of byte 2 of an 11 byte, binary message indicates whether the left flange of the Acme, Incorporated Doohickee 3500-XL is up or down. It just does not matter.

I don’t care to know the details of how someone implemented a device or protocol or whatever. Those details don’t matter.

But, this CMS50E is out of the Far East, so talking to it through the serial/USB port requires reverse engineering.

And a strong stomach.

Now, the CMS50E has a 1-button user interface: Beautifully done. A work of art in design and implementation.

The protocol? … … … Otherwise.

So, here goes:

The device talks to a PC through a serial-to-USB conversion cable. The PC program sets 19.2 8O1. Yes. Odd parity. And the PC program actually does a 4800/19200 dance at the top. Is this bug-clearing logic for the cable or device? Who knows.

The device sends a data stream to the PC when the “USB” menu item is “on”.

Any single byte sent to the device appears to turn USB streaming on. Perhaps, any voltage transition on the receive data line turns it on.

Let’s cover this first goof in the PC interface:

If the device is USB powered, then USB streaming should start and stop when the USB power is on or off. Duh. And, in any case and if the device does not use USB power, then a particular command message from the PC should turn streaming on – for a limited time so the battery isn’t drained from the streaming.

Poof. The menu item goes away.

USB powered devices should send identifying heartbeat messages in any case. This would allow a PC program to find the device by opening and only listening on the serial port. The heartbeat should include device identity information.

A menu choice tells the device to dump its recorded data.

Let’s cover this second goof:

A message from the PC should start the dump.

Poof. The menu item goes away.

Turns out, the PC program sends two 0xf5 bytes when it begins the data dump. And it sends three 0xf6 bytes when it has received the dump. I cannot find any reason the PC program does this. The only effect they seem to have is to turn on data streaming. Note: the displayed state of the “USB” option does not visually change until the menu choice is actively scrolled to. No big deal, but this is the reason I’ve not tested the effects of all 256 byte values sent to the device.

Streaming data format:

The streaming data is composed of 5-byte messages sent 60 times a second:

  1. Byte 1: 128 / 0x80 means the “finger” is not in the device. Ignore the other 4 bytes.

    If the high bit is not set then this is not the first byte of a message. Wait for a byte with the high bit set.

    0xf0 bits: Outside documentation indicates this is a “signal strength for pulsate” value. I have recorded only values from 0 through 9. No recorded values from 10 through 15. In all cases of a zero value but two, the heart rate and SpO2 values have been zero, but the waveform value has been valid, though also often zero. The two anamalous cases had a spurious heart rate of 132.

    0x10 bit: Outside documentation indicates this bit means “searching too long” when set.

    0x20 bit: Outside documentation indicates this bit means “dropping of SpO2” when set.

    0x40 bit: is set when the device senses a heart “beat” – a peak in the waveform. This “beat” marker comes a few samples after the actual peak and seems to coincide with the beep sound the device makes. There are often two samples together with the beat markers.

  2. Byte 2: The waveform value. 0..127. The high bit is not set. If it is set (and the high bit is set on byte 1, of course), then this is not a streaming data message, but rather a recorded data dump message.

  3. Byte 3: High bit of heart rate and certain status bits.

    The 0x40 bit is the heart rate high bit – allowing heart rates of up to 255 BPM.

    0x0f bits: Apparent duplications of the top 4 bits of the waveform value. I tried to make sense of these bits. Were they a way to get at the instantaneous oxygen saturation? No luck so far. Outside documentation indicates that they are to be used for a bar-graph on a display. In any case the 0x08 bit is always zero as the 0x80 bit of the waveform data in byte 2 is always zero, too.

    0x10 bit: Outside documentation indicates it may be “probe error” if set.

    0x20 bit: Outside documentation indicates it may be “searching” if set.

    I have no instances of the 0x30 bits being set.

    0x80 bit: Must always be zero. Otherwise, this is not a regular sample.

  4. Byte 4: Heart rate: 0..127. The low 7 bits of the heart rate, that is.

    If the third byte is 0xf2 and the fourth byte has its high bit set, then they are the first two bytes of a recorded data dump.

    The heart rate appears to be a calculation on the time difference between the oldest and most recent “beat” in the last 30 seconds plus a few samples.

  5. Byte 5: Oxygen saturation percentage.

    This value seems to be a 30 second average of some sort. Anyway, it lags by 30 seconds.

Data dump format:

A recorded data dump is composed of 3-byte messages telling the heart rate and oxygen saturation level once a second.

The first two messages sent contain the HH:MM time value set by the user when the recording was started.

The third message sent tells how many bytes are in the full data dump.

Subsequent messages are the dump, itself.

Once started, the dump continues until finished. I have not tested the effect of pulling the USB connection during a dump.

The three message types:

  1. Time value (from the menu HH:MM time, set by the user when recording was started).

    Two of these messages are sent to start the data dump.

    They can be recognized by:

    (first_byte == 0xf2) and (second_byte & 0x80)

    1. Byte 1:

      0xf2

    2. Byte 2:

      High bit is set.

      The 0x1f bits are the hours: 0..23.

    3. Byte 3:

      Minutes 0..59.

  2. The single message not starting with an 0xf2 value and following an 0xf2 message tells how many bytes of recorded data will be sent in the subsequent messages.

    The calculation is:

    ((first_byte & 0x3f) < <  14) | ((second_byte & 0x7f) << 7) | third_byte

    Note: There appear to be bugs in the device which makes this byte count subject to adjustments along the way. See the code for my current best guesses. Too. WordPress seems to render the shift-left 14 with an extra space.

  3. Recorded data.
    1. Byte 1:

      0xf0 or 0xf1 (possibly 0xf2 and 0xf3, but I doubt it)

      The low bit (or two bits) are the high bit(s) of the heartrate.

      If this byte masked with 0xf0 is not 0xf0, then see the code. It gets knarly.

      The device appears to be directly dumping its flash memory and the data seems to be organized on 256 byte page boundaries. 256 / 3 (3 being the message length) is not an even number. So strange things happen 3 times every 256 data messages. It’s baffling why the engineer did things this way. But there it is. Perhaps extra information is encoded by special messages at these page boundaries, but it sure doesn’t look like it. The whole thing just looks incredibly sloppy. This feel of sloppiness is enhanced because there can be obvious glitches in the data and/or dumping during particular recording dumps. The glitches appear to be in memory rather than communications problems.

    2. Byte 2:

      Low 7 bits of the heart rate.

      The 0x80 or 0x180 bits – the high bit(s) – of the heart rate are in the first byte’s low bit(s).

      If byte 2 and byte 3 are both zero, then presumably the finger was out.

    3. Byte 3:

      The oxygen saturation percentage: ?..99. I have never seen 100. At the first two 256-byte boundaries in the data dump for each 256 samples, this value is 255.

      The third 256-byte boundary seems to yield a regular streaming data sample message with bongoed heartrate – or something.

There you have it. Gosh, I hope the engineer responsible for this can say, “Hey, whadya want? I had an hour to do it in!”

36 thoughts on “CMS50E Pulse Oximeter USB / Serial Protocol

  1. Hi, I just bougth a CMS50E and done few experiments in understanding streaming serial protocol. It seems that beat marker are stored in byte 1, waveform in byte 2… it seems different from your one.
    My unit is a ver 6.6R, and your?

  2. Mine is also 6.6R.

    But the problem was in my description – a paragraph in the wrong place – fixed now.

    I also have the beat marker in byte 1 and waveform in byte 2.

    Thanks for the catch!

  3. Thank you…
    have you ever tested the oximeter with a cable different from the original one? it seems to work with the original cable only!
    Another question… I perform the streaming data capture by a software serial data logger set at 19.2 8O1, but some times captured data seems without a sense while other times I can catch the right data. Have you ever experienced this problem??

  4. I suspect that their cable contains the RS-232 to USB converter (a CP2101 or CP2102 chip) in the end of the cable that plugs in to the PC. That is, I suspect that the mini-USB that plugs in to the device is actually at RS-232 levels/protocol. I originally saw the mini-USB on the device end so I did not unpack their cable, but rather used a generic USB cable. It did not work.

    No trouble with 19.2 8O1. Nor 8N1. Any chance you are opening the port in two programs at once? (Depends on OS whether you can do this.)

    If you can run Python and have a situation where there’s garbage, I’d be interested if my program recovers from the situation OK.

  5. Hi again,
    I downloaded your tz_cms50.py updated at October 14, 2011.
    I really don’t know Pyton but I will try to install it and try your program.
    How should I set the right COM port number?

    For the cable I’ve the same suspect!

    I don’t really understand well english (I’m italian) and reading your posts I can’t understand if you are able to get a “low” spO2. I tried in several ways, but nothing…

    As well as I will able to run the program I’ll let you know the results!

  6. You might just download the whole tzpython to a new directory you make for it. tzpython is in a zip file at:

    https://www.tranzoa.net/tzpython/tz_python.zip

    That way, all the other of my Python scripts that tz_cms50.py uses will be there.

    But, the tz_cms50.py program needs at least one external library that’s not part of regular Python – the serial port handler at:

    https://sourceforge.net/projects/pyserial/

    pyserial is available for installation on any Ubuntu Linux system through the package manager.

    I think tz_cms50.py looks for the USB serial port and should find it on both Linux and Windows systems (though, I admit that earlier tonight I was breaking that logic. It should be OK now.). It finds the port by looking for the vendor ID and product ID of the CMS50’s USB/serial converter device.

    I did get below 90% SpO2 level, by blowing out as much as I could, then holding my breath for 40 seconds or more. Then, about 30 seconds after I started breathing again, the SpO2 level started going down below 90%.

    There is a 30 second delay in the device for both SpO2 and for heart rate. That delay makes for smooth results, but it certainly is not responsive.

  7. Hi again…
    I just solved for the installation of win32api and serial for pyton… I triet to run tz_cms50.py but I obtained a an error like this:

    import pygooglechart
    ImportError: No module named pygooglechart

    any ideas??

  8. I just uploaded a new version of tz_cms50.py that can handle there being no pygooglechart.py. Also, it has bugs fixed in the graphing logic that I had put in the code since the last time I actually run tz_cms50.py directly. 🙂 I run:

    python tz_pulse_server.py –title “Alex’s Current Pulse” –spoof “oxi_*” oxi

    which is a web server that lets a modern, HTML5 browser look at the current waveform (or look at historical data if the finger is not in) at:

    https://www.tranzoa.net/alex_pulse/

    My copy of pygooglechart.py is probably old, so I don’t know whether the tz_cms50.py code will work with the latest versions of pygooglechart from the net. Anyway, Google for it to find it. Or, you might be able to give the command line “easy_install pygooglechart” to install it.

  9. Thank you very much, I will try to get it work…
    I tried again to get the stream data directly with my favorite programmin language (Matlab). Now it seems to work… maybe it was a problem in the other serial port logger program…
    I have dove very few experiments, but I think that the not still understand bit in the first byte are linked with the strength of raw waveform value. I think that are used to introduce an offset or multiplying factor to display the waveform without slow fluctuation in the baseline, amplitute and mean value. I’m not sure I explained myself very well in english… I hope you can understand wath I mean…

    I will let you know my further experiments if you like!

  10. I wondered, too, whether the low bits of the first byte were some kind of offset or something related to the waveform. The waveform auto-scales, after all. But, after looking at a *lot* of those bits, it just did not seem to be the case.

    Also, since 3 other bits seemed to be duplicates of 3 of the waveform bits, I wondered whether all these unexplained bits could be information that could be used to derive the SpO2 level. As it is, the waveform data does not appear to be usable for calculating the SpO2 level. But, I could not figure out any way to correlate the unexplained bits with the SpO2 percentage from 30 seconds later.

  11. mmm… we have to investigate deeply about these bits! 😉
    About the SpO2 calculation, I don’t think that we can argue nothing more than the direct SpO2 % given in byte 5…
    …The SpO2 has to be calculated by the value of the waveform for the red and IR led and we don’t have these data. The only thing that we can try is to voluntary reduce SpO2 and check if the duplicates 3 bit are not really duplicates… maybe at “low” SpO2 they differ… 🙂
    About the delay of 30 seconds… Could it be a “real” delay given by the time needed by the deoxigenated blood to reach the finger???

  12. tz_cms50.py has a commented out print statement on line 517 the prints if those 3 bits are ever different from each other. I’ve done a lot of testing to try to make them different, but never had a printout (when the statement was uncommented). Very depressing. I had hoped that the extra bits would reflect a value from one of the LEDs or a value that could be subtracted from the waveform values to get further information. Or whatever.

    I don’t think the 30 seconds is circulation delay.

    First reason: I’ve gone in to waveform data and seen pretty close to how they are calculating the heart rate from the beat markers. (I believe that the calculation is pretty simple, but that there is filtering on the results to compensate for weird waveforms.)

    Second: The delay is 30 seconds on both me and another person. That person has a body type and size quite different from mine.

    Third: A device that is my “day job” to write the software for says that the full time of circulation throughout the body is rarely anywhere near 30 seconds.

    Fourth: I did some timing comparisons between the waveform from a finger and from a toe. They were a lot closer in time than I expected. Fraction of a second. I expected the toe to lag an easily measurable amount of time.

    Fifth: Vague memory says that the 30 seconds is mentioned somewhere in the documentation or sales literature for the device.

    Hey, maybe some time an engineer from Contec or a supplier for them will see this thread and tell us! 🙂 It would sure be nice if the USB dump had the raw data.

  13. Hi again…. to be sure of what we are speaking of (sorry again for my poor english comprehension!)…
    I was analyzing the lowest 3 bit of the firt byte…
    For example, in one of my few logs, they start as 010; after a while they becomes 011 for about 7 seconds. after that they went back to 010.
    These changes occur as the same moments of certain “beat” bit 0×40.

    I also displayed the corresponding waveform noticing a “fast” change in amplitude.

    with 010 the amplitude was “naturally” increasing arriving to a peak measured as a waveform value of more than 100/127. After that occured the change to 011 and the next waveform just arrived at a raw value of 70/127 (not a possible natural change). After some seconds the waveform peak reached a “minimum” of about 50/127. At this time occurred also the coming back to 010 and innaturally the next waveform peak increased to about 80!!!

    Interesting also to notice the the changes from 010 and 011 are also corresponding to improvvise (from a time sample to the next one) change in the waveform curve, like a multiplying factor was applied…
    it seems that if the instrument is observing a peak value increasing or reaching a treshold it divides the following data by a factor to avoid to go over the limit of 127/127. The same thing seem to happen also when the signal goes too low….

  14. …WOW…
    I considered these bits as a multiplyng factor of the 7 bits of the waveform in byte 2… now the waveform is more and more “smooth” over the time… both considering the peak to peak intensity changes both as the changes in sample to sample change when the 3 bit changes occur…
    I noticed the I had to shift (delay) the application of the multiplication coefficient of a sample…

  15. You may have the right idea.

    It does look like they only change those 3 low bits at the time of the beat.

    They tend to go down in value when the average for recent beat samples goes lower than some variable level. And they tend to go up when the average of recent beat samples goes above some variable level.

    Too, I notice that the low 3 bits sometimes go down just a few samples before the waveform hits zero for a run of samples. And, the low 3 bits sometimes go up a similar number of samples after the waveform hits near 127 for a run of samples.

    I’ll have to try that multiply idea. I had rejected it because it seemed like it would be too powerful – have too much effect on the waveform values.

  16. Well, just multiplying the waveform value by the low 3 bites seems to make low amplitude waves flatten out. So maybe that’s not what you are doing. 🙂

  17. I do the following as example:
    if the 3 bit are 010, I multiply the next waveform value by 4, if the 3 bit are 011, I multiply the next waveworm value by 5.
    I’m not sure if the delay have to be applied both for augmenting and diminishing value of the 3 bit multiplicator, but I will study this.
    But wath do you mean with “low amplitude waves flatten out” ??

    I noticed also sometimes the 5th bit in byte1. This is not for sure related with the multiplicator… it seems to appear just before a “beat” bit if has lost the previous “beats”…

  18. …other thing… I think the the first 4 bytes (not the firts 3 bytes) of byte 4 are an approximation of the waveform…

  19. My bad. Yes. 4 bits, not 3, are duplicates of the high 4 bits of the 7-bit waveform value.

    With respect to the multiplying, would the multipliers be:

    000 multiply by 2
    001 multiply by 3
    010 multiply by 4
    011 multiply by 5
    100 multiply by 6
    101 multiply by 7
    110 multiply by 8
    111 multiply by 9

    All the 000 samples I have recorded have 001000 or 010000 in the low 6 bits of byte 1. And with only two exceptions, the 010000 samples all have heart rates and SpO2 values of zero. 000, 001, 110 and 111 samples are relatively rare in my recorded data.

    “Flatten out”. The tz_cms50.py printout scales the waveform star characters’ positions relative to the highest and lowest waveform values seen. After a bit, when the waveform does not have high amplitude waves – is relatively flat – the flatness is enhanced in the tz_cms50.py printout. That is, it’s really, really flat. Always the same “Y” value on-screen. That printout only has 100 vertical “pixels”, but it doesn’t feel right.

  20. About byte4… is very interesting to notice that the 4 bits are not a real duplication of the 4 high bits of the waveform but an approximation, taking care of the value of the lowest 3 bits of the waveform… Would be very nice to understand WHY they are in the stream!

    About the multiplier… I don’t understand if you agree that these bits are a multiplier or not… and why you suspect that we have add 2 to the 3 bit value so that 010 means 4 and not 2?

    About the “Flatten out”… I’m still not able to run your code in pyton, so is difficult to understand the problem. With Matlab plots I don’t have problems. The only thing I hat to take care is to convert the integer 8bit value to double precision value before going the multiplication to avoid that non integer values after the multiplication are rounded to an integer.

  21. I sure wondered about those 4, duplicate bits, too. No idea why there are there, though the output from the device is clearly not a priority for the device. (I say that because the calculations and screen updates are done instead of output if there is not enough CPU to do everything. Which happens at times.) Anyway, I’ve modified tz_pulse_server.py (which is the web server that serves http://www.tranzoa.net/alex_pulse) to print on my screen something if those 4 bits are ever *not* a duplicate of the top bits of the waveform.

    Multiplier: My mistake. I thought from your earlier comment that you had multiplied the values with a +2 multiplier. Certainly, those 3 bits can’t be a multiplier when those they are zero! 🙂

    What’s the error from Python now? (You can email it directly to my.) I’d hoped to have fixed it so you could run tz_cms50.py without pygooglechart.

  22. I tried this calculation in the spreadsheet ( =B4 * (D4 + 1) / 4.5 ) to avoid zeroing the waveform values when the “signal strength” value is zero.

    or, in English:

    waveform_value = byte 2
    signal_strength = byte 1, low 4 bits
    y = (waveform_value * (signal_strength + 1)) / 4.5

    The gltches that are smoothed in the spreadsheet are smoothed a bit less than using the (B4*D4)/4 calculation, but are still smoothed.

    I do not really prefer the waveform display using this */4.5 calculation, though, to be honest.

  23. thank you for posting this information and for your hard work. I know it has been a while, however, I wanted to let you know I came across a data spec sheet for the CMS50 Oximeter and would be happy to share it if you contact me by email off line.

  24. Just letting you know that I purchased a CMS-50E in June 2014. The protocol has changed, the device now sends groups of nine bytes many times a second. There is a handshake between the application and device before bytes are sent. The real time data does not look that hard to decipher (have not done so yet) but replicating the handshake to start the data is proving harder. The device says “1.4W” during the start-up sequence. Happy to receive advice from anyone.

  25. Hi. I am the developer behind the OpenSource software project SleepyHead, which supports importing from Contec CMS50D+, E and F models, prior to firmware 3.7.

    I’m looking for serial interception logs or any shred of information at all on any of the newer oximeters, including the CMS50I.

    I’ve been analysing the raw data streams for older models for quite a while now and have come to the conclusion the devices don’t transmit length codes, it appears to be a corruption of the live streaming, breaking through record headers, because of a faulty buffer flush in the 50E firmware. (My 50F does not display this behaviour)

    My (c++) code isn’t by any means exemplary, but it does work reliably since my last CMS50 oximetery module commit for these earlier (pre 3.7) models. (It’s on sourceforge if anyone is interested.)

    I did find a data sheet describing the CMS60 series protocols.

  26. Hi!!

    I’m an electronic engineer and I’m trying to hack this CMS50D+ pulse oximeter. First I want to say this info you posted is the most accurate I’ve found about this device, nice job.

    I’ve read thanks to a sniffer all packages you talk about even the 0xF5(x2) and 0xF6(x3) before and after data dump occurs. Here comes my problem. I can’t get data from device when I manually send these 0xF5 to the device.

    I’ve tried it using my own board Atmega based and Arduino Mega and device does not respond to my request. I wish I knew python but I have no idea so I don’t know how your code works. Were you able to get the data from the device by sending this 0xF5? Did you find out any other way to do it?

    Thanks in advance and hope you can give me a helping hand!!

  27. Looking at the (long forgotten by me) code, I see it starts the serial port at 19.2, 8 bits, odd parity, 1 stop bit. Then it sends an 0xF5 to the device.

    If sending the 0xF5 doesn’t get the thing talking, experience says the problem is either the cable or the serial port configuration.

    “It’s always the cable.” is the old mantra to explain why two machines don’t talk to each other. If your device has a USB-serial cable like mine, the cable is presumably OK. USB I/O may go through logic somewhere out of your control so that may be a problem.

    With serial ports, if the cable is OK, then it’s “always” the speed and framing. This thing uses odd parity. Unusual, that. So, speed or parity would be my guess of what’s actually the problem.

    Good luck!

  28. I would expect it would have pretty much the same protocol as the older devices. But you never know.

    So, let’s check the source code I use: CMS 50 Python code

    It looks like a hex F5 is what’s sent to the device at the start.

    19200 8 bit, odd parity, 1 stop bit. It’s that odd parity that is weird.

  29. Thanks. Can you write me the list of available commands? I can not find any info on internet. May be , you have any manual about it?!

Leave a Reply