linkmeter WNDR3700 progress and questions


 

Dave Casazza

TVWBB Fan
Linkmeter Progress...

Two blinks on green led
No Pit Probe shown on LCD Display
On configuration page, only reboot AVR seems to work - I see window that shows Rebooting AVR ... OK
HeaterMeter Version shows blank - ugh?
Update URL link seems not to be active - ugh?

Question:
How do I know that Heatermeter is communicating to the Linkmeter board? The home page shows a grid of 0-100%, and I can reboot the AVR... is there a more positive way of confirming the link?

When I connect probes, should the Heatermeter Version show and Update URL links become active? Or is there another problem I should investigate?

Thanks,

Dave
 
Yup it sounds like there's no serial communication to the HeaterMeter board. The version number and all the configuration information should be filled if the linkmeterd/lucid are running and communicating to the board. Make sure nothing else is using the serial port using the proc filesystem:
Code:
find /proc -type l | xargs ls -l 2> /dev/null | grep /dev/ttyS1
(or ttyS0 if that's what port you're using). The number shown in the path should only match the lucid process.

If you're using ttyS0 then make sure you've edited /etc/config/lucid to point to that device as well.

You can also stop the lucid process `/etc/init.d/lucid stop` and then cat the serial device `cat /dev/ttyS1` then when you hit the reset button on the HeaterMeter board you'll see the initialization output like "$UCID,HeaterMeter,[version number]".

The reset button only sends the reset command and reports if linkmeterd got it, not that the HeaterMeter responded (there's actually no response the the reset command either)
 
1) - Already done - set up linkmeter to use ttyS0 - because I saw that router didn't enable ttyS1

Additional debugging steps:

Commented out inittab ttyS0 line
Found out I switched TX, RX line, so reversed lines
Rebooted router... router hangs
Unplugged router serial connection to Heatermeter, rebooted, wireless comes up fine
Browsed to 192.168.200.1, see page fine, but without data
Reconnected serial connection, and start seeing data flowing into screen, times start scrolling from right to left...
Clicked on login... web page does not display on 192.168.200.1/luci/lm/login
netstat -an shows nothing running on 80
/etc/init.d/lucid restart issued
netstat -an again shows nothing running on 80
unplugged serial connection
/etc/init.d/lucid restart issued
netstat -an again shows process running on 80
in lucid config, changed daemonized to 0, debug to 1

This is the output when restarting lucid:

root@OpenWrt:/etc/config# /etc/init.d/lucid restart
Stopping LuCId superserver: lucid.
Starting LuCId superserver: lucidlucid[1491]: Initializing daemon http
lucid[1491]: Preparing daemon http
lucid[1491]: Preparing TCP-Daemon http
lucid[1491]: Preparing socket for port 80
lucid[1491]: Sockets bound for http
lucid[1491]: Preparing publishers for http
lucid[1491]: Preparing TLS for http
lucid[1491]: Invoking daemon factory for http
lucid[1491]: Prepared daemon http
lucid[1491]: Initializing daemon lmserver
lucid[1491]: Preparing daemon lmserver
lucid[1491]: Prepared daemon lmserver

<re-plugged in serial connection here>

lua: /usr/lib/lua/luci/lucid/linkmeterd.lua:165: attempt to perform arithmetic on a nil value
stack traceback:
/usr/lib/lua/luci/lucid/linkmeterd.lua:165: in function </usr/lib/lua/luci/lucid/linkmeterd.lua:155>
(tail call): ?
/usr/lib/lua/luci/lucid/linkmeterd.lua:249: in function 'handler'
/usr/lib/lua/luci/lucid.lua:106: in function 'run'
/usr/lib/lua/luci/lucid.lua:47: in function 'start'
(command line):1: in main chunk
[C]: ?
root@OpenWrt:/etc/config#
 
It seems to be blowing up here: last = now - tonumber(vals[idx+3])

Here's the code block:

local function segRfUpdate(line)
local vals = segSplit(line)
rfStatus = {} -- clear the table to remove stales
local idx = 1
local now = os.time()
while (idx < #vals) do
local nodeId = vals[idx]
rfStatus[nodeId] = {
batt = vals[idx+1],
rssi = vals[idx+2],
last = now - tonumber(vals[idx+3])
}
idx = idx + 4
end
end

What's it trying to do here, and how can I insert debugging statements to figure out what's going on?

Thanks,
 
Making progress! Sounds like the serial data is getting messed up. The current published version doesn't have checksuming so the linkmeterd code is susceptible to crashing due to bad data.

What I do for debugging is set lucid to run foreground with `uci set lucid.main.daemonize=0`, then add "print" statements to the linkmeterd.lua. A good place to start is in
Code:
local function serialHandler(polle)
  for line in polle.lines do
    print(line) -- add this
    ...
And when you start lucid now it will print each line it receives and you can see the one that is crashing it and we can go from there.
 
Cool, here's the output:


root@OpenWrt:/usr/lib/lua/luci/lucid# /etc/init.d/lucid restart
Stopping LuCId superserver: lucid.
Starting LuCId superserver: lucidlucid[1588]: Initializing daemon http
lucid[1588]: Preparing daemon http
lucid[1588]: Preparing TCP-Daemon http
lucid[1588]: Preparing socket for port 80
lucid[1588]: Sockets bound for http
lucid[1588]: Preparing publishers for http
lucid[1588]: Preparing TLS for http
lucid[1588]: Invoking daemon factory for http
lucid[1588]: Prepared daemon http
lucid[1588]: Initializing daemon lmserver
lucid[1588]: Preparing daemon lmserver
lucid[1588]: Prepared daemon lmserver
$HMSU,225,U,U,U,U,0,0,0*06
lucid[1588]: Discarding duplicate update
$HMSU,225,U,U,U,U,0,0,0*06
lucid[1588]: Discarding duplicate update
$HMSU,225,U,U,U,U,0,0,0*06
lucid[1588]: Discarding duplicate update
$HMSU,225,U,U,U,U,0,0,0*06
lucid[1588]: Discarding duplicate update
$HMSU,225,U,U,U,U,0,0,0*06
lucid[1588]: Discarding duplicate update
$HMSU,225,U,U,U,U,0,0,0*06
lucid[1588]: Discarding duplicate update
$HMSU,225,U,U,U,U,0,0,0*06
lucid[1588]: Discarding duplicate update
$HMSU,225,U,U,U,U,0,0,0*06
lucid[1588]: Discarding duplicate update
$HMSU,225,U,U,U,U,0,0,0*06
lucid[1588]: Discarding duplicate update
$HMSU,225,U,U,U,U,0,0,0*06
lucid[1588]: Discarding duplicate update
$HMSU,225,U,U,U,U,0,0,0*06
lucid[1588]: Discarding duplicate update
$HMSU,225,U,U,U,U,0,0,0*06
lucid[1588]: Discarding duplicate update
$HMSU,225,U,U,U,U,0,0,0*06
lucid[1588]: Discarding duplicate update
$HMSU,225,U,U,U,U,0,0,0*06
lucid[1588]: Discarding duplicate update
$HMSU,225,U,U,U,U,0,0,0*06
lucid[1588]: Discarding duplicate update
$HMSU,225,U,U,U,U,0,0,0*06
lucid[1588]: Discarding duplicate update
$HMRF,A,3300,128,2174*6B
lua: /usr/lib/lua/luci/lucid/linkmeterd.lua:165: attempt to perform arithmetic on a nil value
stack traceback:
/usr/lib/lua/luci/lucid/linkmeterd.lua:165: in function </usr/lib/lua/luci/lucid/linkmeterd.lua:155>
(tail call): ?
/usr/lib/lua/luci/lucid/linkmeterd.lua:250: in function 'handler'
/usr/lib/lua/luci/lucid.lua:106: in function 'run'
/usr/lib/lua/luci/lucid.lua:47: in function 'start'
(command line):1: in main chunk
[C]: ?
root@OpenWrt:/usr/lib/lua/luci/lucid#
 
Ah! Are you running the latest linkmeterd from git? The HeaterMeter is putting out the checksums but I'm guessing the linkmeterd isn't processing them and passing them on to the segment handler which tries to evaluate the checksum as a value. If it is the latest from github, serialHander() should look like this:
Code:
local function serialHandler(polle)
  for line in polle.lines do
    local csumOk = segmentValidate(line)
    if csumOk ~= false then
      if hmConfig == nil then
        hmConfig = {}
        serialPolle.fd:write("\n/config\n")
      end

      -- Remove the checksum of it was there
      if csumOk == true then line = line:sub(1, -4) end
      segmentCall(line)
    end -- if validate
  end -- for line
end
If it doesn't you can just replace the script with the latest by deleting it and `wget https://github.com/CapnBry/HeaterMeter/raw/master/openwrt/package/linkmeter/luasrc/lucid/linkmeterd.lua`
 
Welp there should just be the one file needed for lucid. Did you just edit the linkmeterd.lua or did you get it from git? You can't just change that one function because there's other changes in that file needed for the checksum code. Usually when it says "slave type not supported" the lua compile failed. You can see why by just running `lua /usr/lib/lua/luci/lucid/linkmeterd.lua`. If all goes as planned nothing should happen at all, just return to the command prompt.
 
Yeah, I had a bad copy. Fixed it, now all good. See lots of pretty numbers and graphs!

What's the best way of grabbing and updating the latest heatermeter+linkmeter code (for fixes and enhancements).

Dave
 
There really isn't a good way yet from the user perspective, it is on the TODO list. As a developer you just git pull and rebuild, installing the linkmeter openwrt package and then the AVR firmware update with the new hex file. If I ever have time I'd like to have an easy way to update just what is needed with a sort of one-click setup. Still a long ways off though.
 
Everything checks out, manual control of fan rocks, alarm works...

but... cold booting router hangs the router, and warm booting (reboot from the console) does not hang the router. Unplugging the router serial pins to the custom board and cold booting does not hang the router. I assume lucid is barfing on a cold boot, and not allowing the networking/wireless to start.

Two questions:
1) Any way of telling lucid to start after networking/wireless startup?
2) Whats the best way of debugging this problem? TTYS0 isn't exactly available for monitoring the serial port during boot up. Any way of dumping lucid debug to a log during startup, and examining the log file?

Thanks,

Dave
 
Dave,

I don't have putty handy to log in and look at how it is set now, but in /etc/init.d, you can see the boot priority of various tasks.
 
Yeah you can rearrange the order things start in by changing the value of their symlink in /etc/rc.d. All the "S" (for "Start") run in numerical order.

Did you make sure to turn back on lucid daemonize? If that doesn't fork, then nothing after it will start. Also, you can read its log output using the `logread` command. Finally, you can use the OpenWrt failsafe mode to stop the boot process before the init scripts run... I'm can't remember if telnet is started but I know it has a fixed IP at that point.
 
I thought the best way of syncing up everything is to get current as of 8/21, and I've build the openwrt image based on that, as well as flashed the arduino chip.

My current problem is
The linkmeter graphs and the heatermeter version don't show correctly, nor does any set command work.

When I put in print statements, I see statement such as this, communicating heatermeter versions:
lucid[1168]: Buried thread: 1216
HMSU,77,$HMSU,77$HMSU,77$HMSU,77,U,U,U,6UCID,HeaterMeter,20120628B*3A
lucid[1168]: Created thread: 1217

I also see checksum errors, and under the Heatermeter Version in the config page, no version is shown here:

HeaterMeter Information
Version
Serial checksum errors: 12

A typical data stream looks like this:

SU,77,MRF,A,3$HMSU,77$HMSU,77$HMSU,77,U,U,U,66.0,0,0,0*78
$HMSU,77$HMSU,77$HMSU,77$HMSU,77$HMSU,77,U,U,U,66.1,0,0,0*79
$HMSU,77$HMSU,77$HMSU,77$HMSU,77$HMSU,77,U,U,U,66.0,0,0,0*78
MSU,77,$HMSU,77$HMSU,77$HMSU,77$HMSU,77,U,U,U,66.0,0,0,0*78
$HMSU,77$HMSU,77$HMSU,77$HMSU,77$HMSU,77,U,U,U,66.1,0,0,0*79
MSU,77,$HMSU,77$HMRF,A,$HMSU,77$HMSU,77,U,U,U,66.1,0,0,MSU,77,U,U,U,66HMSU,77,U,U,U,66.1,0,0,0*79
$HMSU,77$HMSU,77$HMSU,77$HMSU,77$HMSU,77,U,U,U,66.1,0,0,MRF,A,3300,128,126*5E
lucid[1168]: Checksum failed: $HMSU,77$HMSU,77$HMSU,77$HMSU,77$HMSU,77,U,U,U,66.1,0,0,MRF,A,3300,128,126*5E
$HMSU,77,U,U,U,66.1,0,0,0*79
$HMSU,77$HMSU,77$HMSU,77$HMSU,77$HMSU,77,U,U,U,65.9,0,0,0*72
$HMSU,77$HMSU,77$HMSU,77$HMSU,77$HMSU,77,U,U,U,65.7,0,0,0*7C


I should be seeing some sort of graph update, even with some checksum errors, correct?

I probably should check my serial connections between the router and the custom board; perhaps even shorten them, or hard wire them in place... any other suggestions here?

Having said that, there should be some update on the graph - is there a possibility that lucid code has changed in the Heatermeter git, such that the graph does not update?

How do I check that the data being pulled in via lucid (on my ttyS0), is being processed correctly by the linkmeterd graphing subsystem?

Thank you,

Dave
 
Oof that looks bad. There will only be graph updates if there are lines which don't have checksum errors. You can try the command `lmclient @LMSS` and that will spit out an update message every time a status line is received from the HeaterMeter. The data you're getting looks pretty bad though, sort of like the stuff I was getting when I was getting tty overruns.

In this linkmeterd commit I switched the serial port to raw mode, which is the only thing I can think of that would cause the serial to get messed up. If you stop lucid and just `cat /dev/ttyS0`, see if the data is all messed up at that level. If it is, try different tty options like switching it back to sane: `stty -F /dev/ttyS0 sane` (and then run the cat again)
 
Two things...

1) running lmclient @LMSS when lucid is running spits out this message:

root@OpenWrt:~# lmclient @LMSS
nil eof
root@OpenWrt:~#

running it without lucid running, spits out this message

root@OpenWrt:/usr/lib/lua/luci/lucid# lmclient @LMSS
nil connect
root@OpenWrt:~#

2) There isn't much difference (if any at all) between raw and sane modes:

root@OpenWrt:/usr/lib/lua/luci/lucid# cat /dev/ttyS0
$HMSU,77.5,0,$HMSU,77,U,U,U,66.5,0,0,0*7D
$HMSU,77,U,U,U,66.5,0,0,0*7D
$HMSU,77,U,U,U,66.5,0,0,0*7D
$HMSU,77,U,U,U,66.6,0,0,0*7E
$HMSU,77,U,U,U,66.7,0,0,0*7F
$HMSU,77,U,U,U,66.7,0,0,0*7F
$HMSU,77,U,U,U,66.7,0,0,0*7F
^C
root@OpenWrt:/usr/lib/lua/luci/lucid# stty -F /dev/ttyS0 sane
root@OpenWrt:/usr/lib/lua/luci/lucid# cat /dev/ttyS0
$HMSU,77,U,U,U,66.6,0,0,0*7E
$HMSU,77,U,U,U,66.6,0,0,0*7E
$HMSU,77,U,U,U,66.5,0,0,0*7D
$HMSU,77,U,U,U,66.5,0,0,0*7D
$HMSU,77,U,U,U,66.5,0,0,0*7D
^C
root@OpenWrt:/usr/lib/lua/luci/lucid# stty -F /dev/ttyS0 raw -echo
root@OpenWrt:/usr/lib/lua/luci/lucid# cat /dev/ttyS0
$HMSU,77,U,U,U,66.7,0,0,0*7F
$HMSU,77,U,U,U,66.7,0,0,0*7F
$HMSU,77,U,U,U,66.7,0,0,0*7F
$HMSU,77,U,U,U,66.7,0,0,0*7F
$HMSU,77,U,U,U,66.7,0,0,0*7F
$HMRF,A,3300,128,6622*6B
$HMSU,77,U,U,U,66.7,0,0,0*7F
^C
root@OpenWrt:/usr/lib/lua/luci/lucid#

I gotta think that at least the heatermeter version should have made it to the config page to display, especially that I see it coming through the serialHandler routine. What am I missing?
 
The other strange thing that is happening is lucid seems to be too busy to process data...

I am capturing this output here:

lucid[1363]: Checksum failed: $HMSU,77$HMSU,77$HMSU,77$HMSU,77$HMSU,77,U,U,U,67.2,0,0,U,U,67.3,0,0,0*7A
lucid[1363]: Buried thread: 1380
lucid[1363]: Created thread: 1381
lucid[1363]: Buried thread: 1381
lucid[1363]: Created thread: 1382
$HMSU,77$HMSU,77HMRF,A,3$HMSU,77$HMSU,77,U,U,U,67.2,0,0,0*7B
lucid[1363]: Checksum failed: $HMSU,77$HMSU,77HMRF,A,3$HMSU,77$HMSU,77,U,U,U,67.2,0,0,0*7B
lucid[1363]: Buried thread: 1382
lucid[1363]: Created thread: 1383
lucid[1363]: Buried thread: 1383
lucid[1363]: Created thread: 1384
$HMSU,77$HMSU,77$HMSU,77$HMSU,77$HMSU,77,U,U,U,67.2,0,0,0*7B
lucid[1363]: Buried thread: 1384
lucid[1363]: Created thread: 1385
$HMSU,77$HMSU,77$HMSU,77$HMSU,77$HMSU,77,U,U,U,67.1,0,0,0*78
lucid[1363]: Buried thread: 1385
lucid[1363]: Created thread: 1386
lucid[1363]: Buried thread: 1386
lucid[1363]: Created thread: 1387
$HMRF,A,$HMSU,77$HMSU,77$HMSU,77$HMSU,77,U,U,U,67.1,0,0,$HMSU,77,U,U,U,67.1,0,0,0*78
lucid[1363]: Checksum failed: $HMRF,A,$HMSU,77$HMSU,77$HMSU,77$HMSU,77,U,U,U,67.1,0,0,$HMSU,77,U,U,U,67.1,0,0,0*78
lucid[1363]: Buried thread: 1387
lucid[1363]: Created thread: 1388
lucid[1363]: Buried thread: 1388
lucid[1363]: Created thread: 1389
$HMSU,77$HMSU,77$HMSU,77
lucid[1363]: Buried thread: 1389
lucid[1363]: Created thread: 1390
UCID,H$HMSU,77$HMSU,77$HMSU,77
lucid[1363]: Buried thread: 1390
lucid[1363]: Created thread: 1391

And in the middle of this process, I reset the arduino custom board. I should have saw the HeaterMeter id come through, but lucid seems to be spending time creating and burying threads, rather than processing serial data? I may be misunderstanding what I see, of course.
 
That is really crazy that the serial data seems to be fine when you cat from it, but linkmeter can't seem to read it properly. The create thread happens when it forks a new process to handle an HTTP request. Is it doing that even when you're not hitting it from the web browser? If you want to give me ssh access from the Internet, I'll connect up and poke around to see if I can see what's going on. Just send a private message with the login and IP.
 
Hey Brian, I think it is a bad router board... too many shorts on it while building... I'll try another board, and I'll let you know if it's a software problem. I'll report back in a few days.
 

 

Back
Top