Originally posted by Brian Hilgert:
/backfire/staging_dir/target-mipsel_uClibc-0.9.30.1/usr/lib/liblua.so: undefined reference to `crypt''
Doh! I'll blow away the directory and try it again from scratch and see if I can reproduce it. Can you verify that your heatermeter/openwrt/package/rrdtool/patches/140-fix_lua_detect.patch contains -lcrypt on the 4 lines that start with +?
Also, you squeezed it down to 2MB? Whoa! You musta removed some serious amounts of packages to get it that low. I think the lowest I've had it was around 2.3MB. Tell me more!
Haha yeah John, well, you see it is complicated. I'll explain it though because someone may find the design interesting or inspirational.
As we all know, inside LinkMeter there is a HeaterMeter board which is connected via serial inside the router. There is a daemon running on the router that reads serial data as it comes in, then builds a "current snapshot" file used to display the current data on the web page. It also keeps the database up to date with the latest info. The web pages have to use these external files to see what's going on because they can't read the serial port directly without messing up the running daemon. That's ok because some of the data (such as RFM12 status information) only comes every once in a while so the request would just hang for a very long time waiting for the right info to come through the serial port.
For setting values from the web pages, I currently do a Very Bad Thing and something that really shouldn't work at all. I open the serial port a second time from the web server and write the configuration directly without involving the daemon. Again, I am shocked this even works properly and I have to be careful to not issue any reads because this would mess with the daemon process. I can't "talk" with the HeaterMeter because of this lack of ability to get feedback from the device.
This is a pretty common problem in programming where you have multiple things that want to access a limited resource. The solution is obviously to have one thing that handles talking and everybody else talks to that. The big question to consider is how everybody will talk to the arbitrator. You can't just make function calls from the client into the server because they're in different processes and therefore protected from each other.
The quick and dirty method is to do what we're doing now, files. The server drops out some files and the clients read the ones their interested in. For talking back to the server this gets more complicated if there is more than one client active at a time (a possibility because the web server can handle multiple requests simultaneously). It also has a high overhead, reading and writing files, as well as high latency due to the server and clients polling directories for these files.
For getting data into and out of a process, there is nothing better than a pipe. You use pipes all the time to tunnel the output from one program to the input of another, for example `ls -l | grep myfile` starts two processes and links them via a pipe. Pipes are also called FIFOs because the data coming through them arrives in the order it is sent. In addition to input and output pipes, processes can create "named pipes" which have filenames and exist in the filesystem under Linux. Because a single FIFO is unidirectional, a server which needs to "talk" with a client would need to create two named pipes, one to read from the client and one to write back to it. Pipes are great super-lightweight way to communicate from one process to another.
Their downfall is because the data that goes in and out of pipes is "single instance" such that there's only one copy of any bit of data sent down it. Therefore if client B opens the pair of named pipes and asks the server a question while client A is still waiting for a response, whoever is able to read data faster will get all the responses. Client A may get client B's answer and discard it leaving client B waiting for a response indefinitely.
So now we can see to serve data to many clients in different processes, we need a bidirectional channel
per client. The answer to this is to use a socket. We could use a TCP/IP or UDP socket but under UNIXes there's a special type of socket that doesn't rely on a network connection at all called a UNIX-domain or AF_LOCAL socket. These can be unidirectional or bidirectional, can have names, and each client can have its own connection. This is the most common method used for communication to Linux daemons, used in small things like wpa_supplicant all the way up to bigger things like a MySQL connection.
Instead of connecting like a TCP/IP socket, with an IP address and a port, a named UNIX socket can live in one of two places. It can have a filename in the filesystem like /var/run/mysql/sock (pathname UNIX socket) or it can exist in the abstract namespace which is sort of like a virtual filesystem that is just in kernel memory. Abstract namespace sockets still have names so clients can find them, but are automatically deleted when the last socket referencing them is closed. Generally, you'll want to expose your server's interface on a pathname socket because A) it is visible B) the permissions of the file provide security over who can connect to it. There's also a third type of UNIX socket address which is called unnamed. This is a socket which has not been bound to a pathname or a abstract name. An unnamed socket has no "address" where pathname and abstract sockets do.
For the client's end of the socket you can't use a fixed pathname for all clients, because if more than one tries to connect, they'll conflict on who has the name. Usually, the programmer creates these in the abstract namespace so the client doesn't have to worry about cleaning up pathname files left on the filesystem or uses an unnamed socket.
Now that we know we want a pathname UNIX domain socket for the server, we need to decide if it is connection-oriented (stream) or connectionless (datagram). For larger amounts of data, stream sockets are best, but for our simple query-response structure datagrams are fine. They have less overhead and don't require you to keep track of a separate socket descriptor for each client you're talking to. Also because we're using a 2.6 Linux kernel, UNIX datagram sockets have guaranteed in-order delivery and have atomic writes for up to 4k (any write of less than 4k of data completes in one call). This means no reassembling or reordering packets like you'd experience using a standard network UDP socket.
That's how the client talks to the server, but how does the server talk back? Well if we used stream sockets, the server could just write back on the same socket descriptor that it read from. However, we're using datagram sockets so we need to direct our write back to the correct client. The way to do this is to use recvfrom() on the server socket, which will receive data *and* tell you who it is from so you can use sendto() to talk back to them.
Remember how I said that unnamed UNIX domain sockets have no address? Well now you see where the problem arises. If the client creates an unnamed datagram UNIX socket, the server gets no address from recvfrom() and can't write back to it. This is where the bug was in OpenWrt's LuCI nixio package. If you recvfrom()ed an unnamed datagram UNIX socket, you'd get
an address back, it was just garbage off the stack. If another client connected with a named UNIX socket, the unnamed recvfrom() would coincidentally return the name of the named socket so the unnamed client would never get any data back and then named client would get extra data.
So what I ended up doing, in addition to fixing the bug, was to verify that every client had a named UNIX socket. I use a feature of Linux called "autobind" which says if you try to bind a datagram UNIX socket to a blank name, it will figure out a unique name for your socket in the abstract namespace.
So there you have it, bidirectional communication working with an autobound abstract namespace UNIX domain datagram socket. The web server process(es) can now talk to the LinkMeter daemon and have live bi-directional communication without the need for relying on temporary files like we do now (e.g. /tmp/json).