OpenEnergyMonitor aggregator

Keeping track of time across different systems is not as straightforward as it might seem: every clock has its own notion of time, and always drifts around to a certain extent. There’s an entire community around the challenge of trying to keep track of time as accurately as possible. In an absolute sense, you can’t actually win in this game - it’s very Heisenberg-like, in that there will always be someuncertainty, even if only at the femtosecond level!

Fortunately, this isn’t a big deal for simple home monitoring and automation scenarios, except that with battery-powered nodes without accurate crystal this issue need addressing.

The simplest solution is to timestamp all data in a single place, based on a single clock source - preferably a stable one, obviously. In Linux, tracking real time is well-understood - all you need when network connected, are the ntp, ntpdate, or chrony tools (thanks for the tip, Thomas!) to keep its time-of-day clock time accurate to within a milliseconds or so.

Hub timestamping service

JET/Hub provides a simple millisecond-timestamping service for anything arriving via MQTT:

The <device> part can be used to distinguish between the different message sources.

The contents of these message can be anything - they also pass through without change. Note that the timestamped topics change endlessly - these messages should not be published with the RETAIN flag set, else the MQTT server will have to store an unbounded number of messages!

Logging raw incoming data

A mechanism introduced in JeeMon and HouseMon many years ago, was the “logger”, which takes all incoming (presumably plain text) data, and saves it to disk. This mechanism is also included in the hub, and serves two purposes:

as a future-proof permanent record of every “raw” message, independent of the processing applied to it later - i.e. even before these messages get “decoded”
as a way to “replay” the original data at a later time, for testing or simulation purposes, but also to be able to completely rebuild all time-series and redo all statistics and other processing - this has proven invaluable to support major changes to the software, since a new installation can quickly be fed all historical data as if it were coming in real-time

The structure and format of these log files have remained the same for many years now:

there is one log file for every day, running from midnight to midnight (UTC!)
the name of these log files is “YYYYMMDD.txt”, e.g. “20160128.txt“
each entry in the log is one line of text
the format of each line is: “L HH:MM:SS.TTT <device> <message...>“

Here is an example of some log entries, taken from the “20160128.txt” file:

L 12:27:51.979 usb-USB0 OK 3 65 112 8 0
L 12:27:52.941 usb-USB0 OK 9 161 25 58 235 31 159 228 5 13 219 234 62
L 12:27:55.937 usb-USB0 OK 9 161 25 58 235 33 159 230 5 13 219 234 62
L 12:27:58.936 usb-USB0 OK 9 161 25 58 235 35 159 224 5 14 219 198 61
L 12:28:00.574 usb-USB0 OK 19 96 16 1 28 13 28 0
L 12:28:01.934 usb-USB0 OK 9 161 25 58 235 37 159 224 5 14 219 198 61
L 12:28:03.080 usb-USB0 OK 6 199 149 143 0

This functionality is included in the hub - it subscribes to MQTT topic “logger/+/+”, causing it to receive all timestamped messages (and only those), which it then saves to the proper log file. With automatic daily roll-over to a new log file at 0:00 UTC. The file paths include the year, so that no more than a few hundred text files end up in a single directory - the above example was actually copied from the file located at “logger/2016/20160128.txt”.

The hub has a heartbeat

For completeness, this is probably a good place to mention that the hub also implements a one-tick-per-second “heartbeat” - i.e. a periodic message, published to the “hub/1hz” topic. The value of this text message is the current time in milliseconds since Jan 1st, 1970 (as commonly used in JavaScript). The hub will continuously adjust its heartbeat timing to happen exactly on the second mark, as you can see in this example:

$ jet sub hub/1hz
hub/1hz = 1453988511000
hub/1hz = 1453988512000
hub/1hz = 1453988513000
hub/1hz = 1453988514000
hub/1hz = 1453988515000
hub/1hz = 1453988516000
hub/1hz = 1453988517000
...

An issue to watch out for, is that these messages can end up a few milliseconds late - even before propagating through MQTT - since Unix / Linux do not offer any real-time guarantees.

One use for this heartbeat is to detect and track clock skew in other machines on the network.

Tying it all together

The serial port interface, timestamping, and raw logging services are built into the hub but fully independent of each other. And - by design - they can be chained together very nicely:

a retained message at “serial/mydev” sets up the serial port and starts the hub’s listener
sendto is set to “logger/mydev”, directing all incoming data to the timestamping service
these messages will then be re-published as “logger/mydev/1453986012808”, etc.
since the logger is listening to “logger/+/+”, it gets a copy of each timestamped message
as a result, all this data also ends up being stored in daily log files

This addresses a major design goal for the hub: to keep raw serial data collection and logging going at all times, even when other parts of JET are in development, being replaced, or failing.

Installation… ah, the “joys” of modern computing!

1. Prerequisites

To try out JET, you need:

a Raspberry Pi, Odroid, OLinuXino, or similar ARM-based Linux board
a working Linux setup on your particular board, e.g. Raspbian or Ubuntu
the ability to login on your system via SSH over the network
some basic familiarity with Linux and the “shell” command line
a link to the latest JET release on GitHub: here it is…
a bit of patience to skim through this setup guide (yep, go right ahead!)

Because JET/Hub is going to run as an unattended background task (next to the MQTT server), we need to set up a few things to automate this process (marked in bold in the diagram below):

And since the idea is to make this as painless as possible: let’s get started!

2. Hardware setup

This will depend on what board you have and is beyond the scope of this guide, but here are some links you might want to check out to get going:

https://raspguide.wordpress.com
https://github.com/raspberrypi/noobs#how-to-automatically-install-an-os
or just google for “getting started with raspberry pi and ssh”, etc

You need to get to a state where your board is running properly, is connected to the internet (because you’ll need to fetch a few files), and you are logged in with the ability to become “superuser” via sudo or su (again, because you’ll need to install a package into the system).

If Linux is new to you: there are lots of ways to get familiar with it, for example with this book.

3. Software setup

You’ll need to get two packages installed and running: the JET/Hub core and the MQTT server. Start with the latter: for MQTT, install a package called Mosquitto - it should be available as standard package, so on systems such as Raspbian, Debian, or Ubuntu, just run this command:

sudo apt-get install mosquitto

Press “y” when asked to accept the installation. If all goes well, Mosquitto will be installed and set up to automatically run in the background, also after reboots of the system.

Part two is to download and run the latest JET/Hub binary from the Releases page on GitHub. You should create a fresh directory, download the package, and unpack as follows:

mkdir ~/jet-v4
cd ~/jet-v4
wget <url-of-hub-linux-arm.gz>   # from the above Releases page
gunzip hub-linux-arm.gz
ls -l hub-linux-arm

That last command should produce output similar to this:

-rwxr-xr-x 1 jcw jcw 6664656 Jan 26 11:56 hub-linux-arm

That’s it. The essential parts of JET are now installed.

4. Starting JET for the first time

It’s time to launch the hub (Mosquitto will already be running, see above), and the first time around you should do it manually by entering this command:

./hub-linux-arm

Here is what should appear (this was done with a slightly older version):

2016/01/26 11:57:28 [JET/Hub] v4.0-2-gd80ce94 2016-01-26 (go1.5.3)
2016/01/26 11:57:28 connected as hub/384389 to tcp://localhost:1883
2016/01/26 11:57:28 opening data store: store.db
2016/01/26 11:57:28 starting HTTP server at localhost:8947

Are you seeing something similar? Then all is well - congratulations! If not, please verify all the above steps, and if you can’t fix the problem, get in touch via the Forum, Gitter, or GitHub.

At this point the hub is running, but if you press CTRL-c or log out, it will stop again.

5. Launching JET automatically

We need a more permanent setup, which doesn’t require login or manually starting up the JET system. One way to do this is via the “crontab” utility, which can set up commands to launch the moment the system starts up, even when you are not logged in.

Let’s edit our “crontab” entry (every user has a different one). Enter the following command:

crontab -e

An editor screen pops up, allowing you to edit text. Add the following line at the end of the file:

@reboot sleep 5 && cd ~/jet-v4 && ./hub-linux-arm -http :8947 2>&1 >out.log

Then save these changes and exit the editor. With the default “nano” editor, instructions for this will be on-screen (to change the default editor: “update-alternatives --config editor”).

The “sleep 5” adds a little time for the rest of the system startup to complete (such as MQTT). The “-http :8947” option enables outside access to the hub’s web server on port 8947.

From now on, the hub will automatically start when you power up your board. Log output will be saved in a file called out.log (use “tail -f ~/jet-v4/out.log” to watch the latest output).

There’ll be two servers running in the background: MQTT on port 1883 and HTTP on port 8947.

6. JET admin utility

There is one last step to make it easier to tinker with a running JET system. See this example shell script - which is a special wrapper to control a running system, called - drumroll - “jet” !

When properly set up, you get some nice conveniences for use from the command line, such as:

jet - display JET’s exact version and build details
jet config - list the persistent configuration, i.e. all retained messages
jet pub <topic> <value> - a way to manually publish a message to MQTT
jet sub <topic> - subscribe to an MQTT topic (for example: “jet sub '#'”)

More admin, control, and debugging options will be added once the hub’s functionality grows.

To set this up, create a file called “jet” with the example’s contents, and adjust as follows:

edit the file, changing the “PROG=…” line to PROG=$HOME/jet-v4/hub
save and quit the editor, then make the script executable with: chmod +x jet
consider moving the file to a dir on your $PATH for easy access from anywhere
… otherwise you’ll only be able to run jet by typing: “~/jet-v4/jet ...“

This utility can also be used from a different machine, i.e. you can perform the above actions without logging in to your Linux box by adjusting the “MQTT=…” line to the target IP address. Note that you’ll need to install a second copy of the hub, but built for the originating machine! (never mind if this doesn’t make sense yet - for local use the above instructions should be fine)

7. Security

Once installed, an internet connection is no longer strictly required, but at a minimum you’ll probably want to keep the local network enabled for SSH and web browser access (unless you use a keyboard and monitor, and attach all your devices directly to your board). JET will never connect to anything “out there”, nor accept any incoming connections - unless you tell it to.

JET does not require superuser privileges, but you may have to fix some permissions to enable permanent access to the serial ports (tip: try “sudo usermod -aG dialout <username>” if you run into this particular issue). JET should run fine with just standard user permissions.

The current JET setup does not have any access control. Authentication and TLS will be added later, both in MQTT (via Mosquitto’s config file) and in the hub.

The JET/Hub process has two types of configuration settings:

startup configuration, such as how to connect to MQTT and the name of the data store
run-time configuration, such as which serial ports to open, and which packs to launch

Command-line options

Since the hub is intended to run virtually “indefinitely” once started, only very few configuration settings are specified via command-line options, and any changes will require a hub restart:

-datafilename

the filename of the hub’s persistent data store (default: ./store.db)

-loggerdirname

the directory where the logger stores its daily log diles (default: ./logger)

-mqtturl

the host and port of the MQTT server (default: tcp://localhost:1883)

-packsdirname

the directory where packs are launched from (default: ./packs)

Serial port configuration

Serial port configuration in the hub uses a more flexible mechanism: the hub continuously listens for topics matching the pattern “serial/+” and treats them as serial port configuration requests, as described earlier. To set up a serial port listener for a device on USB port 0, you simply need to send a message to MQTT with a specific format:

topic = serial/<name>
payload = {"device":"/dev/<usb-name>","sendto":"<publish-topic>"}

The payload must be a valid JSON object, with device and sendto fields.

The “jet” utility makes it very easy to set this up from the command line, once the hub and MQTT server are up and running. Here is an example:

jet pub serial/jeelink '{"device":"/dev/ttyUSB0","sendto":"logger/jeelink"}'

Note the use of single quotes to simplify passing JSON’s double quotes without further escapes.

If you have more serial ports, just send more messages and use different names:

jet pub serial/arduino '{"device":"/dev/ttyUSB1","sendto":"logger/arduino"}'

And if you don’t know which device is on which port, or if this might change from one power-up cycle to the next, then there’s a trick for that too in Linux:

Each serial device is listed in a little-known directory called /dev/serial/by-id/. By looking up your device and using that (much longer) device name instead of ttyUSB0, you can force the hub to always open a specific device. Here is an example:
long=/dev/serial/by-id/usb-FTDI_FT232R_USB_UART_A40117UK-if00-port0
json='{"device":"XYZ","sendto":"logger/jeelink"}'
jet serial/jeelink `echo $json | sed "s/XYZ/$long"`
As you can see, this may require some nasty massaging to avoid quoting hell and keep all the double quotes in that JSON payload intact. Note that if you replace “jet” by “echo” in that last line, you can see what’s going on without publishing to MQTT.

To close a serial port, send an empty payload. This can be done with JET’s “delete” command:

jet delete serial/jeelink

These commands can be sent at any time. There is no need to stop-and-restart the hub.

Persistent settings

The above message sent to MQTT are once-only, i.e. on restart the hub won’t reopen serial ports. But there is a simple solution for this in MQTT: the RETAIN flag. By adding the -r flag to the above “jet pub“ commands, the messages will be sent as before, but also stored and re-sent when the hub is restarted and reconnects to MQTT at a later time:

jet pub -r serial/jeelink '{"device":"/dev/ttyUSB0","sendto":"logger/jeelink"}'
jet pub -r serial/arduino '{"device":"/dev/ttyUSB1","sendto":"logger/arduino"}'

The RETAIN flag is also sent by “jet delete”, i.e. a deletion / close request is also permanent. One little detail to keep in mind is that when a retained message is stored in MQTT, subsequent non-retained messages do not affect it: the original message will still be sent after a hub restart.

You can use “jet config” as a quick way to see all retained messages (i.e. persistent settings).

Managing JET packs

Apart from routing some messages to and from serial ports, logging, and a few other built-in features, one of the hub’s main tasks is to manage “JET packs”, i.e. separate processes (in any language) which are tied into the system as equal citizens and communicate through MQTT.

As with serial ports, the hub supports starting and stopping JET packs at any time. And again, this is driven via MQTT messages. Here is the bird’s eye view:

Here is how to add a new JET pack and manage it with the hub:

place the JET pack’s executable or a small shell script wrapper in the “packs” directory - it must have the executable bit set (“chmod +x”), and in the case of a shell script, it must also start with the line “!/bin/sh“ so the hub can launch it
send an MQTT message to “packs/<name>” with as payload a JSON array, containing the name of the executable or shell script plus optional arguments

So for example, if packs/abc.sh exists, we can issue the following command:

jet pub packs/abc '["abc.sh","arg1","arg2"]'

The hub will report what it’s doing in the log, as well as any errors it runs into.

For security reasons, the hub will only launch packs present in the “packs“ directory. Path names are not accepted as first item of the JSON payload.

All stdout and stderr output from the pack is also reported by the hub and sent out as MQTT messages to “packs/<name>/log”. Output lines from stderr will be prefixed with "(stderr)".

Send an empty message to stop the pack again, e.g. “jet delete packs/abc”. If the pack is running when another launch request comes in, the old one will be killed first (using SIGQUIT). If you want to temporarily prevent a pack from starting up, you can remove its executable bit (”chmod -x packs/abc.sh”) and add it back in later (”chmod +x packs/abc.sh”).

As with serial ports, JET pack launch requests only persist across hub restarts if you include the RETAIN flag by using “jet pub -r ...”.

The “database” built into JET/Hub is set up as a general-purpose key-value store - i.e. persistent data storage which can’t store or retrieve anything other than (arbitrary) data by key. Whereby the keys are treated as a hierarchical structure separated by forward slashes (“/”). This database looks very much like an ordinary file system, and also very much like an MQTT topics hierarchy.

This is by design: it makes it easy to treat this persistent data in the hub as a tree-of-files. Where each “file” is usually a JSON object. The size of these objects can range from one byte to multiple megabytes, since the underlying storage implementation can easily accommodate them all.

Speaking of implementation: the hub’s store is now based on BoltDB, which is widely used in Go, is in active development, and appears to be well-designed and robust. And it’s open source.

BoltDB is an “embedded database” package, which means that its implementation is part of the application using it. As an important consequence, no outside access by any other process is possible while the database is open. In the case of the hub, this won’t be very restrictive, as all its functionality is going to be exposed via MQTT requests anyway: the hub is the database server.

Store semantics

Let’s call this a “store” from now on, since it isn’t a full-scale database (no indexes, no multi-user facilities, no user management, just keys and values - albeit arbitrarily many, and nestable).

To store data, we need a (string) key to specify where to store things, and a value (which can be an arbitrary set of bytes).

All keys must start with a slash. Another constraint is that we can’t store any data at the top level, we need to include at least one extra “directory level” in our keys. Empty key segments should be avoided and will at some point probably be rejected. So these are valid keys:

/abc/def        (an "abc" directory, with a "def" entry in it)
/a/b/c/d/e/f    (5 directory levels, with an "f" entry in the top one)

… while these are not:

abc/def         (not starting with a slash)
abc/            (same, and empty entry name)
/abc            (need at least one dir and one entry)
/abc/           (empty entry name)
//def           (empty directory name)

In the case of “/abc/def/ghi”, there are two directory levels, “abc” and “def”, plus the final “file-like” entry, which is where the value gets stored. It really helps to think of these keys as being very much like file-system paths, except that there are no “.” and “..” special entries, since there is no concept of a “current directory”.

The values stored can be any non-empty byte sequence, although other parts of JET might impose some further restrictions, such as being valid JSON.

Empty values cannot be stored - they represent the absence of an entry, as will become clear below. But you can store the empty JSON string, represented as a pair of double quotes (“”).

The hub’s store listens to MQTT topics starting with “!” and “@”, i.e. patterns “!/#” and “@/#”.

Storing a value

To store the value “123” in directory “foo”, entry “bar”, send this message to MQTT:

topic:   !/foo/bar
payload: 123

I.e. when using the “jet” utility:

jet pub '!/foo/bar' 123

(the “!” needs to be quoted, because it has special significance in the shell)

In the case of JSON objects as values, you can use quoting to get things across without the shell messing things up. For example:

jet pub '!/foo/bar''{"name":"John","age":21}'

In both cases, the “foo” top-level directory level will be automatically created if it doesn’t exist. This applies to all intermediate levels.

Deleting a value

To delete a value, store the empty value in it (i.e. a value of zero bytes long). There are two ways to do this with “jet“:

jet pub '!/foo/bar'''
jet delete '!/foo/bar'

To delete all values under directory “foo”, as well as the directory itself, use either of these:

jet pub '!/foo'''
jet delete '!/foo'

This will delete “foo” and everything below it, including all sub-directories. Use with caution.

Fetching a value

Now it gets interesting. How do you fetch a value? Or rather, what do you do with that result?

This is implemented in JET is as a “request/reply” pair: the request to fetch a value includes a topic to which the reply (i.e. the fetched value) should be sent. So to fetch the value of directory “foo”, entry “bar”, we need to set up a listener to a unique topic, and send this MQTT message:

topic:   @/foo/bar
payload: "<topic>"      (a JSON-formatted string)

For example, if we want to receive the value back on topic “abc”, we can send:

jet pub @/foo/bar '"abc"'

Note the extra quotes again here, to get that JSON string across properly.

When this request is picked up by the hub, it’ll obtain the requested value from the store and send a message to topic “abc” with that value. It’s a bit like a “call” with a “return value”, and indeed this is essentially a Remote Procedure Call, mapped on top of MQTT.

If the requested key does not exist, an empty value will be returned instead.

Note that if any of the directories don’t exist, nothing will be returned at all.

Listing the entries

One last function is needed to make the above suitable for general use: the ability to find all the entries in a specific directory. This can be done by “fetching” the value of that directory:

jet pub @/foo '"abc1"'

The reply sent to the “abc1” topic might be something like:

{"bar":3,"baz":100,"sub":0,"text":30}

Each entry is a key in the returned JSON object, with as value the number of bytes stored for that entry. In the case of sub-directories, this number will be zero.

So in this example, there are three entries, called “bar”, “baz”, and “text”, as well as one sub-directory, called “sub”. We could then follow-up with a new request:

jet pub @/foo/sub '"abc2"'

And a reply would be sent to “abc2”, perhaps this one:

{}

Which indicates that “sub” does exist as directory, and that it doesn’t have any entries.

Current status

As of this writing (end Jan 2016), the above is work-in-progress. An early implementation with slightly different semantics has been implemented and is being adapted to match the above API.

Note that this data store - like all other elements of the hub - is totally optional and independent of the rest of JET. Each pack can choose to use this MQTT-based key-value store, or store data using its own approach. One benefit of this “store” is that it’s always available, to every JET pack.

The acronym “WSN” stands for Wireless Sensor Network. Ok sure, we all have one or more wireless sensor nodes, JeeNodes or whatever, and they probably work nicely. But how do we manage them? What about code revisions?

Let’s go into this for a moment. Because the usual approach of: “a sketch, an upload, and off you go” doesn’t really scale well. How do you manage all those nodes, which may be different in functionality, in their hardware, or even just be different models or revisions of the same type?

But first, I’ll start off this week with a note about running the hub in the background:

And just to keep a clear model of the hub’s main role in front of you, here’s the diagram from last week’s configuration guide again:

Latest hub v4.0-45 builds now on GitHub.

The crontab “@reboot” approach mentioned in the hub’s installation guide has as benefit that it’s very easy to do, without the risk of messing up anything serious, because it doesn’t involve “sudo”. It also should work on just about any system - cron has been around for a long time.

But if you’re willing to do just a little more work, there’s actually a more flexible mechanism in recent Linux distributions, called systemd: it will take care of starting and stopping a service, all its output logs, and catching any runaway or otherwise failing launches.

Here’s how to set up the hub to run under systemd as a service, but first:

type “systemctl” to verify that “systemd” is actually available in your system
make sure the “@reboot” entry in your crontab is commented out! (crontab -e)
also make sure that the hub is no longer running, as you’ll move some stuff around

Now create a file called “jet.service”, with the following lines in it:

[Unit]
Description=JeeLabs JET Daemon
After=mosquitto.service
After=network.target

[Service]
WorkingDirectory=/home/jcw/jet-v4
ExecStart=/home/jcw/jet-v4/hub-linux-arm
User=jcw
Restart=always

[Install]
WantedBy=multi-user.target

Note that this is set up to wait for both Mosquitto and the network to be ready.

Be sure to check the ExecStart, WorkingDir, and User settings, and adjust as needed for your situation. If you prefer to put the hub (and its data store and packs) in a more central directory: there’s an “/opt“ area intended for just that purpose. Here’s how you can migrate the hub to it:

sudo mkdir -p /opt
sudo mv ~/jet-v4 /opt/

In which case the jet.service file will need to be adjusted to:

WorkingDirectory=/opt/jet-v4
ExecStart=/opt/jet-v4/hub-linux-arm

And if you’ve set up a “jet” script, you’ll need to adjust the path in there as well.

The last step is to put the service in place:

sudo chown root:root jet.service
sudo mv jet.service /etc/systemd/system/

Now you can start and stop the hub (and its child processes, i.e. active JET packs) at will:

sudo systemctl start jet
sudo systemctl stop jet

One thing to beware of is that you need to enable the service if you want it to also start automatically on power-up or after a reboot:

sudo systemctl enable jet

You only need to do this once, it’ll stay that way until you disable it again.

To see the status and the last few lines of the hub’s output, use … you guessed it:

sudo systemctl status jet

Here is some sample output with a freshly-installed hub:

$ sudo systemctl status jet
● jet.service - JeeLabs JET Daemon
   Loaded: loaded (/etc/systemd/system/jet.service; enabled; vendor preset: enabled)
   Active: active (running) since Tue 2016-02-02 16:37:45 CET; 7s ago
 Main PID: 11645 (hub-linux-arm)
   CGroup: /system.slice/jet.service
           └─11645 /opt/jet/hub-linux-arm

Feb 02 16:37:45 xudroid systemd[1]: Started JeeLabs JET Daemon.
Feb 02 16:37:45 xudroid systemd[1]: Starting JeeLabs JET Daemon...
Feb 02 16:37:45 xudroid hub-linux-arm[11645]: 2016/02/02 16:37:45 [JET/Hub] ...)
Feb 02 16:37:45 xudroid hub-linux-arm[11645]: 2016/02/02 16:37:45 connected ...3
Feb 02 16:37:45 xudroid hub-linux-arm[11645]: 2016/02/02 16:37:45 opening da...b
Feb 02 16:37:45 xudroid hub-linux-arm[11645]: 2016/02/02 16:37:45 starting H...7
Hint: Some lines were ellipsized, use -l to show in full.

(note: the above still shows duplicate timestamps - this has been fixed in the latest hub revision)

If you want to shorten this last command, add the following line to your “~/.bashrc” script:

alias jets='sudo systemctl status jet'

(as always with a .bashrc change: re-load or re-login to put it in effect)

Now, typing “jets” will give you a quick glimpse of the JET/Hub’s status.

It’s really convenient and the new “standard” way to run services in Linux: letting you start, stop, and check up on the hub at any time. Thanks to Thomas L. for his suggestion and help with this.

So you’ve set up a Wireless Sensor Network, i.e. 2 or more “nodes”, talking to each other via RF. This being a JeeLabs article, let’s say you’re using a JeeNode, JeeLink, or other ATmega/tiny-based board with an RFM12 or RFM69 wireless radio module - or perhaps some self-made variation, using some neat flashy ARM µC.

Everything is working fine. Great! And as time passes, you add more nodes. Fun! Some nodes are just like the previous one, for example a “Room Node” to monitor temperature, humidity, motion, and light levels in a room, and some nodes are different, such as an OOK Relay, or a LED Node to control LED strips, or … whatever, you get the idea.

In many cases, similar nodes will only differ in the “node ID” used to identify them, and such nodes can all run the same firmware, with the difference stored in EEPROM. Sometimes, you need to set up a node in just a slightly different way, and you start editing the source code to upload a slightly different build. Easy! The Arduino IDE, Eclipse, or Makefile-based build environment can make this sort of tinkering oodles of (sheer endless!) fun.

What could possibly go wrong, then?

The trouble with this way of working with more than one node, is that the tools used to build the code usually operate in “edit, build, upload, fire-and-forget” mode: each build stands on its own, but the build environment in no way helps you manage multiple nodes and all their variations.

What variations, you ask?

Here are a few, most of them will be pretty obvious:

one node needs to report on wireless as ID 16, the other as ID 17
one node is running on a JeeNode v4, the other on a JeeNode v6
one node is a room node with a Pressure Plug added, the other without
one node is running with an RFM12, the other with an RFM69
one node has an ATmega328P, the other one uses the new ATmega328PB
one node runs as an AVR-based door sensor, the other is ARM-based
one node talks directly to the central node, the other uses a repeater

The list is endless. This is what happens over time. And this is the way to create a (potentially) huge mess when it comes to keeping track of all such variations.

With the Arduino IDE, you may have to readjust the build settings in the “Tools” menu for each different “sketch”, for example. Or worse: re-adjust a #define hidden somewhere deep in one of the included libraries.

In some scenarios, this variability can be ignored: nodes get set up, you test them, you install them in their remote spot, and that’s it. In fact, here at JeeLabs, over a dozen nodes have been running this way for years on end, with only occasional battery replacements to keep ‘em going.

But the world is not a fixed place, and neither is the home. A lot can change in the course of a few years - more nodes, to the point where things become a bit crowded on the RF band perhaps, or maybe improvements such as tracking RF signal levels and adjust the TX/RX sides to optimise for lowest packet loss - it’s most unlikely that all your home-grown designs will stay the same over the course of a few years, especially if you’ve got tinkering-for-fun in your genes!

One of the main underlying issues is the disconnect between source code and remote nodes: the source code is cross-compiled - by necessity! - (since a remote node can’t do it), and therefore it runs - by definition! - on a different machine from where it’s intended be used.

We can use wonderful version control tools such as Git and GitHub all we like, they can’t address the fact that at the end of the day, the generated machine code will be out there on some small embedded µC, with no mechanism in place to track the association to the exact source code, tool versions, and upload style used when it was originally “flashed” into that device. Say bye, bye to tweaking or bug fixing, after a while.

And then there are the technological advances: of course there will be new, flashier, smarter, cheaper, more flexible, better performing options over time. Should we replace our entire setup every year? Of course not: a working room node is still a working room node, even if isn’t the latest trend, buzz word, or fad. The reality is that over the years, any home environment is bound to become a collection of old and new. And newer still. Sure, it’d be neat to run 28V DC all over the house and base everything on LED lighting with simple central control - but are we really going to rip out existing AC mains wiring, with its hazards and security requirements? Nah…

This is the situation here at JeeLabs now, and it may actually be in a better shape than some, since many - if not all - of the nodes here have been described and documented as weblog posts over the years, with their code incorporated as examples in the JeeLib repository on GitHub.

Node-by-node development doesn’t scale, neither for hardware, nor for software!

If you’re setting up a home-monitoring / home-automation system, and you’re not assembling it 100% from stable, officially long-term-supported products, then it probably isn’t such a great idea to just keep on adding stuff to what is essentially becoming a personalised environment which no-one else, including your family members, will be able to deal with in the LONG run. And what if… ehm, you’re not always around? Or you simply forgot some of the details as the years go by? (as years tend to do…)

Spouse calls significant other: “the heating isn’t working, what do I do now?” … sound familiar?

There are a number of ways to avoid the long-term mess just described. One is to make the target environments aware of source code. For example by running a Basic / Forth / JavaScript / whatever interpreter on them - then we can work on the target environment as with a regular development platform: the source stays on the target, and whenever we come back to it, we can view that code and edit it, knowing that it is exactly was has been running on the node before.

There are some serious drawbacks with this source-on-the-target approach:

it requires a fairly hefty µC, able to not only run our code, but also to act as an editor and mini development environment - bit it is quite effective, and probably a reason why BASIC became one of the mainstream languages decades ago, even in “serious” industrial & laboratory settings
you end up with the worst of all worlds, by today’s standards: a target µC, struggling to emulate a large system, and a very crude editing context
if that target machine breaks, you lose the code - there is no backup
no revision control, no easy code sharing with other nodes, no history

Trying to turn a remote node into a “mini big system” may not be such a great idea after all, in the context of one-off development at home with a bunch of remote nodes: it really is too risky, especially for the tinkering and experimentation that comes with physical computing projects.

Some recent developments in this area, such as Espruino and MicroPython, do try to mitigate the drawbacks by offering a front end which keeps the source code local - but then you end up back in square one: with a potential disconnect between what’s currently running on each node and the source code associated with it, and stored on the central/big development setup.

Another option, which takes some discipline, is to become very good at taking snapshots of your development environment setup, and in particular at taking notes of which build ended up where. With proper procedures, everything becomes traceable, recoverable, and repeatable.

The problem with it: discipline? notes? backups? constantly? for hobby projects? … yeah, right!

To re-iterate: the central problem is that development happens in a different context than actual use - embedded µCs can’t come anywhere near the many very convenient capabilities of modern development environments, with their fancy programmer’s editors, elaborate IDEs, revision control systems, cross- compiler toolchains, debuggers, and uploaders.

The issue here is not that our development tools are lacking. The problem is that they tend to be used in a node-by-node “fire and forget” development style, which doesn’t help with the entire (evolving) home collection of nodes and gadgets. Which node was compiled how again?

The best we can probably do is to aim for maximum automation, and to focus all development in a single spot - not just on a node-by-node basis, but for the entire network and collection of devices we’re gradually setting up. And not just for one node type, or even one vendor’s products, but for everything we’re tying together, from home-grown one-off concotions to commercially obtained ready-to-use devices and gadgets.

If all design and development takes place in one place, and if all results are pushed out to the remote node “periphery” in a semi-automated way, then we may stand a chance of being able to re-use our work and re-generate new revisions in the same way at a (much) later date. Whereby “one place” doesn’t imply always developing on the same machine (that too, is bound to evolve after all) - we just need to have remote access to that “one place”, the fixed point of it all.

In the longer term, i.e. a decade or more, there is no point trying to find a single tool or setup for all this. Technology changes too fast, and besides: we’re much too keen on trying out the latest new fad / trick / language / gadget. We really need to approach this all with a heterogenous set of technologies in mind. The goal is not one “perfect” choice, but a broad approach to keeping track of everything over longer periods of time. Much longer than our attention span for any specific new node we’re adding to our home-monitorin/-automation mix.

Maybe it’s time to treat our hobby as a “multi-project”: lots of ideas, lots of experimentation, hopefully lots of concrete working parts, but by necessity it’ll also be a bit like herding cats: alternative / unfinished designs, outdated technologies alongside with shiny new ones, and lots of loose ends, some actively worked on, some abandoned, some mature and “in production”.

In terms of keeping things organised to avoid the predictable mess described in the previous article, there really is no other sane option than to at least track the entire home-monitoring and home-automation project in one place. And there’s a fairly simple way to make this practical: simply add a web- server on top, which allows browsing through all the files in the project. It can be password-protected if needed, but the key point is that a single area somewhere needs to represent the state of our entire“multi-project”.

How do we get there? Some options come to mind: we could add a web server on the same machine as where our home server is running (JET or whatever), and make sure that all the related code, tools, documentation, and design notes live there. We could turn that entire area into one massive Git repository, and even keep a remote master copy somewhere (on GitHub, why not?). Note that this is not really about sharing, it’s merely a way to keep track of what is inevitably going to be a unique and highly personal setup. And if putting it in public view doesn’t feel right, then of course you shouldn’t be placing your copy on Github. Put it in a personal cloud folder instead, or keep it on a server within your own house (you do have a robust backup strategy in place, right?). The main point is: treat your hobby setup as if it were an “offcial” project, because it’s even more important to create a durable structure for such a unique and evolving configuration, than with public open-source stuff which is going to be replicated all over the place anyway.

As you can see, this isn’t about “the” solution, or “the” technology. There is no single one. In a way, it’s about the greater context of “sustainable tinkering”, thinking about where your projects and hobbies will take you (and your family members) ten years from now. You’re probably not doing all this to become a “sysadmin for your own house”, right?

What we need to do, is design and implement “in the open”, so that we can go back and tweak / fix / improve things later, possibly many years later, when all the neat ideas and builds will be fond memories, but their details long-forgotten. Note that “in the open” does not imply “in public”, it may well be open to an audience of just one person: you. What “open design” actually means here, is: resumable design.

Keep in mind that this is a long-term, small-scale, personal, bursty, hobby-mode context. Life is too short to allow it to turn into a long-term mess - yet that seems to be exactly what happens a lot, well… at least here at JeeLabs. It’s time to face up to it, and to try to avoid these problems.

From this perspective, this hobby may become a whole different ball game. Tools which could come in handy include Hugo, to easily manage notes (ignore all the flashy “themes”), and Gogs, to set up a personal git repository browser. Heck… taking notes, documenting your ideas and progress, and tracking the evolution of your own designs over time could actually be fun!

Just to avoid any possible confusion: the case made in the previous article for “centralised node management” is not meant to imply that nodes need to operate in a centralised fashion!

This is where design and development are really distinct from the decisions on what remote nodes (wireless as well as wired) actually do. We may well want to design a home automation infrastructure where a light-switch node sends out a signal when pressed, only intended for another node in the same room, controlling a lamp attached to it.

JET uses MQTT as its central message exchange “switchboard” (or “bus”, rather), which is indeed a centralised design. A model of automation where all decisions are made centrally is probably also going to be the main (or at least initial) mode of operation for JET and its hub. But all such decisions can be made on a case-by-case basis: you could for example, decide to use the central system only as a “data collection centre”, with all home-automation decisions taking place in the periphery, by the actual nodes involved in (and affected by) a particular event.

This leads to a design whereby the central node doesn’t end up becoming a potential “single point of failure”, an important concept in reliability engineering. A system without central decision-taking authority is able to limit the effect of any failure, allowing the rest of the system to continue to function. So that a faulty light switch in the garage will not affect the heating control system (assuming that different nodes are involved in these two functions).

Warning: what follows is HIGHLY tentative - it’s just a (wild) thought experiment for now!

The “bigger” plan for JET is to create a range of different node types, each with a specific set of sensors and other hardware, and to manage each of these remote units from a central node attached to the hub. This includes the source code, its cross-compilation, and the upload process to get the resulting firmware “flashed” into the remote units (either over the air, or by wiring them up temporarily). This will allow managing versions and revisions, especially when tied to each remote µC’s built-in unique “hardware ID”.

But the story does not end there. The idea in JET is to separate the functionality (i.e. basic capabilities and hardware drivers) from the wiring (i.e. the way all the available functions are tied together). The basic capabilities will have to be hard-coded into the firmware as C/C++ code, but the wiring is going to be implemented as (soft) data, i.e. as a (possibly quite elaborate) data structure describing all the periodic events, sensor triggers, and actual “readings”, and how these should be routed - both inside each remote node and between these nodes (as messages).

For the existing (currently all-wireless) nodes here at JeeLabs - which all happen to be various generations of JeeNodes - very little will change: they’ll continue to broadcast their sensor readings as is, without any notion of where that data ends up or what is done with it.

For future nodes, the aim is to build a lot more flexibility into them, by adding support for rules and a certain amount of decision-making capability. All driven from “wiring diagrams” and customisable for each individual node. An example of this could be a set of rules to behave as follows: “report motion detection immediately, but also send a trigger to a specific lamp if it’s dark outside”. Then again, maybe this logic should not be the motion sensor’s role, but the lamp’s role, i.e. we could instead tell the lamp node to listen for motion and light level messages, and let it decide for itself whether its lamp should be turned on. We can explore both avenues.

The idea here is not to come up with a design right now, but to illustrate how configurable“smarts” could be encoded as rules, sent out (and managed) by the central node, but leading to autonomous behaviour in the remote nodes. This can really only work when this kind of behaviour is designed for the network as a whole, and not simply by throwing new nodes in and leaving the old ones alone. Clearly, all of this will need to evolve over time - as we gradually find out and evaluate what sort of behaviour we actually want for our home!

With changes limited to sending out new wiring diagrams, we can greatly reduce the risk of failure and serious disruption in the case of (occasional, but inevitable) mistakes. These wiring diagrams are likely to be quite small, compared to a full firmware upgrade, and since we’re not altering the firmware itself, we’ll also benefit in the extremely important area of security: it’ll be impossible to make nodes do things they were definitely not intended to do, if their firmware stays intact (assuming it properly validates all incoming wiring changes!).

A nice side-effect (for everyone who is not a deep-diving C/C++ programmer), is that firmware recompiles will not be required to change a node’s behaviour: that’ll be controlled by the wiring.

Another benefit with this data-driven approach: with the separation of code and data, different node architectures can be tried out, with different µC boards and different RF technologies. As long as all the nodes understand the same basic common data structure conventions, we’ll be able to mix / replace / upgrade as needed, and make them inter-operate as much as we like.

This article supersedes this one - but it’s probably still useful to read that original article first.

While many of the design choices remain the same, the API has changed a bit. The description below matches what is currently implemented on GitHub and included in the latest release.

Terminology and semantics

What hasn’t changed, is the way the data store is tied into MQTT. The hub listens for messages matching “!/#” and “@/#” topic patterns and interprets these as stores and fetches, respectively.

Here is how to store the text “abc” in an item “c” inside a bucket “b”, which is in turn inside bucket “a” at the top level:

jet pub '!/a/b/c' abc

(the topic has to be quoted, because “!” has special significance in the shell)

The term bucket is from the underlying BoltDB package. As you can see, this looks very much like storing text in a file“c” in the directory“/a/b/”. From now on, we’ll stop calling these nested level buckets, and use the term directories and directory paths instead. But to avoid confusion with real files, let’s also continue to call “c” an item, and not a file. It’s not accessible via the file system after all, it only exists somewhere inside the BoltDB data file.

There is one important limitation in BoltDB: it can only store directories at the top level. Items must be placed inside a directory, i.e. “!/c” is an invalid item reference, since it’s not in a dir.

Another convention added in this redesign, is that directories must always be specified with a trailing slash, whereas items may not end in one. So “!/a/b/” and “@/a/b/” refer to the (nested) sub-directory “b”, while “!/a/b” and “@/a/b” refer to the item “b”. Also, empty names are no longer allowed, i.e. a path cannot contain two slashes next to each other (“…//…” is invalid).

Extracting / fetching data is a two-step process: you send a message with a topic corresponding to the item of interest, and as payload a “reply topic”. Like this, for example:

jet pub @/a/b/c '"abcde"'

Note the extra double quotes, “abcde” is a JSON string which names the topic where the reply will be sent (it should not normally start with “@/…”!). To see what’s happening, we have to subscribe to that topic before sending out the fetch request, i.e. by keeping a separate terminal window open and running this command:

jet sub abcde

No double quotes this time, the topic is always a plain string, not JSON. If we now re-send that “jet pub '@/a/b/c''"abcde"'” request, we’ll see this output appear in the subscription:

abcde = abc

In summary: to store a value, send it as payload to the proper “!/…” topic. To fetch a value, set up a subscription listener to pick up the reply, then send a message to the proper “@/…” topic and specify our listener topic as payload, formatted as JSON.

This approach turns MQTT into an RPC mechanism for the data store. Any MQTT client can use the data store if it adheres to the above convention, this is not limited to JetPacks. As long as the hub is active, the store will process these requests.

Payload considerations

MQTT topics are always plain text strings, with “/” to segment the key space. Null bytes and control characters should be avoided, but UTF-8 is fine.

MQTT payloads can be anything: plain text, JSON-formatted text, or binary data. The same holds for the data store: it takes a number of bytes, whatever their format might be, and returns them as is. There is no hard limit for the size of a payload.

Note that in JET, many parts of the system do expect JSON-formatted payloads. For numeric values, there is no difference, but strings will need to be double-quoted when this is the case.

Storing data

As already shown above, you store an item by sending it as “!/…” message:

jet pub '!/a/b/c' 123

If the item exists, it will be overwritten. If the directory “/a/b/c/” exists, you’ll get an error - items and directories cannot have the same name.

All intermediate directory levels are automatically created if necessary. Again, this will fail if any of the directory names already exist as item names.

You can also store multiple items in one go, by storing a JSON object to a directory. The above could also have been written as:

jet pub '!/a/b/''{"c":123}

Same effect, and to store multiple items, we could have done:

jet pub '!/a/b/''{"c":123,"d":456}'

This creates (or overwrites) two items in the “/a/b/” directory. This is an atomic operation: all the items are saved as part of a single transaction.

Multi-stores can be convenient to “unpack” an object into separate items, but since the request uses JSON, you can only use it to store JSON-formatted data. To store arbitrary text or binary data, you have to use the single version.

The following request is a no-operation, except that it will create “/a/b/” it it didn’t exist:

jet pub '!/a/b/''{}'

Note that a multi-store does not affect other items in the same directory. Items are “merged into” the directory, leaving the rest unchanged, it does not delete anything. Speaking of which…

Deleting data

Deleting an item is done by sending it an empty payload:

jet pub '!/a/b/c'''

Or, equivalently:

jet delete '!/a/b/c'

Note that this cannot be done via a multi-store:

jet pub '!/a/b/''{"c":""}'

This will store the empty JSON string (with its double quotes), not a zero-length payload, which may not be what you had in mind.

You can also delete a directory and everything it contains, including any sub-directories, by sending the empty payload to the directory:

jet pub '!/a/b/'''

As before, the item vs. directory distinction is made through the trailing slash.

Empty payloads

As you can see, empty payloads play a special role. This is not the same as the empty JSON string (“”) or even JSON’s “null”, which consists of a small number of bytes, even if they represent“nothingness”.

Storing empty payloads deletes stuff from the store. But since fetching a non-existing item also returns the empty payload, you can often ignore this behaviour. The only difference is in directory listings, as described below.

Fetching data

The fetching behavious of the store has already been described above, but for completeness, here is a quick example anyway:

jet pub @/a/b/c '"abcde"'

This is what will happen when this message is sent:

the hub picks up the request
it retrieves the content of item “c” in bucket “b” inside bucket “a”
it sends the result to MQTT as payload, using “abcde” as topic

If the item did not exist, an empty payload will nevertheless be sent. But in case “/a/b/” doesn’t exist, the hub will report an error on its log instead, and not send anything back.

Listing directories

One request type has not yet been presented. The data store also offers a way to scan its contents, allowing you to enumerate all items in either the top level or any existing directory.

This again, uses the “@/…” notation, with a reply topic as payload. The difference is that now the topic refers to a directory. An example:

jet pub @/a/b/ '"abcde"'

The result, as reported by “jet sub abcde”, might be something like:

abcde = {"a":2,"b":0,"c":4}

Here, “/a/b/” contains items “a” and “c”, with payloads of size 2, and 4, respectively, as well as a subdirectory “b”. Subdirectories always have zero size, which is never the case for normal items.

Names are stored in sorted order (sorted as raw bytes that is, not UTF-8 or anything fancy), but JSON object attributes aren’t always kept in order (they’re usually implemented as hash tables).

More advanced searches - such as ranges and globs - can be implemented later, by passing in more information than just a reply topic string. This could also be used for on-the-fly statistics, i.e. scanning and summarising data on the hub, and reporting only the resulting metrics.

Reply topics

As you can see, all accesses require some reply topic to get the results back to the requesting app. These topics should be unique, to avoid confusion about which reply relates to which request.

The plan is to have a convention for any JetPack to easily come up with such reply topic names, and to add some utility code which will wait for a reply and timeout if nothing comes in quickly. Since each JetPack has a unique name when it connects to MQTT, and since the hub manages these names when it starts them up, we can probably choose topics with the following structure:

packs/<packname>/replies/<seqnum>

This way, each pack can easily track and issue its own sequence numbers. Other (non-JetPack) applications will have to come up with their own unique reply topics.

For the time being (early Feb 2016), reply topics are not yet automated.

Here’s a “JET” engine, for your amusement:

This week, I’m going to go into the practical aspects of the JET project: for production, i.e. the always-running hub + MQTT server, and for development, i.e. what is needed to work on this software and take it further.

As you’ll see, these end up at two extremes of the spectrum: a tiny setup with minimal dependencies is enough for production, whereas development requires a slew of tools (albeit standard and well-supported), and a hefty machine if you want a snappy development cycle.

But first, I’ll revisit the redesigned JET data store, and its new - improved! - API:

All these aspects are still evolving and in flux, but hey… ya’ gotta start somewhere!

Latest hub v4.0-66 builds now on GitHub.

One of the main design goals for JET, is that it must run well on very low-power Linux boards.

The reasoning is that you really don’t need much computing power at all to manage all the data flows involved with home monitoring and automation. It’s a very slow kind of real-time system, with at most a few events per second, and most of the time nothing else to do than collecting the incoming sensor data for temperature / light / door / motion sensors, etc.

Then again, a good responsive user interface which can update a flashy graphical screen and is able to show live graphs with potentially a lot of historical data is considerably more demanding. The insight here, is that all of this processing can take place in the browser, which is always very performant nowadays, on desktops as well as mobile platforms.

In fact, with modern “reactive” and “single-page” applications, we don’t even need a very capable web server - if it can serve 100% static files and handle a WebSocket connection on the side for all the (bi-directional) real time stuff, we’ll be fine. There’s no need for any server-side rendering of web pages, i.e. no templating, no embedded scripting language, nothing.

That means that on the server side, JET must include these functions:

a message dispatcher, i.e. MQTT with Mosquitto
accept incoming sensor data, via serial, gpio, i2c, LAN, WiFi, etc.
keep raw data logs, i.e. the current “logger” module in the hub
a scalable data store, in this case a key-value database using BoltDB
a web server, serving static web pages, and JS/CSS/image assets
websocket handler, tied into MQTT (either in Mosquitto or via the hub)
the ability to run arbitrary tasks as custom JetPacks, launched by the hub
optional extensions, such as statistics, historical data, and a rule engine

But that’s about it. As long as none of these pull in large applications, we really can keep it all very lightweight. And indeed, so far, it really is a very light load - after two weeks of running, the MQTT + hub have proven to require extremely few resources so far:

the hub process (in Go) needs less than 5 MB RAM as working set
Mosquitto (in C) needs well under 1 MB of RAM to do its thing
on an Odroid U3, the hub uses 15 min/day of CPU, and Mosquitto 1.5 min/day
that’s with 1,500 incoming messages per hour, about 3 MB of raw logs per day

This is a few percent of a Raspberry Pi - even an “ancient” model B is more than enough:

(model A is a bit inconvenient due to its lack of Ethernet and scarcity of USB ports)

Now let’s look at the software side of things…

The minimal JET setup has virtually no dependencies on other software, since the hub is built as a fully static Go executable, and Mosquitto is a one-line install (”sudo apt-get install mosquitto”) which also pulls in virtually no other packages.

For a truly minimal Raspberry Pi setup, all you need is some small Debian or Raspbian build - version 8.x (“Jessie”) is the latest incarnation these days. For a tiny distro, check out pipaOS.

If you’re using HardKernel’s Odroid C1 board, which is really a RasPi clone, then you could use this 1.0-20160131-C1 image for it. It’s a very nice minimal setup, as described in this forum post. The result will fit on even a small 1 GB (µ)SD card, with enough room for a year’s data collection.

And that’s about it. Since the hub can be cross-compiled from Go on any machine (see the JET releases page for a few builds), and Mosquitto is ready to, ehm, go from the “apt-get” package repository used in Debian, Raspbian, and Ubuntu, there is not even a requirement to install the gcc compiler toolchain on such a machine. The hub’s installation has been described earlier.

Since JET is intended to remain always-on, at least as far as the hub is concerned, we need to be a little careful how we introduce changes. The way to do this in JET, is to treat the whole system as one, whereby development simply happens to be inactive some of the time:

The hub, and all the interfaces and JetPacks it has been told to activate, will continue to run no matter what (unless they crash or fail in some other way, evidently). This is what keeps the home monitoring and automation going at all times.

For development, we’re going to need a whole slew of tools:

a command-line shell, to perform ad-hoc tasks
a web browser client, to examine the public interface of the system
NodeJS to (re-)generate all the client-side code and assets
a programmer’s editor or IDE, obviously
an SSH terminal session to connect to the machine running the hub
Go, or whatever language environment we are using in our JetPack
Git, to manage versions and revisions of all the source code
… and probably a variety of other applications and “dev tools”

This is what it looks like on a dual-machine setup, connected via a (local) network:

There are a number of points to make here:

a JetPack does not have to be running on the same machine as the hub - although this will depend somewhat on its role: it might need to access a hardware interface, for example
the hub is not in charge of what happens on the development machine (shown here as being at the receiving end of the arrows) - it is not necessarily the parent of all JetPacks
the hub does not need NodeJS, even when it’s serving web browser requests, but if you would like to use NodeJS functionality, you can install it there as well, of course
the above dual-machine split is optional - when making sweeping (or risky) changes, or simply to try out JET first, this can all be set up on a single development machine

The development system at JeeLabs is a Mac OSX laptop, with the Homebrew package manager installed to grab all the pieces of software, and to easily keep them up to date. Currently, this is:

‘brew install go’ - Go version 1.5.3
‘brew install node’ - NodeJS version 5.6.0
‘brew install macvim’ - the Vim editor, as GUI variant
ssh is pre-installed, gcc and git are in the standard Xcode command-line toolset

If you develop on a Windows or Linux PC, you’ll neeed to locate these packages for your system. Version requirements are not very stringent: try to use fairly recent versions of Go and NodeJS.

The one missing piece is cross-compilation for ARM and AVR µCs. Here are some options:

for ARM, you can get good ready-made builds from the launchpad.net site
or set up your search paths to use the gcc cross-compilers included in the Arduino IDE
or you could choose to set up cross-compilation on the hub’s Linux machine - this will be slower but can be automated with some remote commands through ssh

More details about these different options will be the subject of a separate article.

Lastly, as example of a dual-machine configuration, here is a permanent-ish setup for the hub:

Actually, this is going to be used as the basis for a secondary test setup at JeeLabs. This should make it easier to experiment with more radical design ideas involving the hub itself. The “main” production setup is separate, and already running 24 / 7 on a battery-backed Odroid U3.

The components above are, in clockwise order:

an Odroid C1+ with 8 GB eMMC, running a lightweight version of Debian Jessie 8.3
also mounted inside the case: the RasPi RF board
a JeeLink Classic with RFM12 @ 868 MHz
an Odroid-branded WiFi stick
underneath the JeeLink, a short USB cable to …
a HyTiny-STM32 board, programmed to act as Black Magic Probe, driving …
a second HyTiny-STM23 board, with RFM69 and OLED display
a 10,000 mAh LiPo-based USB Power Bank - it’ll easily power all this for a day
a small experimental node, based on an LPC824 and RFM69
all mounted on … foam board!… with … tie wraps!

More options are likely to be added later, e.g. for trying out an ESP8266.

This configuration has multiple radios, so it can also be used to generate test packets and see how the receiving node (and hub) processes the data. And the OLED is a nice debugging aid.

JET is going to need a web interface. In fact, it’s likely that a major part of the total development effort will end up being poured into this “front-end” aspect of the system.

After many, many explorations, a very specific set of tools has been picked for this task, here at JeeLabs. It’ll all be JavaScript-based (ES6 as much as possible), since that’s what web browsers want. But unlike a number of earlier trials and actual builds of HouseMon, this time we’ll go for a fairly plain approach: no CoffeeScript and … no ClojureScript. They’d complicate things too much for casual development, despite their attraction (ClojureScript is pretty amazing!).

We do, however want a very capable development context, able to create an UI which is deeply responsive (”reactive”, even), and can keep everything on the screen up to date, in real-time.

Here is the set of tools selected for upcoming front-end development in JET:

ReactJS - should be easier to learn than AngularJS (but also totally different!)
WebPack - transparently builds and re-builds code during development
Hot Reload - an incredible way to literally edit a running app without losing context
ImmutableJS - a new trend in ReactJS, coming from Clojure and other FP languages
PureCSS (probably) as a simple and clean grid-based CSS styling framework

This front end will be called JET/Web and has been based on the react-example-es2015 project template on GitHub. It has all the main pieces in place for a truly fluid mode of development. A very preliminary setup can be found in the “web/” directory inside the JET repository - but note that the current code still has sample content from the template project.

Front-end development is a lot different from back-end development, i.e. the JetPacks and the hub itself. In development mode, a huge amount of machinery is activated, with NodeJS driving WebPack, “injecting” live-reload hooks into the web pages, and automatic change detection across all the source files. Once ready for “deployment”, the front-end development ends with a “build” step, which generates all the “assets” as static files, and compresses (“uglifies”) all the final JavaScript code into a fairly small single file - optimised for efficient use by web browsers. In JET/Web, the final production code is only about 150 KB of JavaScript (including ReactJS).

If you’re new to any of the tools mentioned above - you may well find that there’s an immense amount of functionality to learn and get familiar with. This is completely unlike programming with a server-side templating approach, such as PHP, Ruby on Rails, or Django. Then again, each and every one of these tools is incredibly powerful - and it’s guaranteed to be fun!

That’s the consequence of today’s breakneck speed and progress w.r.t. web development. But these choices have not been made lightly. Some considerations involved in these choices were:

a low-end (even under-powered, perhaps) server, which can’t handle a lot of processing
the desire to have everything on a web page updated in real-time, without page refreshes
the hub’s web server can’t be restarted, at least not for work on the web client software

In its current state, JET/Web is next to useless. It doesn’t even connect to the MQTT server yet, so there’s no dynamic behaviour other than inside the browser itself (see for yourself, the demo is quite interesting, especially when you start to look into how it’s all set up in the source code).

One final note about these “decisions”: obviously, you have to pick some set of software tools to be able to implement anything meaningful. But with JET, “big” decisions like these are actually quite inconsequential, because many different front ends can easily co-exist: anyone can add another JetPack, and implement a completely different (web or native) front end!

In 1965, computing history was made when DEC introduced a new computer, called the PDP-8. It was the start of a long series of incrementally improved models, later to be followed by the even more successful and ground-breaking PDP-11 series, and then the VAX. It’s a fascinating story, because so many technological trends of that time have shaped what we now use on a daily base, from laptops, to mobile phones, to watches.

This week’s episode is about the PDP-8 and an amazing replica/kit, called the PiDP-8/i:

Let’s dive right in - as always there will be one article each day this week:

The PDP-8 was before my time, although I did play with it once … clunky doesn’t even start to describe it, by today’s measures!

Let’s get some historical context first - we’re in the year 1965:

the transistor had been commercially available for about a dozen years
RAM memory was made of magnetic cores - it was used between 1955 and 1975
the ASR-33“teletype” was introduced in 1962 and produced until 1981
block-addressable DECtape had just been invented, holding ≈ 280 KB per reel
hard disks drives were “a few megabytes” and extremely large and expensive
the first 4-bit “computer on a chip” microprocessor was still half a dozen years away
for a lot more context, see this timeline at the Computer History Museum

But perhaps the most telling metric of that time is the amazing revolution made possible by the introduction of the integrated circuit - before the PDP-8, everything had to be constructed from individual components. Here’s the IC’s development path, as summarised by Wikipedia:

Think about it: a few years after the first PDP-8 was produced, a “chip” (from the incredibly successful 7400-series introduced by TI) could replace no more than a handful of gates!

And to get a bit more perspective about the mindset of those days: Ken Olsen, the founder of Digital Equipment Corporation which created the PDPs, said in 1977 that:

“There is no reason for any individual to have a computer in his home.”

(note that he was talking about home automation, not ruling them out for other purposes)

So here we are, half a century ago. Computers were huge colossi, sitting in large noisy rooms, drawing kilowatts of power, and operated by a small group of specialists. Nobody considered these things useful, other than to speed up numerical calculations. Time on “the machine” was so costly, that everything was focused on optimising the computer’s time, not that of us people.

And then came the first minicomputers - most notably the DEC PDP-8 and the Data General Nova. Here’s a picture of one of the first PDP-8’s, with the panels removed to show its innards:

Yeah, it’s called a mini-computer! (image from the SMECC museum in Arizona, US).

There have been many PDP-8 models, over the span of about a decade. Here are a few of the highlights, from Doug Jones’ site - a huge resource for everything related to these machines:

PDP-8 - 1965..1968 - 4K (12-bit word) memory, 1.5 µs memory cycle time - $18,000
PDP-8/i - 1968..1971 - M-series “flip-chips” with wirewrap backplane - $12,800
PDP-8/e - 1970..1978 - SSI/MSI 3-board design, bus instead of backplane - $6,500
PDP-8/a - 1974..1984 - single-board CPU, “workstation” with diskettes - $1,835

Here is a PDP-8/f, a slightly newer version of the PDP-8/e, from The Old Computer Hut site:

A hefty switched power supply on the side and cards which push into an “OMNIBUS” connector board on the bottom. This still uses 4K words of core memory, expandable to a whopping 32K.

One of the distinguishing features of computers from this era is their “programmer’s console” - a row of switches and a bunch of lights which indicate the content of some registers in real time. You can stop the machine dead in its tracks by flipping the STOP switch, examine memory, even “deposit” new values in it, and then continue execution. How’s that for debugging, eh?

Ok, so now we have our computer. How does it interface to the real world? How do we talk to it? How do we tell it what to do? Do we login to it? Or is it all about switches and blinkenlights?

No tablets, no LCDs, no video screens, no internet yet, no Ethernet yet, no local area networks!

The main interface was the teletype, a big and noisy hardcopy printer, keyboard, paper tape reader, and paper tape punch, all in one. A marvel of mechanical (not electronic!) engineering:

The communication speed was 110 baud serial, using the same start/stop bit stream we still use today. That’s about 10 characters per second. The paper punch ran in parallel with the output, so you could “save” what was being sent to you and then later re-enter it, as if you typed it in.

In the early days - or if you had no budget for anything fancier - that was it!

Here’s a 3-minute video, showing a fairly “high-end” setup - s l o w l y ! - typing out a file listing.

To save 2 KB text, i.e. roughly one typed page, on paper tape you had to load empty paper tape in the punch, start the printout, and listen to a very loud paper punch, pressing holes in the tape for well over two minutes. Oh, and some “lead-in” and “lead-out”, i.e. blank pieces of tape at the front and the back to make it possible to load the tape and run it through without problems.

The paper tape got jammed, you say? Pity. Just start over, if it wasn’t damaged too much.
You want to make a safety copy, just in case? Sure, just start the reader and punch in parallel.

Later on, much faster “high-speed” optical paper tape readers were introduced, which greatly reduced the noise and time spent, but paper tape just isn’t such a great medium when it comes to kilobytes of data. Not to mention the storage needs and keeping track of it all (in handwriting).

Meet the DECtape unit (image from the Computer History Museum):

Here is an 18-second video of how they worked. Each tape can store up to 280 KB of data, and because of the “tape marks” it was able to seek to any block on the tape and read or re-write it as needed. Beats paper tape, but it still took a lot of time just shuttling around to access each block.

The DECtape unit was quite expensive (a TU56 dual unit w/ controller was $5,500 in 1974), but the tapes themselves were cheap, so you could have virtually unlimited storage. Some technical specifications, from pdp8.net:

start/stop/turnaround time: 150/100/200 ms
tape speed: 2.4 m/sec - transfer rate: 8,325 12-bit words/sec
power consumption: 325 watts - weight: 36 kg
tape reels: 10 cm in diameter - tape length: 78 meter

DECtape was pretty convenient to handle, and one tape can store about 140 pages of text, but the Achilles’ heel was their seek time: over half a minute just to get from one side to the other.

As today, technology trends evolved rapidly. Disks with fixed platters as well as removable ones became more widespread and more affordable year by year:

DF32 - fixed head - 32..128 (12-bit) kilowords - 17 ms seek, 16 kw/sec
RX02 - 8” floppy disk - 256 kilowords - 262 ms avg seek, 20 kw/sec
RK05 - removable pack - 1.6 megawords - 70 ms seek, 1500 rpm, 100 kw/sec xfer

Note the units: kilowords and megawords. That RK05 became a workhorse, also for the PDP-11 later on, but at a steep price: $7,900 per drive + controller. And to be practical you really needed two, otherwise there’s no way to make backup copies or move large amounts of data around!

Let’s compare this to today: a 16 GB µSD card costs around €9, read and write speeds are in the tens of MB/sec range, and there’s no seek time, as the card has no moving parts. Oh… and no controller either - any µC with 4 spare I/O pins can read and write from this thing. That’s 6,000 x the storage of an RK05 pack, 10,000 x as fast, and 1/1000th of its price (per unit, not byte!). Not to mention physical size and power consumption differences…

If you would like to experience for yourself how a computer such as a PDP-8 looks and feels, there are several possible avenues to choose from:

get in touch with a museum, friend, or hobbyist who has “The Real Thing”, and see if you can get a demo or schedule a session
look for old equipment dumps, maybe some company, university, or individual wants to give, sell, or lend such a machine to you
get hold of schematics, spare parts, and go try and build one yourself, possibly re-using any original parts available to you
download a software simulator for the computer model you’re interested in, and have a go at running this virtual environment
design your own emulation, possibly adding some fancy lights and switches to make it more realistic and tangible than a software-only emulation
look for a kit and build it yourself, knowing that others have done the same, with support from the kit maker and/or other builders who went before you

This article is about that last option. Oscar Vermeulen has a site with the wonderful name of Obsolescence Guaranteed where he has collected a lot of information and offers a kit for what he calls the PiDP-8/i (note the “Pi” in there!). Here’s his PiDP-8/i kit in front of a real PDP-8/i:

The PiDP-8/i looks like a 2:3 scale model of the real thing, but inside is a Raspberry Pi, running SIMH, as an extremely elaborate and complete emulation of the PDP-8 (and others) as well as tons of peripherals. So you can make the machine think it has a paper tape reader, or a few DECtape drives, or some RK05 diskpacks, or all of them at the same time. Storage media will then be emulated as files on the Raspberry Pi’s SD card or on USB sticks.

The Obsolescence Guaranteed site is a joy to read, and has tons of details - about the kit, the assembly process, the original hardware, as well as things you can try out with it.

Two nice videos are the introduction (7 min) and the Hackaday 2015 presentation (20 min).

As noted, the PiDP is completely different inside - it has nothing to do with the original clunky, energy slurping machines of 50 years ago. It just looks the same and it behaves very much like an original PDP (if you imagine the paper tape, teletype, and other peripherals yourself, that is).

Here’s the PiDP, with a Raspberry Pi A+ on the left, and running off a blue (18650-based) LiPo battery pack from eBay - there’s not much behind that front panel, as you can see:

Construction of this kit is very straightforward. It’s all very nicely documented on the website. You have to solder in 89 LEDs, a dozen or so resistors, and the most unusual part: a series of 22 switches (some toggle, some spring-action), carefully mounted and positioned to give the whole thing a nice well-spaced appearance. It took an afternoon - it’s not hard, it just takes patience …

For this build, the goal was to create a completely self-contained unit (hence the battery pack), and to control it entirely via a network connection over WiFi. To that end, an FTDI interface had to be brought out, both to charge the battery pack and to create a serial connection for adjusting WiFi settings. Nothing a bit of Dremel-cutting and hot glue can’t handle:

Wifi is a matter of inserting a WiFi dongle in the A+’s only USB port, but because this was a few millimeters too large to fit inside the box, its plastic cover has been removed - revealing a WiFi board which is even smaller than an ESP8266:

The last puzzle to solve was how to turn power on and off to this thing. The battery pack has a very convenient button, but it would require making another ugly hole in the box. The solution was to place the battery holder right behind the front panel (with a bit of cardboard behind it):

This way, if you know where to push on the front panel, you can bring this PiDP-8/i to life!

Based on a quick measurement, the PiDP-8/i draws about 230 mA, so it ought to last about a day on the LiPo battery before needing to be plugged in. How’s that for mobile computing?

That front panel is quite extraordinary, by the way: not only can you see (and change) the contents of memory and the accumulator, and of course single-step the whole beast - you can even single-cycle it, i.e. go through each of the different phases of an instruction, and see the instruction decoder in action on the right hand side of the panel.

See that vertical set of 8 lights? That’s the instruction type: the PDP-8 has only eight different opcodes! Although one of them has a sub-division for additional “micro” operations. Since only the Jmp and Iot instructions are lit here, the program must be idling, waiting for some I/O.

The PiDP-8/i comes with 32 Kwords of memory, the maximum supported in this architecture, and the simulator is able to connect every possible type of hardware to it, in a virtual sense that is. These options are part of SIMH and can be adjusted through a serial or SSH connection.

So what can you do with 8 banks of 4,096 words of memory, organised as 128-word pages?

The introduction of the PDP-8 series was a disruptive, game-changing event, in that it made computers available to a large group of scientists, engineers, and tinkerers. For the first time, more or less, you could walk into a room, sit down at a teletype, and start programming. No more “batch jobs” and “reserving” time on a huge, scarce, over-booked, expensive machine.

Instructions and memory

The PDP-8’s instruction set is very well documented elsewhere. It only has 3 bits to store the “opcode”, i.e. 8 combinations in all. One is for I/O, one is for special “micro” instructions - that leaves a mere 6 operations: a jump, a subroutine call, and only four other instructions, with 2 special bits and a 7-bit operand field. Can this thing really be Turing-complete?

That’s not all: those 32 Kwords of a maximally-configured PDP-8 are split into eight 4 Kword “banks”, and each bank is split into thirty two pages of 128 words each. Since a word is 12 bits, you can only easily access words in a single 4 Kw memory bank. Multiple instructions will be needed for anything beyond 4 Kw.

There is no “load accumulator” instruction, there is only “add to accumulator”. Storing the accumulator clears it! (which makes a lot of sense combined with add-to-accumulator) - for some interesting notes about the instruction set, see this page.

Let’s look at memory: in those days, random memory was magnetic core memory. It has some very unusual properties by today’s measures: reading a memory address is destructive - you have to write it back after a read if you want to preserve it! As a consequence, reading and writing and even modifiying a memory address all take the same amount of time.

And then this: core memory retains its contents when powered off. That means you can stop a PDP-8 from its front panel, turn the machine off, power it up again, and restart it.

Despite the limitations of a PDP-8, people have built various operating systems for this thing, and implemented more than half a dozen programming languages for it. It boggles the mind.

Languages

There are two categories of programming languages for the PDP-8:

Compiler-based languages - you write your code in some editor, then you save it (to paper tape, or magnetically if you’re lucky), then you start up the compiler, possibly multiple passes, then you start the linker, and at the end you have a binary, which you can start to see if it works.

This process is tedious, to put it mildly. With a disk, a Fortran compilation of a simple “Hello world” program takes 10 seconds or so, but that increases to about 10 minutes with DECtapes, and even more if you have to save to paper tape and also load each new program that way.

Only then will you know whether you mis-typed anything or forgot a comma.

Some languages for PDP-8 were: Fortan II and IV, Algol, Pascal, Lisp (!), and Forth.

Many of these require at least 8 Kwords of core memory, sometimes even 28 Kw. If you only have 4 Kw, the minimal and default PDP-8 configuration, then all you could probably use is the machine-level instruction “assembly language”.

The compilers and linkers themselves were invariably written with an assembler. It’s hard to imagine how much time and effort it must have taken to design, implement, and test these elaborate software systems, fitting their code and data structures into that quirky 32 Kword, 4096 w/bank, 128 w/page memory layout. Text was stored as two 6-bit characters per word: no lowercase, and only a very limited set of special characters! Six-char var names, what a luxury!

Interpreted languages - imagine sitting at the teletype, entering some commands and getting a computed reply back within a second - nirvana!

That was the promise of interpreted programming languages then, and that’s the appeal of scripting languages today (that distinction is all but gone with today’s performance levels).

On the PDP-8, there was BASIC, which incidentally was designed at just about the same time as the PDP-8 hardware. It lets you enter commands in immediate mode, as well as edit them into a larger program by starting each line with a line number. You could enter strange things like:

20 GOTO 10
10 PRINT "HELLO WORLD"

And the computer would guarantee to execute them in order, creating an infinite loop in this case. By hitting Control-C (sound familiar?), you could abort the running program and regain control. The line numbers were irrelevant, but by keeping gaps you could then insert additional lines later, such as:

15 PRINT "1 + 2 =", 1+2

Typing “LIST” would print out the entire program:

10 PRINT "HELLO WORLD"
15 PRINT "1 + 2 =", 1+2
20 GOTO 10

All the essential tools were present for interactive use: a command line, a crude-but-effective editor (with “LOAD” and “SAVE” commands for paper tape or disk files), and your code, waiting to be executed, enhanced, or debugged. In many ways, we still use this same process today.

This approach, and BASIC in general, was definitely the mainstream model for the next twenty years, when 8-bit hobby computers and CP/M and MS-DOS became the dominant systems.

The other interpreted language on the PDP-8 was FOCAL, developed by DEC. Just as BASIC, this was a completely self-contained system. It ran (barely) in just 4 Kw, and there was no “operating system” in sight. Focal-69, the most widespread variant, was the operating system.

Again, looking at the hardware this all ran on, and the fact that these systems themselves had to be programmed in assembly language, raising the conceptual bar to make a PDP-8 an interactive and highly responsive system was quite a revolution at the time.

Operating systems

Then came magnetic storage. Even if an expensive (but fast!) fixed-head DF32 with 4 platters could only hold 128 Kwords of memory, it changed the landscape for good. Gone were the time sinks of loading, saving, re-loading, and damaged or lost paper tapes. The operating system turned these disks (and DECtapes) into data filing cabinets. That’s why they’re called “files”!

File names were up to 6 characters, with a 2-letter “extension” to indicate the type of file (does this ring a bell?). This was also the start of utilities such as “PIP”, the Peripheral Interchange Program which could shuttle data around from one file to the next, from paper tape to disk, from disk to teletype, and so on.

The computer was starting to become more of an information processor, and less of a purely computational / number-crunching engine. And the PDP-8 was right in the middle of all this, with well over half a million units in the field.

The PDP-8 was fertile territory for several groundbreaking operating systems:

OS/8 was the first and main one - a PDP-8 + disk or DECtape + OS/8 was all you needed to get tons of work done (or games). A slow but very respectable precursor of the Personal Computer.

And then more people wanted to join in on the game. Most of the time, all these computers were just sitting, twiddling their thumbs after all, waiting for a bunch of sluggish carbon-based life-forms to press the next key on their keyboard. What a silly waste of (the computer’s) time!

Meet TSS-8, the time-sharing system: it gave each user a “time slice” of a single shared PDP-8, swapping data to and from a disk, as needed to maintain the illusion. While one person was typing, another one could be running a calculation, and they’d both get good mileage out of the system. Just hook up a few more terminals to the machine, and you’re off. Apparently, up to 17 users could share a PDP-8, and its smallest configuration only needed 12 Kwords of RAM!

There’s also ETOS/8 - a virtualising OS, giving each user the illusion of a complete machine.

SIMH and the PiDP-8/i

Last but certainly no less impressive, there’s the SIMH emulator and Oscar’s PiDP-8/i mod to display SIMH’s internal state (see the “Software” section on this page) - he does this by poking around in the (simulated) memory space - a clever way to let the simulator run full speed while still presenting a continuous glimpse inside via the LEDs. All thanks to multi-tasking in Linux.

Everything mentioned above can be tried on the PiDP-8/i. The front panel has a special trigger, where the three INST-FIELD switches in combination with the SINGLE-STEP toggle can be used to start up different software sets, as prepared by Oscar on his pipaOS-based SD card image (which will boot quite a bit faster than a standard Raspbian distro).

Here are the front-panel quick-launch cheat codes:

Octal  IF-sw  Description
--------------------------------------------------------------
  0     000    (used on power-up, set to same as slot 7)
  1     001    RIM Loader at 7756, paper tape/punch enabled
  2     010    TSS/8 multi-user system. Up to 7 telnet logins
  3     011    OS/8 on DECtape (a lot slower, also simulated)
  4     100    Spacewar! With vc8e output on localhost:2222
  5     101    (empty slot, 10-instr binary counter demo)
  6     110    ETOS/8, also with telnet logins on port 4000
  7     111    OS/8 on RK05 10MB disk cartridge

The best resource for all this is really the PiDP-8/i website. It has all the information you might want and lots of pointers to other documentation, software, and the pidp-8 discussion group.

Note that on a single core Raspberry Pi A+ or B+, SIMH runs flat-out, consuming nearly 100% CPU - yet the system remains fairly responsive, even when logged in via a network session. To regain most of the processor time, you can suspend SIMH by entering “ctrl-e” - and later enter “cont” to resume the simulation and blinking. You don’t need to quit simh to get a shell prompt: just type “ctrl-a ctrl-d” to suspend the session and “~/pdp.sh” to resume it.

That’s it for our brief excursion into the world of computing 50 years ago - fun from the 60’s!

In the beginning, there were computers. Programmed with wires, then with binary data (the “stored-program” computer), then with assembly language, and from there on, a relentless stream of new programming languages. Today’s web browsers all “run” JavaScript.

Here’s a summary of that evolution again, in terms of technology:

wires: just hardware circuits and manually inserted patch cords, yikes!
binary data, i.e. “machine code”: a very tedious and low-level way of programming
assembly: symbolic notation, macros, you have to keep track of all registers in your head
Fortran, Cobol, Algol, Pascal, C: yeay, it gets compiled for us!
Basic, Lisp, Icon, Snobol, Perl, Python, Ruby - no compiler: immediateinterpretation!
but interpreters are slow - we can compile on-the-fly and just-in-time…
and today, with JavaScript / NodeJS, Java, etc: the compiler has becomeinvisible

The story could end here, but then there is that embedded microcontroller world, with smart chips in just about anything powered by electricity. While powerful and capable of generating byte code and even machine code, they do not have the storage and memory to run a high-end optimising compiler. Even if projects such as Espruino andMicroPython have come a long way to bring complete self-contained environments to the µC world - they still depend heavily on a larger machine to produce those run-time environments we can flash into the µC.

This has an important implication: everything not implemented and linked into Espruino or MicroPython has to be written in the higher-level language (JavaScript or Python, respectively). That works and can be quite convenient, but you lose performance big time (think 1000-fold and more) - these are still interpreted languages, after all. For some cases, this is irrelevant - reading out an I2C sensor and analysing its values can easily be done slowly, if the I2C support is present and if we’re only reading out that sensor once a second or so.

But what if we want more performance? - or run on a smaller µC with 32..128 KB of flash?

One solution is the Arduino IDE: a cross compiler which runs on a large“host” and generates code for our very limited “target” µC. Or some similar “ARM embedded gcc toolchain”.

Which is where we stand today, in 2016: tethered software development, with the source code and tools living in one world (our laptops or the web), and the µC being sent a firmware upload to perform the task we’ve coded up for it, after translation by our toolchain:

you have to set up that toolchain (for your choice of Windows, Mac OSX, Linux)
you have to keep track of the source code, in case you ever need to change it
the µC will do its thing, but any change to it will require going back to the host
software debugging is tedious: add a print statement, compile, upload, try, rinse, repeat
hardware debugging requires proper toolchain support and maybe also learning“gdb”

What if we just want to investigate the hardware, check out a few settings in the chip, briefly toggle a pin or adjust a hardware register setting? Tough luck: you have to leave the flow of design and implementation, and enter the (completely different) world of remote debugging.

Our µC might as well be on Mars. With all our fancy tools (constantly updated, improved, changed) we’re virtually coding in the dark nowadays. We’re adding layer upon layer of technology and infrastructure, just to make that darn LED blink! Or read out a sensor, or make something turn, or respond to sensor changes, whatever. Does it really have to be so hard?

(speaking of Mars: Forth has been used in several NASA space missions)

What if we could talk to an embedded µC directly over a serial port connection, give it simple commands, tell it things to do now, or save for later, or do continuously. As we gradually build up our application, the µC records what we’ve done, lets us change things as much and as often as we like, selectively wiping some previously saved definitions.

Forth can do that. It’s a programming language, but it’s also a full-blown development system. Once you store the Forth “core” into the µC, you’re done. From then on, you can type at it, make it do things, and go wild. If you make a mistake (as we all do, especially while trying out stuff), you simply press reset to return to a stable system.

There is hardly a toolchain involved. The Mecrisp Forth core is written in GNU “assembler”, producing a 16..20 KB “.bin” or “.hex” file, and that’s it. You never need to go back and change it. Everything else can be built on top. Mecrisp Forth is extremely fast, so what you write can also be. There’s an assembler for the ARM Cortex written in Forth: if you load it in, you can extend the core by adding assembler code (using a Forth-like syntax). There’s even a disassembler…

(please note that assembly language is there if you want it, but hardly ever needed in Forth)

But there is one major (and very painful) drawback, in today’s world with millions of lines of code written in C and C++: Forth and C don’t really mix. A µC running the Forth core cannot easily interoperate with C code, although it can be tricked into calling external routines with C linkage (Forth can generate assembler instructions for any purpose, after all).

To sum it all up: think of the Mecrisp Forth core as a boot loader - you have to get it onto the µC once, and then it becomes the new “communication language” for the chip. From there on, this µC will understand plain text Forth commands, including saving potentially large amounts of (your!) Forth definitions after its own flash memory area. All you need, is a serial port + terminal interface, plus a robust way to send larger amounts of Forth source code to the chip.

With Forth, you don’t have a “build environment”. Forth is the environment and it’s running on the chip you’re programming for. It’s intensely interactive and there are no layers of complexity. There is no compiler, no linker, no uploader (other than a text-mode send tool), no bytecode, no firmware image, no object code, there are no binary libraries, no conditionals, no build flags.

For turnkey use, you can define a function called “init” and save it in flash memory. Then your chip will run that code on every reset. But beware: if you don’t include a mechanism to get back to command mode, then the only way to get back control is to reflash the chip with a fresh core…

There is one other “drawback”: Forth blows away every notion of language syntax and software development methodology you’re probably used to - but that’s for the next articles…