Category: Programming

Raspberry PI VLANs

The Raspberry PI is a lovely small computer but sometimes the documentation leaves much to be desired. Expecially when searching for information online. Take networking, if you just need the default configuration everything just works. And that’s one of the joys of working with a Raspberry PI, many tasks just work. But as soon as you want to do anything outside the norms, things become difficult.

Take for instance, adding a VLAN to a PI. Searching online will bring up lots of details on what people have done in the past to add a VLAN and configure it. Sadly, the hows for this have changed over time and most of the information out there is worse that wrong. It forces users to follow steps that just don’t work and then lots of time spent trying to figure out what was done wrong. It’s a rather frustrating aspect of working with a PI.

In an effort to help me remember the steps and for anyone who stumbles into the need to add a VLAN, here are the steps for Stretch, Raspbian Linux 9:

$ sudo apt-get install vlan
$ sudo vconfig add eth0 2
$ sudo bash -c 'echo "interface eth0.2" >> /etc/dhcpcd.conf'
$ sudo ifconfig eth0.2 up
  1. Install the vlan package.
  2. Add vlan 2 to interface to eth0. Change 2 to which ever VLAN you need and eth0 to the physical ethernet interface.
  3. Add a new interface entry to the dhcpcd.conf file so that an IP address can be assigned.
  4. Bring up interface eth0.2.

Update 2021-03-05: There are a couple of steps that I neglected to include. With the steps above, the VLAN will be lost on reboot. Actually those steps won’t work since the VLAN kernel module isn’t loaded. You’ll need to do that before running the vconfig command.

modprobe 8021q
echo 8021q >> /etc/modules

Next, add

vconfig add eth0 5

to /etc/rc.local. If your rc.local has an

exit 0

line at the end, the vconfig command needs to be added before the exit.

With these changes, your VLAN configuration will be re-created when the system is rebooted.

Influx Telegraf and inputs.exec

I’m a fan of Influxdb for capturing data over time. Coupling it with Grafana and interesting dashboards come to life.

Part of Influx’s tool set is Telegraf, their data collection tool. It comes with a slew of data input and output plugins that are reasonably easy to configure and use. I use two of them fairly regularly, inputs.snmp and inputs.exec. The inputs.snmp plugin uses the long standing SNMP protocol to pull data from network devices. Configuration is fairly straight forward. Here’s a sample for collecting data from a network switch:

[[inputs.snmp]]
  agents = ["NAME_OR_IP"]
  version = 2
  community = "COMMUNITY"
  timeout = "60s"

  [[inputs.snmp.field]]
    oid = "RFC1213-MIB::sysUpTime.0"
    name = "uptime"

  [[inputs.snmp.field]]
    oid = "RFC1213-MIB::sysName.0"
    name = "source"
    is_tag = true

  [[inputs.snmp.table]]
    oid = "IF-MIB::ifTable"
    name = "interface"
    inherit_tags = ["source"]

    [[inputs.snmp.table.field]]
      oid = "IF-MIB::ifDescr"
      name = "ifDescr"
      is_tag = true

Change NAME_OR_IP to the device name / IP address and the COMMUNITY to the configured SNMP community on the device and Telegraf will pull data from the switch every 60 seconds.

I put one of these configuration files in the

\etc\telegraf\telegraf.d

directory for each device. I use the device name as the file name. So for network switch ns1, the configuration file is

\etc\telegraf\telegraf.d\ns1.conf

At home, the network has 4 switches and there are 4 .conf files in the telegraf.d directory. The inputs.snmp plugin handles all the .conf files and processes the data from all the network devices as expected.

The second Telegraf plugin I often use is inputs.exec. This will launch a program and collect the output to send to the influx database. CSV, JSON, etc. all work to feed the Influx engine.

A typical configuration file looks like:

[[inputs.exec]]
  commands = [
    "/usr/local/bin/purpleair_json.py https://www.purpleair.com/data.json?show=DEVICEID&key=APIKEY"
  ]

  interval = "60s"
  timeout = "10s"
  data_format = "json"
  name_suffix = "_purpleair"
  tag_keys = [
    "ID",
  ]

In this case, the exec will run the /usr/local/bin/purpleair_json.py program and capture the data from a PurpleAir device every 60 seconds.

The problem is that the inputs.exec plugin doesn’t allow for multiple instances as with the inputs.snmp plugin. If there are more than one .conf file with inputs.exec, only the last one read by telegraf will be used. As such, more than one program cannot be used to feed via telegraf into influxdb. Rather annoying.

To get around this, I create another instance of the telegraf service. That includes a new systemd service file, a separate /etc/telegraf_EXECNAME folder and supporting configuration files.
In /lib/systemd/system/telegraf_EXECNAME.service:

[Unit]
Description=The plugin-driven server agent for reporting metrics into InfluxDB
Documentation=https://github.com/influxdata/telegraf
After=network.target

[Service]
EnvironmentFile=-/etc/default/telegraf_EXECNAME
User=telegraf
ExecStart=/usr/bin/telegraf -config /etc/telegraf_EXECNAME/telegraf.conf -config-directory /etc/telegraf_EXECNAME/telegraf.d $TELEGRAF_OPTS
ExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure
RestartForceExitStatus=SIGPIPE
KillMode=control-group

[Install]
WantedBy=multi-user.target

In the /etc/systemd/system/multi-user.target.wants directory, a symbolic link to the new services file:

cd /etc/systemd/system/multi-user.target.wants/
ln -s /lib/systemd/system/telegraf_EXECNAME.service .

In the /etc/telegraf_EXECNAME/telegraf.conf:

[agent]
  interval = "10s"
  round_interval = true
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_interval = "10s"
  flush_jitter = "0s"
  precision = ""
  debug = true
  logtarget = "file"
  logfile = "/var/log/telegraf/telegraf_EXECNAME.log"
  logfile_rotation_interval = "1d"
  logfile_rotation_max_size = "50MB"
  logfile_rotation_max_archives = 10
  hostname = ""
  omit_hostname = false


[[outputs.influxdb]]

[Add the needed options to the influxdb section for where the influxdb is hosted]

Note that this .conf file has removed all the collection information for the localhost, that remains in the original telegraf instance.

The .conf for the inputs.exec plugin are placed in

\etc\telegraf_EXECNAME\telegraf.d\EXECNAME.conf

To kick off the new service:

systemctl daemon-reload
systemctl enable telegraf_EXECNAME
systemctl start telegraf_EXECNAME

[In all the examples above, replace EXECNAME with a name that describes what’s being run.]

Creating multiple instances of the Telegraf service is annoying but it does allow me to collect the data from multiple places by running programs that reach out, gather the data and format for use with Telegraf and then into an InfluxDB database. See https://github.com/pkropf/telegraf for some examples.

Tags : , ,

Finding Raspberry Pi’s

I’m using Raspberry Pi’s for all sorts of projects. They’re fabulous, inexpensive computers that run Linux and can interact with the physical world. But sometimes, I have trouble finding them on my network. I know they’re there but I don’t have any idea of their IP address. For Pi’s that are connected to the display, keyboard and mouse, this isn’t much of a problem. But when the Pi is running headless, it’s a bit annoying trying to find it.

An easy way is to use nmap to find the all hosts on the local network with a specific MAC address prefix. For Raspberry Pi’s, there are two: B8:27:EB for older RPy models 1, 2, 3 and DC:A6:32 for RPy 4.

#! /bin/sh

nmap -sP 192.168.1.0/24 | awk '/^Nmap/{ip=$NF}/B8:27:EB/{print ip}'
nmap -sP 192.168.1.0/24 | awk '/^Nmap/{ip=$NF}/DC:A6:32/{print ip}'

nmap will find hosts on the specified network and awk will pull out the IP address of the host if the MAC address prefix matches those of the RPy’s.

Note that you’ll need to change the 192.168.1.0/24 to match your local network.

Nagios Monitoring of Directories

I had a recent occurrence at work that caused me to look around for a tool to monitor a directory for any changes made. Since there didn’t seem to be anything out there, I created a check called dirchanged. It looks at all the files in a directory and creates an sha256 has of the names and contents of the files. That hash is compared to a known value to determine if there have been any changes made.

There are a couple of issues with this check specifically that it doesn’t look into subdirectories and that the hash for comparison is passed on the command line from within the Nagios configuration files. I think the first issue will be fixed soon enough w/ a flag to indicate if the directory tree is to be traversed. The second issue is more cumbersome in that the hash value has to be stored somewhere. I’m not yet certain that putting it in the Nagios configuration files is better than putting it somewhere on the target file system. From the security standpoint, having the check not stored on the target file system is better, much less chance of it being changed by bad guys.

I’ll let it run for a while and see how it behaves and if changes are warranted.

Tags : ,

Chasing a Rollover Crash

I was reminded recently that when writing code in C, you have to take care to understand how variable are going to be used when declaring them. I was had just finished working on the code used to control the fire effects at The Crucible‘s Maker Faire 2013 booth when the system just seemed to come to a halt. That’s not quite what it was supposed to do.

The system was designed to have 3 24′ towers as the central part of the booth. On top of the towers would be accumulator based fire effects – a 24″ round sphere w/ a 2″ exhaust port, a 9″ x 24″ oblong tank w/ a massive 3″ pneumatic solenoid / exhaust and three smaller accumulators based on old fire extinguishers. The solenoids on the fire effects would all be controlled with an Arduino. The idea was that there would be no direct user interaction this year but the system would run automatically. Plug in the Arduino and away we go.

The code would run one of a number of possible sequences, pause between 30 and 90 seconds, randomly run the next sequence, pause . . . And it did that, most of the time. A couple of times after starting up the Arduino, several sequences would run and then nothing else would happen. Made me wonder if I had crashed the Arduino.

I added some Serial.print statements to the code to dump out details on what was happening internally and ran the code again. This time it ran without issue for almost 2 hours before coming to a halt. Looking at the output on the serial console showed that that pause value was -31438. Of course everything came to a halt, the system was attempting to pause negative 31,438 milliseconds! This didn’t make much sense until I reread the Arduino docs and saw that ints are 16 bit values. Of course it rolled over into a negative number.

Digging into the code I realized that I had used int’s in several places where an unsigned long was needed. Once fixed, all was right with the world and the system went on to work just fine for both days of the Maker Faire.

Perhaps I need to start writing these systems on a Raspberry Pi where I can use Python 😉

Tags : , ,