Category: Sys Admin

Influx Telegraf and inputs.exec

I’m a fan of Influxdb for capturing data over time. Coupling it with Grafana and interesting dashboards come to life.

Part of Influx’s tool set is Telegraf, their data collection tool. It comes with a slew of data input and output plugins that are reasonably easy to configure and use. I use two of them fairly regularly, inputs.snmp and inputs.exec. The inputs.snmp plugin uses the long standing SNMP protocol to pull data from network devices. Configuration is fairly straight forward. Here’s a sample for collecting data from a network switch:

[[inputs.snmp]]
  agents = ["NAME_OR_IP"]
  version = 2
  community = "COMMUNITY"
  timeout = "60s"

  [[inputs.snmp.field]]
    oid = "RFC1213-MIB::sysUpTime.0"
    name = "uptime"

  [[inputs.snmp.field]]
    oid = "RFC1213-MIB::sysName.0"
    name = "source"
    is_tag = true

  [[inputs.snmp.table]]
    oid = "IF-MIB::ifTable"
    name = "interface"
    inherit_tags = ["source"]

    [[inputs.snmp.table.field]]
      oid = "IF-MIB::ifDescr"
      name = "ifDescr"
      is_tag = true

Change NAME_OR_IP to the device name / IP address and the COMMUNITY to the configured SNMP community on the device and Telegraf will pull data from the switch every 60 seconds.

I put one of these configuration files in the

\etc\telegraf\telegraf.d

directory for each device. I use the device name as the file name. So for network switch ns1, the configuration file is

\etc\telegraf\telegraf.d\ns1.conf

At home, the network has 4 switches and there are 4 .conf files in the telegraf.d directory. The inputs.snmp plugin handles all the .conf files and processes the data from all the network devices as expected.

The second Telegraf plugin I often use is inputs.exec. This will launch a program and collect the output to send to the influx database. CSV, JSON, etc. all work to feed the Influx engine.

A typical configuration file looks like:

[[inputs.exec]]
  commands = [
    "/usr/local/bin/purpleair_json.py https://www.purpleair.com/data.json?show=DEVICEID&key=APIKEY"
  ]

  interval = "60s"
  timeout = "10s"
  data_format = "json"
  name_suffix = "_purpleair"
  tag_keys = [
    "ID",
  ]

In this case, the exec will run the /usr/local/bin/purpleair_json.py program and capture the data from a PurpleAir device every 60 seconds.

The problem is that the inputs.exec plugin doesn’t allow for multiple instances as with the inputs.snmp plugin. If there are more than one .conf file with inputs.exec, only the last one read by telegraf will be used. As such, more than one program cannot be used to feed via telegraf into influxdb. Rather annoying.

To get around this, I create another instance of the telegraf service. That includes a new systemd service file, a separate /etc/telegraf_EXECNAME folder and supporting configuration files.
In /lib/systemd/system/telegraf_EXECNAME.service:

[Unit]
Description=The plugin-driven server agent for reporting metrics into InfluxDB
Documentation=https://github.com/influxdata/telegraf
After=network.target

[Service]
EnvironmentFile=-/etc/default/telegraf_EXECNAME
User=telegraf
ExecStart=/usr/bin/telegraf -config /etc/telegraf_EXECNAME/telegraf.conf -config-directory /etc/telegraf_EXECNAME/telegraf.d $TELEGRAF_OPTS
ExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure
RestartForceExitStatus=SIGPIPE
KillMode=control-group

[Install]
WantedBy=multi-user.target

In the /etc/systemd/system/multi-user.target.wants directory, a symbolic link to the new services file:

cd /etc/systemd/system/multi-user.target.wants/
ln -s /lib/systemd/system/telegraf_EXECNAME.service .

In the /etc/telegraf_EXECNAME/telegraf.conf:

[agent]
  interval = "10s"
  round_interval = true
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_interval = "10s"
  flush_jitter = "0s"
  precision = ""
  debug = true
  logtarget = "file"
  logfile = "/var/log/telegraf/telegraf_EXECNAME.log"
  logfile_rotation_interval = "1d"
  logfile_rotation_max_size = "50MB"
  logfile_rotation_max_archives = 10
  hostname = ""
  omit_hostname = false


[[outputs.influxdb]]

[Add the needed options to the influxdb section for where the influxdb is hosted]

Note that this .conf file has removed all the collection information for the localhost, that remains in the original telegraf instance.

The .conf for the inputs.exec plugin are placed in

\etc\telegraf_EXECNAME\telegraf.d\EXECNAME.conf

To kick off the new service:

systemctl daemon-reload
systemctl enable telegraf_EXECNAME
systemctl start telegraf_EXECNAME

[In all the examples above, replace EXECNAME with a name that describes what’s being run.]

Creating multiple instances of the Telegraf service is annoying but it does allow me to collect the data from multiple places by running programs that reach out, gather the data and format for use with Telegraf and then into an InfluxDB database. See https://github.com/pkropf/telegraf for some examples.

Tags : , ,

Finding Raspberry Pi’s

I’m using Raspberry Pi’s for all sorts of projects. They’re fabulous, inexpensive computers that run Linux and can interact with the physical world. But sometimes, I have trouble finding them on my network. I know they’re there but I don’t have any idea of their IP address. For Pi’s that are connected to the display, keyboard and mouse, this isn’t much of a problem. But when the Pi is running headless, it’s a bit annoying trying to find it.

An easy way is to use nmap to find the all hosts on the local network with a specific MAC address prefix. For Raspberry Pi’s, there are two: B8:27:EB for older RPy models 1, 2, 3 and DC:A6:32 for RPy 4.

#! /bin/sh

nmap -sP 192.168.1.0/24 | awk '/^Nmap/{ip=$NF}/B8:27:EB/{print ip}'
nmap -sP 192.168.1.0/24 | awk '/^Nmap/{ip=$NF}/DC:A6:32/{print ip}'

nmap will find hosts on the specified network and awk will pull out the IP address of the host if the MAC address prefix matches those of the RPy’s.

Note that you’ll need to change the 192.168.1.0/24 to match your local network.

Black Rock City Wifi Summit 2018

The Black Rock City Wifi Summit 2018 took place last Wednesday. As with previous summits that I’ve attended, it was an interesting discussion with a Burning Man tech staff, artists, and various theme camp representatives. The venue was the Thunderdome conference room at the Burning Man Headquarters.

Rolf (sp?) with the org lead a general presentation on the goals, issues and plans for this coming year. In general, he’s asking for frequency coordination to help facilitate access by everyone, to lower noise and such.

The past two years has had troubles with connectivity. For the most part, things just didn’t work. Connecting a NanoBeam to the sector antennas on the NOC tower didn’t work. The ISP had major routing issues and they were late in bringing the backbone online.

The plan for this year is to provide configuration files before heading out to the playa. These are designed to configure a NanoBeam NBE-5AC-GEN2. Other Ubiquiti gear may work but they’re testing and providing configuration for the NanoBeam.

The link between a NanoBeam and the NOC tower gear is on the 5Ghz band. The org is requesting that city participants stay off the 5Ghz band to help facilitate infrastructure connections. Local wifi in camps, art installations, mutant vehicles, etc. should be on the 2.4Ghz band. If the access point provides both 2.4Ghz and 5Ghz access, the org requests that the 5Ghz band be disabled. Doing so will help to keep the noise floor lower on that band.

If you already have a project in the works that is using 5Ghz for communications, don’t fret too much. The org will not be using the upper most channel on the 5Ghz band. That’s 5.825Ghz with a 20Mhz bandwidth. It should be easy enough to configure any radio gear already in use to use that channel. Hopefully there won’t be too much interference with other users on the band.

The plan is for the backbone to be live by 8/20. It consists of (2) 130Mbps connections to the ISP. For folks arriving on playa before gate opens, net access has the potential of helping communications greatly. For a city with a population north of 70,000, it’s going to be interesting to see how the available bandwidth holds up. I can easily see throttling of non-org access occurring as the week progresses. That said, I’m glad that the Burning Man org is working to share the resources they have with the city at large.

There was a short discussion on power. The gist is that the Ubiquiti radios want a stable 24v power source. Grounding the radios is also a good thing. That means driving a copper round bar into the playa 2′ – 3′. And use a surge suppressor. There is lightning and static on the playa that can quickly turn the gear into used carbon.

There was also a mention that microwave based communication equipment doesn’t like to sway. So using a pole that’s too high and moves in the wind will cause connectivity issues with the NOC tower.

If you’re planning to attempt local wifi via the org’s backbone, here’s the hardware you’ll most likely need. At least this is the gear I’m planning to bring on playa:

  • Ubiquiti Network NanoBeam NBE-5AC-GEN2
  • Ubiquiti Network Unifi AP AC Lite
  • network switch
  • 24v dc-dc converter
  • some power source, most likely 2 solar panels and 2 12V deep cycle batteries
  • surge protector for use between the NanoBeam and the switch
  • grounding rod
  • mast / tower along with equipment to secure it

It sounds like the org will be using Ubiquiti Network Rocket Prism AC radios behind their sector antennas. I’m not sure if I can gain access to one before heading out to the playa but it would be nice to test the gear and configuration before heading to the dust.

The org asked for the community to help each other out during the week with doctors hours. Basically we define a schedule and recruit volunteers who are willing to be a network doctor. When someone on playa has an issue, they can come to one of these doctors for help. There was also a mention that doctors may also want to be on a particular MURS radio channel during their office hours. I’m intrigued by this idea and intend to host some time at my camp.

There was also a discussion on the use of APRS for tracking mutant vehicle telemetry data. Someone mentioned putting together an on-playa web service that provides a map of the city along with locations of mutant vehicles or anything else using APRS. Unfortunately I didn’t catch the person to talk more before the summit ended but I really like the idea.

As with many summits like this, my to-do / wish list has just expanded:

  • create a local status dashboard for the network connection
    • raspberry pi based
    • ping to local nanobeam
    • ping to tower antenna
    • ping to google
    • current bandwidth in use through local radio
    • local radio ip address
  • bring some fox hunting gear for 2.4Ghz and 5ghz
  • set up a server with ubiquiti’s access point management system (unifi controller?)
  • configure for a captive portal w/ timeout
  • allow other access points to be adopted and push a stable, usable configuration to them
  • host network doctor’s hours
  • test the configuration on litebeams (LBE‑5AC‑Gen2) as I have a couple left over from another project
  • find my murs radios and verify that they still work
  • find a good dc-dc power filter

Good luck and come visit me at Frozen Oasis, we have killer margaritas!

Tags : , ,

Nagios Monitoring of Directories

I had a recent occurrence at work that caused me to look around for a tool to monitor a directory for any changes made. Since there didn’t seem to be anything out there, I created a check called dirchanged. It looks at all the files in a directory and creates an sha256 has of the names and contents of the files. That hash is compared to a known value to determine if there have been any changes made.

There are a couple of issues with this check specifically that it doesn’t look into subdirectories and that the hash for comparison is passed on the command line from within the Nagios configuration files. I think the first issue will be fixed soon enough w/ a flag to indicate if the directory tree is to be traversed. The second issue is more cumbersome in that the hash value has to be stored somewhere. I’m not yet certain that putting it in the Nagios configuration files is better than putting it somewhere on the target file system. From the security standpoint, having the check not stored on the target file system is better, much less chance of it being changed by bad guys.

I’ll let it run for a while and see how it behaves and if changes are warranted.

Tags : ,

Down, moved and back once again

I didn’t realize that the site was down, that’s not good.

I discovered this when I started to migrate to a new server on Digital Ocean. I like their service level, server configurations, and very reasonable costs. Plus the fact that they run everything on SSD makes it all nice and blazingly fast 😉

With the site now fully migrated, hopefully things will be back to normal. And that I’ll be posting again. So many ideas and projects to share! And I need to get a Nagios monitor in place to let me know the next time the site goes sideways.