Using smartmontools – smartd and conky (with cron) to monitor HDD health

conky is one hack of a great monitoring tools, but it misses the use of S.M.A.R.T. technology (as far as I know) to monitor hard-disk health, specifically use of smartmontools.

I present here quite dirty, but simple and easily extensible, way to include information from smartctl in conky output.

Setup

First, you should setup your smartmontools to perform some tests on regular basis, so you have what to monitor.

I use the following settings in my /etc/smartd.conf for my 2 SATA hard-disks:

/dev/sda -a -o on -S on -I 190 -I 194 -I 231 -m root -s (S/../.././02|L/../../6/04)
/dev/sdb -a -o on -S on -I 190 -I 194 -I 231 -m root -s (S/../.././03|L/../../6/05)

Explanation:

  • -a: monitor all SMART features
  • -o on: enables the automatic off-line testing
  • -S on: enables automatic Attribute autosave
  • -I 190: do not log changes in Airflow_Temperature_Cel
  • -I 194: do not log changes in Temperature_Celsius
  • -I 231: do not log changes in power-on hours
  • -m mail: mail warning messages to ‘mail’
  • -s (…): start short test each day at 2am/3am, and long test on Saturdays at 4am/5am

You should also comment out any line that starts with DEVICESCAN in the /etc/smartd.conf.

Note for Ubuntu users:
On default, smartd is not started automatically on system start. You have to enable this in /etc/default/smartmontools by uncommenting the line that says:

start_smartd=yes

What can conky use?

You can always use sudo smartctl -l selftest /dev/sda command to get the results of recent disk tests, but this command is non-trivial for system to execute, and has quite a long output, thus not really useful for print out in conky.

The easiest solution is to use the cron job to get fresh data off of smartctl once an hour (since the tests are executed once a day, this is way enough of a refresh rate), and to store this info in a text file in format suitable for printing in conky.

To do so, simply create script smartctl_out using your favourite text editor (and don’t forget to chmod +x it) with the following content:

#!/bin/bash
text=`smartctl -l selftest /dev/sda | grep '# 1'`;
echo 'sda' ${text:5:5} ${text:25:30} R${text:55:3} > /home/[user]/.smartctl_out.txt;
text=`smartctl -l selftest /dev/sdb | grep '# 1'`;
echo 'sdb' ${text:5:5} ${text:25:30} R${text:55:3} >> /home/[user]/.smartctl_out.txt;

where you replace [user] with your user name, and place it in your /etc/cron.hourly directory.

Notice that the script file does not contain extension (.sh that would be normal in this case). This is due to run-partsbug #38022 – if the script’s file name contains “dot”, the run-parts (that cron uses to execute all scripts in /etc/cron.xxx directories) fails without any warning.

The above script prints out one line of info per hard-disk in the “sda Short Completed without error R00%” format. Notice that the smartctl -l selftest contains “Remaining” percentage of the test, not “Done”, that’s why I include the “R” ahead of the percentage in the print out.

Feel free to edit the script to your liking!

Then in ~/.conkyrc just add (again – format to your liking)

${color grey}S.M.A.R.T. tests:${color}
${exec cat /home/[user]/.smartctl_out.txt}

and you have S.M.A.R.T. test results in your conky print out! 🙂

3 responses to “Using smartmontools – smartd and conky (with cron) to monitor HDD health

  1. Thanks for the post. I was using this instruction – http://sysadmin.te.ua/linux/smartd.html . It has more explanation about separate smartd log files, almost all smartd.conf options and debug all smartd process (email sending, logs output,daemon startup options). Hope it helps to improve.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s