Bit Flock–A cloud what?

In a previous post I said that “Bit Flock is a cloud service for hard drive S.M.A.R.T. health data”.

I should also add the fact that it’s free.

bitflock_ssd

Alright, I have to admit, it seems like everything these days has a cloud service. So the question is, who needs Bit Flock? With so many good paid and free S.M.A.R.T. utilities out there, why another one?

In order for something to be useful it needs to solve a concrete problem and it needs to do it well.

Let’s start at the beginning.

Why should anyone care?

If you’re reading this, chances are you own a computer. All your data is stored either on a Solid State Drive or a Hard Disk Drive. Both of these devices will stop working after some number of years. At that time, the data on those drives may become damaged and or inaccessible. Drive recovery shops are very expensive. I’ve heard anecdotal stories of some old 20MB hard drive still going strong today, but let’s face it, modern hard drives don’t work for that long.

You should care because you don’t want to loose your data. Even if you have a backup you will want to know if your current drives are in trouble. A hard drive that’s not healthy can cause system crashes, performance problems, or worse, data corruption.

What is S.M.A.R.T.?

There’s a great article about S.M.A.R.T. on wikipedia, but I’ll summarize. At the end of the day S.M.A.R.T. has only one job, to tell your system whether your hard disk is going to fail within the next 24 hours. The answer to that question is either yes or no. If it’s yes, then you better hope that you have a backup.

But S.M.A.R.T. isn’t good at predicting drive failure?

That’s the common knowledge floating around on the Internet these days. Based on my experience, I absolutely believe that it’s true. But let’s redefine the problem.

What is important to me, and I think computer user everywhere, is not to know when the hard drive needs replacement, it’s to know when my data is in danger of being lost. These are not necessarily the same things.

Look at it this way, if you saved an important file to your documents folder and then come back a week later and can’t access it, is that a problem? S.M.A.R.T. wouldn’t think so. It has a backup plan for situations like this. It remembers the sector on the drive that can’t be read and the next time you write to it, it will be “re-mapped” to a known good one from the spare pool.

In other words, it assumes that it’s ok for your important file to get lost. It won’t sound any alarm bells or set off any warnings but will instead quietly cover up the evidence the next time you write anything to that damaged sector, self-repairing the drive and making it appear as if nothing had gone wrong.

While S.M.A.R.T. is not good at detecting drive failure, it’s generally really good at detecting when the situation above happens. It just doesn’t do anything about it. What you need is a secondary utility to go in and look at the data to let you know that something like that happened.

This is just one example, and Bit Flock is designed to look for these kinds of problems and report them in a clean and user friendly manner, while at the same time not dumbing down the information.

What about other S.M.A.R.T. clients, why do we need a cloud based solution?

The problem with detecting these types of problems is that you need to dive deep into the different SMART attributes and interpret them in specific ways. That’s because each manufacturer uses SMART in their own proprietary manner. The meaning of an attribute may change from one drive to another and the format of the attribute data may also change from one manufacturer to another. Showing you the actual numbers of the attribute doesn’t tell you enough.

I’ll give you two concrete examples:

1. Seagate

On some Seagate drives, the Seek Error Rate attribute has a special encoding where it’s actually a percentage over a time period.

A SMART client that’s not aware of this fact will just show you some large number. If it’s a good client, it might explain what a seek error rate is, leading you to assume that your drive is having a huge number of seek errors and may fail at any moment.

This would be incorrect. First of all, seek errors are normal for a hard disk, they happen all the time. Second of all, that particular number is not a number at all. On specific Seagate drives it’s actually a rate. Properly decoded it can be interpreted something like “< 0.01% (16 / 261039)”. That means that for the past 261,039 seeks, 16 of those were not successful and the drive had to try again. This is exactly what Bit Flock does for you, and more.

Having a seek error rate of < 0.01 % is not a problem, but having a seek rate of 5.21%, combined with 14 reallocated sectors, should probably alarm you. Bit Flock will let you know that your data may be in danger with a friendly warning.

This example is taken from 2 real Seagate hard drives part of Bit Flock.

Bit Flock can detect this type of problem because:

  1. It knows that you have a particular Samsung hard drive that encodes it’s seek error rate in this way.
  2. It understands that a 5% seek error rate is too high, given that other similar models in the flock don’t experience this.

2. Solid State Disks

Solid state disks support SMART just like any other hard disk, but their SMART data needs to be interpreted differently. For example, the “Reallocated Sector Count”, which normally implies data loss, is not that critical on a SSD. That’s because a SSD reallocates damaged sectors before writing your data, unlike a HDD that typically reallocates sectors on the next write after some of your data can’t be read. This means that you don’t typically lose data when a SSD reallocates sectors.

Also, a SSD may have a lifetime indicator which can be shown as a percentage of useful life left. Just like the seek error rate above, this percentage may be encoded in different ways for different manufacturer.

So what’s the point of all of this?

The bottom line is that you need a SMART client that can identify the model of disk that you have, and one that can look for the thing that’s most important to you, data loss.

It needs to be able to intelligently compare your data with data from other similar hard drives in order to determine what’s abnormal.

It needs to be able to quickly adapt to new drives on the market and update existing data if more information becomes available about any specific drive.

This implies a large database of drive types and attribute interpretation data that can be updated on the fly.

Bit Flock has this right now. It’s free. Go check it out.

I’ll talk more about how Bit Flock identifies your drive and what a similar drive type really means in a later post.