The tricky thing about antivirus software is that it seems like a fairly mysterious piece of software. How does it work nowadays? What is it meant to do and what is outside its purview? Every company has their own secret sauce that makes their technology just a little bit different from everyone else, so it gets increasingly difficult to say what makes antivirus go.
Many people know only that anti-virus is supposed to protect them, but they don’t know how and can think it should protect against any nasty thing under the sun. Some people understand that anti-virus software comprises some sort of signature-scanning technology, and a surprising number of people believe this is where anti-virus technology stopped. Any modern anti-virus is going to have some level of additional functionality beyond simple signature scanning, even if you exclude security software suites that have separate, non-scanner technology like a firewall.
In this article, we’ll talk about the three most common types of anti-virus scanning detections you’ll find out there and what they do. It’s not, by any stretch of the imagination, intended to be exhaustive. But it’ll hopefully give you a glimpse into what’s out there and what you should expect your anti-virus product can do.
Specific detection is what a lot of people think of when they think of anti-virus scanners. It looks for known malware by a specific set of characteristics. Each malware uses its own code to do its thing. To detect malware specifically, the scanner looks for that signature in a fairly particular place. This technique can be fast, because it can exclude clean files pretty quickly if a researcher does things right, but it’s also fairly easy to evade this sort of detection. Change the code, move it somewhere else, encrypt it, or hide it in some other way, and then the threat doesn’t get detected anymore.
As malware is getting much more crafty and prevalent, researchers must get more creative to quickly identify known-bad files, lest they be overwhelmed by the ton of samples that arrive every day. So the next step is to group samples by what’s called “families,” which means they’re related by a common code-base.
Generic detection looks for malware that are variants of known families, which are often created by a common group of programmers. Within a generic detection, there is usually common functionality and occasionally common signatures, and detection is meant to catch both known malware samples as well as new samples based on that known code-base.
Since malware is all about the money these days, the best way for malware authors to get the biggest bang for buck is to reuse their code-base. They may add or change minor functionality, or they may just move things around to evade specific detection. There are a lot of different, common malware that have freely available, open-source software or malware-creation kits that allow for quick and easy use.
So it makes sense for anti-virus scanners to look for common properties of those popular malware families or known malicious behavior if they want to be have any hope of keeping up. These generic detections can be fairly broad or fairly specific. For example, it could scan for known exploit code that could be added to known malware or brand-new creations. Or it could look for specific packers that are used by only one malware family.
Heuristic detection is scanning for previously unknown viruses by looking for known suspicious behavior or file structures. This is where scanning gets more broad and speculative. Basically, heuristic detection applies the “smell test” to files. Is there anything about this file that looks hinky? Are its structures odd in a way that would imply that it’s trying to hide something? Does it behave in a way that benign files generally don’t?
Heuristic scanners usually add up a number of things, possibly positive and definitely negative, in order to give the program a rating. If the rating leans towards identifying a lot of bad behavior, it’s likely the file is malicious. Sometimes a scanner determines the rating by viewing the file statically (on disk) or dynamically (in action, in a virtual environment or a sandbox). Some scanners will allow you to dial up or down your preferred level of paranoia.
Did you notice how many times I used the word “known”? That was no accident. Anti-malware scanners look for known quantities: code, behavior, structures, or signatures. The way anti-virus scanners interact with the system, they are not designed to look for unexpected qualities. Sometimes anti-virus scanners will detect new malware samples or families because they are looking for known badness. And as anti-virus scanners (and researchers) get smarter, this does happen more often than it did way back when. But this is why you need layered defenses. There are other tools, such as firewalls, that interact with the system in such a way that they can warn you about behavior that is simply anomalous, or block things that you have not specifically authorized.
These different detection techniques bring with them pros and cons. The more specific a detection, the quicker and more accurate it is. The broader the detection, the more likely it is that an innocent file will be flagged as suspicious. Searching for several broad conditions can make detection more accurate, but more slow.
Anti-virus scanners all do a delicate balancing act between speed, accuracy, and proactivity, mostly favoring one direction or another depending on the specific needs of their customer base. Home users are more likely to be hit with common threats, so they probably don’t need super-duper paranoid heuristics. Products that focus on home users can be speedier and less proactive. Corporate users have to deal with targeted attacks that may not be seen anywhere else in the world, so they can sacrifice a little speed in the name of extra security. Products that focus on smaller businesses naturally fall somewhere in between.
Hopefully this gives you a better idea of what is going on behind the scenes of your anti-virus scanner so it seems more friendly and approachable and less mysterious.