Communications of the ACM, September 2014, Vol. 57, No.9
contributed articles
DOI:10.1145/2654847
Copyright By PowCoder代写 加微信 powcoder
Defense begins by identifying the targets likely to yield the greatest reward for
an attacker’s investment.
BY CORMAC HERLEY
Security, Cybercrime, and Scale
A TRADITIONAL THREAT model has been with us
since before the dawn of the Internet (see Figure 1). Alice seeks to protect her resources from Mallory, who has a suite of attacks, k = 0; 1, … ,Q – 1; now assume, for the moment, (unrealistically) that Q
is finite and all attacks are known to both parties. What must Alice do to prevent Mallory gaining access? Clearly, it is sufficient for Alice to block all Q possible attacks. If she does, there is no risk. Further, assuming Mallory will keep trying until he exhausts his attacks (or succeeds), it is also necessary; that is, against a sufficiently motivated attacker, it is both necessary and sufficient that Alice defend against all possible attacks. For many, this is a starting point; for example, Schneider14 says, “A secure system must defend against all possible attacks, including those unknown to the defender.” A popular textbook13 calls it the “principle of easiest penetration” whereby “An intruder must be expected to use any available means
of penetration.” An often-repeated quip from Schneier, “The only secure computer in the world is unplugged, encased in concrete, and buried under- ground,” reinforces the view.
How did Mallory meet Alice? How does this scale? That is, how does this model fare if we use it for an Internet- scale population, where, instead of a single Alice, there are many? We might be tempted to say, by extension, that unless each Alice blocks all Q at- tacks, then some attacker would gain access. However, a moment’s reflec- tion shows this cannot always be true. If there are two billion users, it is nu- merically impossible that each would face the “sufficiently motivated” per- sistent attacker—our starting assump- tion; there simply are not two billion at- tackers or anything close to it. Indeed, if there were two million rather than two billion attackers (making cybercrimi- nals approximately one-third as plenti- ful as software developers worldwide) users would still outnumber attackers 1,000 to one. Clearly, the threat model in Figure 1 does not scale.
Sufficient ≠ necessary-and-sufficient.
The threat model applies to some us- ers and targets but cannot apply to all. When we try to apply it to all we con- fuse sufficient and “necessary and suf- ficient.” This might appear a quibble, but the logical difference is enormous and leads to absurdities and contradic- tions when applied at scale.
First, if defending against all attacks is necessary and sufficient, then failure
key insights
A financially motivated attacker faces
a severe constraint unrelated to his technical abilities; the average gain minus average cost of an attack must be positive.
Without a cost-effective way of telling profitable targets from unprofitable targets, an attack is no use to a financially motivated attacker.
The difficulty of finding profitable targets is extreme when density is small; a 10x reduction in the density of profitable targets generally results in much more than 10x reduction in economic opportunity for the attacker.
64 COMMUNICATIONS OF THE ACM | SEPTEMBER 2014 | VOL. 57 | NO. 9
ILLUSTRATION BY DANIEL HERTZBERG
SEPTEMBER 2014 | VOL. 57 | NO. 9 | COMMUNICATIONS OF THE ACM 65
contributed articles
to do everything is equivalent to doing nothing. The marginal benefit of al- most all security measures is thus zero. Lampson11 expressed it succinctly: “There’s no resting place on the road to perfection.”
Second, in a regime where every- thing is necessary, trade-offs are not possible. We have no firm basis on which to make sensible claims (such as keylogging is a bigger threat than shoulder surfing). Those who adhere to a binary model of security are un- able to participate constructively in trade-off decisions.
Third, the assumption that there is only a finite number of known attacks is clearly favorable to the defender. In general, it is not possible to enumerate all possible attacks, as the number is growing constantly, and there are very
likely to be attacks unknown to the de- fenders. If failing to do everything is the same as doing nothing (and Alice cannot possibly do everything) the situ- ation appears hopeless.
Finally, the logical inconsisten- cies are joined by observations that clearly contradict what the model says is necessary. The fact that most users ignore most security precautions and yet escape regular harm is irreconcil- able with the threat model of Figure 1. If the model applies to everyone, it is difficult to explain why everyone is not hacked every day.
Modifying the threat model. The threat model of Figure 1 might appear to be a strawman. After all, nobody seriously believes that all effort short of perfection is wasted. It is doubtful that anyone (especially the researchers
quoted earlier) adheres to a strictly bi- nary view of security. Rather than insist that the threat model always applies, many use it as a starting point appro- priate for some situations but overkill for others. Some modification is gen- erally offered; for example, the popu- lar textbook mentioned earlier13 codi- fies this as the “principle of adequate protection,” saying “[computer items] must be protected to a degree consis- tent with their value.” A more realistic view is that we start with some variant of the traditional threat model (such as “it is necessary and sufficient to defend against all attacks”) but then modify it in some way (such as “the defense effort should be appropriate to the assets”).
However, while the first statement is absolute, and involves a clear call to action, the qualifier is vague and im- precise. Of course we cannot defend against everything, but on what basis should we decide what to neglect? It helps little to say the traditional threat model does not always apply unless we specify when it does, and what should be used in its place when it does not. A qualifier that is just a partial and impre- cise walkback of the original claim clar- ifies nothing. Our problem is not that anyone insists on rigid adherence to the traditional threat model, so much as we lack clarity as to when to abandon it and what to take up in its place when we do. Failure to be clear on this point is an unhandled exception in our logic.
This matters. A main reason for el- evated interest in computer security is the scale of the population with se- curity needs. A main question for that population is how to get best protec- tion for least effort. It is of the first im- portance to understand accurately the threats two billion users face and how they should respond. All models may, as it is said, be wrong, but failure to scale, demands of unbounded effort, and inability to handle trade-offs are not tolerable flaws if one seeks to ad- dress the question of Internet threats. The rest of this article explores modi- fications of the traditional threat model.
Financially Motivated Cybercrime
The threat model of Figure 1 tried to abstract all context away. There is no reference to the value of the re- source, the cost of the attack, or how
Figure 1. In a traditional threat model, a single user faces a single attacker; given a suf- ficiently motivated attacker it is necessary and sufficient to block all attacks.
2. Dividing attacks as financial and nonfinancial; here, financial attacks are further divided into scalable and non-scalable.
Non-scalable
Non-financial
66 COMMUNICATIONS OF THE ACM | SEPTEMBER 2014 | VOL. 57 | NO. 9
Mallory came to focus his attention on Alice. The model does not distin- guish between finite and infinite gain or between zero and non-zero cost. Abstraction like this is useful. It is far more powerful if we can solve the general problem without resorting to specifics. Unfortunately, the attempt breaks down at scale; the binary view of security must be qualified.
When money is the goal, it seems reasonable to assume Mallory is “suf- ficiently motivated” when the expected gain from an attack exceeds the cost. I now examine whether focusing on the sub-problem of financially moti- vated cybercrime will allow progress on the questions of exactly when and how to deviate from the binary model. I propose a bifurcation of attacks (see Figure 2) into those that are financially motivated and those that are not.
Profit at scale. A main reason for con- cern with cybercrime is the scale of the problem. We might be less concerned if it were a series of one-off or isolated at- tacks rather than an ongoing problem. Only when the one-time costs can be amortized over many attacks does it be- come a sustained phenomenon that af- fects the large online population. To be sustainable there must first be a supply of profitable targets and a way to find them. Hence, the attacker must then do three things: decide who and what to attack,successfullyattack,orgetaccess to a resource, and monetize that access.
A particular target is clearly not worthwhile if gain minus cost is not positive: G – C > 0. Thus, when attacks are financially motivated, the average gain for each attacker, E{G}, must be greater than the cost, C:
E{G} – C > 0. (1)
C must include all costs, including that of finding viable victims and of mon- etizing access to whatever resources Mallory targets. The gain must be av- eraged across all attacks, not only the successful ones. If either E{G} → ∞ or C = 0, then equation (1) represents no constraint at all. When this happens we can revert to the traditional threat mod- el with no need to limit its scope; Alice can neglect no defense if the asset is in- finitely valuable or attacks have no cost.
That gain is never infinite needs no demonstration. While it should
be equally clear that cost is never pre- cisely zero, it is common to treat cyber- crime costs as small enough to neglect. Against this view I present the follow- ing arguments: First, if any attack has zero cost, then all targets should be at- tacked continuously, and all profitable opportunities should be exhausted as soon as they appear. Instead of “Why is there so much spam?,” we would ask “Why is there so little?,” as it would overwhelm all other traffic. Second, while a script may deliver victims at very low cost, the setup and infrastruc- ture are not free. Even if we grant that a script finds dozens of victims in one day (the Internet is big after all) why should the same script find dozens more the next day, and again the day after? Why should it do so at a sus- tained rate? Finally, as discussed later, while scripts might achieve access to resources at low cost, the task of mon- etizing access is generally very difficult. Thus, I argue that not only is attacker cost greater than zero, it is the princi- pal brake on attacker effort.
Attacks that scale. While I have ar- gued that C > 0, it is clear that the ma- jority of users are regularly attacked by attacks that involve very low cost per attacked user. I find it useful to seg- ment attacks by how their costs grow. Scalable attacks are one-to-many at- tacks with the property that cost (per attacked user) grows slower than lin- early; for example, doubling the num- ber of users attacked increases the cost very little:8
C(2N) 2 · C(N). (2)
Many of the attacks most commonly seen on the Internet are of this type. Phishing and all attacks for which spam is the spread vector are obvious examples. Viruses and worms that spread wherever they find opportunity are others. Drive-by download attacks (where webpage visitors are attacked through browser vulnerabilities) are yet more. Non-scalable attacks are ev- erything else. In contrast to equation (2), they have costs that are proportion- al to the number attacked: C(N) α N. I add the bifurcation, into scalable and non-scalable attacks, to Figure 2.
Constraints on Financially Motivated Attackers
A financially motivated attacker must decide who and what to attack, attack successfully, then monetize access. The better these activities can be scaled the greater the threat he represents to the online population. I now examine some of the difficulties and constraints in scalable answers to these questions.
Scalable attacks (attack everybody).
An alternative to solving the problem of deciding whom to attack is to at- tack everyone. Scalable attacks have inherent advantages over non-scalable attacks. They reach large masses at very low cost, and techniques can be propagated easily—advantages that come with severe constraints, however. Scalable attacks are highly visible; in reaching millions going unnoticed is
contributed articles
Figure 3. Example ROC curve with line of slope T/d = 20. Only operating points to the left
of this line satisfy equation (5) and yield profit. As T/d increases, the true positive rate falls, and fewer viable targets are attacked; for example, with this classifier, when T/d = 104,
less than 1% of viable targets will be attacked.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 False positive rate
SEPTEMBER 2014 | VOL. 57 | NO. 9 | COMMUNICATIONS OF THE ACM 67
True positive rate
contributed articles
difficult. Their broadcast nature gives an alert, to both defenders and other would-be attackers. This attracts com- petition and increases defense efforts.
Scalable attacks are a minority of attack types. It is the exception rather than the rule that costs have only weak dependence on the number attacked. Anything that cannot be automated completely or involves per-target ef- fort is thus non-scalable, as this cost violates the constraint defined in equa- tion (2). Physical side-channel attacks (requiring proximity) are out, as get- ting close to one million users costs a lot more than getting close to one. Labor-intensive social engineering attacks (such as those described by Mitnick12) and the “stuck in London” scam are non-scalable. After an initial scalable spam campaign, the Nigerian 419 scam (and variants) devolves into a non-scalable effort in manipulation. Equally, spear-phishing attacks that make use of information about the tar- get are non-scalable. While the success rate on well-researched spear-phishing attacks may be much higher than the scatter-shot (such as “Dear Paypal cus- tomer”) approaches, they are non-scal- able. Attacks that involve knowledge of the target are usually non-scalable; for example, guessing passwords based on knowledge of the user’s dog’s name, favorite sports team, or cartoon char- acter involves significant non-scalable effort. Equally, attacks on backup au- thentication questions that involve researching where a user went to high school are non-scalable.
While Internet users see evidence of scalable attacks every day, it is actu- ally a minority of attack types that are scalable.
Finding viable targets. Non-scalable attacks resemble the one-on-one at- tacks of the traditional threat model. However, rather than an attacker who is sufficiently motivated to persist, no mat- ter what, we have one who obeys a profit constraint, as in equation (1). The prob- lem (for Mallory) is that profitability is not directly observable. It is not obvious who will succumb to most attacks and who will prove profitable. Since C > 0, the cost of false positives (unprofitable targets) can consume the gain from true positives. When this happens, attacks that are perfectly feasible from a tech- nical standpoint become impossible to
run profitably. The cost and difficulty of deciding whom to attack is almost un- studied in the security literature; how- ever, no audit of Mallory’s accounts can be complete without it. Unless he has a cost-effective way to identify targets in a large population, non-scalable attacks are of little use to Mallory.
Assume Mallory can estimate a probability, or likelihood, of profit, given everything he observes about a potential target. This is the probabil- ity that the target succumbs and access can be monetized (for greater than av- erage cost). Call this P{viable|obs.}. The observables might be address, ZIP code, occupation, and any other factor likely to indicate profitability. Without loss of generality, they can be wrapped into a single one-dimensional suffi- cient statistic.15 We assume the cost of gathering the observables is small rela- tive to the cost of the attack. This makes the problem a binary classification,9 so receiver operator characteristic (ROC) curves are the natural analysis tool; the ROC curve is the graph of true positive rate, tp, vs. false positive rate, fp (an ex- ample is shown in Figure 3).
Let us now examine how the binary classification constrains Mallory. Sup- pose, in a population of size N, a frac- tion P{viable} = d of targets are viable. From Bayes’s theorem (when d is small):
P{obs.|viable}≥ T ·P{obs.|non-viable}
68 COMMUNICATIONS OF THE ACM | SEPTEMBER 2014 | VOL. 57 | NO. 9
P {viable|obs.}=
d d+P{obs.|non-viable}·(1−d)
Thus, only points (fp, tp) on the ROC curve to the left of a line with slope T/d will satisfy Mallory’s profit constraint. To illustrate, a line of slope 20 is shown in Figure 3.
Since the slope of the ROC curve is monotonic,15 as we retreat to the left, tp/fp thus increases; equation (5) can almost always be satisfied for some points no matter how good or bad the classifier. However, as we retreat left- ward, tp decreases, so a smaller and smaller fraction of the true positives, or viable targets, are attacked; for ex- ample, for the classifier in Figure 3, when T = 1/10 (or Mallory needs one attack in 10 to succeed) and d = 10–5 (or one in 100,000 is viable), Mallory requires tp/fp ≥ 104, which happens only for values tp < 0.01, meaning less than 1% of the viable population is observably profitable. As d decreases, Mallory ends up with a shrinking frac- tion of a pool that is itself shrinking.9 Without a very good classifier (with tp high while keeping fp low), most viable victims escape harm.
It is easy to underestimate the diffi- culty of building good classifiers. Real- world examples from other domains illustrate that this is non-trivial; for ex- ample, the false positive rate for mam- mograms is tp ≈ 0:94 at fp ≈ 0:065 (so tp/fp ≈ 14.5).4 For appendectomies it is tp ≈ 0.814 at fp ≈ 0:105 (so tp/fp ≈ 7.8).7 Even with the benefits of decades of effort and millions of examples of both true and false positives, building a classifier is often extremely difficult. This is espe- cially true when the base-rate of sought
P{obs.|viable}
P {obs.|viable} .
P {obs.|non-viable}
(4) This constraint says the observables must be a factor of T/d more common among viable targets than non-viable. If 1-in-10,000 is viable, and Mallory needs one attack in 20 to succeed, he must identify observable features that are T/d = 500x more common in the vi- able population than in the non-viable. The ROC curve gives a geometric in- terpretation. Mallory finds dtpN viable targets in dtpN + (1 – d)fpN attacks. To satisfy the budget constraint, the ratio of successes to attacks must be greater
than T, so (when d is small) we get: tp ≥ T
P{viable|obs.} is proportional to densi- ty; so the difficulty of finding a viable tar- get gets worse as d falls. A set of observ- ables that gives a 90% chance of finding a viable target when d = 0.01 gives only a 0.09% chance when d = 10–5. So observ- ables that promise a near “sure thing” at one density offer a worse than 1,000-to-1 long shot at another.
Mallory presumably decides to at- tack depending on whether or not P{viable|obs.} is above or below some threshold, T. The threshold T will gen- erally be set by a budget if, say, an at- tacker needs one attack in every 1/T (such as 1-in-20 and 1-in-100) to be profitable. Then, from equation (3) he must hav
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com