Blog

Christian Wagner: Leveraging uncertainty in cyber security

Christian Wagner

Christian Wagner

The key to the plot of Philip K. Dick’s story Minority Report (spoiler alert) is that there were three precognitives, and in cases where two of them agreed sometimes it was critical to look at what the dissenter was saying.

Considering dissenting outliers and managing uncertainty are part of the new EPSRC and NCSC-funded RISCS project Leveraging the Multi-Stakeholder Nature of Cyber Security, led by Christian Wagner from the Lab of Uncertainty in Data and Decision Making (LUCID) at the University of Nottingham. The goal is to establish the foundations for a digital “Online CYber Security System” (OCYSS) that will offer decision support by rapidly bringing together information on system vulnerabilities and alerting organisations that may be affected by them.

For the last five to ten years, Wagner, whose background is in working with uncertain data such as that collected by sensors in robotics or captured explicitly from people, has been working with colleagues from the University of Nottingham on problems of interest to CESG, NCSC’s precursor. “A key aspect they were interested in,” he says, “was that when the vulnerabilities of systems are assessed, usually multiple cyber security experts look at them, some internal, some external. Commonly, they break down these systems by attack vectors and attack paths. Then each path is split into steps – hops. Compromising a system may involve multiple paths and hops, so what they were interested in initially was – how do we capture the unavoidable uncertainty in these assessments and what are the patterns in the differences between individual experts’ assessments? ”

In practice, experts in vulnerability assessment are not asked to fix the systems but to highlight potential weaknesses. The fundamental questions, therefore, are first, how confident we can be in the assessments they are providing, and second, how we should deal with the variability between multiple people assessing the same system? If their assessments are different, how do we know which assessment we should follow? The reports made by Philip K. Dick’s three “precogs” were simply studied for majority agreement – which turned out to be the problem. That’s not as valid an approach in cyber security, where often one individual may hold key knowledge, or experts from different areas may be aware of different key aspects. A final key question: what is the minimum – or ideal – number of expert opinions needed to give you confidence that you’ve got the right assessment?

At first glance, the prior work Wagner cites as relevant appears to have little to do with security. The 2015 paper Eliciting Values for Conservation Planning and Decisions: A Global Issue was published in the Journal of Environmental Management. Yet a closer look finds the connection: he and co-authors Ken J. Wallace and Michael J. Smith discussed the difficulty of decision-making about the conservation and use of natural resources in the face of competing human values and, particularly, how to incorporate the value of human well-being into otherwise technical deliberations. Especially relevant was the way stakeholders were asked to rate the importance of each of the values they were considering.

Few of us can have escaped the five-point scales so often used in such surveys, the ones that ask you to locate your reaction along a given range – for example, from strongly agree to strongly disagree, or from very important to very unimportant. These Likert scales are the most common tool in this type of survey, and modern pollsters love them because they’re easily turned into numbers representing our reactions that can be simply compared and quantitatively manipulated. Yet even in the limited context of rating your satisfaction with a hotel room the approach is frustratingly constraining: was that bed a 3 or a 4? Is there a way to indicate that the mattress was great but the bed’s positioning right under the air conditioning unit doomed its occupant to discomfort? In a more complex realm, like conservation, where the values are fuzzier – human physical health, or the spiritual value of the outdoors – the Likert scale approach throws away the very real uncertainty people feel in marking these assessments and their observable hesitation in picking a single number.

“If you watch people completing these,” Wagner says, “what usually happens is they display hesitancy between points.” He and his co-authors set out to see if they could get something more out of these quantitative techniques. Their solution, which they have since applied to NHS quality of life assessments, asks subjects to instead draw an ellipse around a point on a continuous scale – for example, 1 to 10. The position of ellipse indicates the response; its width indicates the scale of the uncertainty the subject associates with their answer. “It’s intuitive to people to doodle,” Wagner says. In assessing the results, the researchers measure the uncertainty surrounding the response as well as the response itself.

Wagner's diagram showing how to capture uncertainty

Capturing uncertainty via a modified Likert scale

The result is a level of quantitative information that goes beyond standard questionnaire approaches, coming a very small step closer to powerful qualitative techniques such as interviews, which can’t be used for this type of work because they cannot be scaled up for the continuous collection of large samples. Wagner’s process captures more information, and rapidly; people actually find this method of answering survey questions faster than using a Likert scale. Research partner Carnegie-Mellon is involved in this aspect: what is it people like, and how well does the technique capture the uncertainty in people’s minds as opposed to just showing that people like circling things? These captured intervals also offer greater opportunities for algorithmic analysis and modelling.

The statistical results from initial work suggest that asking five experts to assess a system provides a good level of coverage that can inspire confidence.

Beyond this, a crucial question addressed by the LUCID team when considering attack vectors and their component stages – or hops – is: if you know the vulnerability of each individual hop can you infer the vulnerability of the overall vector? This isn’t as straightforward as it might seem because of uncertainty and dependencies between hops: hops A and B may be fine in isolation, but together they may be a problem.

In early work, the LUCID team collected vulnerability assessments of attack vectors and associated hops for a series of scenarios, proceeding to show how the individual hop vulnerabilities can be exploited to reproduce the overall vulnerability of a vector. “It’s neat that we could show how a well-known mathematical operator – an ordered weighted average – can replicate the judgement of experts when combining the vulnerability of hops into an overall vulnerability assessment for a vector,” he says. “In practice, it means that experts expect the hop with the highest vulnerability to have the largest influence over the vulnerability of the overall vector – which intuitively makes sense.”

One of the practical uses of this work is a really important issue facing the NCSC (formerly CESG): there simply aren’t enough expert assessors to look at all the systems we have. The country’s thousands of systems may each have different – and changing – attack paths, but all these paths are made up of hops and the hops may be shared across many systems. In an ideal world, one could reduce the required effort by looking at only the constantly changing hop vulnerabilities and aggregating them to tell companies what their highest-risk paths of attack are.

“It’s not a panacea,” Wagner warns. “We’re not solving cyber security.” However, “The system specifically addresses how we deal with not having enough experts to look at all these vectors and how we might minimise the time it takes for new knowledge on vulnerabilities to be available to relevant system users.” Eventually, the idea behind OCYSS is to have an online platform that will make it possible for experts and other stakeholders such as software providers to combine their insights rapidly and efficiently .

What is crucial is retaining uncertainty: “In a lot of areas of science there is an expectation that there is a crisp truth,” Wagner says. In this case, the fundamental point is that there may be no one crisp answer, just as you can’t supply a single definitive number to answer the question, “How safe is running?” Therefore, it’s essential to preserve the uncertainty of those inputs all the way through the process. Similarly, it’s important not to ignore outliers: even if four out of your five experts choose one thing and just one chooses another, it’s important to spend a little time looking at that minority report.

About Wendy M. Grossman

Freelance writer specializing in computers, freedom, and privacy. For RISCS, I write blog posts and meeting and talk summaries