Awais Rashid: Hard problems with real-world impact

Awais Rashid

Awais Rashid

“It’s always important to me that the work I do, although foundational research, has impact on the real world,” says Awais Rashid (Lancaster). “The biggest societal and industrial challenges, such as those of cyber security in a highly connected world, are also interesting academic challenges.”

Rashid calls himself a “late starter” with computers: while he was growing up in Pakistan in the late 1980s, acquiring an early computer such as the BBC Basic was something “my parents never thought they had to do”. He did, however, want to learn more about them. His first brush with computers was the big mainframe at his father’s workplace. His first degree was in electronics engineering. In that course at that time, students didn’t get an opportunity to program until their third year. “I was very impatient,” Rashid says – and so he managed to argue his way into being allowed to use the lab where the programming course was taught at times when it wasn’t being used for teaching. From the books in the university library and ones that he could afford to buy, he started learning first PASCAL and then assembly language for x86 processors. Around then, he also managed to convince his father to buy him a PC.

At this point, “someone smarter than me” told him that there were two ways he could learn more. First, play games, because he would learn how to beat the algorithms behind them. Second, learn how viruses work. Rashid says he’s still not a great game player, but he can usually figure out how to beat the algorithm without having real gaming skills, “to the irritation of people playing against me”. Viruses were new at the time; but the first one (Brain) was actually written in Pakistan. Rashid began collecting them by offering an unpaid “almost-service” working on the infected systems people from around the campus brought him to clean up “for fun”. No one, he says, really knew about “computer security”.

He moved to Britain in 1996 to do a post-graduate degree. He had intended to return to Pakistan after completing his master’s degree in software engineering at the University of Essex. However, not expecting a reply, he sent an email inquiry about doing a PhD to the well-known University of Lancaster software engineer Ian Somerville. In response, Somerville, who was going on holiday, asked if he could interview the following day. “I travelled over, and within two days I had an offer of a place. It was an odd thing to have happened.” Taken together, his first two degrees – electronic engineering and software engineering – gave him a deep, multi-faceted understanding of how computers work. His PhD was in computer science, but really, he says, “It was at the boundary of software engineering, database systems, and programming languages. This mix has set the tone for Rashid’s research career: he likes to work on problems located at the boundaries of different disciplines or sub-disciplines.

Following his PhD he worked on tackling complexity in large-scale software systems, solving the issues inherent in dividing these up into modules and studying how to reason about them. This work led to the influential 2003 paper Modularisation and Composition of Aspectual Requirements. This paper was the result of collaboration with Ana M. D. Moreira and Joao Araujo, both based in Lisbon, and sought to understand system-wide properties such as software compatibility – how different modules interact with each other and the trade-offs that have to be made. Rashid notes two things about this paper. First, it was born of a pub conversation. Second, in order to work with his Lisbon colleagues, Rashid applied for a £1,000 grant to enable his travel. That small bit of funding, he says, had outsized results in the form of not only their research but the follow-up work done around the world by many others that eventually resulted in two large European projects. This work also gave Rashid his first brush with security as a key problem.

Over the next five or six years, he worked primarily on general software engineering and modularity, as well as program language design. In 2008, he and several other researchers began looking at peer-to-peer file-sharing networks (at the time, Gnutella and eDonkey) and came across what looked like child sex abuse material being shared. The group started thinking about the challenges of traffic analysis and began talking to law enforcement. The just-being-formed Child Exploitation and Online Protection Centre had some tools, but were interested in finding ways to support law enforcement in analysing the mass of data in cases involving grooming of children on social networks and in chat forums. In cases of suspected child abuse, the investigators would get the suspects’ computers, but all they could do was very simple keyword-based searches. One case had so much data that it would have taken an investigator six months to go through it; the tools that Rashid’s group eventually built reduced that to a couple of days.

Rashid already had a history of applying computational techniques for natural language analysis to understand the requirements of computer systems, and thought he could use some of his modularity work to build systems to help investigate these settings. This project developed digital persona analysis techniques to “fingerprint” individuals’ online chat and enable comparisons to try to detect whether the individual is the age and gender they claim to be. “It’s not 100% accurate,” Rashid says, “but it’s pretty good depending on how you configure the threshold levels – 80% to 90% accuracy.” They were able to trial the technique on a real data set provided through their collaboration with law enforcement, and from that build an investigative toolkit. “Given the topic, we really wanted to build something that people could use,” he says. This work was spun out into the company Relative Insight so that it could live on after the project ended in 2011. Two of Rashid’s postdocs moved on to run the spinout, which has since built a mobile child safety version that kids can use themselves.

This work was also published in the 2013 paper Who Am I? Analyzing Digital Personas in Cybercrime Investigations. The project was difficult to work on because of the subject matter, but Rashid did enjoy its multi-disciplinary nature, as it involved not only computer scientists but sociologists and criminologists, and, as part of data collection, led the group to develop hands-on internet safety exercises they ran in schools. The work they did on understanding such criminal behaviours online has gone on to underpin later work, such as the ICOP project, which developed a tool that’s been handed off to law enforcement to auto-detect new and unknown child abuse media when it comes online. The work Rashid is doing within RISCS now as part of the DAPM project is also rooted in this work; using his understanding of how online victimisation happens he is building computational tools he hopes will help people protect themselves.

Rashid’s security work has had two main streams. The first is the work already discussed, on security behaviours, both adversarial and non-adversarial. The second is the security of cyber-physical infrastructure, which formed the basis for a study of existing incidents to determine how attackers exfiltrate data from critical infrastructure organisations. Carried out as part of CPNI’s iData programme, the result was the 2016 paper Discovering “Unknown Known” Security Requirements. A play on Donald Rumsfeld’s “unknown unknowns”, the paper proposes a method for uncovering vulnerabilities that are out there hidden in known security breaches, but that we don’t know about yet and for which organisations’ security controls have therefore not been updated. This method uses grounded theory, combined with incident fault tree analysis (drawn from safety literature) to look systematically at patterns across multiple incidents to see which problems are covered by security controls and which are not. The paper concludes by making recommendations for critical security controls that Rashid believes are being adopted.

This work led Rashid to develop an interest in the security of cyber-physical systems, a growing concern that poses the challenge of how to help people who do not understand security but must increasingly deal with it because their organisations run critical infrastructure. His most recent paper, “The Good, the Bad and the Ugly: A Study of Security Decisions in a Cyber-Physical Systems Game”, in review in 2017, creates a board game for studying security decision-making in the critical infrastructure environment. In this area people don’t want to talk directly about their security incidents. The Lego-based Decisions and Disruptions board game, which depicts a small utility infrastructure that players have to defend against attacks, gives them a way to talk about such incidents and decisions without requiring such disclosures. “It’s almost like a sandbox,” he says. The group has built a number of game boxes which they take into organisations; they benefit from collecting data, while the organisations benefit from helping staff understand security issues. The game is released under a Creative Commons licence for anyone to copy and use (Lego not included).

Besides his involvement in the DAPM project, currently Rashid is leading Cybok, a project that aims to capture the foundational knowledge that should be used to educate people about cyber security.

Rashid is finding security a particularly satisfying subject area to work in. “It’s a wonderful space,” he says. “I like it because some of the really hard problems that are practical problems are also interesting research problems.” This comes back to his key belief about the kind of research he wants to do: “As researchers what we do should and can have an impact on the world around us.”

About Wendy M. Grossman

Freelance writer specializing in computers, freedom, and privacy. For RISCS, I write blog posts and meeting and talk summaries