Tristan Caulfield and Andrew Fielder


The presence of unpatched, exploitable vulnerabilities in software is a prerequisite for many forms of cyberattack. Because of the almost inevitable discovery of a vulnerability and creation of an exploit for all types of software, multiple layers of security are usually used to protect vital systems from compromise. Accordingly, attackers seeking to access protected systems must circumvent all of these layers. Resource- and budget-constrained defenders must choose when to execute actions such as patching, monitoring and cleaning infected systems in order to best protect their networks. Similarly, attackers must also decide when to attempt to penetrate a system and which exploit to use when doing so. We present an approach to modelling computer networks and vulnerabilities that can be used to find the optimal allocation of time to different system defence tasks. The vulnerabilities, state of the system and actions by the attacker and defender are used to build partially observable stochastic games. These games capture the uncertainty about the current state of the system and the uncertainty about the future. The solution to these games is a policy, which indicates the optimal actions to take for a given belief about the current state of the system. We demonstrate this approach using several different network configurations and types of player. We consider a trade-off for the system administrator, where they must allocate their time to performing either security-related tasks or performing other required non-security tasks. The results presented highlight that, with the requirement for other tasks to be performed, following the optimal policy means spending time on only the most essential security-related tasks, while the majority of time is spent on non-security tasks.

Date: 5 November 2015 Published: Journal of Cybersecurity, 2015, 1–15 Publisher: Oxford Academic Publisher URL: DOI: