This talk/discussion was part of a RISCS/NCSC workshop on securing software development in November 2016. The day’s discussions led directly to the research call that funded Why Johnny Doesn’t Write Secure Software and Motivating Jenny to Write Secure Software, among others.
Sascha Fahl, from the Universitat des Saarlandes, outlined his research and future challenges. His research interest is an often-ignored source of security errors: software developers. There is an assumption that because developers are experts, they don’t need the assistance that end users do. Yet developers by and large are not experts in security, and herein lies the root cause of problems such as Heartbleed, Shellshock, and Gotofail. The community blames developers for not writing secure code or applying secure development techniques, but the better answer is to make secure programming easier. Fahl outlined four recent projects researching this area.
Typically, a developer asked to write secure Android code looks for an example that advises what to use. What they find may include weak cryptography or auto-generate broken SSL without any indication that there is a problem. Fahl wants to replace a blame-the-developer culture by making writing secure code easier.
The default Android and iOS APIs implement correct certificate validation. As long as your app is using a certificate signed by one of about 130 pre-installed trusted certificate authorities, you have no problem. However, a certificate from an authority not on that list requires a custom workaround. A developer encountering this problem who uses a search engine to find a list of possible workarounds is likely to find code for disabling certificate validation or turning off hostname verification.
For the 2012 paper Why Eve and Mallory Love Android: An Analysis of Android SSL (In)Security (PDSF), Falh and his co-authors conducted a static analysis of 13,500 free apps from the Playstore looking for certificate validation vulnerabilities, Fahl found 17.28% of the apps that use HTTPS include code that fails in certificate validation. Of these, 1,074 included critical code; 790 accepted all certificates; 284 accepted all hostnames. In a manual study of 100 cherry-picked apps, again, Fahl found that 21 trusted all certificates and 20 accepted all hostnames. Worse, these apps came from some of the biggest names: Yahoo, Facebook, American Express, Paypal, Google… The upshot: even banking apps will transfer funds over broken SSL. Fahl notes little has changed since the paper was published.
Fahl has even found an anti-virus app for Android that in manual testing proved to be transferring virus signature updates over a broken SSL connection. He created his own database copying form, added a new virus signature, and ran a man-in-the-middle attack against the app. It triggered an update, scan, and successful identification of the virus, which it then removed.
The iOS system is more difficult to study: the apps are in a walled garden and cannot be easily downloaded, making static analysis a tough challenge. The team did, however, manually download 1,000 apps and mounted real man-in-the middle attacks. The results: different platform, same problems.
Fahl’s conclusion is that it’s time to rethink how developers code SSL. The researchers contacted 80 of the app developers, 15 of whom agreed to be interviewed. The interviews brought out two key problems. First, the developers are app experts, not mobile or security experts, and as such they implemented the first working solution they found on the internet. Second, they used self-signed certificates for testing and sometimes then forgot to remove the code when the app was released.
Fahl argues that it’s necessary to reduce the overhead and make the common use cases easy. In particular, certificate handling should be adapted to developers’ needs so they don’t have to become security experts. The developers’ wish list includes: configuration, rather than coding; an easy way to use certificate pinning; standardised, reusable warning messages; and self-signed certificates. Fahl argues that it’s essential to reduce the overhead, make the common use cases easy, and adapt certificate handling to the developers’ needs so they don’t have to become security experts. A redesign of the Android system that removes the need to write custom code, adds the possibility of validating certificates via configuration and settings in the user interface, the ability to disable developer options for a single app, and turning public key pinning into a simple configuration option have been adopted for Android 7.
In a second project, published in IEEE Security and Privacy, Fahl studied the impact of the information sources developers us on code security. Sites like Stack Overflow publish myriad pieces of code that developers can reuse, but copying and pasting from the internet usually results in insecure code. Fahl set out to determine whether this could be measured, how internet sources compare to the official documentation; and what professional developers actually do. Fahl’s online survey attracted 300 participants and 295 valid responses. The result: most use Stack Overflow and search engines. Only a quarter worked exclusively with official documentation.
Fahl ran a lab study of Stack Overflow to determine its impact by assigning each of a group of developers one of four conditions: official documentation only, Stack Overflow, a programming book, or a free choice of references. They were given a skeleton app and an emulator, they were not primed for security or privacy, and they were assigned four tasks designed to have both secure and insecure solutions. These were: convert HTTP to HTTPS; limit access to an Android service to only apps from the same developer; securely store user ID and password locally; dial a customer support phone number without requesting extra permission. The resulting code was evaluated for correctness, security, and self-reported sentiment, which included the correctness and usefulness of the resources they were assigned to use.
In exit interviews, the 54 participants, 14 of them professional, all of whom passed basic Android knowledge questions, indicated that the free choice option was the easiest, and the book option the most difficult. However, books and official documentation were considered the most correct. In terms of functional correctness, the official documentation came off worst (40%) and Stack Overflow the best (67%). However, in terms of code security, Stack Overflow came in worst (51%) and official documentation the best (86%). Professional developers produced more functional code than the students did – but not significantly more secure code.
The conclusion: Stack Overflow provides quick, functional solutions where the official documentation doesn’t, but the resulting code is less secure than either the official documentation or books. Fahl believes that therefore we need resources that integrate both, and suggests adding a security rating to influence upvoting and integrating questions and answers into official documentation. In addition, he recommends using Stack overflow to identify trouble spots and then providing code snippets in the official documentation.
The third project, currently under review, compared the usability of crypto APIs. There is no doubt that getting crypto right is hard. Developers have to make many choices, including the algorithm, mode of operation, key size, an d so on. Getting all these choices right is challenging and error-prone. There are, however, alternatives that claim to be more usable, such as libsodium and keyczar. Fahl set out to measure whether this was really true.
In an online developer study, participants, who were experienced Python developers from Github, were asked to complete short programming tasks in Python that were, again, designed to have both secure and insecure solutions. Developers were offered skeleton code and an online code editor, and had to pick between symmetric and asymmetric cryptography various functions within those two categories. They were given access to a group of libraries: PyCrypto, M2Crypto, cryptography.io, keyczar, and pyNacl. They then completed an exit survey; they were not primed for security or privacy. The 256 participants, 208 of them professionals, were evaluated for correctness, security, and self-reported sentiment.
The researchers found significant differences in functionality by library, with keyczar coming in the worst. Functionality increased as soon as people were able to copy and paste code. Asymmetric cryptography was harder than symmetric to make work; however asymmetric solutions are more secure. Key generation and storage, required for asymmetric implementations, is hard, and certificate validation even harder. The most secure solutions used keyczar. These results showed that there is a real tradeoff between functionality and security in terms of the difficulty of coding. In addition, the researchers found that experience in programming had no impact, but the impact of a background in IT security was significant.
The researchers concluded that implementing crypto is hard. API usability helps but is not sufficient on its open. Finally, participants complained about bad documentation. Therefore, future APIs should support common tasks; be better documented; and provide secure, easy-to-use code snippets.
The fourth and final project, still in progress, studied security in Android Lint, a plugin from Android Studio intended to help app developers find bugs. Among the problems Lint is supposed to find are security-related issues. Fahl would like to create Fixdroid, a plug-in which would identify common bad Android security practices; offer developers more secure solutions; provide more thorough data flow analysis; and provide usable security indicators. Fahl is planning a field study on the tool’s usability.
For the future, Fahl’s research agenda has three pieces: measuring the status quo; understanding developers; and methodology and validity. He is interested in questions such as: when do developers think about security? Do they think about it? Which tools do they accept? How does security fit into the development process. These need field, not lab, studies.
In discussion, commenters asked why books were significantly better at providing secure solutions. One assumption the researchers made was that developers would have only a limited time to solve security issues. In the free choice group, it’s possible that participants read the official documentation, then tried Stack Overflow, and then spotted security issues based on what they had read. Fahl believes this did happen in some cases. One commenter pointed out that a significant omission is internal company information, which is often a significant source, and wondered what percentage of the research subjects were self-employed. This was, however, the project with only 15 professionals among the research subjects. Angela Sasse noted that a just-published study of three US organisations found that peer influence in internal groups is really strong. Another commenter noted the need for internal champions in development teams; his organisation had seen the culture change entirely in four years. In addition, he felt it important that the code examples given to customers should be secure so that if they are found in searches the results will be secure by default. Sasse noted the same applies to computer science course coding samples, a suggestion she heard at an Intel security education summit. In one experiment, Fahl found that even in real-world software 95% of code snippets are insecure. Changing this so that insecure snippets are flagged as such and replaced with secure code would be valuable and manageable as a collaborative effort.
One solution, proposed by Sasse and others, is checklists. Using these, as in aviation, would cut out problems such as forgetting to take out elements needed only for development. Other issues raised in discussion included: the varying cultural backgrounds of developers across the world, who respond to different approaches; how to take into account team structure; what more could be done at the testing and code review levels; whether management was allocating sufficient and appropriate resources; security by construction rather than security by design; and the slowness of change in large organisations.