Archive for the 'Security' Category

Mutually Assured Destruction and the Impending AI Apocalypse

Monday, August 13th, 2018

I gave a keynote talk at USENIX Workshop of Offensive Technologies, Baltimore, Maryland, 13 August 2018.

The title and abstract are what I provided for the WOOT program, but unfortunately (or maybe fortunately for humanity!) I wasn’t able to actually figure out a talk to match the title and abstract I provided.

The history of security includes a long series of arms races, where a new technology emerges and is subsequently developed and exploited by both defenders and attackers. Over the past few years, “Artificial Intelligence” has re-emerged as a potentially transformative technology, and deep learning in particular has produced a barrage of amazing results. We are in the very early stages of understanding the potential of this technology in security, but more worryingly, seeing how it may be exploited by malicious individuals and powerful organizations. In this talk, I’ll look at what lessons might be learned from previous security arms races, consider how asymmetries in AI may be exploited by attackers and defenders, touch on some recent work in adversarial machine learning, and hopefully help progress-loving Luddites figure out how to survive in a world overrun by AI doppelgängers, GAN gangs, and gibbon-impersonating pandas.

Cybersecurity Summer Camp

Saturday, July 7th, 2018

I helped organize a summer camp for high school teachers focused on cybersecurity, led by Ahmed Ibrahim. Some of the materials from the camp on cryptography, including the Jefferson Wheel and visual cryptography are here: Cipher School for Muggles.




Cybersecurity Goes to Summer Camp. UVA Today. 22 July 2018. [archive.org]

Earlier this week, 25 high school teachers – including 21 from Virginia – filled a glass-walled room in Rice Hall, sitting in high adjustable chairs at wheeled work tables, their laptops open, following a lecture with graphics about the dangers that lurk in cyberspace and trying to figure out how to pass the information on to a generation that seems to share the most intimate details of life online. “I think understanding privacy is important to that generation that uses Facebook and Snapchat,” said David Evans, a computer science professor who helped organize the camp. “We hope to give teachers some ideas and tools to get their students excited about learning about cryptography, privacy and cybersecurity, and how these things can impact them.”

(Also excerpted in ACM TechNews, 29 June 2018.)




UVa bootcamp aims to increase IT training for teachers. The Daily Progress. 22 June 2018. [archive.org]

Feature Squeezing at NDSS

Sunday, February 25th, 2018

Weilin Xu presented Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks at the Network and Distributed System Security Symposium 2018. San Diego, CA. 21 February 2018.



Paper: Weilin Xu, David Evans, Yanjun Qi. Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks. NDSS 2018. [PDF]

Project Site

Why hasn’t Cross-Site Scripting been solved?

Sunday, December 31st, 2017

By Haina Li

Introduction

In 2017, Bugcrowd reported that cross-site scripting (XSS) remains as the number one vulnerability found on the web, accounting for 25% of the bugs found and submitted to the bug bounty program. Additionally, XSS has remained in the top 3 on the list of the web’s top vulnerabilities for the recent years. Over the 17 years since XSS was first recognized by Microsoft in 2000, XSS has been the focus of intense academic research and development of penetration testing tools, yet we are still finding vulnerabilities even in top websites such as Facebook and Google. In this blog post, we explore some of the reasons why XSS is still a major problem today.

XSS has evolved

XSS evolved while modern applications became more complex than the static pages that they once were. While reflected and stored XSS have not disappeared because both server and client-side logic have become more elaborate, the pattern of replacing server-side logic with client-side JavaScript gave rise to DOM-Based vulnerabilities. Additionally, server-side XSS prevention tools that examined deviations between the request and response (XSSDS) do not work for DOM-Based vulnerabilities because the entire flow of malicious data from the source to the sink is contained within the browser and do not go through the server.

New methods that do prevent DOM-Based XSS attacks include XSS Filters and CSP. These myriad of sophisticated tools aimed to achieve the seemingly simple purpose of escaping user-provided content. As it stands currently, these tools are not able to catch all XSS vulnerabilities, and escaping everything all the time would break an web application altogether. For example, a recent work by Lekis et al. [PDF]
describes a new attack that was missed by every existing XSS prevention technique. In the new attack, the injected payload is benign-looking HTML but can be transformed by script gadgets to behave maliciously.

The effectiveness of web penetration tools are limited

In a study of automated black-box web application vulnerability testing by Bau et al. [PDF], researchers tested commercial scanners such as McAfee and IBM and found that the average scanner XSS vulnerability detection rates were 62.5, 15. and 11.25, respectively, for reflected, stored, and advanced XSS that used non-standard tags and keywords. The study found that the scanners were effective in finding straightforward, textbook XSS vulnerabilities, but lack sufficient modeling of more complex XSS with respect to the specific web application. Web application scanners are designed using a reactive approach, converting new vulnerabilities into test vectors only after they’ve become a problem. When it comes to stored XSS, XSS scanners also struggle to link an event to a subsequent, later observation. These scanners are also often difficult to configure and often take too long if they were set to fuzz every possible location in a large and complicated web application.

Conclusion

As with most web vulnerabilities, XSS is not going away anytime soon because of the constant evolving technologies of the web and the challenges in developing penetration tools with high true-positive rates. However, we may be able to eliminate most of the client-side security issues by replacing JavaScript with a new language that exhibits better control-flow integrity, such as WebAssembly.

Highlights from CCS 2017

Saturday, November 18th, 2017

The 24th ACM Conference on Computer and Communications Security was held in Dallas, 30 October – 3 November. Being Program Committee co-chair for a conference like this is a full-year commitment, and the work continues throughout much of the year preceding the conference. The conference has over 1000 registered attendees, a record for any academic security research conference.

Here are a few highlights from the conference week.



PC Chairs’ Welcome (opening session)



Giving the PC Chairs’ Welcome Talk



Audience at Opening Session



ACM CCS 2017 Paper Awards Finalists



CCS 2017 Awards Banquet




At the Award’s Banquet, I got to award a Best Paper award to SRG alum Jack Doerner (I was, of course, recused by conflict from being involved in any decisions on his paper).




UVA Lunch (around the table starting at front left): Suman Jana (honorary Wahoo by marriage), Darion Cassel (SRG BSCS 2017, now at CMU), Will Hawkins, Jason Hiser, Samee Zahur (SRG PhD 2016, now at Google), Jack Doerner (SRG BACS 2016, now at Northeastern), Joe Calandrino (now at FTC); Back right to front: Ben Kreuter (now at Google), Anh Nguyen-Tuong, Jack Davidson, Yuan Tian, Yuchen Zhou (SRG PhD 2015, now at Palo Alto Networks), David Evans.

First Workshop for Women in Cybersecurity

Friday, November 17th, 2017

I gave a talk at the First ACM Workshop for Women in Cybersecurity (affiliated with ACM CCS 2017) on Truth, Social Justice (and the American Way?):




There’s also a short paper, loosely related to the talk: [PDF]





In the Red Corner…

Monday, August 7th, 2017

The Register has a story on the work Anant Kharkar and collaborators at Endgame, Inc. are doing on using reinforcement learning to find evasive malware: In the red corner: Malware-breeding AI. And in the blue corner: The AI trying to stop it, by Katyanna Quach, The Register, 2 August 2017.



Antivirus makers want you to believe they are adding artificial intelligence to their products: software that has learned how to catch malware on a device. There are two potential problems with that. Either it’s marketing hype and not really AI – or it’s true, in which case don’t forget that such systems can still be hoodwinked.

It’s relatively easy to trick machine-learning models – especially in image recognition. Change a few pixels here and there, and an image of a bus can be warped so that the machine thinks it’s an ostrich. Now take that thought and extend it to so-called next-gen antivirus.

The researchers from Endgame and the University of Virginia are hoping that by integrating the malware-generating system into OpenAI’s Gym platform, more developers will help sniff out more adversarial examples to improve machine-learning virus classifiers.

Although Evans believes that Endgame’s research is important, using such a method to beef up security “reflects the immaturity” of AI and infosec. “It’s mostly experimental and the effectiveness of defenses is mostly judged against particular known attacks, but doesn’t say much about whether it can work against newly discovered attacks,” he said.

“Moving forward, we need more work on testing machine learning systems, reasoning about their robustness, and developing general methods for hardening classifiers that are not limited to defending against particular attacks. More broadly, we need ways to measure and build trustworthiness in AI systems.”

The research has been summarized as a paper, here if you want to check it out in more detail, or see the upstart’s code on Github.

Adversarial Machine Learning: Are We Playing the Wrong Game?

Saturday, June 10th, 2017

I gave a talk at Berkeley’s International Computer Science Institute on Adversarial Machine Learning: Are We Playing the Wrong Game? (8 June 2017), focusing on the work Weilin Xu has been doing (in collaboration with myself and Yanjun Qi) on adversarial machine learning.



Abstract

Machine learning classifiers are increasingly popular for security applications, and often achieve outstanding performance in testing. When deployed, however, classifiers can be thwarted by motivated adversaries who adaptively construct adversarial examples that exploit flaws in the classifier’s model. Much work on adversarial examples, including Carlini and Wagner’s attacks which are the best results to date, has focused on finding small distortions to inputs that fool a classifier. Previous defenses have been both ineffective and very expensive in practice. In this talk, I’ll describe a new very simple strategy, feature squeezing, that can be used to harden classifiers by detecting adversarial examples. Feature squeezing reduces the search space available to an adversary by coalescing samples that correspond to many different inputs in the original space into a single sample. Adversarial examples can be detected by comparing the model’s predictions on the original and squeezed sample. In practice, of course, adversaries are not limited to small distortions in a particular metric space. Indeed, it may be possible to make large changes to an input without losing its intended malicious behavior. We have developed an evolutionary framework to search for such adversarial examples, and demonstrated that it can automatically find evasive variants against state-of-the-art classifiers. This suggests that work on adversarial machine learning needs a better definition of adversarial examples, and to make progress towards understanding how classifiers and oracles perceive samples differently.

Modest Proposals for Google

Friday, June 9th, 2017

Great to meet up with Wahooglers Adrienne Porter Felt, Ben Kreuter, Jonathan McCune, Samee Zahur (Google’s latest addition from my group), and (honorary UVAer interning at Google this summer) Riley Spahn at Google’s Research Summit on Security and Privacy this week in Mountain View.

As part of the meeting, the academic attendees were given a chance to give a 3-minute pitch to tell Google what we want them to do. The slides I used are below, but probably don’t make much sense by themselves.

The main modest proposal I tried to make is that Google should take it on as their responsibility to make sure nothing bad ever happens to anyone anywhere. They can start with nothing bad ever happening on the Internet, but with the Internet pretty much everywhere, should expand the scope to cover everywhere soon.

To start with an analogy from the days when Microsoft ruled computing. There was a time when Windows bluescreens were a frequent experience for most Windows users (and at the time, this pretty much mean all computer users). Microsoft analyzed the crashes and concluded that nearly all were because of bugs in device drivers, so it wasn’t their fault and was horribly unfair for them to be blamed for the crashes. Of course, to people losing their work because of a crash, it doesn’t really matter who’s code was to blame. By the end of the 90s, though, Microsoft took on the mission of reducing the problems with device drivers, and a lot of great work came out of this (e.g., the Static Driver Verifier), with dramatic improvements on the typical end user’s computing experience.

Today, Google rules a large chunk of computing. Lots of bad things happen on the Internet that are not Google’s fault. As the latest example in the news, the leaked NSA report of Russian attacks on election officials describes a phishing attack that exploits vulnerabilities in Microsoft Word. Its easy to put the blame on overworked election officials who didn’t pay enough attention to books on universal computation they read when they were children, or to put it on Microsoft for allowing Word to be exploited.

But, Google’s name is also all over this report – the emails when through gmail accounts, the attacks phished for Google credentials, and the attackers used plausibly-named gmail accounts. Even if Google isn’t too blame for the problems that enable such an attack, they are uniquely positioned to solve it, both because of their engineering capabilities and resources, but also because of the comprehensive view they have of what happens on the Internet and powerful ability to influence it.

Google is a big company, with lots of decentralized teams, some of which definitely seem to get this already. (I’d point to the work the Chrome Security Team has done, MOAR TLS, and RAPPOR as just a few of many examples of things that involve a mix of techincal and engineering depth and a broad mission to make computing better for everyone, not obviously connected to direct business interests.) But, there are also lots of places where Google doesn’t seem to be putting serious efforts into solving problems they could but viewing them as outside scope because its really someone else’s fault (my particular motivating example was PDF malware). As a company, Google is too capable, important, and ubiquitous to view problems as out-of-scope just because they are obviously undecidable or obviously really someone else’s fault.



[Also on Google +]

Feature Squeezing: Detecting Adversarial Examples

Monday, April 10th, 2017

Although deep neural networks (DNNs) have achieved great success in many computer vision tasks, recent studies have shown they are vulnerable to adversarial examples. Such examples, typically generated by adding small but purposeful distortions, can frequently fool DNN models. Previous studies to defend against adversarial examples mostly focused on refining the DNN models. They have either shown limited success or suffer from expensive computation. We propose a new strategy, feature squeezing, that can be used to harden DNN models by detecting adversarial examples. Feature squeezing reduces the search space available to an adversary by coalescing samples that correspond to many different feature vectors in the original space into a single sample.

By comparing a DNN model’s prediction on the original input with that on the squeezed input, feature squeezing detects adversarial examples with high accuracy and few false positives. If the original and squeezed examples produce substantially different outputs from the model, the input is likely to be adversarial. By measuring the disagreement among predictions and selecting a threshold value, our system outputs the correct prediction for legitimate examples and rejects adversarial inputs.

So far, we have explored two instances of feature squeezing: reducing the color bit depth of each pixel and smoothing using a spatial filter. These strategies are straightforward, inexpensive, and complementary to defensive methods that operate on the underlying model, such as adversarial training.

The figure shows the histogram of the L1 scores on the MNIST dataset between the original and squeezed sample, for 1000 non-adversarial examples as well as 1000 adversarial examples generated using both the Fast Gradient Sign Method and the Jacobian-based Saliency Map Approach. Over the full MNIST testing set, the detection accuracy is 99.74% (only 22 out of 5000 fast positives).

For more information, see the paper:

Weilin Xu, David Evans, Yanjun Qi. Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks. arXiv preprint, 4 April 2017. [PDF]

Project Site: EvadeML