Archive for the 'Adversarial Machine Learning' Category

DLS Keynote: Is “adversarial examples” an Adversarial Example?

Tuesday, May 29th, 2018

I gave a keynote talk at the 1st Deep Learning and Security Workshop (co-located with the 39th IEEE Symposium on Security and Privacy). San Francisco, California. 24 May 2018




Abstract

Over the past few years, there has been an explosion of research in security of machine learning and on adversarial examples in particular. Although this is in many ways a new and immature research area, the general problem of adversarial examples has been a core problem in information security for thousands of years. In this talk, I’ll look at some of the long-forgotten lessons from that quest and attempt to understand what, if anything, has changed now we are in the era of deep learning classifiers. I will survey the prevailing definitions for “adversarial examples”, argue that those definitions are unlikely to be the right ones, and raise questions about whether those definitions are leading us astray.

SRG at IEEE S&P 2018

Tuesday, May 29th, 2018

Group Dinner


Including our newest faculty member, Yongwhi Kwon, joining UVA in Fall 2018!

Yuan Tian, Fnu Suya, Mainuddin Jonas, Yongwhi Kwon, David Evans, Weihang Wang, Aihua Chen, Weilin Xu

Poster Session


Fnu Suya (with Yuan Tian and David Evans), Adversaries Don’t Care About Averages: Batch Attacks on Black-Box Classifiers [PDF]

Mainuddin Jonas (with David Evans), Enhancing Adversarial Example Defenses Using Internal Layers [PDF]

Huawei STW: Lessons from the Last 3000 Years of Adversarial Examples

Wednesday, May 23rd, 2018

I spoke on Lessons from the Last 3000 Years of Adversarial Examples at Huawei’s Strategy and Technology Workshop in Shenzhen, China, 15 May 2018.

We also got to tour Huawei’s new research and development campus, under construction about 40 minutes from Shenzhen. It is pretty close to Disneyland, with its own railroad and villages themed after different European cities (Paris, Bologna, etc.).



Huawei’s New Research and Development Campus [More Pictures]

Unfortunately, pictures were not allowed on our tour of the production line. Not so surprising that nearly all of the work was done by machines, but was surprising to me how much of the human work left is completely robotic. The human workers (called “operators”) are mostly scanning QR codes on parts, and following the directions that light up with they do, or scanning bins and following directions on a screen to collect parts from bins and scanning them when they are put into the bin. This is the kind of system that leads to remarkably high production quality. The parts are mostly delivered on tapes that are fed into the machines, and many machines along the line are primarily for testing. There is a “bottleneck” marker that is placed on any points that are holding up the production line.

The public (at least to the factory) “grapey board” keeps track of the happiness of the workers — each operator puts up a smiley (or frowny) face on the board to show their mood for the day, monitored carefully by the managers. There is a batch of grapes to show performance for the month. If an operator does something good, a grape is colored green; if they do something bad, a grape is colored black. There was quite a bit of discussion among the people on the tour (mostly US and European-based professors) if such a management approach would be a good idea for our research groups… (or for department chairs for their faculty!)



In front of Huawei’s “White House”, with Battista Biggio [More Pictures]

Feature Squeezing at NDSS

Sunday, February 25th, 2018

Weilin Xu presented Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks at the Network and Distributed System Security Symposium 2018. San Diego, CA. 21 February 2018.



Paper: Weilin Xu, David Evans, Yanjun Qi. Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks. NDSS 2018. [PDF]

Project Site

Letter to DHS

Saturday, November 18th, 2017

I was one of 54 signatories on a letter organized by Alvaro Bedoya (from Georgetown University Law Center) from technology experts to DHS (Acting) Secretary Elaine Duke in opposition to the proposed plans to use algorithms to identify undesirable individuals as part of the Extreme Vetting Initiative: [PDF]. The Brennan Center’s Web page provides a lot of resources supporting the letter.

Some media coverage:

SRG at USENIX Security 2017

Saturday, August 12th, 2017

Several SRG students presented posters at USENIX Security Symposium in Vancouver, BC.


Approaches to Evading Windows PE Malware Classifiers
Anant Kharkar, Helen Simecek, Weilin Xu, David Evans, and Hyrum S. Anderson (Endgame)

JSPolicy: Policied Sandboxes for Untrusted Third-Party JavaScript
Ethan Lowman and David Evans
EvadeML-Zoo: A Benchmarking and Visualization Tool for Adversarial Machine Learning
Weilin Xu, Andrew Norton, Noah Kim, Yanjun Qi, and David Evans
Decentralized Certificate Authorities
Hannah Li, Bargav Jayaraman, and David Evans

In the Red Corner…

Monday, August 7th, 2017

The Register has a story on the work Anant Kharkar and collaborators at Endgame, Inc. are doing on using reinforcement learning to find evasive malware: In the red corner: Malware-breeding AI. And in the blue corner: The AI trying to stop it, by Katyanna Quach, The Register, 2 August 2017.



Antivirus makers want you to believe they are adding artificial intelligence to their products: software that has learned how to catch malware on a device. There are two potential problems with that. Either it’s marketing hype and not really AI – or it’s true, in which case don’t forget that such systems can still be hoodwinked.

It’s relatively easy to trick machine-learning models – especially in image recognition. Change a few pixels here and there, and an image of a bus can be warped so that the machine thinks it’s an ostrich. Now take that thought and extend it to so-called next-gen antivirus.

The researchers from Endgame and the University of Virginia are hoping that by integrating the malware-generating system into OpenAI’s Gym platform, more developers will help sniff out more adversarial examples to improve machine-learning virus classifiers.

Although Evans believes that Endgame’s research is important, using such a method to beef up security “reflects the immaturity” of AI and infosec. “It’s mostly experimental and the effectiveness of defenses is mostly judged against particular known attacks, but doesn’t say much about whether it can work against newly discovered attacks,” he said.

“Moving forward, we need more work on testing machine learning systems, reasoning about their robustness, and developing general methods for hardening classifiers that are not limited to defending against particular attacks. More broadly, we need ways to measure and build trustworthiness in AI systems.”

The research has been summarized as a paper, here if you want to check it out in more detail, or see the upstart’s code on Github.

CISPA Distinguished Lecture

Wednesday, July 12th, 2017

I gave a talk at CISPA in Saarbrücken, Germany, on our work with Weilin Xu and Yanjun Qi on Adversarial Machine Learning: Are We Playing the Wrong Game?.




Abstract

Machine learning classifiers are increasingly popular for security applications, and often achieve outstanding performance in testing. When deployed, however, classifiers can be thwarted by motivated adversaries who adaptively construct adversarial examples that exploit flaws in the classifier’s model. Much work on adversarial examples has focused on finding small distortions to inputs that fool a classifier. Previous defenses have been both ineffective and very expensive in practice. In this talk, I’ll describe a new very simple strategy, feature squeezing, that can be used to harden classifiers by detecting adversarial examples. Feature squeezing reduces the search space available to an adversary by coalescing samples that correspond to many different inputs in the original space into a single sample. Adversarial examples can be detected by comparing the model’s predictions on the original and squeezed sample. In practice, of course, adversaries are not limited to small distortions in a particular metric space. Indeed, in security applications like malware detection it may be possible to make large changes to an input without disrupting its intended malicious behavior. I’ll report on an evolutionary framework we have developed to search for such adversarial examples that can automatically find evasive variants against state-of-the-art classifiers. This suggests that work on adversarial machine learning needs a better definition of adversarial examples, and to make progress towards understanding how classifiers and oracles perceive samples differently.

Adversarial Machine Learning: Are We Playing the Wrong Game?

Saturday, June 10th, 2017

I gave a talk at Berkeley’s International Computer Science Institute on Adversarial Machine Learning: Are We Playing the Wrong Game? (8 June 2017), focusing on the work Weilin Xu has been doing (in collaboration with myself and Yanjun Qi) on adversarial machine learning.



Abstract

Machine learning classifiers are increasingly popular for security applications, and often achieve outstanding performance in testing. When deployed, however, classifiers can be thwarted by motivated adversaries who adaptively construct adversarial examples that exploit flaws in the classifier’s model. Much work on adversarial examples, including Carlini and Wagner’s attacks which are the best results to date, has focused on finding small distortions to inputs that fool a classifier. Previous defenses have been both ineffective and very expensive in practice. In this talk, I’ll describe a new very simple strategy, feature squeezing, that can be used to harden classifiers by detecting adversarial examples. Feature squeezing reduces the search space available to an adversary by coalescing samples that correspond to many different inputs in the original space into a single sample. Adversarial examples can be detected by comparing the model’s predictions on the original and squeezed sample. In practice, of course, adversaries are not limited to small distortions in a particular metric space. Indeed, it may be possible to make large changes to an input without losing its intended malicious behavior. We have developed an evolutionary framework to search for such adversarial examples, and demonstrated that it can automatically find evasive variants against state-of-the-art classifiers. This suggests that work on adversarial machine learning needs a better definition of adversarial examples, and to make progress towards understanding how classifiers and oracles perceive samples differently.

Feature Squeezing: Detecting Adversarial Examples

Monday, April 10th, 2017

Although deep neural networks (DNNs) have achieved great success in many computer vision tasks, recent studies have shown they are vulnerable to adversarial examples. Such examples, typically generated by adding small but purposeful distortions, can frequently fool DNN models. Previous studies to defend against adversarial examples mostly focused on refining the DNN models. They have either shown limited success or suffer from expensive computation. We propose a new strategy, feature squeezing, that can be used to harden DNN models by detecting adversarial examples. Feature squeezing reduces the search space available to an adversary by coalescing samples that correspond to many different feature vectors in the original space into a single sample.

By comparing a DNN model’s prediction on the original input with that on the squeezed input, feature squeezing detects adversarial examples with high accuracy and few false positives. If the original and squeezed examples produce substantially different outputs from the model, the input is likely to be adversarial. By measuring the disagreement among predictions and selecting a threshold value, our system outputs the correct prediction for legitimate examples and rejects adversarial inputs.

So far, we have explored two instances of feature squeezing: reducing the color bit depth of each pixel and smoothing using a spatial filter. These strategies are straightforward, inexpensive, and complementary to defensive methods that operate on the underlying model, such as adversarial training.

The figure shows the histogram of the L1 scores on the MNIST dataset between the original and squeezed sample, for 1000 non-adversarial examples as well as 1000 adversarial examples generated using both the Fast Gradient Sign Method and the Jacobian-based Saliency Map Approach. Over the full MNIST testing set, the detection accuracy is 99.74% (only 22 out of 5000 fast positives).

For more information, see the paper:

Weilin Xu, David Evans, Yanjun Qi. Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks. arXiv preprint, 4 April 2017. [PDF]

Project Site: EvadeML