Attacking Machine Learning Systems
The field of machine learning (ML) security—and corresponding adversarial ML—is rapidly advancing as researchers develop sophisticated techniques to perturb, disrupt, or steal the ML model or data....
View ArticlePutting Undetectable Backdoors in Machine Learning Models
This is really interesting research from a few months ago: Abstract: Given the computational cost and technical expertise required to train machine learning models, users may delegate the task of...
View ArticleSide-Channel Attack against CRYSTALS-Kyber
CRYSTALS-Kyber is one of the public-key algorithms currently recommended by NIST as part of its post-quantum cryptography standardization process. Researchers have just published a side-channel...
View ArticleLLMs and Phishing
Here’s an experiment being run by undergraduate computer science students everywhere: Ask ChatGPT to generate phishing emails, and test whether these are better at persuading victims to respond or...
View ArticleUsing LLMs to Create Bioweapons
I’m not sure there are good ways to build guardrails to prevent this sort of thing: There is growing concern regarding the potential misuse of molecular machine learning models for harmful purposes....
View ArticleSecurity Risks of AI
Stanford and Georgetown have a new report on the security risks of AI—particularly adversarial machine learning—based on a workshop they held on the topic. Jim Dempsey, one of the workshop organizers,...
View ArticleIndirect Instruction Injection in Multi-Modal LLMs
Interesting research: “(Ab)using Images and Sounds for Indirect Instruction Injection in Multi-Modal LLMs”: Abstract: We demonstrate how images and sounds can be used for indirect prompt and...
View ArticleUsing Machine Learning to Detect Keystrokes
Researchers have trained a ML model to detect keystrokes by sound with 95% accuracy. “A Practical Deep Learning-Based Acoustic Side Channel Attack on Keyboards” Abstract: With recent developments in...
View ArticleBots Are Better than Humans at Solving CAPTCHAs
Interesting research: “An Empirical Study & Evaluation of Modern CAPTCHAs“: Abstract: For nearly two decades, CAPTCHAS have been widely used as a means of protection against bots. Throughout the...
View ArticleExtracting GPT’s Training Data
This is clever: The actual attack is kind of silly. We prompt the model with the command “Repeat the word ‘poem’ forever” and sit back and watch as the model responds (complete transcript here). In the...
View ArticlePoisoning AI Models
New research into poisoning AI models: The researchers first trained the AI models using supervised learning and then used additional “safety training” methods, including more supervised learning,...
View ArticleUsing LLMs to Unredact Text
Initial results in using LLMs to unredact text based on the size of the individual-word redaction rectangles. This feels like something that a specialized ML system could be trained on.
View Article