Dec 17th, 2024: [EN] MLSecOps Essentials: Building Security into your ML Workflow

Hey Community!

Christmas :christmas_tree: is almost here! And as a gift, I’d like to tell you a story :gift: You ready?

Imagine for a moment that you’re a Machine Learning engineer at Elastic :technologist:. You’re taking part in a hacking competition.

You, Charlie, Elena and James, forming the fantastic CipherCore team competing against the ByteSquad team.

Sounds fun, right?

Your mission is to hack an AI company to win.

The prize? A one-week trip for the whole team to El Caribe :sun_with_face:. Let’s go for it! :rocket:

Cat Typing GIFs | Tenor

Charlie: Alright, brainstorming time. What should we do? Where do we start?

You: No idea, this ML cybersecurity stuff is so new…

Elena: Let’s stay calm. Machine Learning still has similarities with a classic software project. I bet they’re working in Python. Remember that time an anonymous hacker injected malicious code into a FastAPI-related package? It was called fastapi-toolkit, I think. Elastic’s competitor, Datadog, was not taking enough care of what packages were being installed and they were threatened by it... The hacker managed to add a line of code that executed a base64-encoded script whenever anyone used the package.

Elena: That code added a FastAPI HTTP route, enabling the hacker to run arbitrary Python or SQL code. The package was published on PyPI and accessible by anyone. Quite wild…

Charlie: Yeah. Those attacks are called Supply Chain Vulnerabilities. I bet that would not happen in Elastic as people raise awareness regarding security concerns.

Elena: Exactly. And it’s not limited to Python packages. Something similar can be done with Docker images, for instance. Every time someone pulls a docker image from Docker Hub and then runs a container with it… well, the system can be compromised. And it is not so strange.

Charlie: Same with GitHub Actions, for example. If you use a GitHub Action developed by someone else in your workflow, it’s possible that the action is malicious and the attacker steals the secrets defined for the job. Can you imagine? Getting access to all the keys, passwords, etc., in use?

You: And what about misnaming famous packages? Like injecting malicious code into a package called rrequests instead of requests. If someone tries to install requests but makes a typo…

Charlie: That could totally happen to me.

Elena: And in fact happened. Library requests is legit. Libraries rrequests, requesys and requests3 were malware.

James: Got it, got it. So the idea is to attack through the ML system’s third-party dependencies, right? Elena and Charlie, you’re in charge of this.

Charlie: Awesome.

You: I am just thinking… What makes ML systems different from traditional software? The data and the model! Maybe there’s a way to attack through these points too.

James: If we get access to the data, it’s possible to corrupt it. They’re called data poisoning attacks. Basically, it means injecting erroneous or malicious samples into the training data of a model to alter its behavior later on.

Elena: Sounds a bit tricky, doesn’t it? After all, we’d need access to their data sources…

James: Yeah, although it’s technically possible, it’s still difficult. But there’s an easier way: public datasets. Imagine publishing a corrupted dataset and being able to control model behaviour at will. We could disable fraud detection systems, modify AI-based cybersecurity systems to let certain threats pass…

Elena: Overloading email by breaking the spam detection system… like a kind of DoS attack.

You: And harm companies’ reputation. For instance, we could try to alter the model’s behaviour to introduce unethical biases—discriminating by race, gender, etc. If the company’s data scientists don’t pay close attention during training, we could slip it in easily. You know how the Internet and social media are…

James: That’s right. There’s also the possibility of confusing models that work with images. I read that by modifying just a few pixels, you can trick object recognition systems into failing to detect things like weapons, mistaking them for something else. Adversarial attacks, folks.

You: That could be used against facial recognition systems.

Charlie: Crazy!

James: Yeah. I’ll take care of this part if you like.

Charlie: Great, we’re finding lots of attack paths. I was thinking… what about LLMs? They’re all the rage now; we could explore how to hack the virtual assistant they offer on their website.

You: Prompt injections and jailbreaks. It’s kind of resembling social engineering, but with AIs. There’s even a game to improve your prompt injection skills

Elena: What’s prompt injection?

You: It’s basically subverting the programming of an LLM by giving specific instructions. For instance, inserting unusual characters and commands can drastically change the LLM’s response.

Elena: And how could we use it?

You: Well, if their LLM is connected to any of their internal databases, like in a RAG system, we could try to access sensitive or confidential business information. And if it’s integrated with any additional internal system, we might even be able to trigger certain actions through the prompt. It’s hard, but… it’s also a possible vulnerability. We could gain access to the entire system with it.

James: I once saw that DeepMind managed to extract training data from an LLM through prompts.

James: If they’ve fine-tuned their model with internal data and we manage to access it… that could be another vulnerability.

You: Good idea! And it could also help us identify if they’ve used any public dataset, so that we can apply data poisoning attacks.

Charlie: Alright, let’s get to work. You’re in charge of this.

You: Sure thing. Let's go!

genius cat - Imgflip

After several hours of analysis and testing, you manage to extract a small portion of the training data through a prompt injection.

You: Hey guys, look what I got! Training data!

Elena: Unbelievable… Look at all this juicy information. Client data, pricing information, financial data… They barely anonymised anything!

You: Yeah. And with a bit of tweaking in the prompt, I managed to get the LLM to reveal even more details.

James: Awesome! Well, there’s one hour left in the competition. Shall we make the final presentation?

Charlie: Yep.

In your final presentation, you explain the results and all the confidential information you’ve extracted.

You: And this reinforces the need to address specific security challenges in ML. Reviewing and hiding LLM responses, monitoring inputs to detect anomalies, anonymising and masking sensitive information, limiting prompt length, checking which data your LLM has access to… There are many security measures that would have made it harder for us, but we pulled it off.

Elena: This highlights the importance of ML security, known as MLSecOps.

The audience applauds :clap:. The judges are astonished. Meanwhile, the other team barely found anything because they didn’t know how to exploit the LLM vulnerabilities.

Judge: Well, it’s time to decide the winner… Unanimously, we’ve chosen CipherCore as the winner. Congratulations!

Great job! Now you are here :sunglasses::

Cheers! :beers:

I hope you have had fun reading the story and learned something. I wish you a Merry Christmas and a Happy New Year!

3 Likes