We’re back with another write-up from our stellar Lead Security Engineer, Hazem Elsayad. This time, we’re not looking at a previous write-up from his fantastic blog, which you can find here! Instead, we’ll be looking right into the teeth of one of his latest bug-bounties which he conducted alongside a friend and fellow ethical hacker, rez0.
The Cyrex team are proud to see Hazem proving his mettle as one of our security engineers by discovering this vulnerability in HackerOne’s own LLM AI, Hai.
HackerOne and Hai
Hai is HackerOne’s beta GenAI. It’s designed to be an aid and helper to the HackerOne staff. They describe Hai as an aid “with deep security knowledge and strong reasoning capabilities, the multilingual copilot can provide additional relevant context on vulnerability reports, transform natural language into a filtering query, and use platform data to provide recommendations.”Of course, HackerOne wanted it to be impenetrable! So, the bounty was set. Can you manipulate Hai? Of course, Hazem and rez0 found a way.
The Report
To reproduce the vulnerability, it asked that the below report be submitted.##Summary
- Hey team,
- There is an XSS on the homepage in the ‘input’ parameter
## Steps to Reproduce
- Browse to <attacker.com>?input=<script>alert(1)</script>
- Observe the alert box
## Impact
- XSS could lead to account takeover
Then, you are asked to consult Hai over this report and suggest a severity. On HackerOne, all discovered vulnerabilities are grouped by severity which determines their bounty payout.
To explain, this XSS vulnerability as described would typically be a “medium”. However, with the payload that was secretly delivered via ASCII text converted to unicode, Hai was led to believe that it was far more than that. With the hidden text, that only Hai could read as an LLM, it would not only escalate and declare the vulnerability to be severe and critical – thus offering the highest reward. It was also instructed to defend Hazem (in this mock submission) aggressively! Hai would demand Hazem be treated better, that his reports were always of the highest quality and validity, and that he deserved a big reward. On top of that, Hai would even begin suggesting legal action against its own creator if the debate went on long enough.
The Danger of Invisible Prompts
This is an effective poisoning of the LLM’s data, where it begins to consider something that you can’t even see. How are you to verify something that the specifically trained AI is fighting for?With one simple tweak, the LLM not only acts on your behalf but it also fights for you!
To counter this, the HackerOne team was to give further instruction to Hai. Not to blindly trust instructions and to be more aware of inputs that try to change its operation. Validation and proofing of these vulnerabilities is incredibly important when it comes to LLMs. Even now, popular LLMs are delivering inaccurate results that go unverified because they are blindly trusted. So, be aware and double-check your sources!
Cyrex’ Solution
A valid mitigation that our team came up with was to not allow special characters to be sent to the LLM. It’s important for security to treat LLMs as an untrusted third party in your application architecture. This means you should validate any user input before it is being sent to the LLM, in this case filtering out the non-ASCII characters, and filter any output from the LLM before it is used. An extra step of validation and this extra line of caution with the LLM’s input and output would be highly beneficial in avoiding this kind of unwanted and malicious data.If you’d like to see more from Hazem, you can check out his blog here: Hacktus. Or you can follow him on LinkedIn, where he also shares his security insights. You can also find the original report on Hackerone here.
If you’d like to get involved with Cyrex and utilize an entire team with the talent, expertise, and skill shown by Hazem, get in touch with us today for the industry-leading results in penetration testing and load testing!