GitHub has become an important resource for programmers around the world and a comprehensive knowledge base and repository for open source coding projects, data storage, and code management. However, the site is currently under an automated attack that involves the cloning and creation of a large number of malicious code repositories, and while the developers have worked to remove the affected repositories, a significant portion of them are said to have survived and more are being added on a regular basis uploaded base.
An unknown attacker has managed to create and deploy an automated process that forks and clones existing repositories while adding his own malicious code hidden under seven layers of obfuscation (via Ars Technica). These fraudulent repositories are difficult to distinguish from their legitimate counterparts, and some users, unaware of the malicious nature of the code, fork the affected repositories themselves, unintentionally increasing the scale of the attack.
Once a developer uses an affected repo, a hidden payload begins unpacking seven layers of obfuscation, including malicious Python code and an executable binary. The code then begins collecting sensitive data and credentials before uploading them to a control server.
The research and data teams from security provider Apiiro were there Monitoring for a resurgence of the attack since its relatively small beginnings in May last year. And although the company says GitHub quickly removed the affected repositories, its automation detection system is still missing many of them, and manually uploaded versions are still disappearing online.
Given the current scale of the attack, which researchers say includes millions of uploaded or forked repositories, even a 1% failure rate still means there may still be thousands of compromised repositories on the site.
While the attack was initially documented on a small scale and several packages containing early versions of the malicious code were discovered on the website, it has gradually grown in size and complexity. Researchers have identified several possible reasons for the operation's success so far, including the overall size of GitHub's user base and the increasing complexity of the technique.
What's really fascinating here is the combination of sophisticated automated attack methods and simple human nature. As obfuscation methods become more sophisticated, attackers have relied heavily on social engineering to trick developers into favoring the malicious code over the real code and inadvertently spreading it, making the attack more severe and making detection much more difficult.
As things stand, this method appears to have worked remarkably well, and while GitHub has not yet directly commented on the attack, it issued a general statement assuring its users: “We have teams dedicated to detection, analysis and removal Dedicate content and accounts to prevent these attacks.” violates our acceptable use policy. We use manual reviews and large-scale detections that leverage machine learning and continually evolve and adapt to adversarial attacks.”
The danger of becoming popular seems to have manifested itself here. While GitHub remains an important resource for developers around the world, it appears to be somewhat vulnerable due to its open source nature and huge user base, although given the effectiveness of the method it is not surprising that the solution to the problem overall seems to be an arduous battle that GitHub has yet to overcome.