38 TB error
Between July 20, 2020 and June 24, 2023, Microsoft made a vast trove of data available to the public via a public repository on GitHub. Cloud security company Wiz discovered the issue and reported it to Microsoft on June 22, 2023, and the company closed the vulnerability two days later. Wiz researchers say that the Shared Access Signature (SAS) feature of the Azure platform was misused by Microsoft employees and 38 terabytes of private data were exposed for this reason.
According to Wiz, Microsoft also exposed training data for its AI models, as well as disk backups of two employees’ workstations. The backups contained “secrets,” private cryptographic keys, passwords and more than 30,000 internal Microsoft Teams messages belonging to 359 Microsoft employees. Anyone who wanted could access a total of 38 TB of private files until Microsoft took precautions.
According to Wiz, Microsoft’s multi-terabyte incident highlights the risks associated with AI model training. The researchers emphasize that this emerging technology requires “large data sets to train” and that many development teams are processing “large amounts of data.” Wiz also notes that development teams share data with colleagues.