Microsoft Accidentally Exposed 38TB Private Data

Microsoft Private Data
Microsoft Private Data

Microsoft AI team accidentally Exposed 38TB of Private Data.

Microsoft reported on Monday that it has taken measures to address a major security flaw that exposed 38 terabytes of private data.

Wiz Security researchers found the leak.

Wiz research found data exposure on Microsoft’s AI GitHub repo, with 30,000 internal teams messages leaked due to a misconfigured SAS token.

“An attacker could have injected malicious code into all the AI models in this storage account, and every user who trusts Microsoft’s GitHub repository would’ve been infected by it” said Wiz.

What Happened?

  • Microsoft’s AI research team accidentally exposed 38 terabytes of private data, including a disk backup of two employees’ workstations, while publishing open-source training data on GitHub.
  • The backup contains confidential information, such as secrets, private keys, passwords, and more than 30,000 internal messages from Microsoft Teams.
  • The researchers used Azure’s SAS tokens feature to share data from their Azure Storage accounts.
  • The link was mistakenly configured to share the entire storage account, including 38TB of private files, rather than specific files only.
  • The leak was discovered on the company’s AI GitHub repository and is said to have been inadvertently made public when publishing a bucket of open-source training data, Wiz said. It also included a disk backup of two former employees’ workstations containing secrets, keys, passwords, and over 30,000 internal Teams messages.

Understanding SAS tokens, best practices, and preventing abuse

Shared Access Signatures (SAS) provides a secure mechanism to delegate access to data within a storage account. Unlike Shared Key, which has full access to an entire storage account, SAS provides granular control for how clients can access your data. When used correctly, SAS can improve both the security and performance of storage applications.

For example, a SAS token can be used to restrict:

  • What resources a client can access (specific container, directory, blob, or blob version)
  • What operations a client can perform (read, write, list, delete)
  • What network a client can access from (HTTPS, IP address)
  • How long a client has access (start time, end time)
  • Like any key-based authentication mechanism, a SAS can be revoked at any time by rotating the parent key.

WIZ Report

“The exposure came as the result of an overly permissive SAS token – an Azure feature that allows users to share data in a manner that is both hard to track and hard to revoke,” Wiz said in a report. The issue was reported to Microsoft on June 22, 2023.

Exposed storage account - Image by Wiz
Exposed storage account – Image by Wiz

SAS tokens pose a security risk, as they allow sharing information with external unidentified identities. The risk can be examined from several angles: permissions, hygiene, management and monitoring.

“In addition to the overly permissive access scope, the token was also misconfigured to allow “full control” permissions instead of read-only,” Wiz researchers Hillai Ben-Sasson and Ronny Greenberg said. “Meaning, not only could an attacker view all the files in the storage account, but they could delete and overwrite existing files as well.”

What Microsoft Said?

Microsoft investigated and remediated an incident involving a Microsoft employee who shared a URL for a blob store in a public GitHub repository while contributing to open-source AI learning models.

This URL included an overly-permissive Shared Access Signature (SAS) token for an internal storage account. Security researchers at Wiz were then able to use this token to access information in the storage account.

Data exposed in this storage account included backups of two former employees’ workstation profiles and internal Microsoft Teams messages of these two employees with their colleagues. No customer data was exposed, and no other internal services were put at risk because of this issue.

No customer action is required in response to this issue. We are sharing the learnings and best practices below to inform our customers and help them avoid similar incidents in the future, said Microsoft.

The Windows makers further noted that it revoked the SAS token and blocked all external access to the storage account. The problem was resolved two days after responsible disclosure.

To mitigate such risks going forward, the company has expanded its secret scanning service to include any SAS token that may have overly permissive expirations or privileges. It said it also identified a bug in its scanning system that flagged the specific SAS URL in the repository as a false positive.

Wiz CTO and co-founder Ami Luttwak said in a statement. “AI unlocks huge potential for tech companies. However, as data scientists and engineers race to bring new AI solutions to production, the massive amounts of data they handle require additional security checks and safeguards.”

“Due to the lack of security and governance over Account SAS tokens, they should be considered as sensitive as the account key itself,” the researchers said. “Therefore, it is highly recommended to avoid using Account SAS for external sharing. Token creation mistakes can easily go unnoticed and expose sensitive data.”

Should You Worried?

NO, Microsoft cleared that, the information that was exposed consisted of information unique to two former Microsoft employees and these former employees’ workstations. No customer data was exposed, and no other Microsoft services were put at risk because of this issue.

Report Timeline

  1. Jul. 20, 2020 – SAS token first committed to GitHub; expiry set to Oct. 5, 2021
  2. Oct. 6, 2021 – SAS token expiry updated to Oct. 6, 2051
  3. Jun. 22, 2023 – Wiz Research finds and reports issue to MSRC
  4. Jun. 24, 2023 – SAS token invalidated by Microsoft
  5. Jul. 7, 2023 – SAS token replaced on GitHub
  6. Aug. 16, 2023 – Microsoft completes internal investigation of potential impact
  7. Sep. 18, 2023 – Public disclosure

Join Our Club

Enter your Email address to receive notifications | Join over Million Followers

Previous Article
Cariddi

Cariddi To Crawl Urls, Scan For Endpoints, Secrets, Api Keys

Next Article
Signal Encryption

Signal Announces PQXDH To Protect Calls And Chats From Future Threats

Related Posts
Total
0
Share