One in Five Public-Facing Cloud Storage Buckets Expose Sensitive Data

November 17, 2022

eSecurity Planet content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

Public-facing cloud storage buckets are a data privacy nightmare, according to a study released today.

Members of Laminar Labs’ research team recently found that one in five public-facing cloud storage buckets contains personally identifiable information (PII) – and the majority of that data isn’t even supposed to be online in the first place.

The information uncovered by the researchers includes physical addresses, email addresses, phone numbers, driver’s license numbers, names, loan details, and credit scores.

“Because this data contains such highly sensitive information as loan details, Bitcoin addresses and conversations about unemployment benefits, we believe that this data has the potential to put the organizations to whom the information belongs at risk,” Laminar Labs said in a statement.

“Organizations cannot properly protect data they do not know is exposed,” the company added. “And in the shared responsibility model, keeping this data secure is the responsibility of the organization that owns the buckets in which the data resides.”

Also read: Cloud Bucket Vulnerability Management

A Data-Centric View

According to Laminar, the sensitive data found online includes the following – it’s quite a list:

A file containing PII of people who used a third-party chatbot service on different websites, including names, phone numbers, email addresses, and messages sent to the bot (such as people seeking unemployment benefits)
A file containing loan details – names, loan amounts, credit scores, interest rates, and more
A participant report for an athletic competition, including names, physical addresses, zip codes, email addresses, and medical information
A VIP invite list, including names, email addresses, and physical addresses
A file with names, Ethereum and Bitcoin address information, and block card email addresses

Companies need to know what publicly exposed sensitive data is in their environment, Laminar said. Still, doing so can be harder than it seems, since non-public Amazon S3 buckets can contain specific files and objects that are public – and conversely, buckets that are intentionally public, like hosted websites, can contain PII placed there by mistake.

The answer, according to Laminar, is a data-centric view rather than an infrastructure-centric one, cataloging all data in your cloud environment to ensure that sensitive information is kept private while public files remain accessible.

Also read: Cloud Security: The Shared Responsibility Model

A Pervasive Privacy Problem

Several other companies have warned of similar issues, such as UpGuard, which has detected thousands of breaches related to misconfigured Amazon S3 security settings over the past four years – including 1.8 million personal records from a database of Chicago voters, 14 million Verizon customer records, and GoDaddy trade secrets and infrastructure information.

“As long as S3 buckets can be configured for public access, there will [be] data exposures through S3 buckets,” UpGuard chief marketing officer Kaushik Sen wrote in a blog post earlier this year.

The Mitiga Research Team also recently found hundreds of databases containing PII exposed via the Amazon Relational Database Service (RDS). While RDS snapshots can be used to back up data, those snapshots can expose a range of highly sensitive information.

As the researchers noted in a blog post, “a Public RDS snapshot is a valuable feature when a user wants to share a snapshot with colleagues, while not having to deal with roles and policies. In this manner, the user can share the snapshot publicly for just a few minutes… What could possibly happen?”

Also read: CNAP Platforms: The Next Evolution of Cloud Security

Assume the Worst

Among the data the Mitiga researchers found exposed between September 21 and October 20 of this year was a MySQL database with about 10,000 rows recording car rental transactions, including names, phone numbers, email addresses, marital status, and rental information.

Another MySQL database contained information on about 2,200 users of a dating app, including email addresses, password hashes, birthdates, links to personal images, and private messages.

The researchers recommend leveraging AWS Trusted Advisor to assess your security posture, using CloudTrail logs to check for historical use of public snapshots, and separately checking for all currently available RDS snapshots.

“We think it’s not an overstatement to assume the worst-case scenario – when you are making a snapshot public for a short time, someone might get that snapshot’s metadata and content,” the researchers wrote. “So, for your company and, more importantly, your customers’ privacy – don’t do that if you are not 100% sure there is no sensitive data in the content or in the metadata of your snapshot.”

The risks of publicly exposing personal data are two-fold. The first is loss of customer confidence. And the second can be costly fines under data privacy regulations like GDPR and CCPA – see Security Compliance & Data Privacy Regulations for important compliance information on those laws and China’s new data privacy law too.

Jeff Goldman

eSecurity Planet contributor Jeff Goldman has been a technology journalist for more than 20 years and an eSecurity Planet contributor since 2009. He’s also written extensively about wireless and broadband infrastructure and semiconductor engineering. He started his career at MTV, but soon decided that technology writing was a more promising path.