Each new week seemingly brings with it another disturbing discovery of personal data left unsecured and exposed on the internet. Different companies — spanning industries as varied as pornography, cannabis, and medical records — screwing up in hard-to-parse ways, all with the same casualty: your privacy.
While the scale and severity may vary, a single theme often unites each newsworthy incident: An unsecured Amazon S3 bucket containing customer, medical, or financial data that’s left out for anyone with the proper know-how to pilfer.
The question as to why this keeps happening is an important one that sadly isn’t going away anytime soon.
You may be wondering what, exactly, an Amazon S3 bucket is. The short and simplified answer is that it’s a virtual storage unit where some companies pay to keep their data. But more on that in a bit.
Your next question is likely more pointed: Is this all Amazon’s fault? The massive company certainly has some culpability in the general erosion of our privacy. But the reason your every personal detail has been exposed in the past, and likely will be in the future, is a lot more complicated.
Unfortunately, that means it’s also going to be more difficult to prevent.
All too often, when security researchers or hackers find personal information online, it’s sitting in an unsecured Amazon S3 bucket. We see this time and time again, often with extremely troubling results.
Hundreds of millions of Facebook user records left . Nearly eight hundred thousand applications for birth certificate copies . Tens of thousands of cannabis users’ private info just practically . Incidents like these — and there are many, many more — are all tied together by one cloud computing platform: .
So what’s going on here? Are customers simply misusing the product, or is there some sort of design flaw that makes the accidental exposure of data inevitable?
To answer that question, I spoke with a range of security experts familiar with S3 buckets and cyber crime. I also repeatedly reached out to Amazon for an on-the-record statement. I wanted to provide the company an opportunity to explain, in its own words, why its services are the unifying factor in the loss of so much privacy.
The company declined to comment.
I also reached out to numerous companies that have themselves screwed up and exposed their own customers’ data. I hoped to understand, from the perspective of an Amazon S3 customer, why this keeps happening. Perhaps unsurprisingly, no one responded to my requests.
Thankfully, for anyone trying to understand the sometimes confusing world of misconfigured buckets, security experts are more than happy to share their expertise.
Amazon S3 Buckets
For starters, it’s worth understanding what, in this context, a “bucket” is. Perhaps the simplest way to think of it is like a “folder” on a PC. In other words, it’s a way for AWS customers to organize the files they’re paying Amazon to store.
“S3 buckets are a great way to host content,” explained Dan Tentler, executive founder of the security company Phobos Group, over email. “You’ll often find shops who leverage AWS doing things like storing logs, storing uploaded user content, output from huge data processing clusters, all sorts of things! For example — every file you’ve ever uploaded to Slack is sitting in an S3 bucket!”
Some buckets — containing, for example, a company’s database of customer emails and phone numbers — should be set to private by admins. Other buckets contain public information and are intentionally set to public. This makes sense.
It’s when the two get confused that problems arise.
“At the end of the day, it’s easy to classify these sorts of mishaps into two piles: People who care, and people who don’t,” wrote Tentler. “In nearly every case, it’s the people who don’t care that are responsible for these types of breaches, because they didn’t take the 5 minutes to read about S3 security settings and deploy the bucket correctly when they set it up.”
Importantly, Amazon secures S3 buckets by default. In other words, for a bucket to be publicly accessible to any old hacker or security researcher who knows where to look — as was the case with the almost 800,000 applications for birth certificate copies mentioned above — someone has to actively screw up. Or, as Tentler put it, not care.
Perhaps it’s hard to delicately call some of your customers idiots?
While mankind’s capacity for error is limitless, this S3 bucket error tends to happen in three distinct ways: An AWS customer might, for example, take sensitive data and accidentally place it in a bucket that’s set up to be public. Or, more likely, that same AWS customer might mistakenly change an entire bucket’s setting to public. Another, less charitable, explanation is that an admin temporarily changes a private bucket’s setting to public as some sort of one-off data-sharing shortcut and then forgets to switch it back.
While in all three of those cases the fault lies with the customer, the first two suggest that Amazon is not doing all it could to make it explicitly and immediately clear to an administrator that a bucket is publicly accessible.
Check the image above. It shows, as of late 2017, what the back end of an S3 Console looks like. Notice anything? Specifically, there is an entire “Access” column which tells an admin whether a bucket is or is not public.
It’s pretty hard to miss.
As previously mentioned, I reached out to Amazon in an attempt to determine why it thinks this mistake keeps getting made, but was unable to get an on-the-record response.
Perhaps it’s hard to delicately call some of your customers idiots?
Set up to fail
In some ways, Amazon Web Services is a victim of its own success.
According to Gartner, a technology advisory company, in 2018, Amazon captured 47.8 percent of the total “infrastructure as a service” market of which AWS is a part. In other words, Amazon is popular — both with website admins who know how to properly keep buckets secured and those who don’t.
To be clear, it’s not just Amazon Web Services’ products that are sometimes configured incorrectly. Just this past April, we learned of a likely misconfigured Microsoft cloud server that exposed the personal data of 80 million households. Oops.
“Using ‘the cloud’ is a double-edged sword,” wrote Tentler. “On one hand, it’s made it trivial to do wide-scale operations, host terabytes of data, or deploy new sites and applications — on the other hand, since it’s so easy, the barrier-of-entry which used to be ‘you have to be at least this tall to ride’ is gone — and literally anybody can do it.”
Essentially, if admins don’t know anything about properly configuring an S3 bucket, and they don’t take the time to learn, then they’re putting everyone’s data at risk.
Victor Gevers, a security researcher working with the non-profit GDI Foundation to find and disclose security vulnerabilities, agreed with Tentler. He first emphasized over Twitter direct messages that S3 buckets are private by default, but then made an analogy that’s worth teasing out.
Laying the blame solely on Amazon customers is a bit of a cop out.
“So when files are publicly accessible, then this was done by the customer,” he wrote. “Can you blame a car company that the drivers can cause accidents?”
But what if the car has a faulty design?
UpGuard, a company that bills itself as “[helping] businesses manage cybersecurity risk,” alerts companies when they’ve left data exposed on the internet. Over the years, UpGuard researchers have discovered millions of private records exposed on misconfigured S3 buckets. And, as the company’s vice president of marketing, Kaushik Sen, wrote in a December 2019 blog post, laying the blame solely on Amazon customers is a bit of a cop out.
“Our view is that AWS has made it far too easy for S3 users to misconfigure buckets to make them totally publicly accessible over the Internet,” he wrote. “It’s up to AWS to create better security solutions by default.”
Chris Vickery, an UpGuard risk analyst, added over email that: “It mostly boils down to the rule of thumb that ‘if it can be misconfigured, some amount of users will misconfigure it.'”
Vickery shared with Mashable a specific example of Amazon setting its customers up for failure.
“Many people have assumed [the ‘Global Authenticated Users’] optional access setting will open up a storage repository ‘globally’ within their organization, but still not allow the general public to download items within it,” he explained. “However, Amazon considers Global Authenticated Users to be anyone in the world that is logged in to Amazon Web Services (AWS). End users could arguably be justified in not considering this result, as it is baffling to me that such a setting would ever be included as an option in the first place.”
Unfortunately, at least at present, whatever steps Amazon is taking to provide clarity don’t appear to be enough to stop a torrent of misconfigured buckets. And so, expect to keep reading accounts of your personal data, once again, being left exposed on the internet for criminals and security researchers to find.
Ain’t technology grand?