Large public cloud providers offer multiple services. In 2021, AWS comprises over 200 products and services, Azure offers more than 160 services, and GCP provides more than 90 different services.
The rich offerings of these large cloud providers make our lives easier, allowing us to focus on building great products that our users enjoy, without having to worry about infrastructure deployment, maintenance, and scalability. Our developers can easily access basic infrastructure elements like virtual machines, as well as Platform-as-a-Service (PaaS) solutions that allow us to embed advanced AI capabilities into our products and store application data in managed database instances, etc.
Unfortunately, there is a caveat. The more public cloud services we are consuming, the larger our attack surface becomes. Many of us are used to evaluating the attack surface from a networking point of view—can a bad actor access my environment via the network? Can an infected machine in my environment communicate externally via the network? While securing & monitoring network access is paramount, a different attack vector is often being overlooked—permissions, or entitlements.
Large public cloud environments leveraging just a subset of the hundreds of available services can have millions of permissions that need to be managed.
Over the years, multiple best practices have been developed to manage human users. In the public cloud, however, a large amount of the entitlements belong to machine (API) identities. Unfortunately, these entitlements are often overlooked, which can lead to serious consequences (e.g. the Capital One breach).
According to Gartner, through 2023, 99% of security failures will be the customer’s fault—and 75% of those failures will be the result of inadequate management of identities, access, and privileges.
So, how can we manage permissions allocation in the public cloud?
One approach may be to monitor permissions usage over a limited period. If a given permission was not used for 90 days, we may assume that it is not needed and revoke it.
Data analytics about permissions usage can be obtained programmatically from the cloud provider and analyzed using self-written scripts and/or commercial tools.
While the approach outlined above is straightforward & easy to implement, it has a fundamental flaw. Consider this example: 5 developers are working on a web application. They were allocated a development AWS cloud account where they deploy and test their code. Only one developer in the team may be turning virtual machines (EC2 instances) on and off. She is constantly using these permissions:
"ec2:StartInstances",
"ec2:StopInstances"
As her colleagues were not performing these actions for a certain amount of time, these permissions would be flagged as unused & revoked. If the developer is on leave, or absent for any reason, one of her colleagues would have to turn the virtual machines on and off. In this case, if their permission was revoked, they would not be able to do so, which would create a business problem leading to support tickets, approval process, and loss of time.
The above-described revocation could have been avoided if a manual review would have taken place and determined that developers should have wide permissions to a development account. Manual review of each unused permission, unfortunately, is something that most companies cannot afford due to the sheer amount of permissions.
Considering the above, we have to find a way to minimize our attack surface by revoking excessive permissions, while simultaneously making sure that we are not causing any disruptions to our colleagues and/or applications. As reviewing permissions manually is not feasible, we have to rely on automation or, to be precise, on machine learning. Such an approach identifies “safe to remove” permissions—not only for those that are unused, but those that are unused and not likely to be needed in the future.
According to a Gartner report, “Protecting cloud infrastructure is crucial, especially with more workloads hosted across cloud service providers. Security and risk management technical professionals must deploy tools that enable effective management of cloud infrastructure entitlements and reduce risks caused by unintended access.”
Considering the scale of the cloud, we need a solution that will filter out the noise and present us with a shortlist of actionable items (i.e. low hanging fruit) that will improve our security posture, reduce friction between security and DevOps teams, and support the adoption of public cloud in the organization.
Lastly, consider the previous example with the 5 developers. It is OK for all developers to have wide permissions to a development AWS account, however, if one of the developers has unused access rights to a production account, while none of his or her colleagues have this kind of access, it would be flagged for human investigation.
A mature CIEM (Cloud Infrastructure Entitlement Management) solution, such as Zscaler CIEM, would apply machine learning to identify whether permission is abnormal on top of being unused:
CIEM is part of a broader Cloud Native Application Protection Platform (CNAPP) strategy that you should be implementing across your public cloud footprint. You can find more information about the Zscaler CIEM and request a demo here.
↧