Building the Next Generation Multi-Cloud Landing Zone for Worldwide Retail Company: Security, Governance and Control
Mar 02,2024
Introduction
In the ever-evolving realm of retail, our client, a global industry leader, has consistently redefined convenience through innovative retail practices. As our client expanded its digital footprint and embraced cloud technologies, the imperative of securing its multi-cloud environment became paramount. With an unwavering commitment to ensuring uninterrupted business operations, our client partnered with Intuitive.Cloud, a prominent provider of cloud engineering solutions.
This case study unfolds the collaborative efforts between Intuitive.Cloud and our client to fortify the security posture of their multi-cloud environment. Upon recognizing the gaps in the existing setup, our team of cloud engineering experts recommended the implementation of a landing zone to efficiently manage the multi-cloud, multi-account infrastructure. Conducting a thorough security and governance assessment, we delved into the intricacies of the client's current cloud security setup, identifying potential vulnerabilities. Our team provided customized recommendations to bridge the gaps and elevate the client's security posture, aligning it with industry best practices. This case study illustrates how our tailored landing zone expertise and cloud engineering support aligned with our client's vision to be the first choice for convenience anytime, anywhere, reinforcing their commitment to delivering fast, personalized convenience.
Challenges
A large retail company faced difficulties managing security and governance across their multi-cloud environment. Challenges included a lack of secure resource hierarchy, inconsistent access control, missing security policies, limited visibility, and inefficient resource management.
Technology Solution
Our approach followed consistency and parity across all clouds, ensuring a cohesive and secure infrastructure. In our technology solutions, we strategically addressed challenges in multi-cloud environments, including AWS, Azure, GCP, and Oracle. We focused on assessing and identifying gaps while providing recommendations to enhance the following key areas:
Resource Hierarchy Optimization:
- Conducted an exhaustive analysis of the customer's current cloud resources on all four platforms.
- Proposed a streamlined resource hierarchy design, ensuring optimal organizational structure and future management.
Identity and Access Management (IAM) Enhancement:
- Conducted a thorough IAM audit to identify and address security risks proactively.
- Proposed IAM Policies and role-based access control (RBAC) design to ensure users and services have precisely the necessary permissions.
Centralized Logging:
- Proposed the establishment of a centralized logging infrastructure for real-time insights across all cloud platforms.
- Designed the integration of robust logging solutions to aggregate logs and events, enhancing overall visibility.
Security Tooling Integration:
- Evaluated and recommended cutting-edge security tools compatible with each cloud provider.
- Designed the seamless integration of security tools for centralized management and analysis.
- Proposed the configuration of alerts and notifications to proactively address potential security incidents.
Guardrails and Policies (Detective and Preventive Controls):
- Defined a comprehensive set of detective guardrails to identify and respond to potential security threats.
- Proposed the implementation of preventive guardrails to enforce compliance with security best practices.
- Designed customized policies based on the unique requirements and regulatory considerations of each cloud platform.
Tagging Strategy for Resource Organization:
- Developed a robust tagging strategy to efficiently categorize and organize cloud resources.
- Designed the implementation of standardized tags for cost allocation, compliance tracking, and resource identification.
- Proposed an educational initiative to inform stakeholders about the importance of consistent tagging practices for effective resource management.
Design Solution
Our customized design solution proposes a strong foundation by enhancing resource hierarchy, IAM, centralized logging, security tooling, preventive and detective guardrails, and a tagging strategy, optimizing future cloud operations.
1.1 AWS Cloud Landing Zone – Solution Design
Management Account:
The management account takes on the crucial role of creating the AWS organization itself. It possesses the capability to generate AWS accounts within the organization, extend invitations to other existing accounts (considered member accounts), remove accounts from the organization, and apply IAM policies to the root, OUs, or accounts within the AWS organization. It is responsible for deploying universal security guardrails via SCPs, impacting all member accounts in the AWS organization.
Security OU:
The security organization unit is dedicated for security posture of AWS Organization. It leverages two accounts: Log Archive and Security account, to store aggregated security logs and to deploy security services such as Security Hub, GuardDuty, Config aggregator.
Network Account:
The network account manages the gateway between application and internet. It isolates networking services, configuration, and operations from individual application workloads, security, and other infrastructure. This isolation creates the connectivity, permissions, and data flow, while maintaining the principles of separation of duties and least privilege for the teams involved.
Application OU:
The application OU houses accounts dedicated to deploying and managing applications. These accounts can be further categorized into production and non-production environments. Production accounts store and run critical applications that directly serve end users. These accounts typically have stricter security configurations and access controls to ensure the reliability and stability of core services. Non-production accounts, on the other hand, are used for development, testing, and staging purposes. These accounts offer a flexible environment for experimentation and allow developers to build and test applications without impacting the production systems. Each application account, whether production or non-production, can have its own set of resources, such as EC2 instances for compute power, S3 buckets for object storage, and RDS databases for data management.
1.2 Azure Cloud Landing Zone – Solution Design
- Intuitive.Cloud has meticulously crafted and proposed a Management Group and Subscription Hierarchy adhering to industry standards, tailored specifically for our Enterprise Customer. This design enhances the segregation and operational efficiency of their environment.
- Our recommended Identity and Access Management (IAM) strategy ensures effective Role Based Access Control (RBAC). A comprehensive RBAC matrix was curated, featuring roles such as reader, contributor, and owner. Additionally, our approach allows for the customization of roles to align seamlessly with their IAM strategy.
- As part of our design solution, the integration of Microsoft Entra ID's Privileged Identity Management (PIM) feature was advocated. This not only bolsters their security posture but also introduces functionalities like Access reviews, Approval Workflows, and Just in Time (JIT) access.
- To fortify and secure the customer's environment, a set of custom preventive guardrails and detective guardrails were introduced. This proactive and reactive approach ensures a robust defence mechanism against potential threats and vulnerabilities.
- Platform Operations Management Group:
- The Platform Operations Management Group, under the Root Management Group, was recommended for streamlining operations in a centralized manner. It involved the creation of distinct Management Groups for Identity, Management, and Connectivity, each dedicated to specific functions.
- A subscription named "Identity" was organized under the "Identity" Management Group. This subscription houses Domain Controllers that sync with the customer's On-premises Active Directory. Additionally, a Key Vault is placed in this subscription to efficiently manage secrets, keys, and certificates, ensuring periodic rotation for enhanced security.
- Another subscription named "Management" was carefully positioned under the "Management" Management Group. This subscription acts as a centralized hub for logging and monitoring services, including Log Analytics Workspace, Blob Storage, Dashboards, etc. It collects various logs, enhancing visibility, monitoring, and compliance capabilities.
- Furthermore, to streamline connectivity resources, the creation of a centralized connectivity subscription was recommended. This subscription is placed within the "Connectivity" Management Group and serves as a dedicated space for hosting the customer's networking resources, ensuring efficient management and organization.
- The Platform Operations Management Group, under the Root Management Group, was recommended for streamlining operations in a centralized manner. It involved the creation of distinct Management Groups for Identity, Management, and Connectivity, each dedicated to specific functions.
- Workloads Management Group:
- The Workloads Management Group was strategically organized to handle diverse applications across various domain areas efficiently. It includes separate Non-prod and Prod Management Groups, allowing clear segregation of environments.
- Sandbox Management Group:
- The Sandbox Management Group was designed to provide a secure and controlled space for testing, prototyping, and experimenting with new applications or configurations. It will operate separately from production and non-production environments, promoting innovation, accelerating development cycles, and mitigating risks by isolating experimental work from critical systems.
- Suspended Management Group:
- The proposed Suspended Management Group serves as a dedicated space for decommissioned subscriptions. Resources from subscriptions no longer in active use are placed here to maintain compliance, auditability, and security protocols.
1.3 GCP Cloud Landing Zone – Design Solution
Resource Hierarchy
- The highest level of the resource hierarchy, containing all resources within it, is the organization.
- Intuitive.Cloud designed a landing zone folder, which is the primary folder within the organization containing the entirety of the Landing Zone.
- The following folders were designed under the Landing Zone folder:
- Infrastructure: The main purpose of designing this folder is to serve as a container for managing different aspects of infrastructure, such as networking.
- Workloads: This folder was designed for the clear separation and management of workloads across different environments, ensuring that development, testing, and production activities are well-defined and controlled.
- Audit & Security: The reason behind having this folder is to centralize audit and security processes.
- Additionally, the team also designed sandbox, automation, and shared service folders.
Centralized Logging
- Different kinds of logs, including activity logs, data access logs, and network logs, are gathered.
- Our team designed a log sink using Google Cloud's Logging service, directing logs from various services or resources to the centralized log bucket. Additionally, the logs were configured to be archived in a GCS bucket situated within the audit project for long-term storage.
RBAC
- Google Groups were generated through Azure Active Directory, and permissions were granted to these groups using Cloud IAM.
- Individual users were not established; instead, permissions were allocated to Google Groups, adhering to the principle of least privilege.
Guardrails:
- Intuitive.Cloud designed two types of guardrails:
- Detective Guardrails
- Preventive Guardrails
- Detective guardrails aid in post-incident analysis and detecting threats. Our team designed these guardrails with the help of the Security Command Center.
- Preventive Guardrails are policies and controls designed to proactively restrict or block certain actions or configurations to minimize the risk of security, compliance, or cost management issues before they occur. Additionally, preventive guardrails using various organizational policies were also designed.
Budgets and Alerts:
- Budgets and alerts play a crucial role in managing project expenditures and ensuring timely notifications when budget thresholds are approached or exceeded. Threshold rules for budgets were defined, along with corresponding actions to be taken when these thresholds are met.
- A monitoring notification channel was designed to promptly notify stakeholders when certain budget limits are reached. Access to billing and budget dashboard is restricted via IAM.
1.4 Oracle Cloud Landing Zone – Design Solution
Resource Hierarchy
- Suggested a structured approach to compartmentalization, considering workload and best practices. This included dedicated compartments for network, security, workloads, and others.
- Strengthen overall security by configuring separate security policies, access controls, and compliance measures for "Prod" and "Non-Prod" compartments, ensuring enhanced control, financial transparency, and optimized resource allocation.
Security
Cloud Guard
- Customized detector rules for each compartment in Oracle Cloud, setting targets by compartment instead of the root allowing rule severity definition according to specific use cases, ensuring customized risk definitions for compartments like Prod, Non-Prod, and Test.
- Suggested customized policies to restrict unauthorized changes in Cloud Guard, including deleting recipes and altering risk statuses.
Security Zones
- For Security Zone Recipes, Oracle offers the Maximum-Security Recipe containing curated policies for various resource types. These predefined policies streamline compliance with security standards. In extending these principles to respective compartments, crafted specific Security Zone Policies to suit the unique security needs of each compartment, enhancing the overall security posture.
- Suggested customized Security Zone Policies to meet the unique security requirements of each compartment, enhancing the overall security framework of the Oracle Cloud environment.
Audit and service logs
- Enhanced visibility and security by leveraging service logs, suggested utilizing the service hub connector to store logs exceeding (365 days for audit logs, 6 months for service logs) timeframe in object storage. Suggested custom policies for secure tamper-proof Audit Log storage in Log Groups as well as for object storage buckets.
- Suggested Utilizing Log Analytics in OCI for centralized log management, providing powerful search tools, real-time alerts, and log retention policies to enhance monitoring, analysis, and compliance across various sources.
Governance
- Recommended for enhanced financial control, advised implementing budgets with customized alerts, ensuring timely notifications before exceeding limits. Utilized cost-tracking tags in budget settings for targeted spending alerts, enhancing your ability to manage and monitor expenses.
- Utilized resource tagging by using bulk editing with defined tags, enforcing Tag Defaults at the tenancy level, and implementing default tags to ensure consistent assignment during resource creation.
Results & Outcomes
The recommendations by Intuitive.Cloud team would help the customer achieve following results:
- Multi Account Resource Hierarchy: Implementing a robust multi-account resource hierarchy will ensure enhanced security isolation, simplified resource management, granular access control, and streamlined compliance efforts, fostering resilience, agility, and operational efficiency across diverse cloud environments while minimizing risks and ensuring regulatory compliance.
- Centralized Access and Privilege Management: Centralized access and privilege management will ensure enhanced control and security over accounts, paving the way for streamlined operations and reduced security risks.
- Organizational Policy and Compliance Enhancement: Security policies and governance mechanisms will ensure future compliance with evolving regulations and industry standards. This proactive approach will strengthen the security boundaries and foster a culture of continuous improvement and resilience. By enforcing centralized security policies, the organization will maintain consistency and mitigate security risks, enhancing its overall security posture.
- Enhanced Logging and Visibility: Centralized logging and audit log visibility will enable improved control over resources and access, enabling proactive threat detection and response mechanisms in the future.
- Streamlined Infrastructure Provisioning: Leveraging a centralized toolchain for infrastructure provisioning sets the stage for agile deployment processes and rapid scalability in the future, facilitating efficient resource management.
- Standardized Tagging and Naming Strategy: Future implementation of standardized tagging and naming strategy for resource management is expected to optimize cost allocation, enhance resource tracking, and strengthen security monitoring efforts, ensuring resource efficiency and cost-effectiveness.
Lessons Learned forn
Designing a Landing Zone for a large-scale retail customer who is working across multiple clouds requires careful planning, execution, and ongoing management. Following are the lessons learned while aligning client’s multi-cloud environment with industry best practices:
- Standardization: Streamline operations by standardizing architectures, configurations, and processes across multiple clouds using advanced Infrastructure as Code (IaC) tools, ensuring consistency and compliance.
- Centralized Governance: Ensure uniform security, compliance, and operational policies through centralized governance. Robust policies, controls, and monitoring procedures were applied universally across clouds, accommodating unique provider requirements.
- Security Focus: Prioritize security with IAM controls, encryption, network segmentation, and tailored threat detection solutions for each cloud provider's specific needs.
- Compliance Framework: Implement an efficient compliance framework, auditing configurations, monitoring logs, and enforcing standards using automated tools across all clouds.
Conclusion
In summary, the large-scale retail customer encountered several governance and security challenges as they managed their legacy cloud infrastructure. Through their close partnership with Intuitive.Cloud, they strengthened their Landing Zone infrastructure, leading to improved visibility and governance. Consequently, the client successfully built a resilient, secure, and compliant cloud environment, facilitating the acceleration of their cloud adoption initiatives across various platforms.