Architected Solutions

Bridging the gap between Supply Chain Operations and Cloud Engineering.

Serverless Data Lakehouse

Live Production

A production-grade Data Lakehouse designed for extreme cost efficiency and scalability. This architecture ingests raw edge logs, processes them via serverless ETL, and stores them in Apache Iceberg format for ACID-compliant analytics.

The Stack

  • Ingest: CloudFront + S3 (Real-time)
  • Storage: Apache Iceberg (Bronze/Silver)
  • Compute: AWS Glue (Serverless Spark)
  • Orchestration: EventBridge (Scheduled ETL)
  • IaC: Terraform (100% Code)
  • Security: WAF + KMS + Geo-Restriction + OAC
  • DR: S3 CRR + Dual CloudFront + Lambda AssociateAlias
  • Observability: CloudWatch + SNS Alerts

Key Metrics

  • Cost: < $1.00 / month (Idle)
  • Latency: < 5 mins (Log to Insight)
  • Scale: Zero-maintenance auto-scaling
View Live Analytics
Architecture Diagram
flowchart TB subgraph Users Browser[Browser] end subgraph GitHub Repo[GitHub Repository] Actions[GitHub Actions] end subgraph AWS_Global["AWS Global"] WAF[AWS WAF
Web ACL] Route53[Route53
DNS] ACM[ACM Certificate
us-east-1] CloudFront[CloudFront Pro
Primary Distribution] CloudFrontDR[CloudFront Pro
DR Distribution] LambdaDR[Lambda
DR Failover] CWHealthAlarm[CloudWatch Alarm
Health Check] end subgraph AWS_EU["AWS eu-central-1"] S3[S3 Bucket
allaboutdata.eu] KMS[KMS
Website CMK] KMSLake[KMS
Datalake CMK] end subgraph AWS_DR["AWS eu-central-2 (DR)"] S3DR[S3 Bucket
allaboutdata.eu-dr] KMSDR[KMS
DR Customer Managed Key] end subgraph DataLake["AWS Data Lake"] S3Lake[S3 Data Lake Bucket] GlueCatalog[Glue Data Catalog
Bronze Table] EventBridge[EventBridge
Scheduler] GlueJob[Glue Spark Job
Bronze → Silver] Iceberg[Iceberg Warehouse
Silver Table] GoldJob[Glue Spark Job
Silver → Gold] Gold[Gold Warehouse
Parquet Aggregations] Athena[Athena
SQL Analytics] LambdaDash[Lambda
Dashboard Export] GlueDQ[Glue Data Quality
DQDL Rulesets] end subgraph Observability["AWS Observability"] CWDashboard[CloudWatch
Dashboard] CWAlarms[CloudWatch
Alarms] SNS[SNS
Email Alerts] end subgraph External GoogleWorkspace[Google Workspace
Email] end %% User Flow Browser -->|HTTPS Request| Route53 Route53 -->|DNS Resolution| CloudFront WAF -->|Protection| CloudFront WAF -->|Protection| CloudFrontDR CloudFront -->|Origin| S3 CloudFrontDR -->|Origin| S3DR ACM -.->|TLS Certificate| CloudFront ACM -.->|TLS Certificate| CloudFrontDR %% DR Failover Flow Route53 -.->|Health Check| CWHealthAlarm CWHealthAlarm -->|SNS → ALARM/OK| LambdaDR LambdaDR -.->|AssociateAlias| CloudFront LambdaDR -.->|AssociateAlias| CloudFrontDR %% CI/CD Flow Repo -->|Push to main| Actions Actions -->|S3 Sync| S3 Actions -->|Cache Invalidation| CloudFront %% Email (Google Workspace) Route53 -.->|MX Record| GoogleWorkspace %% Data Lake Flow CloudFront -->|Access Logs| S3Lake S3Lake -->|TSV Logs| GlueCatalog S3Lake -.->|S3 Events| EventBridge EventBridge -->|6h Trigger| GlueJob EventBridge -->|Daily Trigger| GoldJob GlueCatalog -->|Incremental Read| GlueJob GlueJob -->|Write Iceberg V2| Iceberg Iceberg -->|Delta Read| GoldJob GoldJob -->|Write Parquet 128MB| Gold Iceberg -->|Query| Athena Gold -->|Query| Athena %% Dashboard Export Flow GoldJob -.->|Job Complete| EventBridge EventBridge -->|Trigger| LambdaDash LambdaDash -->|Query Gold| Athena LambdaDash -->|Write JSON| S3 %% Cross-Region Replication S3 -->|S3 CRR| S3DR %% Encryption KMS -.->|SSE-KMS| S3 KMSDR -.->|SSE-KMS| S3DR KMSLake -.->|SSE-KMS| S3Lake %% Data Quality (Daily at 3 AM) EventBridge -->|Daily 3 AM| GlueDQ GlueDQ -->|Validate| Iceberg GlueDQ -->|Validate| Gold %% Observability Flow GlueJob -.->|Metrics| CWDashboard GoldJob -.->|Metrics| CWDashboard LambdaDash -.->|Metrics| CWDashboard CWAlarms -->|Alert| SNS SNS -->|Email| GoogleWorkspace %% Styling classDef aws fill:#FF9900,stroke:#232F3E,color:#232F3E classDef github fill:#24292E,stroke:#24292E,color:#fff classDef user fill:#4285F4,stroke:#1a73e8,color:#fff classDef external fill:#34A853,stroke:#1e8e3e,color:#fff classDef datalake fill:#8C4FFF,stroke:#232F3E,color:#fff classDef observability fill:#DD3522,stroke:#232F3E,color:#fff classDef security fill:#1A8FE3,stroke:#232F3E,color:#fff class Route53,ACM,CloudFront,CloudFrontDR,S3,S3DR aws class WAF,KMS,KMSDR,KMSLake security class Repo,Actions github class Browser user class GoogleWorkspace external class S3Lake,GlueCatalog,EventBridge,GlueJob,Iceberg,GoldJob,Gold,Athena,LambdaDash,GlueDQ datalake class CWDashboard,CWAlarms,CWHealthAlarm,SNS observability class LambdaDR datalake

Static Site on Cloudflare R2

Live Production

A cost-optimized static website served from Cloudflare R2 with aggressive CDN caching, deployed via a fully automated GitOps pipeline. Built as a benchmark comparison against the AWS S3 + CloudFront stack above, demonstrating R2's zero-egress pricing model.

The Stack

  • Storage: Cloudflare R2 (S3-compatible)
  • CDN: Cloudflare Global Network
  • Routing: Transform Rules (index rewrite)
  • Caching: Edge 7d + Browser 4h
  • IaC: Terraform (Cloudflare Provider v5)
  • CI/CD: GitHub Actions (aws s3 sync)
  • TLS: Cloudflare Managed Certificate

Key Metrics

  • Egress: $0.00 (R2 zero egress fees)
  • Deploy: Push-to-live via GitOps
  • Infra: 4 Terraform-managed resources
Visit Live Site
Architecture Diagram
flowchart TB subgraph DevWorkflow["Developer Workflow"] Dev[Developer] GitPush[Git Push to main] end subgraph GH["GitHub"] GHRepo[Repository] subgraph GHA["GitHub Actions"] Trigger[Trigger: push to main
web/** or .github/**] S3Sync[aws s3 sync to R2
S3-compatible API] CacheCtl[aws s3 cp
Cache-Control: max-age=604800] end end subgraph CF["Cloudflare"] CFDNS[DNS Zone
allaboutdata-test.uk] CFTLS[HTTPS / TLS
Managed Certificate] CFCDN[Cloudflare CDN
Global Edge Network] subgraph CFRules["Rulesets (Terraform)"] Rewrite[Transform Rule
/ → /index.html] CacheRule[Cache Rule
Edge: 7d / Browser: 4h] end subgraph CFR2["R2 Object Storage"] Bucket[R2 Bucket
WEUR Region] end end subgraph IaC["Infrastructure as Code"] TF[Terraform
Cloudflare Provider v5] end User[End User
Browser] %% Deploy Flow Dev --> GitPush GitPush --> GHRepo GHRepo --> Trigger Trigger --> S3Sync S3Sync --> CacheCtl CacheCtl -->|S3-compatible endpoint| Bucket %% Terraform Provisioning TF -->|Provisions| Bucket TF -->|Provisions| CFDNS TF -->|Provisions| Rewrite TF -->|Provisions| CacheRule %% Request Flow User -->|HTTPS Request| CFDNS CFDNS --> CFTLS CFTLS --> CFCDN CFCDN --> Rewrite Rewrite --> CacheRule CacheRule -->|Cache MISS| Bucket CacheRule -->|Cache HIT| User Bucket -->|Origin Response| CFCDN CFCDN -->|Response| User %% Styling classDef cloudflare fill:#F6821F,stroke:#F6821F,color:#fff classDef github fill:#24292E,stroke:#24292E,color:#fff classDef user fill:#4285F4,stroke:#1a73e8,color:#fff classDef terraform fill:#7B42BC,stroke:#5C2D91,color:#fff classDef storage fill:#F6821F,stroke:#E05D00,color:#fff class CFDNS,CFTLS,CFCDN,Rewrite,CacheRule cloudflare class Bucket storage class GHRepo,Trigger,S3Sync,CacheCtl github class User,Dev,GitPush user class TF terraform

Project: Financial breakdown AWS and Cloudflare

Coming Soon

A detailed comparison between the two chosen approaches for static website hosting.

Project: Dr. Calpice

Coming Soon

Supply Chain Data Cleaning & Anomaly Detection Pipeline.

← Return to Base