{
  "nbformat": 4,
  "nbformat_minor": 5,
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "name": "python",
      "version": "3.13.0"
    },
    "blog_metadata": {
      "topic": "Three full-stack data platforms in a weekend: what Fabric and Azure AI Foundry enable for rapid delivery",
      "slug": "three-full-stack-data-platforms-in-a-weekend-what-fabric-and",
      "generated_by": "LinkedIn Post Generator + Azure OpenAI",
      "generated_at": "2026-05-06T02:38:03.226Z"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Three full-stack data platforms in a weekend: what Fabric and Azure AI Foundry enable for rapid delivery\n",
        "\n",
        "This notebook turns the blog post into a hands-on validation flow. It focuses on the core claim that rapid delivery is increasingly a platform architecture outcome, especially when data landing, transformation, AI access, APIs, and governance are designed to work together.\n",
        "\n",
        "The examples below simulate a practical path from raw data ingestion to curated metrics, governed AI access, internal API wrapping, and lightweight governance checks."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "%pip install -q pandas pyarrow fastapi pydantic requests azure-identity"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "import os\n",
        "import json\n",
        "from pathlib import Path\n",
        "\n",
        "import pandas as pd\n",
        "import requests\n",
        "from fastapi import FastAPI\n",
        "from pydantic import BaseModel\n",
        "from azure.identity import DefaultAzureCredential"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Platform architecture map\n",
        "\n",
        "The blog frames weekend delivery as a reduction in coordination drag across three connected platform layers: data ingestion, analytics and semantic access, and AI-enabled app workflows. This cell renders that architecture as a Python data structure so it can be inspected and validated in a notebook."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "platform_flow = {\n",
        "    \"Weekend delivery goal\": {\n",
        "        \"Platform 1: Data ingestion\": [\n",
        "            \"Fabric Data Factory pipelines\",\n",
        "            \"Lakehouse landing zone\",\n",
        "            \"Governed shortcuts / OneLake\"\n",
        "        ],\n",
        "        \"Platform 2: Analytics + semantic model\": [\n",
        "            \"Notebook or SQL transform\",\n",
        "            \"Warehouse / Lakehouse tables\",\n",
        "            \"Power BI semantic model\"\n",
        "        ],\n",
        "        \"Platform 3: AI-enabled app workflow\": [\n",
        "            \"Azure AI Foundry managed endpoint\",\n",
        "            \"Internal app or API\",\n",
        "            \"Entra ID token-based access\"\n",
        "        ],\n",
        "        \"Guardrails\": [\n",
        "            \"Span ingestion\",\n",
        "            \"Span analytics\",\n",
        "            \"Span AI workflow\"\n",
        "        ]\n",
        "    }\n",
        "}\n",
        "\n",
        "print(json.dumps(platform_flow, indent=2))"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Fabric-style ingestion into a lakehouse-friendly Parquet layout\n",
        "\n",
        "This example validates the landing pattern described in the post. The goal is not sophistication, but a repeatable raw-data structure that downstream transforms and semantic models can consume without format churn."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "import pandas as pd\n",
        "from pathlib import Path\n",
        "\n",
        "orders = pd.DataFrame(\n",
        "    [\n",
        "        {\"order_id\": 1001, \"customer\": \"Contoso\", \"amount\": 420.50},\n",
        "        {\"order_id\": 1002, \"customer\": \"Fabrikam\", \"amount\": 199.99},\n",
        "    ]\n",
        ")\n",
        "\n",
        "output = Path(\"Files/bronze/orders\")\n",
        "output.mkdir(parents=True, exist_ok=True)\n",
        "file_path = output / \"orders_2026_05_06.parquet\"\n",
        "orders.to_parquet(file_path, index=False)\n",
        "\n",
        "print(f\"Wrote {len(orders)} rows to {output}\")\n",
        "print(f\"File exists: {file_path.exists()}\")\n",
        "print(pd.read_parquet(file_path))"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Lightweight curation into a KPI-ready dataset\n",
        "\n",
        "This step promotes raw operational records into a simple curated layer. The resulting dataset is closer to a semantic model, dashboard tile, or app-facing API payload than the original transaction-level landing data."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "import pandas as pd\n",
        "\n",
        "bronze = pd.DataFrame(\n",
        "    [\n",
        "        {\"order_id\": 1001, \"customer\": \"Contoso\", \"amount\": 420.50},\n",
        "        {\"order_id\": 1002, \"customer\": \"Fabrikam\", \"amount\": 199.99},\n",
        "        {\"order_id\": 1003, \"customer\": \"Contoso\", \"amount\": 80.00},\n",
        "    ]\n",
        ")\n",
        "\n",
        "silver = (\n",
        "    bronze.groupby(\"customer\", as_index=False)\n",
        "    .agg(total_revenue=(\"amount\", \"sum\"), order_count=(\"order_id\", \"count\"))\n",
        "    .assign(avg_order_value=lambda df: df[\"total_revenue\"] / df[\"order_count\"])\n",
        ")\n",
        "\n",
        "print(silver.to_string(index=False))"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## App-to-data-to-AI interaction pattern\n",
        "\n",
        "The post recommends a specific order of operations for internal AI-enabled apps: establish identity, read approved business context from curated data, call a governed AI endpoint, and log metadata rather than sensitive payloads. This cell expresses that sequence in executable Python form."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "interaction_sequence = [\n",
        "    {\"step\": 1, \"actor\": \"Internal App\", \"action\": \"Request token for AI resource\", \"target\": \"Entra ID\"},\n",
        "    {\"step\": 2, \"actor\": \"Entra ID\", \"action\": \"Return access token\", \"target\": \"Internal App\"},\n",
        "    {\"step\": 3, \"actor\": \"Internal App\", \"action\": \"Read approved business context\", \"target\": \"Fabric Data Product\"},\n",
        "    {\"step\": 4, \"actor\": \"Fabric Data Product\", \"action\": \"Return curated facts\", \"target\": \"Internal App\"},\n",
        "    {\"step\": 5, \"actor\": \"Internal App\", \"action\": \"POST prompt + context + bearer token\", \"target\": \"Governed AI Endpoint\"},\n",
        "    {\"step\": 6, \"actor\": \"Governed AI Endpoint\", \"action\": \"Return grounded response\", \"target\": \"Internal App\"},\n",
        "    {\"step\": 7, \"actor\": \"Internal App\", \"action\": \"Log request metadata only\", \"target\": \"Internal App\"},\n",
        "]\n",
        "\n",
        "for item in interaction_sequence:\n",
        "    print(f\"{item['step']}. {item['actor']} -> {item['target']}: {item['action']}\")"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Required environment variables for governed AI endpoint access\n",
        "\n",
        "The reusable client pattern expects these environment variables:\n",
        "\n",
        "- `AI_ENDPOINT`: Base URL for the governed AI service\n",
        "- `AI_SCOPE`: OAuth scope or resource identifier used to request a token\n",
        "\n",
        "In a real environment, the endpoint would be protected by Entra ID and the notebook or app would authenticate using a managed identity, service principal, or developer credentials."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Reusable governed AI client\n",
        "\n",
        "This example wraps token acquisition and endpoint invocation behind a small client class. To keep the notebook safe and runnable without live cloud dependencies, the code includes a dry-run fallback when environment variables are not set."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "import os\n",
        "import requests\n",
        "from azure.identity import DefaultAzureCredential\n",
        "\n",
        "class GovernedAIClient:\n",
        "    def __init__(self, endpoint: str, scope: str):\n",
        "        self.endpoint = endpoint.rstrip(\"/\")\n",
        "        self.credential = DefaultAzureCredential()\n",
        "        self.scope = scope\n",
        "\n",
        "    def invoke(self, prompt: str, context: dict) -> dict:\n",
        "        token = self.credential.get_token(self.scope).token\n",
        "        headers = {\"Authorization\": f\"Bearer {token}\", \"Content-Type\": \"application/json\"}\n",
        "        payload = {\"input\": prompt, \"context\": context}\n",
        "        response = requests.post(f\"{self.endpoint}/invoke\", json=payload, headers=headers, timeout=30)\n",
        "        response.raise_for_status()\n",
        "        return response.json()\n",
        "\n",
        "endpoint = os.environ.get(\"AI_ENDPOINT\")\n",
        "scope = os.environ.get(\"AI_SCOPE\")\n",
        "\n",
        "if endpoint and scope:\n",
        "    client = GovernedAIClient(endpoint, scope)\n",
        "    try:\n",
        "        result = client.invoke(\"Summarize account risk\", {\"account_id\": \"A-1024\"})\n",
        "        print(result)\n",
        "    except Exception as e:\n",
        "        print(f\"Live invocation failed: {e}\")\n",
        "else:\n",
        "    print({\n",
        "        \"dry_run\": True,\n",
        "        \"message\": \"Set AI_ENDPOINT and AI_SCOPE to enable live invocation.\",\n",
        "        \"sample_request\": {\n",
        "            \"input\": \"Summarize account risk\",\n",
        "            \"context\": {\"account_id\": \"A-1024\"}\n",
        "        }\n",
        "    })"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Internal API wrapper for governed AI access\n",
        "\n",
        "Rather than letting every app call the model endpoint directly, the post recommends a small internal service boundary. This pattern centralizes approved context handling and reduces connector sprawl across Power Apps, internal web apps, and workflow tools."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "from fastapi import FastAPI\n",
        "from pydantic import BaseModel\n",
        "\n",
        "app = FastAPI()\n",
        "\n",
        "class AskRequest(BaseModel):\n",
        "    account_id: str\n",
        "    question: str\n",
        "\n",
        "@app.post(\"/copilot/ask\")\n",
        "def ask(req: AskRequest):\n",
        "    approved_context = {\"account_id\": req.account_id, \"source\": \"curated_customer_360\"}\n",
        "    answer = {\n",
        "        \"question\": req.question,\n",
        "        \"answer\": f\"Governed response for {req.account_id}\",\n",
        "        \"context_used\": approved_context,\n",
        "    }\n",
        "    return answer\n",
        "\n",
        "sample = AskRequest(account_id=\"A-1024\", question=\"What is the current supplier risk summary?\")\n",
        "print(ask(sample))"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Governance check simulation: app registrations with Microsoft Graph access\n",
        "\n",
        "The original post includes a PowerShell example for reviewing app registrations with Graph permissions. Since this notebook is Python-based, the following cell simulates the same review logic using sample application metadata."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "applications = [\n",
        "    {\n",
        "        \"DisplayName\": \"Internal Risk App\",\n",
        "        \"AppId\": \"11111111-1111-1111-1111-111111111111\",\n",
        "        \"RequiredResourceAccess\": [\n",
        "            {\n",
        "                \"ResourceAppId\": \"00000003-0000-0000-c000-000000000000\",\n",
        "                \"ResourceAccess\": [\n",
        "                    {\"Id\": \"User.Read.All\"},\n",
        "                    {\"Id\": \"Directory.Read.All\"}\n",
        "                ]\n",
        "            }\n",
        "        ]\n",
        "    },\n",
        "    {\n",
        "        \"DisplayName\": \"Finance Dashboard\",\n",
        "        \"AppId\": \"22222222-2222-2222-2222-222222222222\",\n",
        "        \"RequiredResourceAccess\": []\n",
        "    }\n",
        "]\n",
        "\n",
        "graph_app_id = \"00000003-0000-0000-c000-000000000000\"\n",
        "review_rows = []\n",
        "\n",
        "for app_item in applications:\n",
        "    graph_access = [\n",
        "        rra for rra in app_item.get(\"RequiredResourceAccess\", [])\n",
        "        if rra.get(\"ResourceAppId\") == graph_app_id\n",
        "    ]\n",
        "    if graph_access:\n",
        "        permission_ids = []\n",
        "        for entry in graph_access:\n",
        "            permission_ids.extend([ra.get(\"Id\") for ra in entry.get(\"ResourceAccess\", [])])\n",
        "        review_rows.append({\n",
        "            \"DisplayName\": app_item[\"DisplayName\"],\n",
        "            \"AppId\": app_item[\"AppId\"],\n",
        "            \"GraphAccess\": \", \".join(permission_ids)\n",
        "        })\n",
        "\n",
        "print(pd.DataFrame(review_rows))"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Governance check simulation: service principals without owners or preferred SSO mode\n",
        "\n",
        "The post also highlights the importance of catching identity posture issues early. This Python version simulates a review for service principals that either have no owners or are missing a preferred single sign-on mode."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "service_principals = [\n",
        "    {\n",
        "        \"DisplayName\": \"Supplier Risk API\",\n",
        "        \"AppId\": \"33333333-3333-3333-3333-333333333333\",\n",
        "        \"Owners\": [\"alice@contoso.com\"],\n",
        "        \"PreferredSingleSignOnMode\": \"oidc\"\n",
        "    },\n",
        "    {\n",
        "        \"DisplayName\": \"Legacy Workflow Connector\",\n",
        "        \"AppId\": \"44444444-4444-4444-4444-444444444444\",\n",
        "        \"Owners\": [],\n",
        "        \"PreferredSingleSignOnMode\": \"\"\n",
        "    },\n",
        "    {\n",
        "        \"DisplayName\": \"Analytics Automation\",\n",
        "        \"AppId\": \"55555555-5555-5555-5555-555555555555\",\n",
        "        \"Owners\": [],\n",
        "        \"PreferredSingleSignOnMode\": \"saml\"\n",
        "    }\n",
        "]\n",
        "\n",
        "rows = []\n",
        "for sp in service_principals:\n",
        "    owner_count = len(sp.get(\"Owners\", []))\n",
        "    preferred_sso = sp.get(\"PreferredSingleSignOnMode\", \"\")\n",
        "    needs_review = owner_count == 0 or not str(preferred_sso).strip()\n",
        "    rows.append({\n",
        "        \"DisplayName\": sp[\"DisplayName\"],\n",
        "        \"AppId\": sp[\"AppId\"],\n",
        "        \"OwnerCount\": owner_count,\n",
        "        \"PreferredSsoMode\": preferred_sso,\n",
        "        \"NeedsReview\": needs_review\n",
        "    })\n",
        "\n",
        "result = pd.DataFrame(rows)\n",
        "print(result[result[\"NeedsReview\"]].sort_values(\"DisplayName\").to_string(index=False))"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## End-to-end rapid delivery pattern\n",
        "\n",
        "This final architecture sketch summarizes the blog's recommended operating model: land data quickly, curate it into reusable metrics, expose it through internal app experiences, connect to governed AI endpoints, and automate governance checks early rather than late."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "rapid_delivery_pattern = {\n",
        "    \"Rapid build in a weekend\": [\n",
        "        \"Fabric data landing\",\n",
        "        \"Curated metrics and semantic layer\",\n",
        "        \"Internal app experience\",\n",
        "        \"Azure AI Foundry endpoint\"\n",
        "    ],\n",
        "    \"Governance automation\": [\n",
        "        \"App registration review\",\n",
        "        \"Consent and owner checks\",\n",
        "        \"Environment security baseline\"\n",
        "    ],\n",
        "    \"Control links\": {\n",
        "        \"App registration review\": \"Internal app experience\",\n",
        "        \"Consent and owner checks\": \"Azure AI Foundry endpoint\",\n",
        "        \"Environment security baseline\": \"Fabric data landing\"\n",
        "    }\n",
        "}\n",
        "\n",
        "print(json.dumps(rapid_delivery_pattern, indent=2))"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Summary\n",
        "\n",
        "This notebook validated the blog's central idea with runnable examples: integrated platforms reduce handoffs across ingestion, curation, AI access, APIs, and governance. The practical advantage is not just better tooling in isolation, but fewer control surfaces and fewer transitions before a business user can interact with something useful.\n",
        "\n",
        "## Next Steps\n",
        "\n",
        "- Replace the sample bronze and silver datasets with a real operational extract.\n",
        "- Connect the governed AI client to a live protected endpoint using managed identity or service principal auth.\n",
        "- Turn the FastAPI example into a deployed internal service with logging and policy enforcement.\n",
        "- Expand the governance simulations into real Microsoft Graph checks and approval workflows.\n",
        "- Measure where delivery time drops most in your environment: data prep, API exposure, or AI endpoint governance."
      ]
    }
  ]
}