{
  "nbformat": 4,
  "nbformat_minor": 5,
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "name": "python",
      "version": "3.13.0"
    },
    "blog_metadata": {
      "topic": "The enterprise case for data sovereignty in autonomous AI systems",
      "slug": "the-enterprise-case-for-data-sovereignty-in-autonomous-ai-sy",
      "generated_by": "LinkedIn Post Generator + Azure OpenAI",
      "generated_at": "2026-05-15T16:41:33.385Z"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# The enterprise case for data sovereignty in autonomous AI systems\n",
        "\n",
        "Autonomous AI raises the stakes of governance because systems are no longer limited to reading data—they can retrieve across boundaries, trigger tools, and take action. This notebook turns the article's core ideas into hands-on validation patterns you can run locally in Python, focusing on identity, retrieval vs. action controls, auditability, and runtime enforcement.\n",
        "\n",
        "The goal is to make sovereignty concrete as an architectural control boundary: what can be accessed, what can be acted on, where requests are processed, and what evidence is produced for review."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "%pip install pandas"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "from dataclasses import dataclass, field\n",
        "from typing import Dict, Set, Callable, List\n",
        "from datetime import datetime, timezone\n",
        "import json\n",
        "import pandas as pd"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Architecture flow: governed data plane and constrained action plane\n",
        "\n",
        "The article argues for a simple but powerful operating model: users interact with an autonomous agent, the agent calls policy middleware before doing anything, retrieval is limited to approved data sources, actions are limited to approved enterprise tools, and both paths generate audit evidence.\n",
        "\n",
        "This cell renders a lightweight text version of that architecture so you can validate the control flow without external diagram tooling."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "flow = {\n",
        "    'Business User': ['Autonomous Agent'],\n",
        "    'Autonomous Agent': ['Policy Middleware'],\n",
        "    'Policy Middleware': ['Permission Type', 'Data Sovereignty Rules'],\n",
        "    'Permission Type': ['Approved Data Sources', 'Approved Enterprise Tools'],\n",
        "    'Approved Data Sources': ['Audit Log'],\n",
        "    'Approved Enterprise Tools': ['Audit Log'],\n",
        "    'Audit Log': ['Governance Review']\n",
        "}\n",
        "\n",
        "for node, edges in flow.items():\n",
        "    print(f\"{node} -> {', '.join(edges)}\")"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Separate retrieval permissions from action permissions\n",
        "\n",
        "A core sovereignty principle is that access to data should not automatically imply authority to act. This example models retrieval and action permissions independently so you can verify that an agent may read from one source while only being allowed to invoke a narrower set of tools."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "# Policy model separating retrieval permissions from action permissions.\n",
        "from dataclasses import dataclass, field\n",
        "from typing import Dict, Set\n",
        "\n",
        "@dataclass\n",
        "class SovereigntyPolicy:\n",
        "    retrieval: Dict[str, Set[str]] = field(default_factory=dict)\n",
        "    actions: Dict[str, Set[str]] = field(default_factory=dict)\n",
        "\n",
        "    def can_retrieve(self, actor: str, source: str) -> bool:\n",
        "        return source in self.retrieval.get(actor, set())\n",
        "\n",
        "    def can_act(self, actor: str, tool: str) -> bool:\n",
        "        return tool in self.actions.get(actor, set())\n",
        "\n",
        "policy = SovereigntyPolicy(\n",
        "    retrieval={\"finance-agent\": {\"eu-sales-lake\", \"policy-wiki\"}},\n",
        "    actions={\"finance-agent\": {\"ticketing-api\"}}\n",
        ")\n",
        "\n",
        "print('retrieve eu-sales-lake:', policy.can_retrieve('finance-agent', 'eu-sales-lake'))\n",
        "print('act ticketing-api:', policy.can_act('finance-agent', 'ticketing-api'))\n",
        "print('act crm-write-api:', policy.can_act('finance-agent', 'crm-write-api'))"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Audit every invocation with region and decision metadata\n",
        "\n",
        "Sovereignty becomes real when it produces evidence. This example logs each attempted operation with timestamp, actor, operation, target, allow/deny result, and region so incident review can reconstruct what happened."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "# Audit logger for every tool invocation with region and decision metadata.\n",
        "from datetime import datetime, timezone\n",
        "import json\n",
        "\n",
        "def audit_event(actor: str, operation: str, target: str, allowed: bool, region: str) -> str:\n",
        "    event = {\n",
        "        'ts': datetime.now(timezone.utc).isoformat(),\n",
        "        'actor': actor,\n",
        "        'operation': operation,\n",
        "        'target': target,\n",
        "        'allowed': allowed,\n",
        "        'region': region,\n",
        "    }\n",
        "    line = json.dumps(event, separators=(',', ':'))\n",
        "    print(line)\n",
        "    return line\n",
        "\n",
        "logs = []\n",
        "logs.append(audit_event('finance-agent', 'retrieve', 'eu-sales-lake', True, 'EU'))\n",
        "logs.append(audit_event('finance-agent', 'act', 'crm-write-api', False, 'US'))\n",
        "\n",
        "pd.DataFrame([json.loads(x) for x in logs])"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Enforce sovereignty checks before retrieval or action execution\n",
        "\n",
        "This middleware pattern validates requests before execution. In addition to checking whether a target is allowed, it also treats region as part of the authorization decision, which helps prevent accidental cross-region drift."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "# Middleware enforcing sovereignty checks before retrieval or action execution.\n",
        "from dataclasses import dataclass\n",
        "\n",
        "@dataclass\n",
        "class Request:\n",
        "    actor: str\n",
        "    kind: str\n",
        "    target: str\n",
        "    region: str\n",
        "\n",
        "class Middleware:\n",
        "    def __init__(self, retrieval_allowed, action_allowed):\n",
        "        self.retrieval_allowed = retrieval_allowed\n",
        "        self.action_allowed = action_allowed\n",
        "\n",
        "    def authorize(self, req: Request) -> bool:\n",
        "        if req.kind == 'retrieve':\n",
        "            return self.retrieval_allowed(req.actor, req.target) and req.region == 'EU'\n",
        "        return self.action_allowed(req.actor, req.target) and req.region == 'EU'\n",
        "\n",
        "mw = Middleware(lambda a, t: t == 'eu-sales-lake', lambda a, t: t == 'ticketing-api')\n",
        "\n",
        "requests = [\n",
        "    Request('finance-agent', 'retrieve', 'eu-sales-lake', 'EU'),\n",
        "    Request('finance-agent', 'retrieve', 'eu-sales-lake', 'US'),\n",
        "    Request('finance-agent', 'act', 'ticketing-api', 'EU'),\n",
        "    Request('finance-agent', 'act', 'crm-write-api', 'EU'),\n",
        "]\n",
        "\n",
        "results = []\n",
        "for req in requests:\n",
        "    allowed = mw.authorize(req)\n",
        "    results.append({**req.__dict__, 'allowed': allowed})\n",
        "    print(req, '=>', allowed)\n",
        "\n",
        "pd.DataFrame(results)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Wrap an autonomous agent with authorization and audit logging\n",
        "\n",
        "This example combines policy enforcement and audit logging in an agent wrapper. The agent logs every attempted call, executes only approved requests, and raises a clear error for denied actions."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "# Autonomous agent wrapper that logs every tool call and blocks unauthorized actions.\n",
        "from dataclasses import dataclass\n",
        "\n",
        "@dataclass\n",
        "class Request:\n",
        "    actor: str\n",
        "    kind: str\n",
        "    target: str\n",
        "    region: str\n",
        "\n",
        "def audit(actor, kind, target, allowed, region):\n",
        "    print(f\"{actor}|{kind}|{target}|allowed={allowed}|region={region}\")\n",
        "\n",
        "class Agent:\n",
        "    def __init__(self, authorize):\n",
        "        self.authorize = authorize\n",
        "\n",
        "    def invoke(self, req: Request) -> str:\n",
        "        allowed = self.authorize(req)\n",
        "        audit(req.actor, req.kind, req.target, allowed, req.region)\n",
        "        if not allowed:\n",
        "            raise PermissionError(f\"Denied: {req.kind} -> {req.target}\")\n",
        "        return f\"Executed {req.kind} on {req.target}\"\n",
        "\n",
        "agent = Agent(lambda r: (r.kind == 'retrieve' and r.target == 'eu-sales-lake' and r.region == 'EU') or (r.kind == 'act' and r.target == 'ticketing-api' and r.region == 'EU'))\n",
        "\n",
        "print(agent.invoke(Request('finance-agent', 'act', 'ticketing-api', 'EU')))\n",
        "\n",
        "try:\n",
        "    print(agent.invoke(Request('finance-agent', 'act', 'crm-write-api', 'EU')))\n",
        "except PermissionError as e:\n",
        "    print('ERROR:', e)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Sequence of an allowed vs. denied request\n",
        "\n",
        "The article emphasizes that policy middleware should log attempted invocations before forwarding approved calls or returning a safe denial. This cell prints a simple sequence trace for both paths."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "def sequence_trace(allowed: bool):\n",
        "    print('User -> Agent: Goal / prompt')\n",
        "    print('Agent -> Policy Middleware: Request retrieve(action?, target, region)')\n",
        "    print('Policy Middleware -> Audit Store: Log attempted invocation')\n",
        "    if allowed:\n",
        "        print('Policy Middleware -> Enterprise Tool: Forward approved call')\n",
        "        print('Enterprise Tool -> Agent: Result')\n",
        "        print('Agent -> User: Response')\n",
        "    else:\n",
        "        print('Policy Middleware -> Agent: PermissionError')\n",
        "        print('Agent -> User: Safe refusal with reason')\n",
        "\n",
        "print('--- Allowed path ---')\n",
        "sequence_trace(True)\n",
        "print('\\n--- Denied path ---')\n",
        "sequence_trace(False)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Minimal rule engine for region pinning and connector allow-lists\n",
        "\n",
        "Even a small policy can prevent a surprising amount of boundary drift. This example checks two things: whether the request and data stay in the same region, and whether the connector is on an approved allow-list."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "# Minimal sovereignty rule engine for region pinning and connector allow-list checks.\n",
        "def evaluate(request_region: str, data_region: str, connector: str) -> dict:\n",
        "    allowed_connectors = {'sharepoint-eu', 'ticketing-api'}\n",
        "    same_region = request_region == data_region\n",
        "    connector_ok = connector in allowed_connectors\n",
        "    return {\n",
        "        'same_region': same_region,\n",
        "        'connector_ok': connector_ok,\n",
        "        'allowed': same_region and connector_ok,\n",
        "    }\n",
        "\n",
        "tests = [\n",
        "    ('EU', 'EU', 'sharepoint-eu'),\n",
        "    ('EU', 'US', 'sharepoint-eu'),\n",
        "    ('EU', 'EU', 'crm-write-api'),\n",
        "]\n",
        "\n",
        "rows = []\n",
        "for request_region, data_region, connector in tests:\n",
        "    result = evaluate(request_region, data_region, connector)\n",
        "    rows.append({\n",
        "        'request_region': request_region,\n",
        "        'data_region': data_region,\n",
        "        'connector': connector,\n",
        "        **result,\n",
        "    })\n",
        "    print(result)\n",
        "\n",
        "pd.DataFrame(rows)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Simulate an identity inventory review in Python\n",
        "\n",
        "The original article includes PowerShell for exporting Azure application and service principal metadata. For notebook validation, this Python version creates a mock inventory, joins applications to service principals, and highlights ownership gaps that represent sovereignty debt."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "apps = pd.DataFrame([\n",
        "    {'DisplayName': 'Finance Agent App', 'AppId': 'app-001', 'Id': 'appobj-001', 'BusinessOwner': 'Finance Platform'},\n",
        "    {'DisplayName': 'HR Agent App', 'AppId': 'app-002', 'Id': 'appobj-002', 'BusinessOwner': None},\n",
        "    {'DisplayName': 'Ops Agent App', 'AppId': 'app-003', 'Id': 'appobj-003', 'BusinessOwner': 'IT Operations'},\n",
        "])\n",
        "\n",
        "spns = pd.DataFrame([\n",
        "    {'DisplayName': 'Finance Agent SP', 'AppId': 'app-001', 'Id': 'spn-001'},\n",
        "    {'DisplayName': 'HR Agent SP', 'AppId': 'app-002', 'Id': 'spn-002'},\n",
        "    {'DisplayName': 'Ops Agent SP', 'AppId': 'app-003', 'Id': 'spn-003'},\n",
        "])\n",
        "\n",
        "joined = apps.merge(spns[['AppId', 'Id']], on='AppId', how='left', suffixes=('_Application', '_ServicePrincipal'))\n",
        "joined = joined.rename(columns={'Id_Application': 'ApplicationObject', 'Id_ServicePrincipal': 'ServicePrincipal'})\n",
        "joined['OwnerMissing'] = joined['BusinessOwner'].isna()\n",
        "joined"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Simulate Microsoft 365 enterprise permission review\n",
        "\n",
        "The article recommends reviewing broad scopes and tenant-wide consent. This Python example builds a mock permission grant report so you can sort by risky consent patterns and identify apps that deserve immediate review."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "service_principals = pd.DataFrame([\n",
        "    {'Id': 'spn-001', 'DisplayName': 'Finance Agent'},\n",
        "    {'Id': 'spn-002', 'DisplayName': 'HR Agent'},\n",
        "    {'Id': 'spn-003', 'DisplayName': 'Ops Agent'},\n",
        "])\n",
        "\n",
        "grants = pd.DataFrame([\n",
        "    {'ClientId': 'spn-001', 'ResourceId': 'graph', 'ConsentType': 'Principal', 'Scope': 'Files.Read'},\n",
        "    {'ClientId': 'spn-002', 'ResourceId': 'graph', 'ConsentType': 'AllPrincipals', 'Scope': 'User.Read.All Directory.Read.All'},\n",
        "    {'ClientId': 'spn-003', 'ResourceId': 'servicenow', 'ConsentType': 'Principal', 'Scope': 'ticket.write'},\n",
        "])\n",
        "\n",
        "report = grants.merge(service_principals, left_on='ClientId', right_on='Id', how='left')\n",
        "report = report.rename(columns={'DisplayName': 'AppDisplayName'})[['AppDisplayName', 'ClientId', 'ResourceId', 'ConsentType', 'Scope']]\n",
        "report['BroadScope'] = report['Scope'].str.contains('All', regex=False)\n",
        "report['TenantWideConsent'] = report['ConsentType'].eq('AllPrincipals')\n",
        "report.sort_values(['TenantWideConsent', 'BroadScope'], ascending=False)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Simulate Power Platform environment and connector review\n",
        "\n",
        "Environment isolation and scoped connectors are key sovereignty controls. This example creates a mock environment inventory and shows how to spot production adjacency, region spread, and connector exposure."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "environments = pd.DataFrame([\n",
        "    {'EnvironmentName': 'env-dev', 'DisplayName': 'Development', 'Location': 'EU'},\n",
        "    {'EnvironmentName': 'env-test', 'DisplayName': 'Test', 'Location': 'EU'},\n",
        "    {'EnvironmentName': 'env-prod', 'DisplayName': 'Production', 'Location': 'US'},\n",
        "])\n",
        "\n",
        "connectors = pd.DataFrame([\n",
        "    {'EnvironmentName': 'env-dev', 'ConnectorName': 'sharepoint-eu', 'ConnectorId': 'con-001', 'Tier': 'Standard'},\n",
        "    {'EnvironmentName': 'env-dev', 'ConnectorName': 'crm-write-api', 'ConnectorId': 'con-002', 'Tier': 'Custom'},\n",
        "    {'EnvironmentName': 'env-test', 'ConnectorName': 'ticketing-api', 'ConnectorId': 'con-003', 'Tier': 'Premium'},\n",
        "    {'EnvironmentName': 'env-prod', 'ConnectorName': 'finance-write-api', 'ConnectorId': 'con-004', 'Tier': 'Custom'},\n",
        "])\n",
        "\n",
        "report = environments.merge(connectors, on='EnvironmentName', how='left')\n",
        "report = report.rename(columns={'DisplayName': 'EnvironmentDisplayName'})\n",
        "report['ProductionAdjacent'] = report['EnvironmentName'].eq('env-prod') | report['ConnectorName'].str.contains('write', case=False, na=False)\n",
        "report[['EnvironmentDisplayName', 'EnvironmentName', 'Location', 'ConnectorName', 'ConnectorId', 'Tier', 'ProductionAdjacent']]"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## End-to-end sovereignty validation scenario\n",
        "\n",
        "This final hands-on example combines policy separation, middleware checks, and audit logging into a single validation flow. It demonstrates approved retrieval, approved action, denied cross-region retrieval, and denied unapproved action."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "from dataclasses import dataclass, field\n",
        "from typing import Dict, Set\n",
        "from datetime import datetime, timezone\n",
        "import json\n",
        "import pandas as pd\n",
        "\n",
        "@dataclass\n",
        "class SovereigntyPolicy:\n",
        "    retrieval: Dict[str, Set[str]] = field(default_factory=dict)\n",
        "    actions: Dict[str, Set[str]] = field(default_factory=dict)\n",
        "\n",
        "    def can_retrieve(self, actor: str, source: str) -> bool:\n",
        "        return source in self.retrieval.get(actor, set())\n",
        "\n",
        "    def can_act(self, actor: str, tool: str) -> bool:\n",
        "        return tool in self.actions.get(actor, set())\n",
        "\n",
        "@dataclass\n",
        "class Request:\n",
        "    actor: str\n",
        "    kind: str\n",
        "    target: str\n",
        "    region: str\n",
        "\n",
        "class Middleware:\n",
        "    def __init__(self, policy: SovereigntyPolicy, required_region: str = 'EU'):\n",
        "        self.policy = policy\n",
        "        self.required_region = required_region\n",
        "        self.audit_log: List[dict] = []\n",
        "\n",
        "    def authorize(self, req: Request) -> bool:\n",
        "        if req.kind == 'retrieve':\n",
        "            allowed = self.policy.can_retrieve(req.actor, req.target) and req.region == self.required_region\n",
        "        else:\n",
        "            allowed = self.policy.can_act(req.actor, req.target) and req.region == self.required_region\n",
        "        self.audit_log.append({\n",
        "            'ts': datetime.now(timezone.utc).isoformat(),\n",
        "            'actor': req.actor,\n",
        "            'kind': req.kind,\n",
        "            'target': req.target,\n",
        "            'region': req.region,\n",
        "            'allowed': allowed,\n",
        "        })\n",
        "        return allowed\n",
        "\n",
        "class Agent:\n",
        "    def __init__(self, middleware: Middleware):\n",
        "        self.middleware = middleware\n",
        "\n",
        "    def invoke(self, req: Request) -> str:\n",
        "        if not self.middleware.authorize(req):\n",
        "            return f'DENIED: {req.kind} -> {req.target} ({req.region})'\n",
        "        return f'EXECUTED: {req.kind} -> {req.target} ({req.region})'\n",
        "\n",
        "policy = SovereigntyPolicy(\n",
        "    retrieval={'finance-agent': {'eu-sales-lake', 'policy-wiki'}},\n",
        "    actions={'finance-agent': {'ticketing-api'}}\n",
        ")\n",
        "\n",
        "middleware = Middleware(policy, required_region='EU')\n",
        "agent = Agent(middleware)\n",
        "\n",
        "scenario = [\n",
        "    Request('finance-agent', 'retrieve', 'eu-sales-lake', 'EU'),\n",
        "    Request('finance-agent', 'act', 'ticketing-api', 'EU'),\n",
        "    Request('finance-agent', 'retrieve', 'eu-sales-lake', 'US'),\n",
        "    Request('finance-agent', 'act', 'crm-write-api', 'EU'),\n",
        "]\n",
        "\n",
        "for req in scenario:\n",
        "    print(agent.invoke(req))\n",
        "\n",
        "pd.DataFrame(middleware.audit_log)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Summary\n",
        "\n",
        "This notebook validated the article's main claim: data sovereignty in autonomous AI is an architectural requirement, not just a legal concern. The practical controls that matter are separation of retrieval and action permissions, identity clarity, region-aware middleware enforcement, environment and connector governance, and audit evidence that can explain what the agent saw, why it acted, and which policy allowed or denied the step.\n",
        "\n",
        "## Next Steps\n",
        "\n",
        "- Inventory every agent, connector, and automation path that can read or write enterprise data.\n",
        "- Map each production agent to four boundaries: data, identity, action, and audit.\n",
        "- Add region-aware policy checks before execution, not after.\n",
        "- Require audit logs for every attempted invocation, including denied requests.\n",
        "- Review high-impact actions for approval gates, idempotency, bounded retries, circuit breakers, and safe degradation.\n",
        "- Rate your current posture from 1 to 5 and identify the weakest boundary today: data, identity, action, or audit."
      ]
    }
  ]
}