{
  "nbformat": 4,
  "nbformat_minor": 5,
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "name": "python",
      "version": "3.13.0"
    },
    "blog_metadata": {
      "topic": "The Real Enterprise Value of Agentic Coding with Azure Cosmos DB",
      "slug": "the-real-enterprise-value-of-agentic-coding-with-azure-cosmo",
      "generated_by": "LinkedIn Post Generator + Azure OpenAI",
      "generated_at": "2026-06-19T18:10:47.732Z"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# The Real Enterprise Value of Agentic Coding with Azure Cosmos DB\n",
        "\n",
        "This notebook turns the blog post into a hands-on validation flow focused on durable state for agentic systems. The goal is not to make an agent smarter, but to validate patterns for memory, checkpointing, partition-aware access, and secure connectivity with Azure Cosmos DB."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "%pip install azure-cosmos azure-identity"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "from azure.cosmos import CosmosClient, PartitionKey, exceptions\n",
        "from azure.identity import DefaultAzureCredential\n",
        "import os\n",
        "import json\n",
        "from pprint import pprint"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Architecture mental model\n",
        "\n",
        "This cell captures the core architecture from the post: agent requests write to conversation state, tool outputs, and workflow checkpoints, all backed by Cosmos DB. The key design choice is partitioning, because it directly affects latency, cost, and recovery behavior."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "architecture = {\n",
        "    \"flow\": [\n",
        "        \"Agent Request -> Conversation State Container\",\n",
        "        \"Agent Request -> Tool Output Container\",\n",
        "        \"Agent Request -> Workflow Checkpoints Container\",\n",
        "        \"Containers -> Azure Cosmos DB\",\n",
        "        \"Azure Cosmos DB -> Resume / Replay\",\n",
        "        \"Resume / Replay -> Agent Continues Safely\"\n",
        "    ],\n",
        "    \"partitioning\": {\n",
        "        \"tenant\": \"tenantId\",\n",
        "        \"conversation\": \"tenantId#conversationId\",\n",
        "        \"workflow\": \"tenantId#workflowId\"\n",
        "    }\n",
        "}\n",
        "\n",
        "pprint(architecture)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Required environment variables\n",
        "\n",
        "Set these before running the live Cosmos DB examples:\n",
        "\n",
        "- `COSMOS_ENDPOINT` - Cosmos DB account endpoint\n",
        "- `COSMOS_KEY` - Cosmos DB key for key-based auth examples\n",
        "- `COSMOS_DATABASE` - optional, defaults to `agentdb`\n",
        "\n",
        "For managed identity examples, run from an Azure-hosted environment with an assigned managed identity and appropriate Cosmos DB data-plane permissions."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Define containers and document shapes\n",
        "\n",
        "This example creates the three baseline containers from the post: `conversations`, `toolOutputs`, and `checkpoints`. It also defines representative document shapes and TTL policies, which encode retention intent directly in the operational state model."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "endpoint = os.getenv(\"COSMOS_ENDPOINT\", \"https://example.documents.azure.com:443/\")\n",
        "key = os.getenv(\"COSMOS_KEY\", \"REPLACE_WITH_KEY\")\n",
        "database_name = os.getenv(\"COSMOS_DATABASE\", \"agentdb\")\n",
        "\n",
        "print({\"endpoint\": endpoint, \"database\": database_name, \"using_placeholder_key\": key == \"REPLACE_WITH_KEY\"})\n",
        "\n",
        "# Uncomment to run against a real Cosmos DB account.\n",
        "# client = CosmosClient(endpoint, credential=key)\n",
        "# db = client.create_database_if_not_exists(id=database_name)\n",
        "# db.create_container_if_not_exists(id=\"conversations\", partition_key=PartitionKey(path=\"/pk\"), default_ttl=604800)\n",
        "# db.create_container_if_not_exists(id=\"toolOutputs\", partition_key=PartitionKey(path=\"/pk\"), default_ttl=259200)\n",
        "# db.create_container_if_not_exists(id=\"checkpoints\", partition_key=PartitionKey(path=\"/pk\"), default_ttl=-1)\n",
        "\n",
        "conversation_doc = {\n",
        "    \"id\": \"msg-001\",\n",
        "    \"pk\": \"tenantA#conv42\",\n",
        "    \"tenantId\": \"tenantA\",\n",
        "    \"conversationId\": \"conv42\",\n",
        "    \"role\": \"user\",\n",
        "    \"content\": \"Summarize QBR notes\",\n",
        "    \"ttl\": 604800\n",
        "}\n",
        "\n",
        "tool_output_doc = {\n",
        "    \"id\": \"tool-001\",\n",
        "    \"pk\": \"tenantA#conv42\",\n",
        "    \"toolName\": \"search\",\n",
        "    \"inputHash\": \"sha256:abc\",\n",
        "    \"result\": {\"hits\": 3},\n",
        "    \"ttl\": 259200\n",
        "}\n",
        "\n",
        "checkpoint_doc = {\n",
        "    \"id\": \"wf-42#step-2\",\n",
        "    \"pk\": \"tenantA#wf-42\",\n",
        "    \"workflowId\": \"wf-42\",\n",
        "    \"step\": 2,\n",
        "    \"status\": \"completed\",\n",
        "    \"resumeToken\": \"next:3\"\n",
        "}\n",
        "\n",
        "print(\"Sample conversation document:\")\n",
        "pprint(conversation_doc)\n",
        "print(\"\\nSample tool output document:\")\n",
        "pprint(tool_output_doc)\n",
        "print(\"\\nSample checkpoint document:\")\n",
        "pprint(checkpoint_doc)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Partition-aware CRUD for conversation state\n",
        "\n",
        "This example demonstrates the operational pattern the post emphasizes: keep reads and writes scoped to a single partition key whenever possible. That is how conversation history stays predictable in both latency and RU consumption."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "endpoint = os.getenv(\"COSMOS_ENDPOINT\", \"https://example.documents.azure.com:443/\")\n",
        "key = os.getenv(\"COSMOS_KEY\", \"REPLACE_WITH_KEY\")\n",
        "database_name = os.getenv(\"COSMOS_DATABASE\", \"agentdb\")\n",
        "pk = \"tenantA#conv42\"\n",
        "\n",
        "print({\"endpoint\": endpoint, \"database\": database_name, \"partition_key\": pk, \"live_run_ready\": key != \"REPLACE_WITH_KEY\"})\n",
        "\n",
        "item = {\n",
        "    \"id\": \"msg-002\",\n",
        "    \"pk\": pk,\n",
        "    \"role\": \"assistant\",\n",
        "    \"content\": \"Here is the summary.\",\n",
        "    \"sequence\": 2\n",
        "}\n",
        "\n",
        "query = \"SELECT * FROM c WHERE c.pk = @pk ORDER BY c.sequence\"\n",
        "params = [{\"name\": \"@pk\", \"value\": pk}]\n",
        "\n",
        "print(\"Prepared item:\")\n",
        "pprint(item)\n",
        "print(\"\\nPrepared query:\")\n",
        "print(query)\n",
        "print(params)\n",
        "\n",
        "# Uncomment to run against a real Cosmos DB account.\n",
        "# client = CosmosClient(endpoint, credential=key)\n",
        "# container = client.get_database_client(database_name).get_container_client(\"conversations\")\n",
        "# container.upsert_item(item)\n",
        "# read_back = container.read_item(item=\"msg-002\", partition_key=pk)\n",
        "# read_back[\"content\"] = \"Here is the revised summary.\"\n",
        "# container.replace_item(item=read_back[\"id\"], body=read_back)\n",
        "# messages = list(container.query_items(query=query, parameters=params, partition_key=pk))\n",
        "# print([m[\"content\"] for m in messages])\n",
        "# container.delete_item(item=\"msg-002\", partition_key=pk)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Idempotent checkpoint creation with dedupe keys\n",
        "\n",
        "Long-running workflows need duplicate-safe execution. This pattern treats duplicate delivery as a normal case by using a deterministic checkpoint ID, allowing the system to detect whether a step has already started or completed."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "tenant_id, workflow_id, step = \"tenantA\", \"wf-42\", 3\n",
        "pk = f\"{tenant_id}#{workflow_id}\"\n",
        "dedupe_id = f\"{workflow_id}#step-{step}#tool-search#sha256:abc123\"\n",
        "\n",
        "checkpoint = {\n",
        "    \"id\": dedupe_id,\n",
        "    \"pk\": pk,\n",
        "    \"workflowId\": workflow_id,\n",
        "    \"step\": step,\n",
        "    \"status\": \"started\"\n",
        "}\n",
        "\n",
        "print(\"Checkpoint payload:\")\n",
        "pprint(checkpoint)\n",
        "\n",
        "# Local simulation of idempotency behavior.\n",
        "seen = {}\n",
        "if dedupe_id not in seen:\n",
        "    seen[dedupe_id] = checkpoint\n",
        "    print(\"First execution: proceed with work\")\n",
        "else:\n",
        "    existing = seen[dedupe_id]\n",
        "    print(f\"Duplicate detected: status={existing['status']}\")\n",
        "\n",
        "# Uncomment to run against a real Cosmos DB account.\n",
        "# endpoint = os.getenv(\"COSMOS_ENDPOINT\")\n",
        "# key = os.getenv(\"COSMOS_KEY\")\n",
        "# database_name = os.getenv(\"COSMOS_DATABASE\", \"agentdb\")\n",
        "# client = CosmosClient(endpoint, credential=key)\n",
        "# container = client.get_database_client(database_name).get_container_client(\"checkpoints\")\n",
        "# try:\n",
        "#     container.create_item(checkpoint)\n",
        "#     print(\"First execution: proceed with work\")\n",
        "# except exceptions.CosmosResourceExistsError:\n",
        "#     existing = container.read_item(item=dedupe_id, partition_key=pk)\n",
        "#     print(f\"Duplicate detected: status={existing['status']}\")"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Resume from the latest completed checkpoint\n",
        "\n",
        "Recovery should start from persisted workflow state rather than logs or manual reconstruction. This example queries the latest completed checkpoint for a workflow and computes the next step and resume token."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "pk = \"tenantA#wf-42\"\n",
        "query = \"\"\"\n",
        "SELECT TOP 1 c.step, c.resumeToken\n",
        "FROM c WHERE c.pk = @pk AND c.status = 'completed'\n",
        "ORDER BY c.step DESC\n",
        "\"\"\"\n",
        "params = [{\"name\": \"@pk\", \"value\": pk}]\n",
        "\n",
        "print(\"Prepared resume query:\")\n",
        "print(query)\n",
        "print(params)\n",
        "\n",
        "# Local simulation.\n",
        "mock_checkpoints = [\n",
        "    {\"pk\": pk, \"step\": 1, \"status\": \"completed\", \"resumeToken\": \"next:2\"},\n",
        "    {\"pk\": pk, \"step\": 2, \"status\": \"completed\", \"resumeToken\": \"next:3\"},\n",
        "    {\"pk\": pk, \"step\": 3, \"status\": \"started\", \"resumeToken\": None}\n",
        "]\n",
        "completed = [c for c in mock_checkpoints if c[\"pk\"] == pk and c[\"status\"] == \"completed\"]\n",
        "latest = sorted(completed, key=lambda x: x[\"step\"], reverse=True)[:1]\n",
        "next_step = 1 if not latest else latest[0][\"step\"] + 1\n",
        "resume_token = None if not latest else latest[0][\"resumeToken\"]\n",
        "print({\"next_step\": next_step, \"resume_token\": resume_token})\n",
        "\n",
        "# Uncomment to run against a real Cosmos DB account.\n",
        "# endpoint = os.getenv(\"COSMOS_ENDPOINT\")\n",
        "# key = os.getenv(\"COSMOS_KEY\")\n",
        "# database_name = os.getenv(\"COSMOS_DATABASE\", \"agentdb\")\n",
        "# client = CosmosClient(endpoint, credential=key)\n",
        "# container = client.get_database_client(database_name).get_container_client(\"checkpoints\")\n",
        "# latest = list(container.query_items(query=query, parameters=params, partition_key=pk))\n",
        "# next_step = 1 if not latest else latest[0][\"step\"] + 1\n",
        "# resume_token = None if not latest else latest[0][\"resumeToken\"]\n",
        "# print({\"next_step\": next_step, \"resume_token\": resume_token})"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Checkpoint-and-resume sequence as executable state transitions\n",
        "\n",
        "The post also describes the workflow as a sequence: create a started checkpoint, execute the tool, persist outputs, then mark the checkpoint completed. This cell models that sequence in Python so you can validate the control flow before wiring it to live services."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "state = {\n",
        "    \"agent\": \"Agent Service\",\n",
        "    \"checkpoint_container\": [],\n",
        "    \"conversation_container\": []\n",
        "}\n",
        "\n",
        "dedupe_key = \"wf-42#step-3#tool-search#sha256:abc123\"\n",
        "existing_ids = {doc[\"id\"] for doc in state[\"checkpoint_container\"]}\n",
        "\n",
        "if dedupe_key not in existing_ids:\n",
        "    started = {\"id\": dedupe_key, \"status\": \"started\", \"resumeToken\": None}\n",
        "    state[\"checkpoint_container\"].append(started)\n",
        "    tool_result = {\"hits\": 3, \"items\": [\"doc1\", \"doc2\", \"doc3\"]}\n",
        "    state[\"conversation_container\"].append({\n",
        "        \"id\": \"msg-tool-3\",\n",
        "        \"pk\": \"tenantA#conv42\",\n",
        "        \"role\": \"tool\",\n",
        "        \"content\": json.dumps(tool_result)\n",
        "    })\n",
        "    started[\"status\"] = \"completed\"\n",
        "    started[\"resumeToken\"] = \"next:4\"\n",
        "    print(\"First execution completed\")\n",
        "else:\n",
        "    existing = next(doc for doc in state[\"checkpoint_container\"] if doc[\"id\"] == dedupe_key)\n",
        "    print(\"Duplicate delivery\", existing)\n",
        "\n",
        "pprint(state)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Required environment variables for managed identity validation\n",
        "\n",
        "For this example, you need:\n",
        "\n",
        "- `COSMOS_ENDPOINT` - Cosmos DB account endpoint\n",
        "- Azure-hosted runtime with a managed identity enabled\n",
        "- Cosmos DB data-plane permissions granted to that identity\n",
        "\n",
        "No embedded key is required."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Use managed identity with Cosmos DB\n",
        "\n",
        "This example shows the preferred enterprise access pattern from the post: use managed identity instead of embedded secrets. The notebook keeps the live write commented so you can validate the code shape safely before enabling it in Azure."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "endpoint = os.getenv(\"COSMOS_ENDPOINT\", \"https://cosmosagentdemo1234.documents.azure.com:443/\")\n",
        "database_name = os.getenv(\"COSMOS_DATABASE\", \"agentdb\")\n",
        "item = {\n",
        "    \"id\": \"msg-100\",\n",
        "    \"pk\": \"tenantA#conv99\",\n",
        "    \"role\": \"system\",\n",
        "    \"content\": \"Managed identity access works\"\n",
        "}\n",
        "\n",
        "print({\"endpoint\": endpoint, \"database\": database_name})\n",
        "pprint(item)\n",
        "\n",
        "# Uncomment to run from an Azure-hosted environment with managed identity configured.\n",
        "# credential = DefaultAzureCredential()\n",
        "# client = CosmosClient(endpoint, credential=credential)\n",
        "# container = client.get_database_client(database_name).get_container_client(\"conversations\")\n",
        "# container.upsert_item(item)\n",
        "# print(container.read_item(item=\"msg-100\", partition_key=\"tenantA#conv99\")[\"content\"])"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Design validation checklist\n",
        "\n",
        "Use this quick checklist to validate whether Cosmos DB is a good fit for your agentic system:\n",
        "\n",
        "- Is the primary need durable operational state rather than relational reporting?\n",
        "- Are conversation threads, tool outputs, and checkpoints semi-structured and fast-changing?\n",
        "- Can you choose partition keys that localize reads and writes?\n",
        "- Do you need resumability after delays, retries, or duplicate delivery?\n",
        "- Can you control cost with TTL, selective indexing, and partition-aware queries?\n",
        "- Do you need managed identity and least-privilege access for the memory layer?"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Next Steps\n",
        "\n",
        "Azure Cosmos DB adds enterprise value to agentic coding when it becomes the durable state backbone for conversations, tool traces, and workflow checkpoints. The key validation points are partition design, idempotent recovery, retention policy, and secure access patterns.\n",
        "\n",
        "Next steps:\n",
        "1. Replace placeholder endpoints and keys with a real Cosmos DB account.\n",
        "2. Create the three baseline containers and test partition-local reads.\n",
        "3. Simulate duplicate workflow delivery and verify dedupe behavior.\n",
        "4. Add ETag-based optimistic concurrency for shared workflow state.\n",
        "5. Review TTL, indexing, and RBAC settings before production rollout."
      ]
    }
  ]
}