Databricks Agent Bricks GA: Managed Infrastructure for the Agent Production Gap

Databricks Agent Bricks GA: Managed Infrastructure for the Agent Production Gap

Only 19% of enterprises have deployed AI agents at scale, according to Databricks' own State of AI Agents report. The other 81% are stuck somewhere between "it works in a notebook" and "it runs in production." Databricks just shipped its answer to that gap: Agent Bricks Custom Agents and Supervisor Agent are now both generally available, giving teams a managed path from prototype to production without rebuilding their deployment stack.

What went GA

Two capabilities hit general availability in February 2026.

Custom Agents lets developers build agents with whatever framework they prefer, then deploy them as fully managed Databricks Apps on serverless compute. Supported frameworks include LangGraph, PyFunc, and OpenAI's Agent SDK, but the actual framework constraint is minimal. Databricks' approach uses MLflow's ResponsesAgent interface as a wrapper. Build your agent however you want, wrap it in ResponsesAgent, and you get automatic compatibility with Databricks' evaluation tooling, tracing, and deployment pipeline.

Supervisor Agent provides managed multi-agent orchestration. A supervisor agent acts as the entry point, analyzing incoming requests and routing them to the right sub-agent: Genie Spaces for structured data queries, Knowledge Assistant agents for unstructured data, or MCP servers for external tools. It handles task delegation and result synthesis across those sub-agents, all under Unity Catalog's access controls.

Both features are available in select US regions for workspaces without Enhanced Security and Compliance features. HIPAA-compliant workspaces will get Supervisor Agent GA soon, according to Databricks' roadmap.

The infrastructure it replaces

If you've built an agent that works locally, you already know what stands between you and production. Memory and state management. Evaluation and testing beyond "I tried five prompts." Monitoring and tracing. Authentication and authorization. Deployment automation. Scaling. Each of those is a real engineering project.

Agent Bricks collapses several of those into managed services:

  • Deployment: Agents deploy as Databricks Apps on serverless compute via a single CLI command (databricks apps deploy). You can also deploy directly from Git repositories, a feature that went beta on February 6.
  • Memory: Lakebase, Databricks' serverless Postgres service, provides persistent agent memory. LangGraph agents can use it as a checkpointer, keeping conversation state and context consistent with the lakehouse. Lakebase Autoscaling hit beta on Azure in February, with scale-to-zero and database branching.
  • Observability: MLflow's AgentServer handles request routing, logging, and tracing automatically. It aggregates streamed responses in traces, which feed directly into Agent Evaluation for testing.
  • Evaluation: Built-in evaluation tools (uv run agent-evaluate) test agent quality before deployment. This matters more than most teams realize: Databricks' data shows that companies using evaluation tools achieve six times more production deployments than those that don't.
  • Governance: Unity Catalog governs data access, model permissions, and tool usage. Supervisor Agent uses on-behalf-of access controls, so sub-agents inherit the caller's permissions rather than running with blanket access.

When it makes sense (and when it doesn't)

Agent Bricks is a strong fit if your team already runs on Databricks. The integration with Unity Catalog, Lakebase, and Mosaic AI Model Serving means you're not bolting on a separate agent platform. You get governance, data access, and model hosting from the same stack. For teams building agents that query enterprise data, especially structured data through Genie Spaces, the Supervisor Agent pattern handles a real coordination problem that's painful to build from scratch.

It's a harder sell if you're not on Databricks. The value of Agent Bricks comes from the integration with the rest of the Databricks platform. If your data lives in Snowflake or BigQuery, or you're using a different governance layer, you're essentially buying into the full Databricks platform to get agent infrastructure. That's a bigger decision than picking an agent framework.

There's also a clear boundary around what Agent Bricks doesn't cover. It's not an agent-building framework; it's deployment and orchestration infrastructure. You still write the agent logic yourself using LangGraph, OpenAI SDK, or raw Python. Databricks provides the wrapper (ResponsesAgent), the deployment target (Databricks Apps), and the supporting services (memory, eval, governance). If your bottleneck is "how do I build an agent that reasons well," Agent Bricks won't help. If your bottleneck is "how do I get my working agent into production with proper monitoring and access controls," this is directly relevant.

The bigger picture

Databricks is making the same bet with agents that the industry made with containers a decade ago. Building your own container orchestration was possible but wasteful; Kubernetes became the default because the infrastructure layer wasn't the differentiator. Databricks is positioning Agent Bricks the same way: the agent logic is your differentiator, the infrastructure shouldn't be. It's the same shift happening across the software stack: AI agents are eating the middleware layer, and the winners will be platforms that own the orchestration layer rather than individual point tools.

The 6x deployment gap between teams that use evaluation tools and those that don't tells you something specific. The production bottleneck isn't compute or deployment mechanics. It's confidence: teams can't ship agents they can't test. We ran into this same wall directly, spending four nights building an 11-agent system that produced zero publishable articles because there was no hard gate between "drafted" and "done." By bundling evaluation directly into the deployment pipeline, Databricks is attacking the actual constraint, not just making deployment faster.

For teams already on the platform, Agent Bricks removes months of infrastructure work. Pair this with agents that can now interact with any software through visual control rather than custom API integrations, and the integration bottleneck that has slowed enterprise adoption starts to dissolve. For everyone else, it's a signal that managed agent infrastructure is becoming a product category, not a DIY project. Expect AWS, Google Cloud, and Azure's own non-Databricks offerings to follow with similar managed agent deployment stacks before year-end.