Foundry Hubs, Projects, Models, and Deployments

Key Takeaways

  • Microsoft Foundry is the Azure platform layer for building, optimizing, deploying, tracing, evaluating, and governing AI apps and agents.
  • A hub or Foundry resource provides shared governance and infrastructure, while a project organizes workload-specific assets such as deployments, indexes, datasets, evaluations, files, and connections.
  • The model catalog is for discovery and selection; a deployment is the named callable endpoint that an app or agent actually invokes.
  • Standard Foundry resource deployments are the preferred path when supported, while serverless API and managed compute deployments serve different model, isolation, and operational needs.
  • AI-103 scenarios often test tradeoffs: model capability versus latency and cost, project isolation versus shared governance, and standard capacity versus provisioned throughput or managed compute.
Last updated: June 2026

Foundry Hubs, Projects, Models, and Deployments

Microsoft Foundry is the platform environment for AI apps and agents on Azure. Microsoft describes it as a unified platform-as-a-service for enterprise AI operations, model builders, and application development. For AI-103, that means Foundry is not just a playground: it is where teams organize projects, choose models, configure deployments, connect data and tools, run evaluations, trace behavior, and govern access.

The terminology can look confusing because Microsoft documentation includes both newer Foundry projects and classic hub-based projects. The exam-safe mental model is simple: shared governance and infrastructure sit above workload workspaces, and projects are where teams build the actual app or agent.

ConceptWhat it ownsExam clue
Foundry resourceAdministrative, security, monitoring, networking, and policy boundaryThe scenario asks for centralized governance or RBAC across projects
Hub or hub-based projectShared settings such as data connections, compute, network configuration, and classic advanced featuresThe prompt mentions hubs, prompt flow, managed compute, or Azure Machine Learning compatibility
ProjectWorkload workspace for app or agent assetsThe team needs deployments, indexes, datasets, evaluations, files, and connections scoped to one solution
ConnectionReusable reference to another service and its authentication methodThe app should use Azure OpenAI, AI Search, Storage, APIs, or tools without embedding credentials

Model Catalog to Deployment

The model catalog is the discovery and selection surface. Use it to compare task fit: large language models for broad generation, small language models for low-cost narrow tasks, reasoning models for complex multi-step work, embedding models for retrieval, code models for developer scenarios, and multimodal models when text alone is not enough.

A deployment is different from a model. The model is the artifact or family; the deployment is the named endpoint configuration the application calls. A deployment can have its own model version, throughput choice, content filter, region, quota, and operational settings. Multiple deployments can expose the same model for different workloads, such as gpt-4o-mini-chat-dev, gpt-4o-prod, and embedding-rag-indexer.

Deployment choiceBest fitTradeoff to notice
Standard deployment in Foundry resourcesCommon Azure OpenAI and Foundry Models workloadsFastest managed path, but subject to shared regional capacity and quota
Provisioned throughputHigh-volume production workloads that need predictable throughput and latencyHigher planning commitment; capacity must be sized and monitored
Serverless API deploymentCertain catalog models where managed endpoint access is enoughAvailability and billing depend on model/provider support
Managed compute deploymentOpen-source, custom, or isolated model hosting with dedicated managed VMsMore infrastructure choices and cost responsibility

Planning Pattern

Start with the business requirement, then work inward:

  1. Choose the project boundary: team, workload, environment, and data sensitivity.
  2. Choose the model family by task, modality, context size, latency, cost, and safety profile.
  3. Choose the deployment type by capacity, isolation, model availability, and network requirements.
  4. Create project connections for retrieval indexes, storage, APIs, and monitoring.
  5. Test with evaluations before routing user traffic.

A strong AI-103 answer usually avoids two extremes. It does not put every experiment into one shared project with broad keys, and it does not overbuild managed GPU infrastructure when a standard Foundry resource deployment satisfies the requirement. The right design is the smallest governed workspace and deployment plan that can meet the app's behavior, cost, security, and operations targets.

Test Your Knowledge

A team selected a model from the Foundry model catalog, but their app still cannot call it. What is the missing step?

A
B
C
D