Managing LLMs & Customer LLM Endpoints
This guide explains how to manage Large Language Models (LLMs), token pricing, and customer-provided LLM endpoints in Deepdesk. For technical architecture, see LLM Gateway.
1. Managing LLMsβ
Viewing LLMsβ
Navigate to:
Admin β LLM Configs β LLMs
You can view:
- LLM code
- Name
- Model type
- Current token costs (per 1M tokens)
Use filters to narrow by model type or realtime support.
Creating or Editing an LLMβ
- Click Add LLM or select an existing LLM
- Configure:
- Code (immutable identifier, e.g.
gpt-4) - Name (human-readable label)
- Model type (Chat completion, Realtime, or Embeddings)
- Code (immutable identifier, e.g.
- Save changes
Changing an LLM code may impact references across pricing and configurations.
2. Managing Token Costsβ
Token pricing is versioned and time-based.
Viewing Token Costsβ
Navigate to:
Admin β LLM Token Costs
Each entry defines:
- LLM
- Input token price (per 1M)
- Output token price (per 1M)
- Currency
- Start date
- Optional end date
Adding or Editing Token Costsβ
- Click Add LLM Token Cost or select an existing entry
- Configure:
- LLM
- Start date
- End date (optional)
- Text input tokens
- Text output tokens
- Audio input/output tokens (if applicable)
- Currency
The system automatically selects the pricing valid at request time.
- Do not overlap date ranges for the same LLM
- Always add a new pricing entry for changes
- Avoid editing historical prices
3. Managing LLM Configs (Endpoints)β
Navigate to:
Admin β LLM Configs β Add LLM Config
The default Deepdesk LLM endpoints are automatically provisioned by Deepdesk, and not visible in the Admin interface. Only customer-managed endpoints are shown here.
Supported Modelsβ
Each LLM Config explicitly lists which models it supports. Only selected models can be routed through the endpoint.
Provider Configurationβ
Azure (Deepdesk-managed)β
Fields:
- Base URL
- API key
Used for Deepdesk-provisioned Azure OpenAI deployments.
Azure (Customer-managed)β
Customer endpoints authenticate using OAuth.
Required fields:
- Base URL
- OAuth token URL
- OAuth client ID
- OAuth client secret
- OAuth scopes
Optional:
- Deployment prefix
Deployment Namingβ
Deepdesk by default assumes that deployment names match the model code.
For example, when an eval is requested for the model gpt-4, the corresponding endpoint will be https://customer-base-url/openai/deployments/gpt-4/chat/completions.
Customers can override this behavior by specifying a deployment prefix, as they may have configured dedicated endpoints for Deepdesk, alongside other ones.
If a deployment prefix is set (e.g., custom-), the gateway will look for deployments named custom-gpt-4, so https://customer-base-url/openai/deployments/custom-gpt-4/chat/completions.
Note that this still requires the deployment name to match the model code, with the optional prefix.
Secrets are stored securely and loaded at runtime.
Config propagation may take several minutes after saving. Secrets are stored in Secret Manager and synced every 10 minutes.
4. Customer-Provided Endpointsβ
Customer endpoints allow clients to:
- Use their own Azure OpenAI subscription
- Retain compliance and data locality
- Control quotas and models
Deepdesk acts as a secure proxy via the LLM Gateway.
Request Flowβ
- Backend sends request to LLM Gateway
- Gateway resolves:
- LLM Config
- Endpoint
- Authentication is applied
- Request forwarded to Azure OpenAI
- Response returned to Deepdesk
5. Load Balancing & Failoverβ
For Deepdesk-managed endpoints:
- Primary and secondary Azure regions are configured
- Automatic failover is handled by the LLM Gateway
Customer-managed endpoints are responsible for their own redundancy.
6. Common Workflowsβ
Add a New Modelβ
- Create LLM
- Add token pricing
- Enable model in LLM Configs
Update Pricingβ
- Add new token cost entry
- Set a future start date
- Leave previous pricing unchanged
Onboard Customer Endpointβ
- Create LLM Config
- Select provider = Azure
- Configure OAuth credentials
- Select supported models
- Save