# **Model Configuration** The model repository system and `config.pbtxt` format are inherited from the **Triton Inference Server**. For full documentation, refer to the official Triton guide: [Model Configuration](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md). Below is a brief overview of some powerful Triton configuration options that can simplify your workflow and enhance performance: - **[Model Warmup](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/optimization.md#onnx-with-tensorrt-optimization-ort-trt)**: If your model requires complex calculations or compilation during initialization, you can configure it to complete these tasks during startup, avoiding delays on the first request. - **[ONNX ORT Optimization](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/optimization.md#onnx-with-tensorrt-optimization-ort-trt)**: Enables conversion of ONNX models to TensorRT during initialization, often resulting in significant performance improvements. - **[Dynamic Batching](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#dynamic-batcher)**: Configures the server to batch multiple incoming requests together, which can balance server load and improve throughput. - **[Instance Groups](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#instance-groups)**: Specifies how many instances of a model should be loaded and assigns each instance to a specific GPU or CPU. This setting can optimize performance and enable CPU-only deployments. *(See more details in [CPU-only Run](./DEPLOY.md#cpu-only-mode).)* - **[Version Policy](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#version-policy)**: By setting the **Version Policy** to `"specific"` in combination with `crate::Options::model_control_mode(EXPLICIT)` and `crate::Server::load_model`, you can load only a single model version instead of all available versions. This is especially useful for: - Managing repositories with multiple versions of a heavy model. - Avoiding loading broken model versions while keeping healthy ones. --- This guide provides a starting point for using Triton’s advanced configuration features to streamline your machine learning workflows. For a detailed walkthrough of these options and more, refer to the official [Triton Inference Server documentation](https://github.com/triton-inference-server/server).