Scans Azure OpenAI capacity across all regions in real time and routes your deployment to the lowest-latency endpoint with available quota. Falls back if your preferred region is full.
Best for: Engineers deploying to Azure who need to avoid quota exhaustion and latency spikes.
Creator's repository · microsoft/azure-skills
License: MIT