Automated

Introduction

This guide covers setting up your AMESA training cluster using pulumiarrow-up-right, an Infrastructure as Code tool.

This example uses Azure Kubernetes Service, but can be adapted to other supported providers.

Prerequisites

  1. An Azure subscription with sufficient permissions to create and update various resources

  2. If you're following along in typescript, a working installation of NodeJS

  3. A new pulumi project, as per the pulumi documentation. You can find the documentation for Azure herearrow-up-right

Overview

We will be deploying the following resources to your Azure subscription:

  1. Resource group, containing all resources

  2. A container registry, to hold simulator images

  3. An AKS cluster

Resource group

The resource group will contain all resources. It is also what determines in what Azure location the resources will be deployed.

At the end, we export the name of the resource group (which will be randomized by pulumi) for further use in our definition

Container registry

The container registry is where you will be able to privately store your simulator docker images, if any.

Kubernetes Cluster

The cluster is where both the AMESA components and your training will be running. This configuration is more complex, so additional information will be provided as comments in the typescript definition:

GPU Training and simulators

If you want to enable GPU training and GPU-enhanced simulators, you will also need to add the following pools.

In addition, you will also need to install the nvidia-gpu-operator on the cluster. This can be done according to the instructions on the project websitearrow-up-right.

Finally, GPU_ENABLED must be set to true on the AMESA controller deployment, if it hasn't been already.

Notes:

  1. Autoscaling:

    • This template enables autoscaling to have the cluster automatically scale to the required size and back down afterward to reduce costs.

    • You can disable autoscaling by removing the minCount, maxCount and enableAutoScaling properties, but you'll have to set the count value accordingly.

  2. vmSize: The vmSizes used above can be adjusted to instances that adhere more to your needs.

Last updated