Open Demo

30 min

Autoscaling Inference on AWS with Run:ai

EMEA: March 8th 1:00 PM CET Americas: March 9th 2:00 PM EST On Demand: Register to get the recording

Reserve Your Spot

This session is intended for:

AI Infrastructure / DevOps Managers

Automatically scale up and down Pods and Nodes on AWS
Run multiple Inference Workloads on a single GPU
Monitor Latency, Throughput, and Compute-Utilization in one Dashboard

MLOps Managers

Streamline Model Deployment
Guarantee optimal SLA and uptime for Model Serving

Run:ai features you will see:

GPU Fractioning
Compute Utilization Monitoring
Using NVIDIA Triton with Run:ai
Native AWS Integration

Presented By:

Guy Salton

Director of Solution Engineering, Run:ai

Robert Magno

Solution Engineer, Run:ai

Tel Aviv, Israel
New York, NY

Solutions
MLOps
Data Scientists
AI Infrastructure

Platform
GPU Optimization
Cluster Management
AI/ML Workflow Management
Request a Demo

Resources
Blog
White Papers
Case Studies
Guides

About
About Us
News
Join Us

Customers
Login
Documentation