When would I use Inf1 vs. C5 vs. G4 instances for inference in Amazon EC2?

Question

When would I use Inf1 vs. C5 vs. G4 instances for inference in Amazon EC2?

1 Answer

Indian · Answer 1 · 2020-05-26T05:09:16+0000

Customers running machine learning models that are sensitive to inference latency and throughput can use Inf1 instances for high-performance cost-effective inference. For those ML models that are less sensitive to inference latency and throughput, customers can use EC2 C5 instances and utilize the AVX-512/VNNI instruction set. For ML models that require access to NVIDIA’s CUDA, CuDNN or TensorRT libraries, we recommend using G4 instances.

Model Characteristics and Libraries Used	EC2 Inf1	EC2 C5	EC2 G4
Models that benefit from low latency and high throughput at low cost	X
Models not sensitive to latency and throughput		X
Models requiring NVIDIA’s developer libraries			X

When would I use Inf1 vs. C5 vs. G4 instances for inference in Amazon EC2?

Please log in or register to answer this question.

1 Answer