Datagen Migrates to AWS and Yields Improved Response Time and Reduced Costs

Table of Contents

AWS migration with Automat-it enabled Datagen to increase its business and improve customer satisfaction

About Datagen

Datagen develops AI-powered 3D simulation technologies to generate training and testing data at scale for computer vision applications. We provide research and data scientist teams with an accurate representation of their target domain to encompass the various scenarios and situations possible in the domain of AR/VR, robotics, smart cars, smart stores and IoT. Using our platform, companies are able to generate high-fidelity 3D synthetic data with all the associated ground truth in a seamless and scalable way. This accelerates R&D cycles and brings their computer vision research to the next level.

The Challenge

Datagen started creating the images using on-premise consumer GPU machines. Using on-premise machines did not provide the flexibility and scalability required for larger-scale requests, leading to a very long processing time (days and weeks) for one customer.

Dagen’s requirement was to have a scalable system that enables large-scale generation of 3D environments, a CPU-intensive process, and rendering the images from within the 3D environments, a GPU-intensive process.

Additional complexity was added, to reduce the high GPU cost, spot instances were to be used as part of the requirements.

The Solution

The chosen solution was to deploy the Datagen workloads in Kubernetes using AWS Managed Kubernetes Services (EKS). The following was designed to provide a scalable solution that will run multiple CPU & GPU workloads:

  • The EKS workloads are triggered by consuming jobs that contain specific rendering jobs.
  • The EKS contains two node groups – for CPU-based workloads and GPU-based workloads.
  • The GPU node group can contain spot instances to reduce dramatically the GPU instances cost. The node pool contains several GPU instance types so when a GPU instance is required, the system will not have to wait for an available instance type.

The AWS EKS solution was created according to WA pillars:

  1. Operational excellence – system health is fully monitored allowing continuous improvement
  2. Security – a secured 3-Tier architecture is used.
  3. Reliability – the EKS workloads are deployed across several AWS Availability Zones,
  4. Performance efficiency – Datagen’s workloads are fully scalable.
  5. Cost optimization – We’re utilizing FinOps practices to improve Datagen’s cost efficiency continuously.

Benefits

  1. Shorten processing time from days (even weeks) to hours.
  2. Reduce costs by 70% via the use of spot instances.
  3. Modernized staging and production environments.