The government of India (India AI) issued a document titled, “Inviting Applications for Empanelment of Agencies for providing AI services on Cloud.” This document invites in-country GPUaaS providers to bid for sovereign opportunities. It is a detailed and thoughtful document and will no doubt spur innovation at all levels of the AI/ML stack within India.

If you are responding to this invitation or plan to, we would like to congratulate you! However, some of the requirements in sections 6.7 “Admin Portal”, 6.8 “Service Provisioning”, 6.9 “Operational Management”, and 6.12 “SLA Management” are complicated. They essentially require a GPU Cloud Management Software layer. And this cloud management software needs to be up & running in t0 + 6 months.

Let’s explore what your options are since it’s the classic “make” vs. “buy” situation. Here are the pros and cons of these two options.

Pros Cons
“Make” Option
  • Full control of the software with ability to differentiate and customize (it may actually not be possible to differentiate at the IaaS layer, so the differentiation argument might be questionable)
  • Requires very strong in-house development skills, esp. given the tight development timelines
  • Matching ongoing feature requirements will get challenging in the long term
“Buy” Option
  • Get access to a purpose-built 3rd party product
  • Save cost (since 3rd party will be less expensive than in-house)
  • Focus precious development resources on AI/ML rather than Infra
  • Customization will be possible, but might be more difficult than in-house software

If you are going for the “make” option, the rest of this blog is moot. However, if you want to explore the “buy” option, we can help you with the below requirements[1]

Section Requirement
General
  • Admin portal available within 6 months of LOI
  • Dynamically manage 1,000+ GPUs
6.7 “Admin Portal”
  • User registration/account creation
  • Service catalog and prices
  • Capacity dashboard
  • Utilization monitoring
  • Incident management
  • Service Health Dashboard
  • Ability to customize dashboard for the subsidy workflow
6.8 “Service Provisioning”
  • Online, on-demand instances that can be scaled up/down
  • Management portal
  • Public internet access with VPN
  • Support for BMaaS and VMs
  • MTTR SLAs and recovery
  • User notifications
  • Data destruction (so it cannot be forensically recovered)
6.9 “Operational Management”
  • Patch management
  • OS images with latest security patches
  • Root cause analysis and timely repairs
  • System usage
6.12 “SLA Management”
  • SLA measurement and MTTR improvement to meet incident management SLA (99.95% or higher)
  • Service availability measurement

 Finally, to our knowledge, we are the only GPU Cloud Management Software company in the market. If this blog sounds interesting, learn more:

●    Our GPU Cloud Management Software demo

And please feel free to contact us.