Insights on AI Cloud Orchestration

Amar Kapadia

Mar 29

·

5

read

Server Architecture Changing After Six Decades?

Find out more

At the Partner Reception at GTC last week, Jensen stated that the computer architecture is changing for the first time since 1964 with the advent of accelerated computing.

I tend to agree. After 60 years, the server architecture is changing from retrieval based to generative for the first time. The below diagram captures my thinking, which is centered around the Human Machine Interface (HMI).

From the mid 60s to mid 80s, the HMI was CLI and the “servers” were minicomputers and mainframes built by companies such as IBM, Digital Equipment Corporation, and HP using highly proprietary architecture. The focal point was the CPU. From the mid 80s to the mid 20s, the HMI of choice has been a GUI (largely based on Xerox PARC research) or REST APIs, which led to client server and its variations such as the current front-end↔back-end split. This era has been dominated by industry standard servers with a CPU focus. The winner has been the x86 CPU and its ecosystem. Networking, memory, I/O, storage, and datacenters have undergone a tremendous renaissance during this era.

Moving forward, the interface will be GenAI. It’s no longer going to be highly structured ways of interfacing with computers to retrieve information, but rather human-like communication based on dynamic generation of responses. Both input and output will be based on GenAI. After all, when we talk to humans, we don’t provide inputs through point-and-click screens and view outputs through dashboards. This era will be dominated by accelerated computing where the winner will be the GPU and its ecosystem. This doesn’t mean the CPU disappears. In fact, the CPU will always be needed, it’s just that it will take a back-seat.
‍

In my mind, there are three key tenets to this new architecture:

CPU, memory, I/O, networking, storage, datacenter all have to cater to the GPU and will change in fundamental ways
Utilization has to be 100% given the cost of GPUs; utilization has not been a concern so far
Use of completely greenfield technology stack
‍

This new world creates massive new opportunities for us (Aarna):

The infra needs to be orchestrated and managed in new ways. In the NVIDIA context that could take the shape of DGX-as-a-service and MGX-as-a-service.
Workload orchestration and management will take a front-row seat given the utilization concern. Sophisticated techniques are required such as bringing in secondary workloads on the same GPU cluster when the primary workload is easing up. The GPU owner may need to sell off GPU capacity to aggregators as a “spot instance” during periods of underutilization.
Given our use of the Kubernetes-Nephio framework, greenfield is music to our ears. We don’t have to worry about VMs or bare metal instances based on old operating systems.
‍

I’d love to hear your thoughts on these topics. Do reach out to me for a discussion.

Amar Kapadia

Mar 6

·

2

read

Insights from Mobile World Congress 2024

Find out more

The Mobile World Congress (MWC) 2024 in Barcelona brought together leaders from the telecommunications industry to discuss the future of digital connectivity. This event served as a platform for sharing advancements and exploring new technologies that are shaping the way we connect.
‍

*Amar Kapadia and Subramanian Sankaranarayanan at IEEE booth*

Aarna.ml participated in MWC 2024, with
‍Amar Kapadia and Subramanian Sankaranarayanan showcasing the work in edge and private 5G management. We demonstrated the capabilities of NVIDIA Grace Hopper Servers and our AMCOP's orchestration abilities in collaboration with NVIDIA and IEEE. These demonstrations aimed to show practical applications of our technology in improving network performance and scalability. For those interested, we have made a demo video available here.

‍The congress highlighted several key developments in the industry. The evolution of 5G technology was a major topic, with discussions on how it can improve connectivity speeds and reduce latency. The use of artificial intelligence (AI) in network operations and service delivery was also emphasized, pointing towards more efficient telecommunications infrastructure. Quantum computing's potential in enhancing data security and processing power was explored, and there was a notable focus on sustainability, with talks on making operations more energy-efficient. These discussions at MWC 2024 indicate a move towards more efficient, secure, and environmentally friendly digital connectivity.
‍

MWC 2024 offered a clear view of the current state and future possibilities in telecommunications, highlighting the industry's ongoing efforts to improve and innovate. Aarna.ml is excited to contribute to this progress, focusing on solutions that enhance connectivity and pave the way for future advancements.

Sriram Rupanagunta

Feb 28

·

2

read

Empowering Edge Computing: The Imperative Role of Automation Solution

Find out more

We are often asked "why automation or orchestration is needed for Edge computing in general, since Edge computing is not a new concept". In this blog, you'Il learn about the role of an orchestrator in unleashing the true potential of Edge computing environments.
‍
Recapping the uniqueness of Edge environments, a blog 'Why Edge Orchestration is Different' by Amar Kapadia highlights the following attributes:

Scale
Dynamic in nature
Heterogeneity
Dependency between workloads and infrastructure
‍

We will see the challenges in accomplishing them, and it will be obvious why automation plays a critical role in this process. Also, as explained in the previous blog, the Edge environments include both Infrastructure and the Applications that run on them (Physical or Virtual/Cloud-native). So all the above factors need to be considered for both of them in case of Edge computing.
‍
The scale of Edge environments clearly prohibits manual mode of operating them since this will involve bringing up each environment, with its own set of initial (day-0) configurations, independent of each other. The problem is compounded when these environments need to be managed on an ongoing basis (day-N). This also brings up the challenge of the dynamic nature of these environments, where the configurations can keep changing based on the business needs of the users. Each such change will result in potentially tearing down the previous environment and bringing up another one, possibly with a different set of configurations. Some of these environments may take days to bring up, with expertise from different domains (which is another challenge), and any change will mean a few more days to bring it up again, even for a potentially minor change.
‍

Another challenge with the Edge environment is their heterogeneity in nature, unlike the public cloud which is generally homogeneous. This would mean that multiple tools, possibly from different vendors, need to be used. These tools could be proprietary or standard tools such as Terraform, Ansible, Crossplane and so on. Each vendor of the Infrastructure or the applications/network functions could be using a different tool, and even in cases where they use standard tools, there may be multiple versions of the artifacts (eg., Terraform plans, Ansible scripts) that need to be dealt with.

The workloads on edge may need to talk to or integrate with some applications on the central location / cloud. This would need setting up connectivity between edge and other sites as may be desired. The orchestrator should also be able provision this in an automated manner.
‍
Lastly, as we saw in the previous blog, there may be dependencies between Infrastructure and the workloads, as well as between various workloads (eg., Network Functions such as 5G that are used by other applications). This will make it extremely difficult to bring them up manually or with home-grown solutions.
‍

All these challenges will mean that unless the Edge environment is extremely small and contained, it will need a sophisticated automation framework or an orchestrator. The only scalable way to accomplish this is to specify the topology of the environment as an Intent, which is rolled out by the Orchestrator. In addition, the Orchestrator should constantly monitor the deployment, and make necessary adjustments (constant reconciliation) deployment and management of the topologies in the Edge environment. When there is a change required, a new intent (configuration) is specified which should be rolled out seamlessly. The tool should also be able to work with various tools such as Terraform/OpenTofu, Ansible and so on, as well as provide ways to integrate with proprietary vendor solutions.
‍

At Aarna.ml, we offer open source, zero-touch, Intent based orchestrator, AMCOP (also offered as a SaaS AES) for lifecycle management, real-time policy, and closed loop automation for edge and 5G services. If you’d like to discuss your orchestration needs, please contact us for a free consultation.

Sriram Rupanagunta

Feb 23

·

3

read

Enabling the RAGOps

Find out more

This is a follow up blog to the earlier blog “From RAGs to the Riches” from my colleague, Amar Kapadia.

‍

Setting up the GenAI for an Enterprise involves multiple steps, and this can be categorized as:

Infrastructure Orchestration, which includes the servers/GPUs with cloud software, virtualization tools and the networking infrastructure. There may be additional requirements depending on the Enterprise needs, such as:

SD-WAN setup between their locations
Access to the Enterprise data from their SaaS infrastructure (Confluence/Jira/Salesforce etc.)
Connectivity to public clouds, if needed
Connectivity to the repos where the GenAI models are present (Huggingface etc.)
If this is set up on Cloud Edge DCs (such as Equinix), there may be a need to configure the fabric to connect to other Edge locations or the public clouds, using network edge devices (routers/firewalls that run as xNFs)

GenAI Orchestration, which includes bringing up the GenAI tools, either for training or for inferencing.
RAG Orchestration, which includes building the necessary Vector DB from various Enterprise sources, and using that as part of the Inferencing pipeline.

‍

All of the above requires a sophisticated Orchestrator that can work in a generic manner, and provide a single-click (or a command) functionality.

The flow will be as follows:

The Admin creates a high-level Intent that describes the necessary infrastructure, connectivity requirements, site details and the tools
The Orchestrator takes the Intent as input, and sets up the necessary infrastructure and applications
The Orchestrator also monitors the infra/applications for any failures/performance issues, and makes the necessary adjustments (it could work with one of the existing tools such as TMS for this function).
‍

I hope this sheds some light on the topic and gives some clarity on how to go about setting up the underlying infrastructure for RAGOps.

‍

AMCOP can orchestrate AI (and more specifically, GenAI) workloads on various platforms. At Aarna.ml, we offer open source, zero-touch, orchestrator, AMCOP (also offered as a SaaS AES) for lifecycle management, real-time policy, and closed loop automation for edge and 5G services. If you’d like to discuss your orchestration needs, please contact us for a free consultation.

‍

Next Steps

Contact us for help on getting started with RAGOps. The Aarna.ml Multi Cluster Orchestration Platform orchestrates and manages edge environments including support for RAGOps. We have specifically created an offering that is suitable for NSPs by focusing not just on the FM and related ML components, but also on the infrastructure e.g. using Equinix Metal to speed up deployment and Equinix Fabric for seamless data connectivity. As an NVidia partner, we have deep expertise with server platforms like the NVidia GraceHopper and platform components such as NVidia Triton and NeMo.

Amar Kapadia

Feb 6

·

2

read

A Glimpse from PTC'24

Find out more

At the recent Pacific Telecommunications Council (PTC)'24 event held in Honolulu, Hawaii, Subramanian Sankaranarayanan, AVP at Aarna.ml, took the stage to deliver an insightful talk on “Multi-Domain Edge Connectivity Services for Equinix Metal, Network Edge, Fabric, and Multi-Cloud.”
‍

Subbu’'s presentation centered on the dynamic evolution of data centers towards Infrastructure-as-a-Service (IaaS) and the complexities inherent in multi-vendor IaaS deployments. He highlighted the innovative solutions offered by the Linux Foundation Edge Akraino PCEI, an award-winning blueprint, for orchestrating and managing cloud edge infrastructures.

A focal point of his discussion was Aarna Edge Services (AES), a SaaS platform instrumental in simplifying the deployment and orchestration of infrastructure, apps, and network services at the cloud edge. Subbu illustrated various use cases of AES, demonstrating its efficiency in reducing deployment time from weeks to less than an hour and optimizing cloud-adjacent storage and GenAI processes.

‍

The session provided valuable insights into the future of cloud and edge computing, emphasizing the importance of seamless integration and efficient management in today's interconnected digital world.

Subbu's expertise and the innovative approaches discussed at PTC'24 paint an exciting picture of the future of cloud edge management and multi-cloud deployments, promising a more streamlined, efficient, and interconnected digital ecosystem.

We are grateful to Pacific Telecommunications Council (PTC) for this amazing opportunity and this memorable exposure and all the time we spent at the PTC’24.
‍
If you couldn't connect with us at the event, feel free to contact us to arrange a meeting.

Amar Kapadia

Feb 1

·

5

read

Exploring Edge-Native Application Design Behaviors

Find out more

In December 2023, the tech community welcomed a groundbreaking whitepaper titled "Edge-Native Application Design Behaviours." This comprehensive document delves into the dynamic realm of Edge-native application design, providing invaluable insights for developers and architects navigating the unique challenges of Edge environments.

‍

Evolution from CNCF IoT to Edge-Native Principles

Building upon the foundational principles outlined in the CNCF IoT Edge Native Application Principles Whitepaper, this latest release adapts and refines these principles specifically for Edge environments. The result is a guide that serves as an indispensable resource for those working on Edge-native applications, offering practical guidelines and illuminating insights.
‍

Navigating Key Aspects of Edge-Native Design

The whitepaper meticulously explores key aspects crucial for Edge-native design, unraveling the intricacies of concurrency, scale, autonomy, disposability, capability sensitivity, data persistence, and operational considerations. A particular highlight is a real-world scenario, illustrating the application of these design behaviours in a tangible context.
‍

Decoding Edge Native Application Design

Understanding Edge-native application design necessitates recognizing its departure from cloud-native design. Edges, as autonomous entities, play a pivotal role in ingesting, transforming, buffering, and displaying data locally. Distributed edge components complement these entities, handling functions to reduce bandwidth consumption and adhere to location-based policies.
‍

Design Constraints and Principles

Edge-native applications face distinct design constraints, such as connectivity, data-at-rest, and resource constraints. The whitepaper emphasises the importance of evolving cloud-native application design principles to address these constraints effectively. Key principles include the separation of data and code, stateless processes, share-nothing entities, and the separation of build and run stages.
‍

Guidelines for Edge-Native Development

For developers venturing into Edge-native applications, the whitepaper provides a detailed reference guide. Topics such as concurrency and scale, edge autonomy, disposability, capability sensitivity, data persistence, metrics/logs, and operational considerations are meticulously explored.
‍

A Glimpse into the Future

As the digital landscape evolves, Edge-native application design becomes increasingly vital. The whitepaper not only serves as a guide but also charts a course for future development in this dynamic field. The principles and insights shared pave the way for innovation, ensuring that Edge-native applications are not just efficient but also resilient in the face of evolving technological landscapes.
‍

Click here to download the whitepaper.

Resources

resources

Blog