Back to Articles and Blogs

Generative AI at the Edge: Engineering Challenges

Image of Michael Sarpa
Michael Sarpa
EdgeCortix-SAKURA-II-M2-Social-Image
EdgeCortix-SAKURA-II-PCIe-Beauty-Image-Social

Article originally posted on Electronic Design.

The increasing demands for generative AI at the edge call for new solutions that can achieve high performance within the low-power requirements for edge apps.

What you’ll learn:
 
  • Transition from the cloud to the edge for generative AI.
  • New product challenges for generative AI at the edge.
  • Choosing an edge AI solution that meets these new challenges.

When most people think about artificial intelligence (AI), they typically comtemplate the applications that generate new content in text, images, or voice. Popular text applications like ChatGPT, (which generated over 1M users in days) have exploded on the scene and are being adopted quickly. Mobile phone users use voice search constantly.

What do these applications have in common? They all rely on the cloud to process AI workloads. Despite the high costs associated with cloud generative AI, virtually unlimited memory capability and power capacity means that cloud-based applications will continue to drive these popular generative AI applications.

However, of more concern to many design engineers, embedded engineers, and data scientists is the explosion in AI applications at the edge. The desire to perform complex generative AI processing at the edge creates many new challenges, such as:

  • Real-time processing needs
  • Restrictive cost requirements
  • Limited memory resources
  • Tight space requirements
  • Mandatory power budgets

Challenge #1: Real-Time Processing Needs

One of the biggest downsides of cloud AI is the time delay in sending and receiving data. For applications like ChatGPT, this isn't an issue, as the time delays aren't noticed by users who can wait for a few seconds to get their text generation completed. However, in many edge applications, this delay is unacceptable.

For example, in a self-driving car application, steering and braking in response to identified images like people and other vehicles is absolutely critical for life-and-death decisions. Having to wait for a delay in communicating to the cloud is completely untenable. Edge AI applications must be able to process data in real-time at the edge, where it's being collected, to meet these time-critical requirements.

Challenge #2: Restrictive Cost Requirements

For companies with huge customer bases, using expensive cloud solutions to process their AI workloads makes fiscal sense. However, in most edge applications, cost is a critical component and a key part in the creation of competitive products in the marketplace. Being able to effectively add generative AI capability within a reasonable cost profile will be a challenge for AI designers and data scientists.

For example, in a smart-city application, adding advanced functionality to cameras and power-monitoring systems would be desired, but would have to be affordable within tight government budgets. Thus, the cloud becomes impractical. Low-cost edge AI processing is a must.

Challenge #3: Limited Memory Resources

Another benefit of cloud AI is virtually unlimited memory capacity, especially useful for processing huge datasets like the ones needed for accurate text analysis and generation. Edge applications don't have this luxury, as the memory capacity is limited by size, cost, and power. With this limited resource, utilizing memory efficiently with the maximum bandwidth becomes extremely critical for such edge applications.

Challenge #4: Tight Space Requirements

Data centers are housed in large spaces and, therefore, these AI applications don't have space concerns like that often experienced with edge designs. Whether it's the limited under-dash space in an automotive design or the size and weight considerations in an aviation or space application, edge designs usually encounter limited space to implement their AI functionality.

It's mandatory that such designs meet these specific size requirements for both the computing and memory resources needed to efficiently process AI workloads at the edge.

Challenge #5: Mandatory Power Budgets

The final challenge in edge-based designs may be the most critical. Cloud AI consumes an inordinate amount of power, but with direct access to power it comes down to the cost of the electricity.

In edge AI applications, power may be readily available. However, power consumption remains a key consideration as the relative cost may be more vital in the overall cost of an application. Power consumption becomes even more important in battery-powered edge AI applications, where higher-power-consumption devices that reduce the lifetime of the product becomes a key consideration.

Solutions for the Edge AI Design Challenges

So, the question becomes: "How do engineers address these challenges and develop successful edge AI products while implementing the needed functionality?" Five keys to consider are:

1. Recognize the limitations of GPUs at the edge

GPUs have a lot of strengths for cloud AI usage, and their characteristics line up well. GPUs have huge processing capability that cloud AI applications can exploit, and GPUs are able to meet the real-time processing requirements well.

However, when considering GPUs for edge applications, it's important to look at the weaknesses of GPU implementations. GPUs are more costly than AI accelerator devices, often as much as 5X to 10X the cost, so they really don't meet edge AI cost requirements. GPUs are offered in quite large device packages and large boxes, often not meeting the size requirements of edge applications. Finally, GPUs consume much more power that typical AI accelerators, making it hard to achieve power budgets.

2. Realize the need for a complete software solution

Many AI accelerator developers focus first on the silicon design, which is understandable from a historical perspective. This remains an important component of a quality edge AI solution. However, by being forced to create and modify the software to meet the already-fixed silicon design, these solutions may not properly take advantage of every possible aspect of efficient AI processing. 

Unlike in the past, where the majority of designers in an engineering design company were focused on hardware, the number of software engineers vastly outnumbers hardware designers, making the software capability and ease of use increasingly important. An excellent overall AI solution requires software development that enables the silicon to process AI models as efficiently as possible. Designing software first and then creating silicon afterwards can lead to the best AI solution.

3. Easy integration into existing systems

While some AI designs may be developed from scratch, more often than not designers are trying to integrate AI functionality into an existing system. By adding AI processing, these existing systems can offer more features to the end customer within the same form factor.

It's important to consider the ease of integration into a current design. Therefore, AI solutions that offer heterogeneous support for a wide range of existing processors will make it easier to add AI functionality to existing designs.

4. Efficient neural-network processing

Ultimately, the overall effectiveness of an AI solution depends on the ability of the neural network to properly process AI workloads efficiently. While it may often be difficult to sort between various vendor claims, it's critical to understand the structure and capability of the “guts” of the AI solution. Questions should be asked about how the models are processed, what innovations have been implemented to increase the processing efficiency, and compare the processing of specific models head-to-head.

5. DRAM capacity and bandwidth

An often overlooked aspect of an AI design is the ability of the solution to effectively use memory capability. Some accelerator solutions have limited or no on-chip memory, which can reduce AI processing capability. Going off-chip to access memory resources is time-consuming, which can significantly increase latency.

Other solutions may have on-chip memory, but if the DRAM bandwidth isn't high enough, the delay in accessing memory can also limit AI processing capability.

Look for the Right Vendor for Edge AI Solutions

As AI transitions from the cloud to the edge, and increasingly becomes adopted across various industries, embedded designers are encountering numerous challenges. Selecting the right AI hardware vendor to support current and future edge AI solutions is crucial for the successful implementation of the product.

Evaluating an AI accelerator’s performance against the key design challenges outlined in this article can be insightful. Co-processors like the EdgeCortix SAKURA-II Edge AI Accelerator, offer solutions that address these industry challenges and enable designers to create energy-efficient, high-performing edge AI solutions.


Edge AI: Its Critical in Moving Industries – and the World, to a Better Future

Image of Jeffrey Grosman
Jeffrey Grosman
{% module_block module "widget_1d6e0a79-7de3-4243-b2f9-60fa1bff70c2" %}{% module_attribute...
Read more
SAKURA-I efficient edge AI chips from EdgeCortix outperform the NVIDIA Jetson AGX Orin

Efficient Edge AI Chips with Reconfigurable Accelerators

Image of Nikolay Nez
Nikolay Nez
{% module_block module "widget_23fbbf0b-d899-4cdb-b090-dfbfbe0dbec3" %}{% module_attribute...
Read more