Article originally posted on Electronic Design.
When most people think about artificial intelligence (AI), they typically comtemplate the applications that generate new content in text, images, or voice. Popular text applications like ChatGPT, (which generated over 1M users in days) have exploded on the scene and are being adopted quickly. Mobile phone users use voice search constantly.
What do these applications have in common? They all rely on the cloud to process AI workloads. Despite the high costs associated with cloud generative AI, virtually unlimited memory capability and power capacity means that cloud-based applications will continue to drive these popular generative AI applications.
However, of more concern to many design engineers, embedded engineers, and data scientists is the explosion in AI applications at the edge. The desire to perform complex generative AI processing at the edge creates many new challenges, such as:
One of the biggest downsides of cloud AI is the time delay in sending and receiving data. For applications like ChatGPT, this isn't an issue, as the time delays aren't noticed by users who can wait for a few seconds to get their text generation completed. However, in many edge applications, this delay is unacceptable.
For example, in a self-driving car application, steering and braking in response to identified images like people and other vehicles is absolutely critical for life-and-death decisions. Having to wait for a delay in communicating to the cloud is completely untenable. Edge AI applications must be able to process data in real-time at the edge, where it's being collected, to meet these time-critical requirements.
For companies with huge customer bases, using expensive cloud solutions to process their AI workloads makes fiscal sense. However, in most edge applications, cost is a critical component and a key part in the creation of competitive products in the marketplace. Being able to effectively add generative AI capability within a reasonable cost profile will be a challenge for AI designers and data scientists.
For example, in a smart-city application, adding advanced functionality to cameras and power-monitoring systems would be desired, but would have to be affordable within tight government budgets. Thus, the cloud becomes impractical. Low-cost edge AI processing is a must.
Another benefit of cloud AI is virtually unlimited memory capacity, especially useful for processing huge datasets like the ones needed for accurate text analysis and generation. Edge applications don't have this luxury, as the memory capacity is limited by size, cost, and power. With this limited resource, utilizing memory efficiently with the maximum bandwidth becomes extremely critical for such edge applications.
Data centers are housed in large spaces and, therefore, these AI applications don't have space concerns like that often experienced with edge designs. Whether it's the limited under-dash space in an automotive design or the size and weight considerations in an aviation or space application, edge designs usually encounter limited space to implement their AI functionality.
It's mandatory that such designs meet these specific size requirements for both the computing and memory resources needed to efficiently process AI workloads at the edge.
The final challenge in edge-based designs may be the most critical. Cloud AI consumes an inordinate amount of power, but with direct access to power it comes down to the cost of the electricity.
In edge AI applications, power may be readily available. However, power consumption remains a key consideration as the relative cost may be more vital in the overall cost of an application. Power consumption becomes even more important in battery-powered edge AI applications, where higher-power-consumption devices that reduce the lifetime of the product becomes a key consideration.
So, the question becomes: "How do engineers address these challenges and develop successful edge AI products while implementing the needed functionality?" Five keys to consider are:
GPUs have a lot of strengths for cloud AI usage, and their characteristics line up well. GPUs have huge processing capability that cloud AI applications can exploit, and GPUs are able to meet the real-time processing requirements well.
However, when considering GPUs for edge applications, it's important to look at the weaknesses of GPU implementations. GPUs are more costly than AI accelerator devices, often as much as 5X to 10X the cost, so they really don't meet edge AI cost requirements. GPUs are offered in quite large device packages and large boxes, often not meeting the size requirements of edge applications. Finally, GPUs consume much more power that typical AI accelerators, making it hard to achieve power budgets.
Many AI accelerator developers focus first on the silicon design, which is understandable from a historical perspective. This remains an important component of a quality edge AI solution. However, by being forced to create and modify the software to meet the already-fixed silicon design, these solutions may not properly take advantage of every possible aspect of efficient AI processing.
Unlike in the past, where the majority of designers in an engineering design company were focused on hardware, the number of software engineers vastly outnumbers hardware designers, making the software capability and ease of use increasingly important. An excellent overall AI solution requires software development that enables the silicon to process AI models as efficiently as possible. Designing software first and then creating silicon afterwards can lead to the best AI solution.
While some AI designs may be developed from scratch, more often than not designers are trying to integrate AI functionality into an existing system. By adding AI processing, these existing systems can offer more features to the end customer within the same form factor.
It's important to consider the ease of integration into a current design. Therefore, AI solutions that offer heterogeneous support for a wide range of existing processors will make it easier to add AI functionality to existing designs.
Ultimately, the overall effectiveness of an AI solution depends on the ability of the neural network to properly process AI workloads efficiently. While it may often be difficult to sort between various vendor claims, it's critical to understand the structure and capability of the “guts” of the AI solution. Questions should be asked about how the models are processed, what innovations have been implemented to increase the processing efficiency, and compare the processing of specific models head-to-head.
An often overlooked aspect of an AI design is the ability of the solution to effectively use memory capability. Some accelerator solutions have limited or no on-chip memory, which can reduce AI processing capability. Going off-chip to access memory resources is time-consuming, which can significantly increase latency.
Other solutions may have on-chip memory, but if the DRAM bandwidth isn't high enough, the delay in accessing memory can also limit AI processing capability.
As AI transitions from the cloud to the edge, and increasingly becomes adopted across various industries, embedded designers are encountering numerous challenges. Selecting the right AI hardware vendor to support current and future edge AI solutions is crucial for the successful implementation of the product.
Evaluating an AI accelerator’s performance against the key design challenges outlined in this article can be insightful. Co-processors like the EdgeCortix SAKURA-II Edge AI Accelerator, offer solutions that address these industry challenges and enable designers to create energy-efficient, high-performing edge AI solutions.
Michael is EdgeCortix’s Director of Product Marketing. Michael has four decades of experience in product marketing for both large-scale worldwide semiconductor companies and small, start-up companies in the early stages of emerging markets. His experience includes the integration and marketing of complete product solutions, including silicon, software and development platforms. Michael holds an MBA from Santa Clara University and an BSEE from California Polytechnic State University, San Luis Obispo.