In our last blog, Which xPU? Understand the Differences Between Edge-Based Processors, we went through the different types of processing elements that might show up on an Edge computer, along with a description of each of the function types. This list included CPUs (central processing units), GPUs (graphical processing units), TPUs (Tensor processing units), and VPUs (vision processing units). Now that we have a basic understanding of each device’s capabilities, we will review some of the decision points needed to select the most beneficial option for an OEM Edge computer, where it applies, and in what combination.
Edge computing is defined in many ways but for our examples, we will assume the system requires significant processing power, resulting in a CPU such as an Arm Cortex or x86 equivalent with some level of integrated graphics. Performance and cost are always at odds so product developers should first determine the level of inference and/or training required for the artificial-intelligence (AI) or machine-learning (ML) algorithms.
Many of the CPUs readily available from companies like WINSYSTEMS provide internal GPUs and other capabilities for AI/ML function. Intel’s OpenVino development software can be used with the company’s CPUs to create a robust AI Edge computing device. NXP’s i.MX8 family of processors likewise offers significant AI/ML capabilities and software support.
The first step is to determine whether the application can be reasonably built around a CPU-only solution to reduce cost, complexity, and size. You might be surprised at the capabilities if you have not worked with them recently.
When a more powerful or application-specific GPU is needed, the solutions vary from low power to extreme Edge computing. On the lower end of this scale, but not the very low end, modules such as NVIDIA’s Jetson combine an Arm CPU with NVIDIA GPUs. This provides a powerful, yet low-power Edge platform that has the processing power for some level of deep-learning or training algorithms and provides a powerful inference engine for pre-trained AI models.
At the other end, we see extremely powerful combinations of Intel Xeon CPUs with multiple NVIDIA GPUs for Edge deep learning and advanced detection algorithms. Though there are other options for add-on GPUs, NVIDIA currently dominates this space, which brings us to another decision point. Not only must an OEM determine the processing requirements for the Edge AI device, but they must consider their own (or contracted) software expertise. If your team of experts is familiar with NVIDIA’s CUDA-X AI development software, the cost of learning another development system must be taken into account, in addition to the cost of the hardware itself.
Intel’s Movidius VPUs are an extension of the platform that’s used for demanding computer vision and AI applications. Intel has put considerable development into its OpenVino toolkit (formerly Intel Computer Vision SDK) to provide cross-platform support for Movidius. The VPU is targeted specifically at machine-vision applications, though it’s expanding into more general AI applications. A Movidius VPU can be added to a single board computer such as WINSYSTEMS’ Intel E3950-based SBC35-427 for a powerful combination, focused specifically on machine-vision processing.
That brings us to applications that combine a TPU with a CPU. Google’s TensorFlow is an end-to-end open-source platform for machine learning which has considerable traction in the market. The TPU modules are specifically designed to assist with predictions and inference for TensorFlow-trained AI models at the Edge. The broad programming and modeling adoption of the TPUs make them a major resource for ML and AI applications with a large base of developers.
One way to handle this configuration is to connect the TPU to a CPU board through a Mini-PCIe slot, as is the case on the ITX-P-C444 industrial Pico-ITX SBC. Thanks to the board’s NXP i.MX8M applications processor, it has the compute power to operate as an Edge computer with camera inputs and an embedded GPU. The TPU can then use the TensorFLOW Lite framework for on-device inference and training. The board is also packed with dual Ethernet, industrial I/O, and other expansion options, including that Mini-PCIe slot. A demo of this combination was discussed in our previous blog AI at the Edge: From Demo to Reality.
Artificial intelligence and machine learning aren’t just the buzz words anymore. They are actively changing the way embedded computers and people interact with the world around us. These blogs only touch on a few of the terms and considerations when selecting the hardware and software for an AI/ML Edge computer. Please connect with us if you’d like to discuss a specific application in more detail.