We write a lot about Artificial Intelligence and its algorithms, approaches and techniques. As a software developing company, we have a keen interest in such innovations that we can immediately put to practice in our projects. However, Artificial Intelligence is not only a smart program that works miracles; in many cases in depends on a physical device that incorporates this program. Like a robot without its body, AI software, to be productive, often needs a shell. In this post, we’ll explore what AI hardware really means.
As AI systems become more sophisticated, they demand more computation power from hardware. To address their needs, new hardware, specifically designed for AI, is supposed to accelerate training and performance of neural networks and reduce the power consumption. The traditional solution is to reduce the size of logic gates to fit more transistors, but shrinking logic gates below about 5nm can cause the chip to malfunction due to quantum tunnelling, so the challenge now is to find another way.
What is AI hardware?
First of all, what is actually AI hardware and how it differs from the general hardware we are used to. Essentially, when we talk about AI hardware, we refer to some type of AI accelerators — a class of microprocessors, or microchips, designed to enable faster processing of AI applications, especially in machine learning, neural networks and computer vision. They are usually designed as manycore and focus on low-precision arithmetic, novel dataflow architectures or in-memory computing capability.
The idea behind AI accelerators is that a large part of AI tasks can be massively parallel. With a general purpose GPU (GPGPU), for example, a graphics card can be used in massively parallel computing implementations, where they deliver up to 10 times the performance of CPUs.
The second pillar of AI accelerators design is focused on multicore implementation. Think of a GPU that can accelerate such tasks using many simple cores that are normally used to deliver pixels to a screen. These cores are designed for simpler arithmetic functions common to AI, where the number of simple functions grows so high that traditional computing approaches fail. With purpose-designed application-specific integrated circuits (ASICs), efficiency can be even greater than that achieved with GPGPU, which can benefit edge AI tasks.
Generally speaking, a purpose-made accelerator delivers greater performance, more features and greater power efficiency to facilitate a given task.
What are the most important new capabilities of AI hardware?
As the need for computational resources to process the newest software soars exponentially, the industry is waiting for a new generation of AI chips that would have new capabilities:
More computational power and cost-efficiency: next-generation AI hardware solutions will need to be both more powerful and more cost efficient to meet the needs of the sophisticated training models;
Cloud and Edge computing: new silicon architectures have to be adapted to support deep learning, neural networks and computer vision algorithms with training models in the Cloud and delivering ubiquitous AI at the Edge;
Faster insights: to be useful for businesses, AI solutions — both software and hardware — should provide much faster insights into customers’ behavior and preferences, which can improve sales and customer satisfaction, upgrade manufacturing processes and uptime and reduce costs;
New materials: new research is made to move from traditional silicon to do optical computing chips developing optical systems that are much faster than traditional CPUs or GPUs.
New architectures: there’s also new types of architectures like neuromorphic chips — the architecture that tries to mimic brain cells. This architecture of interconnected “neurons” replaces von-Neumann’s back-and-forth bottleneck with low-powered signals that go directly between neurons for more efficient computation. If you’re trying to train neural networks at the edge or in the cloud, such architectures will have a huge advantage.
How to choose your AI hardware provider
Before making purchases on AI hardware, businesses should, of course, understand how different types of hardware suit different needs. Nowadays with the shift towards purpose-make chips you don’t want to spend a ton of money on specialized hardware that you don’t need.
The first step to choosing AI hardware is to map out how improvements in interaction with customers or suppliers could affect business processes. Afterwards, it is possible to look for software solutions that can support these changes and the corresponding hardware.
Deciding on a general-purpose chips like GPUs, more specialized solutions, like TPU or VPUs or looking for more innovative designs offered by promising startups depends on the AI tasks a business needs.
Top 5 AI hardware solutions
The most popular current hardware solutions for AI acceleration include the following:
Tensor Processing Unit is an AI accelerator application-specific integrated circuit (ASIC) developed by Google specifically for neural network machine learning which features a cloud solution.
Nervana Neural Network Processor-I 1000 is a discrete accelerator produced by Intel that is designed specifically for the growing complexity and scale of inference application.
EyeQ is of a family of system-on-chip (SoC) devices that are designed by Mobileye to support complex and computationally intense vision processing, maintaining low power consumption even while located on the windshield.
Epiphany V is a 1,024-core processor chip by Adapteva aimed at real-time image processing, autonomous driving, and machine learning.
Myriad 2 is a vision processor unit (VPU) system-on-a-chip (SoC) by Movidus comprises a set of programmable processors and a set of dedicated and configurable image and vision accelerators to power computational cameras.
More Promising Startups
Besides from established semiconductor companies, there are a number of well-funded startups that work on purpose-specific chips or try to develop new architectures to build a supercomputer:
Graphcore is a semiconductor company that develops Intelligent Processing Units (IPUs) that holds the complete machine learning model inside the processor. The IPU is designed to work on the complex high-dimensional models, as it emphasizes massively parallel, low-precision floating-point compute and provides high compute density.
Wave Computing is developing the Wave Dataflow Processing Unit (DPU), employing a disruptive, massively parallel dataflow architecture. When introduced, Wave’s DPU-based solution is claimed to be the world’s fastest and most energy efficient deep learning computer family.
Luminous Computing is developing a supercomputer for AI on a single chip that will replace 3000 TPU boards. The company’s idea is to use photonics to solve all of the major bottlenecks traditional processors have to overcome.
Mythic develops its version of IPUs going beyond conventional digital architectures, memory, and calculation elements — rethinking everything: transistors and physics, circuits and systems, and software and AI algorithms.
Prophesee stands a bit apart of chip producers as it focuses on innovative computer vision sensors and systems for applications in all fields of artificial vision. The sensor technology is inspired by biological eyes, acquiring and processing the visual information in a performing and efficient way.
Despite the rapidly evolving market of AI hardware, of the three key parts of hardware infrastructure — computing, storage, and networking — it is computing that has been the focus and has made significant progress in the last couple of years. The industry is going with the fastest available option and promoting that as a solution for deep learning. The other two areas, storage and networking, are still to be seen in the near future.