In recent years, there has been rapid acceleration in quantum computing development and industry (the field is expected to provide £4 billion of economic opportunity by 2024), and the promise that quantum computers could perform exponentially faster than classical CPUs could be realised in the coming decades. In view of this, it is both easy and tempting to imagine that “classical” computing – the computation of ‘bits’ of information having a value of exclusively 0 or 1 – has had its heyday and will enter a decline in the coming years and decades, giving way to the excited blended approach of quantum computing where a ‘bit’ is now a ‘qubit’ and can be 0 or 1 or, most often, part-0 and part-1 at the same time.
However, classical computing can’t be discounted yet. The recent development of tensor core units has drastically improved the performance of computers when implementing AI algorithms – in particular Deep Learning algorithms (a subset of Machine Learning). In classical computing, the performance of the computer is dictated by the “clock speed” of the processing unit, i.e. how fast the processor can generate pulses. Modern-day computers have GHz clock speeds, meaning that they can execute a billion or more individual calculations per second. This is incredibly fast for standard calculations involving number (or scalar) based arithmetic. However, in deep learning models one of the dominant types of calculation involves matrix algebra, and often matrix multiplication.
Multiplying two 4x4 matrices together requires 800 individual calculations to determine the answer. In a conventional computer, this would therefore requires 800 clock cycles to calculate. In contrast, tensor core units have been specifically designed to compute the result of a 4x4 matrix calculation in a single clock cycle. This means that AI processes can now be calculated and carried out hundreds of times faster than previously possible.
Now, researchers from George Washington University have proposed a method to combine the already rapid tensor core unit with the Universe’s fastest information carrier: light itself. In conventional computers, the performance of the processor will ultimately limited be by the speed at which information can be pushed through the circuitry. In their new paper, "Photonic tensor cores for machine learning" published in Applied Physics Reviews, Mario Miscuglio and Volker J Sorger have outlined a device that could replace the conventional circuitry with photonic components to create a photonic tensor core.
By exploiting the wave nature of light, parallel computing can be realised simply by splitting a broadband spectrum into parallel single-wavelength channels, and amplitude modulating each channel separately with a “dot-product engine”. The modulated channels can then be re-combined to obtain a single answer from a parallelised operation.
Advanced simulations indicate that photonic tensor cores bypass the restriction of processor clock speed entirely (because the cycles are now determined by the oscillation of light) and could achieve a throughput for deep-learning and other AI algorithms of almost 1 trillion operations in a second, nearly 100 times faster than conventional electronic tensor core units. Additionally, the power consumption of a photonic tensor core unit is a fraction of that of a conventional tensor core unit (8mW vs. 200mW) meaning that photonic tensor cores could soon be offering efficient, light-speed artificial intelligence processing.
The promise of light-speed processors is an exciting one for all those engaged in computing, especially anyone interested in the results that can be obtained by artificial intelligence and deep learning.
This blog was originally written by Alexander Savin.
Sign up to our newsletter: Forward - news, insights and features
We have an easily-accessible office in central London, as well as a number of regional offices throughout the UK and an office in Munich, Germany. We’d love to hear from you, so please get in touch.