ChatGPT, highly acclaimed as a revolutionary generative AI model, can – true to its description – generate intelligently formed answers to human questions. While it has flaws with accuracy, bias, and portraying false facts with overconfidence, it also offers realms of possibility as a creative tool and instigator for ideas. The potential is undeniably there.
Businesses have wasted no time in adopting ChatGPT in an attempt to streamline everyday tasks. And, with OpenAI more recently announcing GPT-4, the interest in generative AI will only continue to grow. Its human-like conversational text already supports a wide-range of applications ranging from research, to code writing, to optimising the customer onboarding process. The prospects already seem boundless as the chatbot’s natural sounding text output offers quick, easy and transformative approaches to existing tasks. However, this sudden reliance on its capabilities leaves us to wonder how it will continue to evolve into something that can support surging computation demands.
Traditional hardware can’t keep up
Even with all the hype surrounding generative AI, the barriers limiting progress are already evident. Aside from limitations with delivering responses, ChatGPT is further constrained by its training data. When first released, ChatGPT was trained with a tremendously large dataset obtained before September 2021, hence the chatbot was left to operate off the back of outdated information, unable to correctly interpret anything more recent. Of course, it can be re-trained frequently with updated datasets, but that will be accompanied by drastically higher operating costs.
MORE FROM ARTIFICIAL INTELLIGENCE
Beyond this, there are considerations regarding the number of graphics processing units (GPUs) required to run ChatGPT. Originally designed for graphics rendering, GPUs have become the major workhorse for AI in the past ten years as they can process many different data points simultaneously. However, even with today’s most advanced GPU, our analysis shows that it would still take more than 50,000 GPUs to train ChatGPT if it’s updated on a daily basis, and better models need drastically more compute power. What if the next model needs 10 times more GPUs, or even perhaps 100s? 1000s? Clearly there is an issue of scalability.
Models like ChatGPT use hundreds of billions of ‘parameters’, so they will eventually hit a cap with computing power as the hardware won’t be able to sustain the size of model required. Processing power has become a major hurdle in the advancement of AI models and it’s becoming progressively difficult for hardware to meet growing computation demands needed to support breakthrough AI models. There is dire need for a solution that can overcome the scaling limits set by traditional computing. This is where optics come in.
Unlocking the full potential of next-generation AI with optical computing
The bottom line is that existing transistor-based computing hardware is struggling to keep up with the growing demand for faster computing. Optical computing technology can unlock a whole new remit of compute power, speed and memory access. It is scalable, sustainable and compatible with existing AI algorithms, offering a viable avenue for driving developments in models like ChatGPT with the momentum at which we hope to achieve. But how?
Optical AI accelerators can already be built using existing technologies, and they will outperform traditional digital processors for large generative AI models. Optics carry out parallel computing, capable of performing much more computation in a unit of time when compared to digital processors that compute sequentially. The use of light means that optical computing does not suffer the same limitations of digital methods. Instead, optics provide transfer times as quick as the speed of light and parallel processing that delivers unprecedented scale.
Using advanced optical technologies, computing can be 100-1000 times faster, with the added benefit of being 100 times more energy efficient. Optics also have a significantly higher clock speed of up to 100GHz as compared to 1GHz of digital electronics. And there’s still plenty of headroom for optics to push further when compared to their electronic counterparts.
Evidently, to address this demand for scaling AI, the industry is in need of a paradigm shift towards hardware better suited to the task. Optical computing is an innovative, game-changing technology that can be applied to accelerate these power-hungry AI models. It offers a viable solution to the shortcomings of traditional computing hardware and will enable the industry to enter a new era of AI.
Tim Weil, CEO and Co-Founder, Lumai
Promoted content: Does social media work for engineers – and how can you make it work for you?
So in addition to doing their own job, engineers are expected to do the marketing department´s work for them as well? Sorry, wait a minute, I know the...