In November 2022, OpenAI released ChatGPT, the large language model (LLM) chatbot that blew the minds of AI researchers and introduced the power of AI to the public. GPT stands for Generative Pretrained Transformer. It uses GPT-3 model, a pretrained AI program (as the name GPT suggests) that can respond to human inputs/instructions to do tasks from creating essays and poems to explaining various topics on its own with a few prompts from the human operator. It is one of the products from OpenAI foundation, the other notable model being DALL-E, an separate AI system that can create realistic images and arts from a description in natural language by the operator. ChatGPT models falls into one of the many fields of Artificial Intelligence (AI) called Generative AI. AI technology has become better since its inception. The quality of Generative AI output has now come to the point that is meaningfully good for creators and businesses. More people are using Generative AI outputs for writing emails and promotional blogs. Given the current frenzy in Generative AI, it helps to put a few aspects of this key trend in the broader context of Enterprise IT and what we believe at Service Ventures are likely to happen in general to future Human productivity, Enterprise IT, and AI research.
ChatGPT is the first AI technology that the public directly interacted with. Sure, there were Siri and Alexa, and deep learning applications were already ubiquitous in many commercial applications. But AI in such scenarios often worked in the background. The public has a much more direct experience with AI through ChatGPT, where a user could directly give inputs to an LLM and directly see its outputs. This directness makes the AI hype much more real for the public than prior developments about AI. We believe that the best way to educate the masses about a new technology is by letting the public openly experiment with the technology and its applications, experience its failures, and iteratively debate and refine the perceptions. The accessibility of such foundation models, especially the free-use precedent set by ChatGPT, will keep the public more informed about AI through hands-on experience, in turn leading to more informed understanding. This key development from ChatGPT could propel AI into the mainstream faster.
For Enterprise IT ecosystem, ChatGPT type LLM models are going to be indispensable. On software engineering front, using AI-based code-completion and ChatGPT-style question-answering for learning to code and understanding an existing codebase will become essential. Most SW engineers do not need to learn about low-level machine code, because powerful compilers turn human-readable code, like C++, into code that machines read. AI could act as the new “compilers” that translate high-level human instructions into low-level code. SW engineers may write high-level requirements and AI coders could write the middle-level code that engineers write today. SW vendors also could add an intermediary natural-language user interface to their offering by simply letting an AI coder translate language instructions into code that calls the APIs of that product. We expect an explosion of natural-language software UIs across all software platforms with a plethora of small-scale LLM deployments in text-based classification, summarization, and prediction capabilities across many existing apps that deal with text inputs.
The fast improvement cycle of LLMs, coupled with publicly accessible APIs will likely mean that LLMs will be commoditized. As AI models get more capable, they will take over more tasks that were done by traditional SW. More SW will be optimized by merely optimizing the AI stack of the SW. The AI stack usually runs on GPUs and co-processors, and the improvements in their performance may lead to improvements in performance of traditional CPUs. At a macro scale, there will be large companies that can afford to train and run their own foundation models and smaller one that need to pay a foundation model tax to providers of LLMs. It isn’t different from what we have today, where companies either host their own servers or pay to AWS/Azure/GCP. Such AI Cloud will be a key battleground for the future platforms.
As per AI research, ChatGPT shows that it will be quite difficult for AI academia to develop scale-enabled AI capabilities as training the base GPT-3-type model will be out of reach for smaller labs. The data collection and model fine-tuning pipeline that led to ChatGPT is too engineering heavy for academic labs. GPT-3 and other OpenAI developments happened after Microsoft threw the full weight of Azure infrastructure behind OpenAI, with dedicated server farms. ChatGPT collects a lot of valuable training data to fine tune its LLM model by making it freely available to the public for usage data. It is not something smaller organizations can afford. Open-source and large-scale collaborative academic partnerships may enable academia to move forward but we think open-source AI models will lag companies with large budgets in terms quality and novelty. There could be national-level computing clouds dedicated to academic AI research with large budget in line with the James Webb Space Telescope and the Large Hadron Collider. Given the strategic importance of AI, we may see such a development.
One of the most important lessons of ChatGPT’s approach is that human feedback is very important for improving the performance of large language models (LLMs). LLMs are first trained with a large data set to predict the next word given a sequence of previous words. Then the output of LLM is fine-tuned using a Reinforcement Learning (RL) with human preference texts. The success of ChatGPT, which hinges upon this combination, is bringing new attention to RL as a practical method for improving LLMs. ChatGPT marks a come-back for Reinforcement Learning (RL) i.e. the success of ChatGPT is bringing new attention to RL as a practical method for improving LLMs.
Looking to the future, if there’s anything we learned in the last 10 years of deep learning, it’s that it’s hard to make accurate predictions about AI, both its development and its deployment. But with good confidence, we can say that ChatGPT is merely a small preview of what’s to come. For the future of foundation models, there are two directions we are seeing promising progress on a ChatGPT-level model that is truly multimodal (e.g. text, audio, image, video) and models that are designed to take actions in an environment. While custom data will always be needed for domain-specific finetuning, pretraining models with large-scale free data undoubtedly led to the success of GPT. It will be interesting to see how the AI ecosystem pivots beyond merely using existing digital data to improve foundation model performance. What’s the next frontier for large-scale self-supervised learning? Where do the next 10 or 100 trillion data points come from? We are excited to find out.
/Service Ventures Team
Komentáře