What are LLMs

by **Bad Wolf** Mon 17 Apr 2023, 22:29

What Are Large Language Models (LLMs) and How Do They Work?
*By Bob Sharp

Generative AI is all the rage, but how does a large language model work?

Large language models (LLMs) are the underlying technology that has powered the meteoric rise of generative AI chatbots. Tools like ChatGPT, Google Bard, and Bing Chat all rely on LLMs to generate human-like responses to your prompts and questions.
But just what are LLMs, and how do they work? Here we set out to demystify LLMs.

What Is a Large Language Model?
In its simplest terms, an LLM is a massive database of text data that can be referenced to generate human-like responses to your prompts. The text comes from a range of sources and can amount to billions of words.

Among common sources of text data used are:

Literature: LLMs often contain enormous amounts of contemporary and classical literature. This can include books, poetry, and plays.
Online content: An LLM will most often contain a large repository of online content, including blogs, web content, forum questions & responses, and other online text.
News and current affairs: Some, but not all, LLMs can access current news topics. Certain LLMs, like GPT-3.5, are restricted in this sense.
Social Media: Social media represents a huge resource of natural language. LLMs use text from major platforms like Facebook, Twitter, and Instagram.

What Are the Limitations of LLMs?
LLMs represent an impressive technological achievement. But the technology is far from perfect, and there are still plenty of limitations as to what they can achieve. Some of the more notable of these are listed below:

Contextual understanding: We mentioned this as something LLMs incorporate into their answers. However, they don't always get it right and are often unable to understand the context, leading to inappropriate or just plain wrong answers.
Bias: Any biases present in the training data can often be present in responses. This includes biases towards gender, race, geography, and culture.
Common sense: Common sense is difficult to quantify, but humans learn this from an early age simply by watching the world around them. LLMs do not have this inherent experience to fall back on. They only understand what has been supplied to them through their training data, and this does not give them a true comprehension of the world they exist in.
An LLM is only as good as its training data: Accuracy can never be guaranteed. The old computer adage of "Garbage In, Garbage Out" sums this limitation up perfectly. LLMs are only as good as the quality and quantity of their training data allow them to be.

Examples of Popular LLMs
The continuing advance of AI is now largely underpinned by LLMs. So while they aren't exactly a new technology, they have certainly reached a point of critical momentum, and there are now many models.

Here are some of the most widely used LLMs.

GPT
Generative Pre-trained Transformer (GPT) is perhaps the most widely known LLM. GPT-3.5 powers the ChatGPT platform used for the examples in this article, while the newest version, GPT-4, is available through a ChatGPT Plus subscription. Microsoft also uses the latest version in its Bing Chat platform.

LaMDA
This is the initial LLM used by Google Bard, Google's AI chatbot. The version Bard was initially rolled out with was described as a "lite" version of the LLM. The more powerful PaLM iteration of the LLM superseded this.

BERT
BERT stands for Bi-directional Encoder Representation from Transformers. The bidirectional characteristics of the model differentiate BERT from other LLMs like GPT.

The Future of LLMs
Ethical arguments may yet have a say in how we integrate these tools into society. However, putting this to one side, some of the expected LLM developments include:

Improved Efficiency: With LLMs featuring hundreds of millions of parameters, they are incredibly resource hungry. With improvements in hardware and algorithms, they are likely to become more energy-efficient. This will also quicken response times.
Improved Contextual Awareness: LLMs are self-training; the more usage and feedback they get, the better they become. Importantly, this is without any further major engineering. As technology progresses, this will see improvements in language capabilities and contextual awareness.
Trained for Specific Tasks: The Jack-of-all-trade tools that are the public face of LLMs are prone to errors. But as they develop and users train them for specific needs, LLMs can play a large role in fields like medicine, law, finance, and education.
Greater Integration: LLMs could become personal digital assistants. Think of Siri on steroids, and you get the idea. LLMs could become virtual assistants that help you with everything from suggesting meals to dealing with your correspondence.

LLMs Transforming and Educating
LLMs are opening up an exciting world of possibilities. The rapid rise of chatbots such as ChatGPT, Bing Chat, and Google Bard is evidence of the resources being poured into the field.

Such a proliferation of resources can only see these tools becoming more powerful, versatile, and accurate. The potential applications of such tools are vast, and at the moment, we are only scratching the surface of an incredible new resource.

Read the full original article at: https://www.makeuseof.com/what-are-large-langauge-models-how-do-they-work/