Large Language Models(LLMs)
All of us has used some kind of Generative AI capability at this point, either its ChatGPT to get answers or to generate a code sample, or any other Image generation tool or voice generation tool. All of these new content generation with AI falls in to Generative AI.
From all the generative AI models, the ones which generates new texts, are doing that using a Large Language Model(LLM). This new text generation can be either summarization of a text or answering to a question in natural language.
Chat GPT and Bard are two famous examples for applications built using LLMs.
Large Language Models are a kind of AI models that are using Deep Learning techniques to create Models that can predict or generate words, when a input set of texts are provided.
Large Language Models are large, general purpose language models that are pre-trained and can be fine-tuned for specific purposes
Sample general usages of LLMs
•Summarization
•Answer to questions
•Text classification
•Translate
•Text generation
Can be fine tuned for specific purposes in areas like Medical or Legal or any other domain
LLMs are trained with a very large number of parameters(In Billions) with a very large data sets. They are normally general purpose and serve well for may general use cases. Since they use large data sets and billions or trillions of parameters, their pre-training can take moths and costs can be in millions of dollars. This process is taking more compute resources as well. Due to these reasons, only few organizations are capable of creating Large Language Models like GPT.
How LLMs Work
1. Prompt is what we give as an input to a LLM. Its a piece of text that may be coming directly from a user, from Prompt Engineering(explaining this in below sections) or from another application
2. Tokenizer in a LLM application will tokenize the input prompt so it will be represented in numbers. There are different tokenizer libraries in use. So tokenized prompt are what a LLM can understand.
3. Context can be considered as the working memory of a LLM app. Its where the original prompt is loaded. When the tokenized prompt is in the context, it will send it to the LLM in parallels.
4. LLM does one thing, that is it predicts the next word when the prompt is sent to it. And return the full text to the prompt. For an example if my prompt is “Once upon a time”, LLM will predict the next word based on the knowledge it was trained with and predicts the next word. So it might return “Once upon a time there” to context. And the context will and that full word again to LLM so it will predict the next word again. This will happen for a while till a specific condition happens based on the LLM app. So it might go on and on and complete the sentence like “Once upon a time, there lived a king”.
5. Detokenizer does the opposite of the tokenizer. Once this text completion is done in LLM, its in tokenized form. It will go through the detokenizer and convert the text in number format to text format and will send to the Completion response.
So basically the LLM is a model that generates or completes the given prompt based on its pre trained knowledge mainly. Applications build upon LLMs can use different logics to add how the completion should happen in the LLM.
There are ways to improve or give more knowledge to the LLM in the the form of prompt it self or by fine tuning.
References