Hello, and welcome to the start of an exciting adventure! 🌟
As of today, we’re officially embarking on a journey into the fascinating world of Prompt Engineering. Now, you might be wondering, "Why prompt engineering?" Well, in this ever-evolving world of artificial intelligence (AI), the ability to communicate with AI models effectively is more important than ever. Whether it’s generating creative content, answering complex questions, or automating tasks, the prompts we use to guide AI play a huge role in the results we get.So, over the next five weeks, we’ll dive deep into the art and science of crafting effective prompts together. We’ll explore everything from the basics to more advanced techniques, all while building practical skills and tools that will empower us to make the most of AI. And don’t worry, we’ll be learning step by step, experimenting, and reflecting on what works best.
What’s Coming Up in Our 5-Week Learning Plan?
Here’s a sneak peek at what we’ll be covering each week of our learning journey:
Week 1: The Basics of Language Models and Prompting
We’ll kick things off by laying a solid foundation. This week, we’ll explore how language models work, what makes them tick, and the magic behind transformers and tokenisation. We’ll also dive into different types of prompts, including zero-shot, few-shot, and chain-of-thought, to understand how each can shape the AI's responses. By the end of this week, we’ll have a good grasp of the fundamental concepts that will guide our future experiments.
Week 2: Getting Hands-On with the OpenAI API
Next, we’ll roll up our sleeves and start working with the OpenAI API. We’ll learn the basics of APIs, how to make requests, and how to get responses from the model using Python. We’ll create a simple Python script that summarises news articles, an easy but practical way to put our skills into action. By the end of this week, we’ll be ready to start building our small AI-powered applications!
Week 3: Mastering Advanced Prompting Techniques
As we get more comfortable, we’ll dive into the world of advanced prompting. This week is all about refining and optimising our prompts to get the most accurate and useful responses from the AI. We’ll also explore prompt chaining, context management, and how to create reusable prompt templates that will save us time and effort. By the end of this week, we’ll be experts in crafting dynamic prompts that can handle more complex tasks.
Week 4: Building Applications with LangChain
Week 4 is going to be hands-on and exciting! We’ll explore LangChain, a powerful framework for building applications that leverage language models. Together, we’ll create a Q&A bot that can read a PDF file and answer questions based on its contents. We’ll also learn how to integrate LangChain with other tools, like Google Search and PDF readers, to create truly interactive AI systems. It’s going to be a fun week of building something real!
Week 5: Unlocking the Power of Chunking and Embeddings
In our final week, we’ll explore retrieval-augmented generation (RAG), a technique that combines the power of AI with external data. We’ll learn how to chunk large documents into manageable pieces, store them in vector databases like FAISS, and retrieve the most relevant information for our AI systems. We’ll also explore how embedding models work, using tools from OpenAI and HuggingFace. By the end of the week, we’ll have built our RAG-powered chatbot, capable of answering questions with data drawn from stored documents.
Why Are We Learning Prompt Engineering?
At first glance, prompt engineering might sound a little niche. But in reality, it’s the core skill that makes AI models truly effective. The better we are at framing questions, structuring requests, and managing context, the more accurate and meaningful the AI’s responses will be. And with AI becoming an integral part of so many fields, whether it’s healthcare, finance, education, or creative industries, knowing how to guide these models will give us a powerful edge in this fast-evolving landscape.
By the end of these five weeks, we’ll have a solid understanding of how to craft effective prompts and the practical skills to build applications and solutions using AI. We’ll be able to leverage the power of AI for everything from summarising documents to answering complex questions to creating smart, data-driven systems.
Week | Pre-requisites |
Week 1 | Basic understanding of programming (Python), Familiarity with machine learning concepts, Basic understanding of neural networks |
Week 2 | Python programming basics, Knowledge of HTTP and web services, Familiarity with REST APIs (will cover the basics as well) |
Week 3 | Week 1 knowledge (prompting basics), Basic programming skills (Python), Familiarity with Python's string manipulation |
Week 4 | Week 1-3 knowledge (prompt engineering, basic API usage), Familiarity with Python libraries (e.g., requests, json), Basic understanding of software architecture |
Week 5 | Week 1-4 knowledge (prompt engineering, API usage, LangChain basics), Familiarity with vector databases and search techniques, Basic understanding of machine learning models (embeddings, FAISS) |
Knowing all these before is good, but as I started it from scratch, so you and I both will learn all these concepts on our journey.
“The only thing that you need is consistency with high curiosity.”
Let’s Dive In!
Tomorrow, we’re jumping straight into Week 1 with an introduction to language models. I’m thrilled to have you join me on this learning adventure, and I can’t wait to share all the discoveries, challenges, and wins along the way. Together, we’ll figure out the best strategies for working with AI and help shape how it’s used in the future.
So, grab your laptop, bring your curiosity, and let’s dive in. This is going to be a fantastic journey! 🚀
Stay tuned for our first update tomorrow, and feel free to share your thoughts, questions, or your own experiences in the comments. Let’s make this an interactive and collaborative experience!
Week 1: Detailed Content | |
1. What is a Language Model? | |
1.1. Definition and Purpose of LLMs | |
1.1.1. NLP Overview | Understand Natural Language Processing and its applications in AI. |
1.1.2. Machine Learning Models in NLP | Explore statistical and neural network models used for NLP. |
1.1.3. Role of Language Models in NLP | Understand the purpose of a language model—predicting the next word, sentence completion, etc. |
1.1.4. Example LLM Use Cases | Text generation, translation, summarization, sentiment analysis, etc. |
1.2. Evolution of Language Models | |
1.2.1. Pre-Transformer Models | N-grams, bag-of-words, LSTMs, RNNs, GRUs. |
1.2.2. Introduction of Transformers | Attention mechanism, self-attention, and parallelization. |
1.2.3. BERT, GPT, T5 | Evolution of transformer-based models—bidirectional (BERT) vs autoregressive (GPT). |
1.2.4. Models Before Transformers | Rule-based models, statistical models. |
1.3. Core Components of a Language Model | |
1.3.1. Tokenization | How text is broken into tokens (words, subwords, characters). |
1.3.2. Vocabulary | Vocabulary size, encoding tokens as integers. |
1.3.3. Embeddings | Converting tokens into dense vectors in the embedding space. |
1.3.4. Attention Mechanism | Understanding how the model attends to different parts of the input text. |
1.3.5. Loss Functions | Cross-entropy loss, negative log-likelihood. |
1.4. LLM Evaluation Metrics | |
1.4.1. Perplexity | A measure of how well a language model predicts a sample. |
1.4.2. BLEU Score | Evaluation metric for machine translation quality. |
1.4.3. F1 Score | A measure of the balance between precision and recall in NLP tasks. |
1.4.4. ROUGE Score | Evaluation metric for summarization tasks. |
1.5. Language Models and Their Applications | |
1.5.1. Text Generation | How LLMs generate coherent text. |
1.5.2. Text Summarization | Automatic summarization using models like BERT and GPT. |
1.5.3. Translation and Multilingual Models | Use of LLMs in machine translation (e.g., Google Translate). |
1.5.4. Sentiment Analysis | How LLMs can understand and predict sentiment in text. |
2. Transformers & Tokenization Basics | |
2.1. The Transformer Architecture | |
2.1.1. Self-Attention | How attention helps capture long-range dependencies in sequences. |
2.1.2. Multi-Head Attention | Understanding how multiple attention heads capture diverse features. |
2.1.3. Position Encoding | Since transformers don’t have recurrence, position encoding is used to capture word order. |
2.1.4. Feedforward Networks | The MLP that follows each attention layer in transformers. |
2.1.5. Layer Normalization & Residual Connections | Techniques to stabilize training and allow deep architectures. |
2.2. Overview of Transformer Models | |
2.2.1. GPT (Generative Pretrained Transformer) | Auto-regressive model, unidirectional context. |
2.2.2. BERT | (Bidirectional Encoder Representations from Transformers) Masked language model, bidirectional context. |
2.2.3. T5 (Text-to-Text Transfer Transformer) | Unified model for multiple NLP tasks using text input/output. |
2.2.4. Vision Transformers (ViT) | Transformers are adapted for vision tasks. |
2.3. Tokenization Techniques | |
2.3.1. Word-level Tokenization | Breaking text into words, issues like out-of-vocabulary words. |
2.3.2. Subword Tokenization | Byte Pair Encoding (BPE), SentencePiece, and WordPiece (used by BERT). |
2.3.3. Character-level Tokenization | Tokenizing text at the character level, often used in languages with many morphologically complex words. |
2.3.4. Special Tokens | <CLS>, <SEP>, <PAD>, <UNK>—explaining the role of these tokens in different models. |
2.3.5. Pre-trained Tokenizers | How to use pre-trained tokenizers with libraries like HuggingFace. |
3. Prompt Engineering Overview | |
3.1. Introduction to Prompting | |
3.1.1. What is Prompting? | Defining how prompts are used to elicit responses from language models. |
3.1.2. Prompt-Based Learning | Using prompts to fine-tune models or guide their behavior. |
3.1.3. How LLMs Generate Responses | Explanation of the decoding process (greedy, beam search, temperature sampling). |
3.2. Types of Prompts | |
3.2.1. Zero-Shot Prompting | No examples, asking the model to generate responses based on the given instructions. |
3.2.2. Few-Shot Prompting | Providing a few examples for the model to learn from within the prompt. |
3.2.3. Chain-of-Thought Prompting | Encouraging reasoning by asking the model to break down its response step-by-step. |
3.2.4. In-Context Learning | The model can learn from the examples provided in the prompt without explicit retraining. |
3.3. Prompt Strategies | |
3.3.1. Instructions and Clarity | Writing clear instructions reduces ambiguity and improves model performance. |
3.3.2. Template-Based Prompts | Using templates for consistent structure and easier debugging (e.g., asking to generate a formal letter). |
3.3.3. Iterative Prompting | Refining prompts based on model outputs to achieve the best response. |
3.4. Use Cases of Prompting | |
3.4.1. Information Retrieval | Using prompts to ask the model to extract information from structured or unstructured data. |
3.4.2. Text Generation and Summarization | Prompting the model to generate coherent text or summaries of long documents. |
3.4.3. Code Generation and Debugging | Prompting LLMs to write or debug code (e.g., OpenAI Codex). |
3.4.4. Creative Tasks | Using prompts for creative content generation—poetry, stories, and art ideas. |
4. Types of Prompts: Zero-shot, Few-shot, Chain-of-thought | |
4.1. Zero-Shot Prompting | |
4.1.1. Definition | Asking the model to perform a task without any prior examples. |
4.1.2. Benefits and Limitations | High flexibility, but often less reliable than few-shot prompts. |
4.1.3. Examples | Text classification, question answering. |
4.2. Few-Shot Prompting | |
4.2.1. Definition | Providing a few examples to guide the model in completing a task. |
4.2.2. Benefits and Limitations | Improves reliability but still depends on the quality of examples. |
4.2.3. Examples | Text summarization, translation. |
4.3. Chain-of-Thought Prompting | |
4.3.1. Definition | Encouraging the model to explain its reasoning process to generate accurate results. |
4.3.2. Applications in Reasoning | Complex mathematical problems, logic puzzles. |
4.3.3. Example Prompts | “First, let’s break down the problem step by step…” |
This week is all about getting a solid grasp of the foundational concepts: understanding what LLMs are, how they evolved, and how we can work with them using prompts. Let’s start this day with some basic knowledge.
LM is a type of Machine Learning model that is designed to understand and generate human language. Now, to generate human language (output), the model needs a large amount of input in text form so that it can predict, generate or even classify the text based on that input.
Language models are typically used to model the probability distribution of words, phrases, or sentences within a language. By understanding this distribution, they can predict the likelihood of a sequence of words, generate coherent text, or classify content based on linguistic patterns.
LM not only has words, but also includes patterns of sentence structure, grammar, word relationships, and context. All this helps LM to predict and generate perfect sentence structure.
E.g.: “My name is Krishna”. Can be written as:
LM is designed to predict not only that the first and second sentences are correct, but also that the third sentence does not make sense and the fourth sentence is a question. This judgment is possible when we give everything to our model (as input).
LLM fulfills the purpose of:
1.1.1. NLP Overview: Understanding Natural Language Processing and Its Applications in AI
Let’s go one step further and discuss (NLP), which is a branch of artificial intelligence (AI) that focuses on the interaction between computers and human language. It involves designing algorithms and models that enable computers to process and understand human language, both written and spoken. The goal of NLP is to bridge the gap between human communication and machine understanding.
Key Components of NLP:
NLP is a vast ocean, and LM is a tool to do the needful (which we discussed above).
Now look at this example:
Although this makes no sense for us, suppose the table is an entity, then it can play chess on ___ (add your creativity). This is called prediction, where the model predicts the possible outcome of a sentence. Remember, there are always 2 choices we (humans) have given to AI in these situations where the sentence makes no sense. The choices are either the AI will say, “Can’t form a sentence because of…."(any reason), or the AI will generate the result, and if asked further about the answers, the AI replies,
Of course, the AI ethics principle, "Non-Malificense," saved me here, else ChatGPT would get very mad at me. for putting such prompt and then questioning the 'I' of 'AI'
Applications of NLP in AI:
NLP is at the heart of many modern AI applications, including virtual assistants, automated customer support, and content generation. And further in this blog, we learn, understand, and explore other tools that help NLP to fulfill its purpose.
1.1.2. Machine Learning Models in NLP: Explore Statistical and Neural Network Models Used for NLP
Machine learning models in NLP can broadly be categorized into two types: statistical models and neural network models.
These models use mathematical techniques to capture patterns in language data, typically by analysing word frequencies, probabilities, and statistical dependencies between words.
Training: The model is trained by counting how often each n-gram appears in a large text corpus (collection of texts). For example, in a bigram model, we would count how many times "I love" appears, how many times "love pizza" appears, etc.
Prediction: Once the model is trained, the system can predict the next word in a sequence. In a bigram model, given a word, the model looks at the word immediately preceding it to predict the next one.
Example:
If the model has seen the sequence "I love", it might predict "pizza" as the next word because "love pizza" has appeared frequently in the training data.
NOTE: When a word combination hasn't been seen during training, the model may assign a zero probability to it. To avoid this issue, techniques like Laplace smoothing are used to adjust probabilities slightly to account for unseen combinations.
In the case of language, the states are often "hidden", meaning we can observe the words (emissions) but we don't directly observe the underlying grammatical categories (states).
The HMM is made up of:
Training:
Prediction:
Example in POS tagging:
States: Noun, Verb, Adjective
Observations: "dog", "barked", "quickly"
The system tries to figure out the most likely sequence of POS tags for the words, given what it's learned about how often nouns, verbs, and adjectives appear and how they transition.
Neural networks are a deep learning model, and it revolutionized NLP. It can capture more complex patterns in language by learning from large datasets through layers of interconnected nodes. Types of NNM are
Neural network models are especially popular because they scale well with large datasets and can model complex relationships in language data.
In this study, we are going to cover the transformer, which is the most advanced NNM technique.
1.1.3. Role of Language Models in NLP: Predicting the Next Word, Sentence Completion, etc.
Language models play a central role in NLP by predicting the next word or sequence of words in a given text, completing sentences, or even generating entirely new text based on an input prompt. The primary functions of language models are:
The core idea is that language models can predict the most probable next unit (word, phrase, sentence) by leveraging vast amounts of linguistic data they were trained on.
NOTE: Nearly ~45 Terabytes of compressed text data is used to train GPT-3.
1.1.4. Example LLM Use Cases: Text Generation, Translation, Summarization, Sentiment Analysis, etc.
Large Language Models (LLMs) are deployed across various industries for a wide range of tasks. Some notable use cases include:
This section will help in understanding the evolution of LLMs and how we moved from simpler models to the complex ones we use now:
1.2.1 Before Transformers: The early days of LMs.
Before transformers (which we will discuss later in this blog post), the LMs looked very different in n-gram models that predicted the next word by looking at the previous few words. This method/process is fine for short phrases, but it struggles to predict correctly in longer sentences or for complete language.
For example, in a trigram model (which looks at two words back), if you've seen a lot of examples of:
"I love pizza.",
"I love music.",
"I love coffee."
Then the model would likely predict “pizza” after “I love”.
Problem: It doesn't know what you were talking about before. If the full paragraph was:
“My favourite car is a BMW. I like driving fast. I love___”
The n-gram model might still say “pizza” because it can’t see or understand the wider context.
1.2.2 Then came Recurrent Neural Networks (RNNs). These were a big step forward. RNNs processed sentences word by word and kept a sort of "memory" of what had come before. This made them better at understanding the flow of language. But they had their problems too. Over time, they’d forget earlier words in a long sentence, which made them less reliable for longer texts.
For example, “My favourite car is a BMW. I like driving fast. I love___”
An RNN might do better than an n-gram and guess “cars” or “driving”. But:
Problem: If the sentence is very long, the early words (like “BMW”) get forgotten. The network struggles to “remember” important context far back in the sequence.
1.2.3 To fix that, Long Short-Term Memory networks (LSTMs) were introduced. LSTMs are a type of RNN designed to hold on to important information for longer periods. They could remember relevant words even if they appeared much earlier in a sentence. But despite these improvements, RNNs and LSTMs were slow to train and couldn't take full advantage of modern computer hardware.
For example, “My favourite car is a BMW. I like driving fast. I love___”
An LSTM might remember that the topic is about cars and correctly guess “cars” instead of “pizza”.
Still a Problem: While better than RNNs, they still don’t always handle very long or complex sentences well, especially with multiple topics or paragraphs.
1.2.4 Transformers: A Big Leap Forward: Transformers completely changed the way we build language models. Introduced in 2017, they got rid of recurrence altogether and brought in something called the attention mechanism. This lets the model focus on different parts of a sentence all at once, rather than going through it word by word.
The key idea is self-attention. This allows every word in a sentence to look at every other word, so the model can understand the meaning based on the full context. For example, in the sentence "The cat sat on the mat because it was tired", the word "it" can be linked back to "cat" more easily using self-attention.
Another big benefit of transformers is parallelisation. Because they don't rely on processing words one after another, they can handle lots of words at the same time. This makes training much faster and more efficient, especially on powerful hardware like GPUs.
1.2.5 BERT, GPT, and T5: Different Paths to Language Understanding. With the transformer architecture in place, researchers began building more advanced models.
BERT (Bidirectional Encoder Representations from Transformers) was one of the first big successes. It reads text both forwards and backwards, allowing it to understand the full meaning of a sentence before making a prediction. This is great for tasks like answering questions or finding relevant information in a passage.
On the other hand, GPT (Generative Pre-trained Transformer) takes a different approach. It only reads text in one direction, from left to right, and is trained to predict the next word in a sentence. This makes it better suited for generating new text, like writing stories or having conversations.
Then there’s T5 (Text-to-Text Transfer Transformer), which takes things a step further. It treats every language task – translation, summarisation, answering questions, and so on as a text-to-text problem. So, you give it text as input and it gives you the new text as output, no matter what the task is. It’s a flexible and powerful approach that’s been very effective.
NOTE: I experienced that when I use ChatGPT, and didn’t log in to it, then sometimes ChatGPT also forgot the previous chats and gives a response to the latest one. Although if I use “…And I hope you remember our last chat regarding…” in my prompt, then ChatGPT gives a better response. Which means in this present condition, there are lots of things that need to improve in this field, and prompt engineering is the key to success.
Code:
Let’s check the same in GPT-4,
Once we understand the theory, let’s dig into the technical bits that make language models tick. These are the core building blocks of large language models (LLMs), and each one plays an important role in helping the model understand and generate human language.
1.3.1 Tokenization: Before a language model can understand text, it needs to break it down into smaller pieces. This is called tokenisation. Tokens are usually words, parts of words, or sometimes even individual characters.
For example, the sentence:
“I’m going to the shop.”
Might be split into tokens like:
["I", "’m", "going", "to", "the", "shop", "."]
Some models, like GPT, use something called subword tokenisation, which means they break rare or unfamiliar words into smaller, more common parts. This way, the model doesn’t get stuck on words it hasn’t seen before.
1.3.2 Vocabulary: Once text is broken into tokens, each one is mapped to a number using the model’s vocabulary. Think of the vocabulary as a huge list of all the tokens the model knows about. Each token has a unique ID, just a number that the model can work with.But what happens if the model sees a word it’s never seen before? Thanks to tokenisation, the word is usually broken into smaller chunks that are in the vocabulary. This way, the model can still handle new words, even if it’s never seen them in training.
1.3.3 Embeddings: How do embeddings work? Now that we have numbers for each token, we need to turn those numbers into something the model can understand in a meaningful way. That’s where embeddings come in.An embedding is a dense vector list of numbers that captures the meaning of a word. Similar words have similar vectors. For example, the words “king” and “queen” would have embeddings that are close to each other in the model’s memory.
These embeddings are what allow the model to pick up on patterns and relationships between words, like knowing that “Paris” is to “France” as “Tokyo” is to “Japan”.
1.3.4 Attention Mechanism: The attention mechanism is one of the most powerful parts of modern language models. It helps the model decide which words in a sentence are most important to focus on when making predictions.For instance, in the sentence:
“The dog that chased the cat was barking loudly.”
The attention mechanism helps the model understand that “was barking” is referring to “the dog”, not “the cat”.
In transformers, we use self-attention, which allows every word to look at every other word in the sentence. This helps the model understand the full context, no matter where important words appear.
1.3.5 Loss Functions: When training a model, we need a way to measure how well (or badly) it’s doing. This is the job of the loss function. One common type is called cross-entropy loss.Here’s how it works: the model makes a prediction (e.g., the next word in a sentence), and the loss function compares that prediction to the actual correct answer. The difference between the two becomes the loss. A high loss means the model got it wrong; a low loss means it got it right.
The model then uses this information to adjust its internal settings (known as weights), so it can make better predictions next time.
Building a language model is one thing, but we also need ways to measure how well it works. In this section, we will look at some common ways to evaluate large language models (LLMs):
1.4.1 Perplexity: This measures how well a model can predict the next word in a sentence. A low perplexity means the model is good at predicting the next word, while a high perplexity means it struggles. I’ll explain how this is used to assess the quality of the model's predictions.
1.4.2 BLEU, ROUGE, and F1 Scores: These are used to evaluate how well a model performs tasks like translating text or summarising it. I’ll explain each of these metrics and how they help us understand how well a model is doing in these areas.
Let’s play with code and do practical work. Use Python code(simple) to understand how these evaluations work in the real world.
Don’t worry, here is the description of each output.
As we learnt that Transformers have revolutionised the field of natural language processing (NLP) by enabling models to understand language with high accuracy and efficiency. This section introduces the fundamentals of the transformer architecture and the concept of tokenisation, which is essential for preparing text for input into these models.
The transformer architecture, first introduced in the paper "Attention is All You Need" (Vaswani et al., 2017), is designed to handle sequential data, such as text, without relying on recurrence. It uses mechanisms like attention and position encoding to process input data in parallel and more effectively.
2.1.1. Self-Attention
Self-attention is a core component of transformers. It allows the model to focus on different parts of the input sequence when processing each word. For instance, when interpreting the word "bank" in the sentence "He sat by the river bank," self-attention helps the model consider surrounding words like "river" to understand the correct meaning.
Self-attention computes the importance of each word in a sentence relative to every other word, enabling the model to capture long-range dependencies. This is crucial in understanding context over long sentences or paragraphs.
2.1.2. Multi-Head Attention
Multi-head attention is an extension of self-attention. Instead of having a single attention mechanism, the model uses multiple attention heads in parallel. Each head learns to focus on different parts or aspects of the input.
This enables the model to capture diverse linguistic features. For example, one head might learn to focus on syntactic structure, while another might capture semantic relationships. The outputs from each head are then combined and passed through the network.
2.1.3. Position Encoding
Transformers, unlike RNNs, do not process words in a specific order by default. To provide the model with information about the position of words in a sentence, position encoding is added to the input embeddings.
These encodings are fixed or learned values that are added to each word’s embedding vector. They allow the model to understand the order of the words, which is essential for meaning.
2.1.4. Feedforward Networks
Each attention layer in a transformer is followed by a feedforward network. This is a simple multi-layer perceptron (MLP) applied independently to each position.
The purpose of this component is to further process the output of the attention mechanism, enabling the model to transform information in a non-linear way and improve its representational power.
2.1.5. Layer Normalisation & Residual Connections
Layer normalisation is used to stabilise and speed up training by normalising the input to each layer. This helps the model converge more quickly and reliably.
Residual connections are shortcuts that add the input of a layer to its output. They help in training deep networks by preventing the vanishing gradient problem and make it easier for the model to learn identity functions where necessary.
Several transformer models have been developed for various NLP tasks, each with unique characteristics and uses.
2.2.1. GPT (Generative Pretrained Transformer)
GPT is an auto-regressive model, meaning it predicts the next word in a sequence based on the previous words. It processes text in a left-to-right fashion, making it unidirectional.
This design is well-suited for tasks like text generation and completion. GPT models are pretrained on large amounts of text and fine-tuned for specific tasks.
2.2.2. BERT (Bidirectional Encoder Representations from Transformers)
BERT is designed to understand the context of a word based on both its left and right sides, making it bidirectional. It uses a masked language modelling approach during training, where some words are hidden and the model learns to predict them.
This allows BERT to generate richer representations of language, making it effective for tasks like question answering and sentence classification.
2.2.3. T5 (Text-to-Text Transfer Transformer)
T5 is a versatile model that treats all NLP tasks as a form of text-to-text problem. Whether it’s translation, summarisation, or question answering, the input and output are always text.
This unified approach simplifies the architecture and makes it easier to apply the model to a wide range of applications.
2.2.4. Vision Transformers (ViT)
Vision Transformers adapt the transformer model for image processing. Instead of using sequences of words, ViTs split images into patches and treat each patch as a token.
This allows the model to learn visual features using the same self-attention mechanism, achieving impressive results in image classification tasks.
Before inputting text into a transformer model, it must be tokenised. Tokenisation is the process of breaking text into smaller units, called tokens, which the model can understand.
2.3.1. Word-level Tokenisation
This method splits text into individual words. It is simple but can struggle with out-of-vocabulary (OOV) words—words not seen during training—which can limit the model’s ability to understand rare or new words.
2.3.2. Subword Tokenisation
Subword tokenisation addresses the OOV problem by breaking words into smaller units. Common techniques include:
This allows the model to represent even unseen words through combinations of known subwords.
2.3.3. Character-level Tokenisation
This method tokenises text at the level of individual characters. It is useful for languages with rich morphology or when dealing with noisy text (e.g., typos).
However, it results in longer sequences, which can increase computational cost.
2.3.4. Special Tokens
Transformers often use special tokens to manage input and output:
Understanding these tokens is essential for effectively using pre-trained models.
2.3.5. Pre-trained Tokenizers
Libraries like HuggingFace provide pre-trained tokenizers for many popular models. These tokenizers are trained on large corpora and follow the exact tokenization strategy used during the model's training.
Using a pre-trained tokenizer ensures compatibility and helps maintain the performance of the model on downstream tasks.
Code: Explore Tokenization Using Hugging Face
Output:
Here we can see that each word has been assigned a specific token number, like Image = 15065, pig = 13407, etc. There are a total of ~1,71,000 words in the Oxford Dictionary, and most of them are similar in terms of spelling, like flip, flipper, flipping, and flipped. These are all different words in English, but while tokenizing, we can see the example above. There are ~30,000 words that are tokenized in the transformers library huggingface, others are just created by using ##
eg: Flipper = ##per.
The case sensitivity also works if we use uncase or case sensitivity.
Bert-base-case = case sensitive = Tip !=tip.
Bert-base-uncase = case insensitive = Tip = tip
Code: Visualize Attention with BERT
Output:
Finally, we’ll dive into Prompt Engineering, which is the art of writing good inputs (called prompts) to get useful, accurate, or creative responses from a large language model (LLM). Think of it like asking the right question in the right way. If you know how to guide the model with a clear and well-structured prompt, you're far more likely to get the result you want.
In this section, we will explore different prompting techniques to see how they influence the model’s output.
3.1.1. What is Prompting?
Prompting refers to crafting input text (prompts) that elicit meaningful, accurate, and helpful responses from an LLM.
Use your OpenAI API key and get the output.
3.1.2. Prompt-Based Learning
Instead of training a new model, prompt-based learning leverages clever prompting to teach the model desired behaviour.
Example:
3.1.3. How LLMs Generate Responses
LLMs use decoding algorithms such as:
Example (temperature setting):
Example: Translate this sentence to French: "I love reading books."
Example: Translate the following sentences to French:
English: "Good morning." → French: "Bonjour."
English: "Thank you." → French: "Merci.”
English: "I love reading books." →
The few-shot version gives the model a pattern to follow, which often leads to more accurate or stylistically consistent results.
This is helpful for tasks that involve logic or reasoning.
Example:
Basic Prompt:
What is 27 times 14?
CoT Prompt:
Let’s think step-by-step.
What is 27 Ă— 14?
The model might reply:
27 Ă— 14 = (20 Ă— 14) + (7 Ă— 14) = 280 + 98 = 378.
By asking it to think step-by-step, we can improve both the accuracy and the transparency of how the model thinks.
If you want to test these models, you can do it by using the OpenAI API.
3.3.1. Instructions and Clarity
Clear and specific instructions lead to better outputs.
Before:
"Write about climate change."
After:
"Write a 100-word paragraph summarizing the causes and effects of climate change in simple language."
3.3.2. Template-Based Prompts
Use templates for repeatable tasks.
Prompt Template (formal letter):
3.3.3. Iterative Prompting
Refine the prompt based on earlier responses to improve the quality.
Example Process:
Ask for specific facts or data.
Result by ChatGPT
3.4.2 Text Generation and Summarization
Generate original content or summarize existing text.
Result:
3.4.3 Code Generation and Debugging
Ask the model to write or fix code.
Result:
Use prompting to generate poems, stories, and ideas.
Result:
Prompting refers to the way a user formulates an input or query to guide an AI model to produce a desired output. The three types described here vary by how much context or guidance is given to the model before it performs the task.
In this section, we will explore different prompting techniques to see how they influence the model’s output.
4.1.1. Definition
Zero-shot prompting involves asking the model to perform a task without providing any examples. You simply describe what you want, and the model uses its training to figure out what to do.
Example prompt: "Classify the sentiment of this sentence: 'I love this movie.'"
4.1.2. Benefits and Limitations
4.1.3. Examples
4.2.1. Definition
Few-shot prompting provides the model with a small number of examples to illustrate how to perform the task. These examples help the model understand the format and desired output.
4.2.2. Benefits and Limitations
4.2.3. Examples
4.3.1. Definition
Chain-of-thought (CoT) prompting encourages the model to explicitly lay out its reasoning process before giving a final answer. This is especially helpful for tasks that require multiple steps of logic.
Example prompt:
"If a train leaves at 3 PM and travels for 2 hours, what time does it arrive?
Let's think step by step:
4.3.2. Applications in Reasoning
Used for problems where thinking step-by-step leads to better answers, especially:
4.3.3. Example Prompts
This method boosts performance in complex reasoning tasks, as the model mimics human problem-solving behaviour.
SUNDAY: REVISE THE WHOLE THING FOR BETTER UNDERSTANDING OF THE BASICS
Week 2: Detailed Content | |
1. What is an API? How REST APIs Work? | |
1.1. Basics of APIs | |
1.1.1. What is an API? | Definition of Application Programming Interface, its role in software communication. |
1.1.2. API Types | RESTful, SOAP, GraphQL, WebSockets—focus on REST for now. |
1.1.3. Client-Server Architecture | How the client requests data, and the server processes and responds. |
1.2. REST Architecture | |
1.2.1. RESTful Principles | Statelessness, client-server model, uniform interface, layered system, and caching. |
1.2.2. HTTP Methods | GET, POST, PUT, DELETE—how they correspond to actions in REST. |
1.2.3. HTTP Status Codes | 200 (OK), 404 (Not Found), 500 (Internal Server Error), etc. |
1.2.4. Endpoints | URL structure and how API requests map to resources (e.g., https://api.openai.com/v1/completions). |
1.3. Making API Requests | |
1.3.1. Headers and Authentication | How to authenticate API requests (using API keys). |
1.3.2. Query Parameters | Sending additional parameters in your requests (e.g., temperature, max_tokens). |
1.3.3. POST Requests | Sending data in the body (usually in JSON format). |
2. OpenAI API Key, Environment Setup | |
2.1. Getting API Access | |
2.1.1. Creating an OpenAI Account | Sign up for API access and retrieve the API key. |
2.1.2. Setting up API Key in Python | Use environment variables to securely store your API key. |
2.1.3. Introduction to OpenAI API Playground | Explore the OpenAI API playground to understand how it works before coding. |
2.2. Setting up Development Environment | |
2.2.1. Installing Libraries | How to install the OpenAI Python package using pip install openai. |
2.2.2. Setting Up Virtual Environments | Using venv or conda for environment management. |
2.2.3. Basic Authentication | How to authenticate requests using the OpenAI API key in your code. |
3. Making Basic Requests Using Python (Requests, OpenAI Package) | |
3.1. Using the Requests Package | |
3.1.1. Sending GET/POST Requests | How to make API requests using Python’s requests module. |
3.1.2. Handling JSON Responses | Understanding how to parse and handle JSON data returned by the API. |
3.2. Using the OpenAI Package | |
3.2.1. Text Completion with GPT-3 | Simple example of requesting the GPT-3 API for text generation. |
3.2.2. Parameters for Text Generation | Understanding parameters like temperature, max_tokens, top_p, and frequency_penalty. |
3.2.3. Error Handling | How to handle potential API errors (timeouts, rate limiting). |
4. Build a Small Python Script to Summarize News Articles | |
4.1. Problem Setup | |
4.1.1. Input Format | Collecting raw text data (news articles) from the web or a local file. |
4.1.2. Preprocessing | Cleaning and preparing text (removing HTML tags, stopwords, etc.). |
4.2. Summarization API Request | |
4.2.1. Making the Request | Send the article text to OpenAI’s API with a prompt like “Summarize this article in 200 words.” |
4.2.2. Extracting and Displaying the Summary | Parse the API response and extract the summary. |
4.3. Output | |
4.3.1. Testing the Script | Run the script with different articles to see how it handles various content types. |
4.3.2. Enhancements | Consider adding options like adjusting summary length or summarizing in a specific tone (formal, casual). |
1.1.1. What is an API?
API stands for Application Programming Interface, which is a set of rules that allows two software programs to talk to each other. Think of it as a waiter in a restaurant: the customer (client) tells the waiter what they want, and the waiter brings the food from the kitchen (server).
Example in real life: When you use a weather app on your phone, it sends a request to a weather service's API, which then returns the current weather data.
1.1.2. API Types
There are several types of APIs, but we’ll focus on REST for now.
We’ll be focusing only on REST, which is the simplest and most widely used.
1.1.3. Client-Server Architecture
In this setup:
Flow:
1.2.1. RESTful Principles
1.2.2 HTTP Methods
1.2.3 HTTP Status Codes
1.2.4. Endpoints
An endpoint is the URL used to access a resource.
Example:
1.3.1. Headers and Authentication
To use many APIs, you need to authenticate using an API key.
Steps:
Example in Python:
1.3.2. Query Parameters
These are extra values added to the URL to filter or control what data you get.
Example:
Used to send data (e.g. add a new user). Data is usually in JSON format.
Steps to send a POST request:
Python example:
In this section, we'll discuss and deep dive through the process of getting access to the OpenAI API, setting up your development environment, and using the API securely. Whether you're new to the OpenAI API or just need a refresher, this step-by-step guide will help you get started efficiently.
Before you can use the OpenAI API, you need to set up an account and retrieve your API key. The API key is a unique code that authenticates your requests to OpenAI's servers. Follow these steps to get started:
2.1.1. Creating an OpenAI Account
2.1.2. Setting up the API Key in Python
To securely use your OpenAI API key in Python, it's recommended to store it as an environment variable, rather than hard-coding it directly in your script. This keeps your key safe, especially when you share or store your code in version control systems like Git.
Here’s how you can set it up:
2.1.3. Introduction to OpenAI API Playground
Before you dive into coding, it’s useful to explore the OpenAI API Playground. The Playground provides an interactive interface where you can experiment with different GPT models (like GPT-3 or GPT-4), test various prompts, and tweak parameters to see how the model responds.
Now that you’ve set up the API key and explored the Playground, it’s time to prepare your development environment. This ensures that all your packages and dependencies are neatly organised and isolated.
2.2.1. Installing Libraries
To interact with OpenAI's API in Python, you need to install the openai Python package. This library allows you to make requests to the OpenAI API, send prompts, and receive responses.
You can install the package using pip, which is the Python package installer. Run this command in your terminal:
To ensure that the library was installed correctly, try importing it in your Python script:
If you don't see any errors, you're good to go. I didn’t get any error, so it’s hard to tell you if you face any error, but still, there is always help available through the comment section, online, you can connect to me or ChatGPT itself.
2.2.2. Setting Up Virtual Environments
A virtual environment is a self-contained directory that contains all the necessary dependencies for a project. Using virtual environments is considered good practice because it helps you avoid conflicts between packages across different projects.
1. Using venv for Virtual Environments4. Deactivating the Virtual Environment: When you are done working, you can deactivate the virtual environment by simply typing:
NOTE: Easy, but still, if you find any difficulty, just follow the steps and read it again, as I mentioned previously, I didn’t find any difficulty in doing it, but might be you might face some issues, then take help and don’t stop learning.
2.2.3. Basic Authentication
Once your development environment is set up and you have the OpenAI API key, you can begin authenticating and making requests to the OpenAI API.
1. Authentication with the OpenAI API Key:In your Python script, authenticate your requests to the OpenAI API by setting the API key. This can be done by passing the key directly or by using the environment variable you set earlier.
Here’s an example of making an authenticated request to OpenAI:
Remember to change the max_token as per your choice, depending on the token you have.
2. Test the Connection
Once the code is set up, running this script will request the OpenAI API. If everything is set up correctly, the response will be printed to the console, and you should see the model's answer to the prompt (e.g., "Paris").
Once you have your development environment set up and your API key configured, the next step is making requests to OpenAI’s API. You can do this using Python, either with the requests package (a popular HTTP library) or the official OpenAI Python package. This section will cover both methods, as well as how to handle responses and potential errors.
The requests package is a simple, yet powerful, HTTP library for Python. It's commonly used to make GET and POST requests to external APIs. You can use this to interact with the OpenAI API directly.
3.1.1. Sending GET/POST Requests
To send requests to the OpenAI API using requests, you’ll need to make a POST request. This is because most API endpoints, like the ones for text generation, require data to be sent in the body of the request.
Here's an example of how to send a POST request using the requests library to interact with OpenAI's text generation API:
The following example demonstrates how to send a request to OpenAI's API to generate text using the GPT-3 model:
Let me explain this in a little detail
3.1.2. Handling JSON Responses
The OpenAI API responds with data in JSON format. JSON (JavaScript Object Notation) is a lightweight data interchange format that is easy to read and write for both humans and machines.
Once you get the response from the API, it is usually in the form of a JSON object. You can parse and extract information from it using Python’s built-in json() method that comes with the requests package.
Here’s how to handle the JSON response:
The requests library automatically parses the response body as JSON if the content type is application/json. You can access the JSON data like this:
This will print the entire JSON response. The typical structure of the response looks like this:
To extract the generated text from the choices array, you would use:
The OpenAI package is the official library for interacting with the OpenAI API. It's designed to simplify the process of making requests, handling authentication, and working with responses.
3.2.1. Text Completion with GPT-3
Once you’ve installed the OpenAI package, you can quickly send a request to generate text using GPT-3. Here’s an example:
We had already done this, the next step is.
2. Making a Text Completion RequestHere’s a basic example to send a prompt to the OpenAI API for text generation:
In this example:
3.2.2. Parameters for Text Generation
When interacting with OpenAI's API, we can tweak various parameters to adjust how the model behaves. Some important parameters include:
Here’s an example with custom parameters:
3.2.3. Error Handling
When working with APIs, it’s crucial to handle potential errors like network issues, timeouts, and rate limiting. OpenAI’s API provides clear error codes that can help you troubleshoot issues.
Here’s an example of how to handle errors gracefully when using the OpenAI package:
NOTE: I must mention that please change the max_token. If you have purchased a token, then it will deduct your money otherwise, it gives you an error that the limit is broken.
In this example, errors are caught using try-except blocks:
In this section, we'll guide you through the process of building a simple Python script that uses the OpenAI API to summarise news articles. By the end of this tutorial, you'll have a script that can take raw article text and generate concise summaries using GPT-3. Let’s break it down step-by-step.
Before we start writing the script, we need to define the problem and understand how we’ll collect and preprocess the news articles. Summarising a news article involves two main steps:
4.1.1. Input Format: Collecting Raw Text Data
To build a summarisation script, the first thing we need is the text of the article. There are two main ways to collect this text:
You can scrape news articles from the web using web scraping libraries like BeautifulSoup and requests. However, keep in mind that scraping should be done according to the terms of service of the website and with respect for copyright laws.
Alternatively, you can manually store news articles in a text file (e.g., .txt format) and load them into the script.
Here’s an example of how you might load an article from a text file:
If you wanted to scrape a news article from a website, you'd use something like this:
4.1.2. Preprocessing: Cleaning and Preparing the Text
Once you’ve collected the raw text data, it often needs some cleaning before being processed. This is especially true when scraping articles from the web, as they may contain extraneous HTML tags, advertisements, or navigation elements.
Here’s a simple function to clean the text by removing HTML tags and unnecessary whitespace:
For more advanced cleaning, you can also remove stopwords (common words like "the", "and", etc.), punctuation, or perform stemming/lemmatisation, but for this simple script, we’ll stick with the basics.
Now that we have the article text cleaned and ready, the next step is to send it to OpenAI’s API for summarization. This is where we use the OpenAI package to make a request to the GPT-3 model.
4.2.1. Making the Request
To make a summarisation request, we’ll use the openai.Completion.create() method, passing in the article text along with a prompt like “Summarise this article in 200 words”.
Here’s an example of how to structure the API request:
Explanation of parameters:
4.2.2. Extracting and Displaying the Summary
After making the request, the API will return a response in JSON format, containing the summary. We can parse this response and extract the summary text as follows:
This function simply takes the summary and prints it to the console. You can modify this to write the summary to a file or display it in a graphical user interface (GUI) if needed.
Now that we've written the core parts of the script, it’s time to test and refine it. We can run the script with different articles to see how well the summarisation works and consider some enhancements to make it more flexible and user-friendly.
4.3.1. Testing the Script
Let’s put everything together and test the script with different articles. You’ll want to make sure that the article input (whether from a file or the web) is processed correctly and that the summary is a good, concise representation of the original content.
Here’s how you could set up the full script for testing:
In this test case, we're using a text file (news_article.txt) that contains the article we want to summarise. You can replace the load_article() function with one that fetches articles from the web if preferred.
4.3.2. Enhancements
Once you’ve got the basic functionality working, there are several ways you can enhance the script:
You can add an option to adjust the length of the summary. For example, you could take user input to specify the desired word count, like so:
Another useful enhancement is to add options for generating summaries in a specific tone, such as formal or casual. You could modify the prompt to include these options:
In this case, you could pass "formal" or "casual" as arguments, and the model will adapt the summary’s style accordingly.
Full Python Script Example
Here is the complete Python script that you can use to summarise news articles:
_________________________________________________________________________________________
import openai
import os
import re
# Set the OpenAI API key (use environment variable for security)
openai.api_key = os.getenv("OPENAI_API_KEY")
def load_article(file_path):
# Load article from a text file
with open(file_path, 'r', encoding='utf-8') as file:
article_text = file.read()
return article_text
def clean_text(text):
# Clean the text by removing HTML tags and extra whitespace
clean = re.sub(r'<.*?>', '', text) # Remove HTML tags
clean = ' '.join(clean.split()) # Remove extra whitespace
return clean
def summarize_article(article_text, word_limit=200, tone="formal"):
# Send the article to OpenAI and get a summary
prompt = f"Summarise this article in {word_limit} words and in a {tone} tone:\n\n{article_text}"
response = openai.Completion.create(
model="text-davinci-003",
prompt=prompt,
max_tokens=300, # Change it according to your choice
temperature=0.5,
)
summary = response.choices[0].text.strip()
return summary
def print_summary(summary):
# Print the summary
print("Summary of the article:")
print(summary)
def main():
# Load the article
article_text = load_article("news_article.txt") # Replace with the path to your file
cleaned_text = clean_text(article_text)
# Ask the user for the summary length
word_limit = int(input("Enter the number of words for the summary: "))
# Summarise the article
summary = summarize_article(cleaned_text, word_limit=word_limit)
# Display the summary
print_summary(summary)
if __name__ == "__main__":
main()
_________________________________________________________________________________________
Now we have a fully functional Python script that can summarise news articles using OpenAI’s GPT-3. The script allows you to load article text, clean it, and then send it to the API for summarisation. You can enhance it further by adding options like adjusting the summary length or changing the tone of the summary, making it more flexible and user-friendly.
In addition, you can use HaggingFace, AssemblyAPI, Deepseek, MistralAI etc for your understanding, mostly in companies they use these APIs along with their own dataset, in such cases the RAG come into the picture which we will discuss in week 5.
SATURDAY: REVISE THE WHOLE THING FOR BETTER UNDERSTANDING OF THE BASICS
SUNDAY: REVISE THE WHOLE THING FOR BETTER UNDERSTANDING OF THE BASICS
Week 3: Master Advanced Prompting Concepts | |
1. Rewriting, Refining Prompts | |
1.1. Prompt Structure | |
1.1.1. Identifying Ambiguity | Detecting vague or unclear prompts and how to make them more specific. |
1.1.2. Precision vs. Recall | Balancing the specificity of the prompt and generalization. |
1.2. Iterative Refining | |
1.2.1. Feedback Loops | How to use the model’s output to refine and improve subsequent prompts. |
1.2.2. Evaluating Responses | How to assess whether a prompt is yielding the desired results and tweak it accordingly. |
1.3. Prompt Engineering for Structured Data | |
1.3.1. Asking for Specific Formats | Example prompts to get results in JSON or CSV format. |
1.3.2. Handling Ambiguous Input | How to design prompts to handle ambiguous or incomplete data. |
2. Prompt Chaining & Context Management | |
2.1. Understanding Prompt Chaining | |
2.1.1. Sequential Prompts | Asking a series of questions that build upon one another (e.g., summarizing, then querying about specific details). |
2.1.2. Example Chains | Creating a chain for a task like summarizing and then performing sentiment analysis. |
2.2. Context Management | |
2.2.1. Memory Management | How to maintain context across multiple prompts (e.g., using previous output as input). |
2.2.2. Handling Long Conversations | How to manage token limits in longer prompts or interactions. |
2.3. Use Cases for Chaining | |
2.3.1. Complex Decision Trees | Chaining prompts for decision-making (e.g., a multi-step reasoning process). |
2.3.2. Role-Playing | Using prompt chaining for interactive applications like customer service bots or therapy bots. |
3. Prompt Templates (Jinja2 or f-strings) | |
3.1. Jinja2 Templates | |
3.1.1. Introduction to Jinja2 | What is Jinja2 and how to use it for templating in Python. |
3.1.2. Building a Template | How to create dynamic prompts using variables with Jinja2. |
3.2. f-string Templates | |
3.2.1. Python f-strings | Learn to use f-strings for cleaner and more readable dynamic prompts. |
3.2.2. Example Use Case | Constructing prompts like "Generate a story about {character_name} who loves {hobby}". |
about the kind of information you want. For instance:
By giving this prompt more context, you guide the model to provide exactly what you want.
1.1.2. Precision vs. Recall
Fact: In natural language processing, the balance between precision and recall is important. Too much precision can make the model "narrow-minded," while too much recall can lead to overly general and irrelevant answers.
Example of Precision:
"List the top five tourist attractions in London."
Example of Recall:
"Tell me about London."
The first example is very precise; it guides the model to give a specific list. The second example is more open-ended, allowing the model to provide a variety of details about London.
1.2. Iterative Refining
Sometimes, the first prompt you write won’t give the perfect result. Instead of rewriting everything, you can refine it based on the model’s output. This is what’s called "iterative refining." The goal is to adjust the prompt to get closer to your ideal response.
1.2.1. Feedback Loops
A feedback loop means using the results from the first response to shape your next prompt. If you don’t get the answer you want, you can clarify or refine your question to guide the model closer to what you're looking for.
For instance, if you ask,
"Give me a recipe for a chocolate cake."
And the response is too basic, you could refine the prompt to:
"Give me a detailed recipe for a moist chocolate cake, including preparation time, ingredients, and step-by-step instructions."
Now the model has a clearer direction, and the output will likely be more useful.
1.2.2. Evaluating Responses
Evaluating responses is about judging whether the output matches your needs. Sometimes, even well-structured prompts will result in responses that don’t quite hit the mark. This is a chance to refine your prompt further.
Fact: Studies have shown that refining prompts can improve accuracy and reliability by up to 63%, depending on the complexity of the task.
1.3. Prompt Engineering for Structured Data
Structured data (like tables, lists, or JSON format) often requires specific formatting. For example, if you’re processing data and need results in a particular format, you should design your prompt accordingly.
1.3.1. Asking for Specific Formats
If you want to get data in a specific format, like JSON or CSV, you can ask the model directly to output that structure.
For example, if you want a list of countries with their capitals in JSON format, you might use a prompt like this:
"List the countries in Europe with their capitals in JSON format."
This way, the model understands it needs to structure the data in JSON.
Here’s an example of what that might look like:
Sometimes, you might get incomplete or unclear input data. In such cases, designing your prompt to handle this ambiguity becomes important. For example, if you’re working with incomplete data and want the model to infer missing information, your prompt might look like this:
"Fill in the missing data for the following countries: [list of countries] with their capitals, population, and area in square miles."
When you need the model to perform complex tasks or handle multiple steps, you can use prompt chaining. This involves a series of prompts that build on each other. Context management ensures that the AI remembers what happened earlier in the conversation.
2.1. Understanding Prompt Chaining
2.1.1. Sequential Prompts
Sequential prompts follow a logical progression. Each prompt asks for something slightly more detailed or related to the previous one. It’s a great way to break down a larger task into smaller, manageable steps.
For example:
Each prompt builds on the previous one, creating a chain of responses.
2.1.2. Example Chains
Prompt chaining can also be used to perform tasks that require analysis or processing over time.
For instance:
Context management is crucial when you have long conversations or multiple prompts. Without context management, the model might forget earlier details, making it harder to stay on track.
2.2.1. Memory Management
Memory management allows the AI to "remember" previous responses. This is vital for multi-turn interactions where the model needs to retain context. For instance, in a chatbot, the model should recall the previous user query and adjust accordingly.
In Python, you can store previous responses and use them in new prompts. Here’s a simple example:
2.2.2. Handling Long Conversations
AI models have token limits, which means they can only process so much text at once. In longer conversations, this could mean the model forgets earlier parts of the chat. One solution is to summarise earlier interactions so the model can still “remember” key details.
Prompt chaining is ideal for situations where you need to break a large task into smaller, manageable chunks or when you want to follow a decision tree.
2.3.1. Complex Decision Trees
Decision trees involve multiple steps, where each step depends on the answer from the previous one. For example, imagine you’re designing an AI to help a user decide what holiday destination to pick:
Based on the answers, the model can help the user make an informed decision.
2.3.2. Role-Playing
Role-playing is when you use a series of prompts to simulate a conversation or interaction. This is useful in customer service, therapy bots, or other interactive scenarios.
For instance, you could create a role-playing scenario where the AI pretends to be a therapist, and you, the user, provide responses based on a given problem.
Templates are a great way to reuse parts of prompts that change based on certain variables. You can use Jinja2 or f-strings to dynamically generate prompts by inserting values where needed.
Jinja2 is a powerful tool in Python, commonly used in web development, that allows you to create templates with placeholders. You can replace the placeholders with real data when you need it.
3.1.1. Introduction to Jinja2
Jinja2 allows you to create templates with dynamic parts. For instance, if you want to create a prompt asking for different characters in a story, you can use Jinja2 to insert the character names automatically.
3.1.2. Building a Template
Example of a simple Jinja2 template:
f-strings are a simpler and more modern way to insert variables into strings in Python. They make code easier to read and are more intuitive than older string formatting methods.
3.2.1. Python f-strings
With f-strings, you can directly insert variables into strings. For example:
You can use f-strings to create different prompts for multiple characters. For instance:
REVISE THE WHOLE THING FOR BETTER UNDERSTANDING OF THE BASICS
REVISE THE WHOLE THING FOR BETTER UNDERSTANDING OF THE BASICS
REVISE THE WHOLE THING FOR BETTER UNDERSTANDING OF THE BASICS
Week 4: Learn LangChain to Build Real-World LLM Apps | |
1. LangChain Components: Prompts, Chains, Memory, Agents | |
1.1. LangChain Overview | |
1.1.1. What is LangChain? | Introduction to LangChain and its components for building LLM-powered applications. |
1.2. Components of LangChain | |
1.2.1. Prompts | Building dynamic prompts that can be passed into different chains. |
1.2.2. Chains | Using chains to link together multiple steps (e.g., Q&A, summarization, extraction). |
1.2.3. Memory | Storing state across different interactions and tasks. |
1.2.4. Agents | Creating agents that can take actions based on certain conditions. |
2. Connecting Tools like Google Search, PDF Readers | |
2.1. Integrating Google Search with LangChain | |
2.1.1. Setting Up Search | How to use LangChain to query Google Search and extract relevant information. |
2.2. PDF Integration | |
2.2.1. Using LangChain to Read PDFs | How to extract text from PDFs and pass it into a LangChain pipeline. |
2.2.2. Example Use Case | Build a PDF Q&A system where users can ask questions about the contents of the document. |
3. Build a LangChain Pipeline to Answer Questions from a PDF | |
3.1. Build a Simple Q&A Bot | |
3.1.1. Design the Flow | Define the user input, process steps, and the output. |
3.1.2. Implement Memory | Store context across interactions to ensure consistency in responses. |
3.2. Testing & Optimization | |
3.2.1. Improving Accuracy | Experiment with different models (e.g., GPT-3, GPT-4) and adjust the pipeline. |
3.2.2. Scaling | Explore how to scale the pipeline to handle larger documents or more complex queries. |
Large Language Models (LLMs), like OpenAI’s GPT-4, are incredibly powerful but building practical applications with them can be tricky. That’s where LangChain comes in. It's a Python-based framework that helps developers build apps using LLMs more effectively.
LangChain offers tools to build chatbots, virtual assistants, Q&A systems, document search tools, and more all with smarter flows and real memory.
LangChain is modular. It provides four main components that work together to build intelligent, LLM-powered applications.
1.1.1 What is LangChain?
LangChain is like a construction kit for building apps that use LLMs. It helps you:
Without LangChain, you'd have to manually connect prompts, API calls, and logic every time. LangChain simplifies and organises this for you.
Let’s understand each one in depth.
1.2.1 Prompts
Prompts are the instructions you give to the language model.
Instead of writing fixed prompts, LangChain allows you to build templates that change based on user input. This makes your app dynamic.
Example:
1.2.2 Chains
Chains allow you to link multiple steps together into a single flow.
Instead of asking the model just one question, a chain might:
Example:
1.2.3 Memory
Memory lets your app remember what the user said before. It’s what makes a chatbot feel human by keeping context.
Example:
Use cases:
1.2.4 Agents
Agents are like AI employees. They can make decisions and use tools based on what the user asks.
They follow a reasoning process:
“I need to answer this question. Should I use the web? Or check the PDF? Let me choose the right tool.”
Agents work well for tasks that require:
LangChain becomes more powerful when you connect it with external tools like:
2.1.1 Setting Up Search
LangChain can work with search APIs like SerpAPI to pull live data.
Use Case: If someone asks, “Who is the Prime Minister of the UK right now?”, LLMs might be outdated. Google Search can get a live answer.
Example Code:
2.2.1 Using LangChain to Read PDFs
LangChain can extract text from PDF files using libraries like PyMuPDF or pdfplumber.
Example:
Once you have the text, LangChain can search or summarise it.
2.2.2 Example Use Case – PDF Q&A System
Build a tool where users upload a PDF (like a company policy or user manual) and ask:
LangChain will:
Let’s put all the parts together into a real-world pipeline.
3.1.1 Design the Flow
Here’s how the system works:
This gives very accurate, document-based answers.
3.1.2 Implement Memory
You can also add memory so the bot remembers the document and previous questions.
Example:
3.2.1 Improving Accuracy
To improve quality:
Search Example with Embeddings:
To make the system production-ready:
Summary
By the end of this week, you’ll understand how to:
LangChain makes it easier to go from LLM experiment to real-world application fast, efficient, and scalable.
REVISE AND EXERCISE THE WHOLE THING FOR BETTER UNDERSTANDING OF THE BASICS.
EXPERIMENT WITH LLMs, UNDERSTAND HOW IT WORKS WITH YOUR CHANGES.
REVISE AND EXERCISE THE WHOLE THING FOR BETTER UNDERSTANDING OF THE BASICS.
EXPERIMENT WITH LLMs, UNDERSTAND HOW IT WORKS WITH YOUR CHANGES.
REVISE AND EXERCISE THE WHOLE THING FOR BETTER UNDERSTANDING OF THE BASICS.
EXPERIMENT WITH LLMs, UNDERSTAND HOW IT WORKS WITH YOUR CHANGES.
Week 5: Learn Chunking, Embeddings, Vector Stores | |
1. What is RAG? | |
1.1. Introduction to Retrieval-Augmented Generation (RAG) | |
1.1.1. RAG Concept | How RAG combines document retrieval with text generation for more accurate results. |
1.1.2. RAG Models | Example models like Facebook's RAG and how they work. |
1.2. How RAG Improves Performance | |
1.2.1. Combining Retrieval with Generation | How retrieval-based methods improve the quality of text generation. |
2. Use FAISS to Store and Retrieve Relevant Context | |
2.1. Introduction to FAISS | |
2.1.1. What is FAISS? | A library for efficient similarity search and clustering of vectors. |
2.1.2. Using FAISS for Document Indexing | How to index and search text documents based on semantic similarity. |
2.2. FAISS Integration in LangChain | |
2.2.1. Setting Up FAISS with LangChain | Implement FAISS within LangChain for information retrieval. |
3. Embedding Models (OpenAI, HuggingFace) | |
3.1. Embedding Models Overview | |
3.1.1. What Are Embeddings? | Converting text into fixed-length vectors that capture semantic meaning. |
3.1.2. Pre-trained Embedding Models | OpenAI’s text embeddings, HuggingFace models, and their use cases in document retrieval. |
3.2. Creating and Using Embeddings | |
3.2.1. Generating Embeddings | Using OpenAI’s or HuggingFace’s embedding APIs to create embeddings for text data. |
3.2.2. Using Embeddings for Similarity Search | How to find semantically similar documents using FAISS. |
Week 5: Learn Chunking, Embeddings, Vector Stores
1.1.1 RAG Concept – What is it?
RAG (Retrieval-Augmented Generation) is a technique that combines document search with language generation. Instead of just using the language model’s pre-trained knowledge, RAG allows the model to search external content (like PDFs or web pages) before generating a response.
Think of it like this:
If GPT doesn’t “know” something, it first retrieves the relevant documents, then generates an answer using that context.
This leads to:
1.1.2 RAG Models – Real-World Examples
Key point: RAG isn't a single model. It's a strategy involving:
1.2.1 Combining Retrieval with Generation
Let’s break this down step-by-step:
Step-by-Step: How RAG Works
This means:
2.1.1 What is FAISS?
FAISS (Facebook AI Similarity Search) is an open-source library designed to:
It helps you do semantic search — finding meaningfully related information even if the wording is different.
2.1.2 Using FAISS for Document Indexing
Let’s say you split a PDF into chunks. Each chunk is converted into a vector (a list of numbers that represent the meaning of the text). FAISS then stores these vectors and lets you search them by meaning.
Step-by-Step: How FAISS Works
2.2.1 Setting Up FAISS with LangChain
You can easily use FAISS inside LangChain with just a few lines of code.
Example Code:
3.1.1 What Are Embeddings?
Embeddings are a way to convert text into numbers that represent meaning.
For example:
These vectors allow systems to compare meaning, not just keywords.
3.1.2 Pre-trained Embedding Models
There are two major sources for embedding models:
OpenAI
HuggingFace
3.2.1 Generating Embeddings
Step-by-Step: Creating Embeddings
OpenAI Example:
Once you’ve stored your document embeddings in FAISS, you can search it using a user’s question, also converted into an embedding.
Step-by-Step: Similarity Search with Embeddings
Code Example:
By the end of this final week, you’ll be able to:
Real-World Example: End-to-End RAG Pipeline
Let’s tie everything together into one final use case.
Build a PDF Question Answering System with RAG
Step-by-Step:
EXPERIMENT WITH PROMPTs, UNDERSTANDING HOW IT WORKS WITH YOUR CHANGES.
EXPERIMENT WITH PROMPTs, UNDERSTANDING HOW IT WORKS WITH YOUR CHANGES.
Happy Learning...!!!
Krishna Sharma.
https://www.linkedin.com/in/krishna-sharma-8176111b3/