Notes on "Langchain for LLM Application Development"

Course Link: LangChain for LLM Application Development - DeepLearning.AI

Previous notes:


What is Langchain

  • Open-source development framework for LLM applications
  • Provide both Python and JavaScript/Typescript packages

Modular components which can be combined to build end-to-end applications.

Models, Prompts and Output Parsers



import openai

def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
    return response.choices[0].message["content"]


from langchain.chat_models import ChatOpenAI
chat = ChatOpenAI(temperature=0.0)
Prompt template

A prompt template refers to a reproducible way to generate a prompt.

from langchain import PromptTemplate
template = """/
You are a naming consultant for new companies.
What is a good name for a company that makes {product}?
prompt = PromptTemplate.from_template(template)
prompt.format(product="colorful socks")

Output parsers

Language models output text. Output parsers allows us to get structrued information out of the LLM response.

parser = PydanticOutputParser(pydantic_object=Joke)
prompt = PromptTemplate(
    template="Answer the user query.\n{format_instructions}\n{query}\n",
    partial_variables={"format_instructions": parser.get_format_instructions()}


LLMs are “stateless”.

Example usage:

llm = ChatOpenAI(temperature=0.0)
memory = ConversationBufferMemory()
conversation = ConversationChain(
    memory = memory,

Additional Memory Types

Vector data memory

  • store text in a vector database and retrieve the most relevant blocks of text

Entity memories

  • using an LLM, it remembers details about specific entities

Conversation can also be stored in conventional database (key-value store or SQL).


LangChain provides the Chain interface for such “chained” applications. We define a Chain very generically as a sequence of calls to components, which can include other chains.

chain = SimpleSequentialChain(chains=[chain_one, chain_two])"input")


  • SimpleSequentialChain: The simplest form of sequential chains, where each step has a singular input/output, and the output of one step is the input to the next.
  • SequentialChain: A more general form of sequential chains, allowing for multiple inputs/outputs.


RouterChain: dynamically selects the next chain to use for a given input.

For example, use MultiPromptChain to create a question-answering chain that selects the prompt which is most relevant for a given question, and then answers the question using that prompt.

chain = MultiPromptChain(
    default_chain=default_chain, verbose=True)

See Langchain Router for full example.

Question and Answer

Use LLM to answer questions over documents.


  • Embedding vector captures content/meaning
  • Text with similar content will have similar vectors
  1. Split document to small chunks
  2. For each chunk, create embeddings and store into vector database
  3. When query came in, first create an embedding for that query
  4. Then compare all vectors in the vector database, and pick the n most similar
  5. These then get passed to LLM to get back the final answer

Use Langchain’s OpenAIEmbeddings to create embedding for query:

from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()
embed = embeddings.embed_query("Hi my name is Harrison")

db = DocArrayInMemorySearch.from_documents(docs, embeddings)
query = "Please suggest a shirt with sunblocking"
docs = db.similarity_search(query)

Use RetrievalQA chain:

retriever = db.as_retriever()
llm = ChatOpenAI(temperature = 0.0)

qa_stuff = RetrievalQA.from_chain_type(

query =  "Please list all your shirts with sun protection in a table \
in markdown and summarize each one."
response =

Stuff method: simply stuff all data into the prompt context to pass to the language model

  • Pros: it makes a single call to the LLM, which has access to all the data at once.
  • Cons: LLMs have a context length, the prompt may exceed the limit.

Additional methods:

  1. Map reduce: call LLM for each chunk plus the query, then aggregate the answers and call LLM again for final answer.
  2. Refine: builds upon the answer from the previous document
  3. Map rerank: let LLM give each chunk a score, then select the highest score as final answer

additional methods


Turn on debug to view the output of each step.

import langchain
langchain.debug = True

Use QAEvalChain :

from import QAEvalChain

llm = ChatOpenAI(temperature=0)
eval_chain = QAEvalChain.from_llm(llm)
graded_outputs = eval_chain.evaluate(examples, predictions)


An agent has access to a suite of tools, and determines which ones to use depending on the user input. Agents can use multiple tools, and use the output of one tool as the input to the next. See more on its doc .


llm = OpenAI(temperature=0)
tools = load_tools(["serpapi", "llm-math"], llm=llm)
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)"Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?")

Create custom tool:

from langchain.agents import tool
from datetime import date

def time(text: str) -> str:
    """Returns todays date, use this for any \
    questions related to knowing todays date. \
    The input should always be an empty string, \
    and this function will always return todays \
    date - any date mathmatics should occur \
    outside this function."""
    return str(

For more, see Define Custom Tools .


My thoughts:

  • Langchain is very powerful and handy tool for developing LLM based applications
  • It is still evolving, new functionalities are introduced and APIs may change