Notes on "Langchain for LLM Application Development"

Course Link: LangChain for LLM Application Development - DeepLearning.AI

Previous notes:

Notes on “Building Systems with the ChatGPT API”

Introduction

What is Langchain

Open-source development framework for LLM applications
Provide both Python and JavaScript/Typescript packages

Modular components which can be combined to build end-to-end applications.

Models, Prompts and Output Parsers

OpenAI API

Example:

import openai

def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0,
    )
    return response.choices[0].message["content"]

LangChain

from langchain.chat_models import ChatOpenAI
chat = ChatOpenAI(temperature=0.0)

Prompt template

A prompt template refers to a reproducible way to generate a prompt.

from langchain import PromptTemplate
template = """/
You are a naming consultant for new companies.
What is a good name for a company that makes {product}?
"""
prompt = PromptTemplate.from_template(template)
prompt.format(product="colorful socks")

Output parsers

Language models output text. Output parsers allows us to get structrued information out of the LLM response.

parser = PydanticOutputParser(pydantic_object=Joke)
prompt = PromptTemplate(
    template="Answer the user query.\n{format_instructions}\n{query}\n",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()}
)
parser.parse(output)

Memory

LLMs are “stateless”.

ConversationBufferMemory
- allows for storing of messages and then extracts the messages in a variable
ConversationBufferWindowMemory
- keeps a list of the interactions of the conversation over time. It only uses the last K interactions
ConversationTokenBufferMemory
- keeps a buffer of recent interactions in memory, and uses token length rather than number of interactions to determine when to flush interactions
ConversationSummaryMemory
- creates a summary of the conversation over time

Example usage:

llm = ChatOpenAI(temperature=0.0)
memory = ConversationBufferMemory()
conversation = ConversationChain(
    llm=llm,
    memory = memory,
    verbose=True
)

Additional Memory Types

Vector data memory

store text in a vector database and retrieve the most relevant blocks of text

Entity memories

using an LLM, it remembers details about specific entities

Conversation can also be stored in conventional database (key-value store or SQL).

Chains

LangChain provides the Chain interface for such “chained” applications. We define a Chain very generically as a sequence of calls to components, which can include other chains.

chain = SimpleSequentialChain(chains=[chain_one, chain_two])
chain.run("input")

Sequential

SimpleSequentialChain: The simplest form of sequential chains, where each step has a singular input/output, and the output of one step is the input to the next.
SequentialChain: A more general form of sequential chains, allowing for multiple inputs/outputs.

Router

RouterChain: dynamically selects the next chain to use for a given input.

For example, use MultiPromptChain to create a question-answering chain that selects the prompt which is most relevant for a given question, and then answers the question using that prompt.

chain = MultiPromptChain(
    router_chain=router_chain,
    destination_chains=destination_chains,
    default_chain=default_chain, verbose=True)

See Langchain Router for full example.

Question and Answer

Use LLM to answer questions over documents.

Embeddings:

Embedding vector captures content/meaning
Text with similar content will have similar vectors

Split document to small chunks
For each chunk, create embeddings and store into vector database
When query came in, first create an embedding for that query
Then compare all vectors in the vector database, and pick the n most similar
These then get passed to LLM to get back the final answer

Use Langchain’s OpenAIEmbeddings to create embedding for query:

from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()
embed = embeddings.embed_query("Hi my name is Harrison")

db = DocArrayInMemorySearch.from_documents(docs, embeddings)
query = "Please suggest a shirt with sunblocking"
docs = db.similarity_search(query)

Use RetrievalQA chain:

retriever = db.as_retriever()
llm = ChatOpenAI(temperature = 0.0)

qa_stuff = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    verbose=True
)

query =  "Please list all your shirts with sun protection in a table \
in markdown and summarize each one."
response = qa_stuff.run(query)
display(Markdown(response))

Stuff method: simply stuff all data into the prompt context to pass to the language model

Pros: it makes a single call to the LLM, which has access to all the data at once.
Cons: LLMs have a context length, the prompt may exceed the limit.

Additional methods:

Map reduce: call LLM for each chunk plus the query, then aggregate the answers and call LLM again for final answer.
Refine: builds upon the answer from the previous document
Map rerank: let LLM give each chunk a score, then select the highest score as final answer

additional methods

Evaluation

Turn on debug to view the output of each step.

import langchain
langchain.debug = True

Use QAEvalChain :

from langchain.evaluation.qa import QAEvalChain

llm = ChatOpenAI(temperature=0)
eval_chain = QAEvalChain.from_llm(llm)
graded_outputs = eval_chain.evaluate(examples, predictions)

Agents

An agent has access to a suite of tools, and determines which ones to use depending on the user input. Agents can use multiple tools, and use the output of one tool as the input to the next. See more on its doc .

Example:

llm = OpenAI(temperature=0)
tools = load_tools(["serpapi", "llm-math"], llm=llm)
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)

agent.run("Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?")

Create custom tool:

from langchain.agents import tool
from datetime import date

@tool
def time(text: str) -> str:
    """Returns todays date, use this for any \
    questions related to knowing todays date. \
    The input should always be an empty string, \
    and this function will always return todays \
    date - any date mathmatics should occur \
    outside this function."""
    return str(date.today())

For more, see Define Custom Tools .

Conclusion

My thoughts:

Langchain is very powerful and handy tool for developing LLM based applications
It is still evolving, new functionalities are introduced and APIs may change