top of page
sisodiaarpit

Unleashing AI's Potential: Exploring LangChain and Its Transformative Concepts!

1- What is Langchain ?


LangChain is a framework for developing applications powered by large language models (LLMs).

LangChain simplifies every stage of the LLM application lifecycle:

 

Development: Build your applications using LangChain's open-source components and third-party integrations. Use LangGraph to build stateful agents with first-class streaming and human-in-the-loop support.


Productionization: Use LangSmith to inspect, monitor and evaluate your applications, so that you can continuously optimize and deploy with confidence.


Deployment: Turn your LangGraph applications into production-ready APIs and Assistants with LangGraph Platform.



Large Language Models (LLMs) like OpenAI, AI 21 Labs, and LLaMA possess limitations. Their knowledge is often outdated, confined to the data used during their training. This can lead to inaccurate or incomplete responses, especially on recent events or rapidly evolving topics. Furthermore, LLMs struggle with domain-specific knowledge and proprietary data. Handling specialized information or confidential data poses significant challenges due to privacy concerns and the inability of these models to access or process such information effectively. Lastly, working with multiple LLMs can be cumbersome due to their varying APIs and interaction methods, hindering seamless integration and workflow management.


LangChain offers a modular toolkit for building applications powered by large language models (LLMs). Its core components include Models for interacting with LLMs, Prompts for guiding their responses, Indexes for connecting LLMs to external data, Memory for maintaining context, Chains for orchestrating complex workflows, and Agents for enabling autonomous interaction with the environment. This framework empowers developers to create a wide range of applications, from simple chatbots to sophisticated AI-powered systems, by combining these components in a flexible and structured manner.


 

2- Text/word Embeddings ?


Word Embedding is a technique in natural language processing (NLP) and machine learning that represents words or phrases in a continuous vector space where semantically similar words are mapped to nearby points. In simpler terms, it’s a way of translating words into numerical representations that capture their meaning and relationships with other words.


Vector databases achieve fast and efficient searches by employing a combination of advanced techniques. Unlike traditional databases that use B-trees or hash-based indexes, they rely on specialized structures like HNSW (Hierarchical Navigable Small World), a graph-based index for approximate nearest neighbor (ANN) search, and IVF (Inverted File Index), which clusters vectors to reduce the search space. ANN algorithms prioritize speed over perfect accuracy, returning results within milliseconds by finding close matches with minimal computation. Additionally, dimensionality reduction techniques help decrease memory usage and query time by preserving similarity relationships with fewer data points. Efficient memory management stores vectors in compact formats, optimizing storage and retrieval. Parallel and distributed processing further enhances performance by running queries simultaneously across multiple nodes, while cache optimization minimizes redundant computations, ensuring faster access to frequently requested data. These methods collectively make vector databases highly scalable and ideal for real-time applications.


 

3- Prompt Module ?


a) A prompt template is a structured format used to create prompts for language models, allowing for dynamic input values. It serves as a blueprint that can incorporate various variables, making it easier to generate consistent and desired outputs without hardcoding specific values.

Key Features:

  • Dynamic Inputs: Allows for changing values within the same structure.

  • Reusability: The same template can be used for different inputs, enhancing efficiency.

  • Clarity: Improves code readability and organization, especially in complex projects.

    Example-


passing prompt to llm without template

in above example we can't dynamically pass the input values. This can be achieved with fstring ( pythpn feature)



now, lets do the same with Lagchain's prompt template.( It has features other than dynamic string passings, though built on fstings only; lets see- )


PromptTemplate is class in langchain_core.

The key components of a prompt template include:

  • Placeholders: These are variables within the template that will be replaced with actual values when the prompt is executed. For example, {text} and {word_count} can be placeholders.

  • Template Structure: This is the overall format of the prompt, which includes static text and the placeholders. For instance, "Can you create a post for tweet in {word_count} words for the above text: {text}?"

  • Input Variables: These define the specific variables that will be used in the template. They are specified when creating the prompt template, indicating which placeholders will be filled with values.

  • Formatting Function: This is the method or function that takes the template and the input variables to generate the final prompt. It replaces the placeholders with the actual values.Ex-


b)

The Example Selector is a component that helps in crafting effective prompts for large language models (LLMs). Here’s a brief summary:

  • Purpose: The Example Selector is designed to enhance the quality of responses generated by LLMs by providing them with relevant examples that guide their output.

  • Functionality:

    • It allows users to specify examples that demonstrate the desired behavior or response style of the model.

    • By selecting appropriate examples, users can influence the model to produce more accurate and contextually relevant answers.

  • Implementation:

    • The process involves creating a structured prompt that includes instructions, examples, and expected responses.

    • This structured approach helps in breaking down complex prompts into manageable parts, making it easier for the model to understand and respond appropriately.

Using the Example Selector effectively can significantly improve the interaction with LLMs, leading to better outcomes in applications like summarization, question answering, and more.

Prompt with few shot examples in langchain-

FewShotprompt is class present in langchain.


create example prompt with prompttemplate

Creating FewShotPromptTemplate and calling LLM

There is a limit on number of tokens model can take and also more words in prompt may cost more. Langchain provides a feature - length-based example selector which is a tool used to manage the number of examples sent to a language model. Here’s a brief summary:

  • Purpose: It controls the maximum length of examples included in the input prompt.

  • Benefits:

    • Prevents Token Overflow: Avoids exceeding token limits, which can cause errors.

    • Cost Efficiency: Reduces costs by limiting the number of tokens processed.

    • Enhances Context: Improves the relevance of responses by including only pertinent examples.

    • Customizable: Allows users to set specific maximum lengths for examples based on their needs.

This selector is essential for optimizing interactions with language models while managing resources effectively.


There is another feature in langchain ie. output parser, which allows users to format the output from a language model (LLM) into specific structures like CSV or JSON.

Output Formatting with LangChain. The process involves creating an output parser object, which can generate format instructions that guide the LLM on how to structure its response.


 

4- memory


as recap there are below modules in langchain-



Memory, the ability to retain information for future use, is essential for creating interactive and context-aware chatbots, yet Large Language Models (LLMs) are inherently stateless, treating each query as a standalone request without recalling past interactions. LangChain addresses this limitation by offering diverse memory types, such as conversation buffer memory, conversation buffer window memory, and entity memory, which cater to varying project needs and are easily implementable in Python. The memory module empowers chatbots to remember previous interactions, enhancing their capacity to deliver relevant, contextually appropriate responses, as demonstrated by conversation buffer memory, which maintains a history of exchanges to improve conversational coherence.



Types of memories ( makes LLM more useful for sequential task)

ConversationBufferMemory is a type of memory in LangChain that allows chatbots to retain the entire history of interactions within a conversation.


Pros and Cons of ConversationBufferMemory


Keeping the LLM inside ConversationChain Template with ConversationBufferMemory.

there is history tag to hold previous conversation

all the conversation between bot and user will be stored under history and passed as prompt in subsequent LLM calls

Conversation window buffer memory is a type of memory management used in AI chatbots to retain a limited number of recent interactions. It allows the system to keep track of the most relevant conversations while intentionally dropping older ones. This approach helps in efficient memory utilization and reduces the token count, which is important for performance. However, it may lead to loss of context from earlier interactions, potentially affecting the depth of understanding in ongoing conversations.


k defines how many previous conversation to keep

Conversation summary memory condenses past interactions into a concise summary, enabling the AI to retain essential context while minimizing data processing. This approach offers advantages such as efficient token usage, reducing the number of tokens sent to the model and preventing token limit issues, and context retention, capturing key points to maintain relevance in ongoing interactions. However, it has drawbacks, including loss of detail, where nuances and raw data may be omitted during summarization, and potential context gaps, where essential context might be excluded, potentially causing misunderstandings in future conversations.


llm is used to provide summary that is past as part of prompt

Conversation token buffer memory is a memory management technique that specifies a maximum token limit for the interactions retained in a conversation. This helps control the amount of data processed by the LLM.


recent 30 tokens from history are taken as part of prompt when calling llm

 

5- Data Connections


Additional data from various sources has to be passed for efficient working of LLM for domain specific tasks.


Transform may involve tasks like chunking, Retrievers uses index and querying capabilities of Vector store for fast and efficient search

Let's look at the python code snippets to understand better.


TextLoader to load text files

When working with large language models, token limits often prevent sending extensive text in a single request. To address this, the data must be divided into smaller, manageable chunks, facilitating easier processing by the model. This process of segmentation and transformation is where LangChain comes into play, offering a structured approach to breaking down and managing large datasets for seamless interaction with the model.


Transform: chunking of documents( additional data information) passed to llm

get embeddings of chunked texts:



taking Huggingface's sentence embeddings

load embeddings into chorma Db-


chunks and corresponding embedding

performing action ( similar to input ) on chunked data

use of retriever in question answering/ memory less chat/ and variation of Q& A like MCQ etc-



general flow of question answering application where documents are chunked and stored in vector DB, when user asks question , question along with relevant doc( retrieved by query) is passed to LLM.

if Data is in table form( text to sql use case) , vector store may have information of table and columns based on input query. Again this information will be passed to LLM to generate query. End to end architecture may look like this



other points to note-

a) every page will be considered as document.If a pdf has 2 pages, 2 docs will be considered.

in below example there are 2 pdfs with 1 and 2 pages so length of docs are 3.


every page is separate document

b) chunking would further divide these documents into chunks. Then these chunks are converted into embeddings using sentence encoder like all-MiniLM-L6-v2

(a sentence-transformers model: It maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for tasks like clustering or semantic search )


c) store embeddings in vector DB ( chroma, Pinecode etc).

d) fetch vector embeddings which are similar to input query embedding.


e) pass identified docs, along with input query to llm to get relevant answer. ( may use chains also here). The output format may require different structure that can be generated using output formatter of langchain. ( regex is used to format further )


output formatter for MCQ that can be passed along with prompt to get answer in MCQ format( or any other)

6) Chains


LangChain chains enable logical connections between large language models (LLMs), simplifying the creation of complex applications by combining various models. There are two primary types of chains: generic chains, which rely on a single LLM for straightforward text generation based on prompts, and utility chains, which use multiple LLMs to perform specialized tasks. Examples of generic chains include the basic LLM chain and simple sequential chain, while utility chains, such as retrieval QA LLM and load summarize chain, enhance the overall functionality of LLMs by providing advanced capabilities for retrieving information and summarizing content.



simple chain code by LLMChain

similarly there can be another chain using another LLM that gives expenses to stay in respective city. so on has to get name of cities first then get the cost so this sequence is important.



Sequence Chain: Here place chain then expense chain has to run to get combined answer

Here’s a breakdown of common chains in LangChain:


1. LLM Chain

  • Purpose: The simplest chain that connects a prompt with an LLM to generate text responses.

  • Use Case: Basic text generation based on a user-provided input or template.


2. Sequential Chain

  • Simple Sequential Chain: Links multiple chains in sequence, passing the output of one as input to the next.

  • Stuff Documents Chain: Combines multiple inputs or documents into a single prompt before processing.

  • Use Case: Multi-step workflows where each step depends on the previous one.


3. Retrieval Chain

  • RetrievalQA Chain: Fetches documents using a retriever, then passes them to an LLM for answering queries.

  • Vectorstore Retriever Chain: Uses vector databases to find relevant context before querying the LLM.

  • Use Case: Question answering and contextual search from external knowledge bases.


4. Summarization Chain

  • Load Summarize Chain: Loads text documents, splits them into smaller chunks, and summarizes them.

  • Use Case: Generating concise summaries from large texts.


5. Router Chain

  • Multi-Model Router Chain: Directs different inputs to appropriate LLMs or chains based on input type or task.

  • Use Case: Task-specific processing in a multi-model environment.


6. Transform Chain

  • Allows custom logic between input and output transformations.

  • Use Case: Applying data processing or formatting logic before passing data to the LLM.


7. LLM Request Chain

  • Enables making external HTTP requests and processing responses.

  • Use Case: Retrieving external data and extracting information through API calls.


These chains allow developers to build scalable and flexible AI applications, combining LLMs with memory, retrieval systems, and custom logic to automate tasks like question answering, data summarization, and web scraping.


7) Agents


The core idea of agents is to use a language model to choose a sequence of actions to take. In chains, a sequence of actions is hardcoded (in code). In agents, a language model is used as a reasoning engine to determine which actions to take and in which order.

LangGraph is an extension of LangChain specifically aimed at creating highly controllable and customizable agents. We recommend that you use LangGraph for building agents.

Unlike a chain, an agent gives an LLM some degree of control over the sequence of steps in the application. Examples of using an LLM to decide the control of an application:

  • Using an LLM to route between two potential paths

  • Using an LLM to decide which of many tools to call

  • Using an LLM to decide whether the generated answer is sufficient or more work is need

There are many different types of agent architectures to consider, which give an LLM varying levels of control. On one extreme, a router allows an LLM to select a single step from a specified set of options and, on the other extreme, a fully autonomous long-running agent may have complete freedom to select any sequence of steps that it wants for a given problem.



Type of agents; router where single step decision is taken to fully autonomous all steps identification is done.

ReAct is a popular general purpose agent architecture that combines these expansions, integrating three core concepts.

  1. Tool calling: Allowing the LLM to select and use various tools as needed.

  2. Memory: Enabling the agent to retain and use information from previous steps.

  3. Planning: Empowering the LLM to create and follow multi-step plans to achieve goals.




Example of using pre-defined agents: create_pandas_dataframe_agent

then any analysis can be done on data like-


response = agent_executor.run("What is the average age of passengers?")

print(response)



8) Tools


tools are functions or capabilities that agents can invoke to perform specific tasks. They act as bridges between the language model (LLM) and external systems, APIs, or custom logic, enabling the agent to interact with the environment, gather data, or process information. Tools expand the functional boundaries of language models, making them interactive and dynamic.


Key Components of Tools

  1. Input Schema: Defines what parameters the tool requires, helping the LLM understand how to invoke the tool correctly.

  2. Functionality: The actual logic or operation performed by the tool, typically implemented as a Python function or API call.


Common Types of Tools

  1. Search Tools: Enable agents to search the web or retrieve data from documents.

    • Example: Google Search, vector database lookups.

  2. Data Analysis Tools: Allow manipulation and analysis of data.

    • Example: Pandas DataFrame tools for querying and analyzing structured data.

  3. Calculation Tools: Perform mathematical or statistical computations.

  4. API Tools: Access external APIs for data retrieval or service interactions.

    • Example: Weather API or stock price API.


Examples of Built-in Tools

  • Python REPL: For executing Python code.

  • Wikipedia API: To fetch information from Wikipedia.

  • LLM-Request Chain: To make HTTP requests.



Here another tool search is used- Duckduck go when is used to give additional info to final LLM call. https://python.langchain.com/docs/integrations/tools/

9) Document Loaders


Document loaders in LangChain are used to load external data into a format suitable for language models. They provide specialized methods for different types of files, cloud services, and APIs. Here are some examples:


Common File Types

CSVLoader: Loads CSV files.

PDF Loaders: Includes PyPDF, PDFPlumber, PDFMiner, etc., for loading and parsing PDF documents.

JSONLoader: Loads JSON files.

DirectoryLoader: Loads all files from a specified directory.


Webpages

Web: Uses BeautifulSoup to load HTML pages.

RecursiveURL: Crawls links recursively from a root URL.

Sitemap: Loads all pages from a sitemap.


Cloud Storage

AWS S3 Loaders: Loaders for files and directories in AWS S3.

Google Cloud Storage Loaders: Loaders for files and directories in GCS.

Azure Blob Storage Loaders: Loaders for files and containers in Azure Blob Storage.


Social Media and Messaging Platforms

TwitterTweetLoader: Loads tweets.

RedditPostsLoader: Loads Reddit posts.

TelegramChatFileLoader: Loads chat history from Telegram.


Productivity Tools

SlackDirectoryLoader: Loads data from Slack channels or workspaces.

NotionDirectoryLoader: Loads Notion pages and databases.





10) Some Use cases-

1) add real time web data by scraping and passing that in LLM call. Use Search tools or scraping tool.


2) identify relevant content from a site:- scrape site data using tool and store after chunking. Do the similarity search of user query with stored embedding.


3) MCQ generation- use output format template in prompt that will generate multiple answer options along with right answer. Documents in vector DB after chunking . Retrieval is used to identify correct documents and then pass the docs along with input questions to LLM to get multiple options and correct answer.


4) Chatbot- use memory feature in conversaitonchain that takes llm and type of memory to use to retain previous chats.


5) Chatbot with capabilities of raising ticket if didn't get correct answer from bot.

first ,all documents have to be stored in vector store after chunking.( create vector db object and add files).

find the documents relevant to user query, pass it along with query to llm to get domain specific response.


Even in streamlit, multi page app can be created now.

a separate classification model ( xgb, catboost, svm etc), can be used to classify query into ticket class. once model is saved, if can be used to map class of ticket and assign accordingly. ( make_pipeline from sklearn can be used as pipeline that will include data processing and model execution) .


6) Resume Screening for HR - resume docs are stored as emdeddings in vector DB. Additional meta data information related to file name, file type, size etc. are added for every chunk. This is required to get resume doc ( as mapping is present)

for chuck for which Job description 's vector's similarity is maximum. in most of the LLM apps sentence embeddings is used not word cause of better context understanding.



why sentence embeddings are better than word embeddings

7) large documents summary - llm with map reduce chain . https://js.langchain.com/v0.1/docs/modules/chains/document/map_reduce/

LCEL ; lang chain expression language can be used for better programming.


8) Email generation- topic, sender, recipient, style ( appreciation/ question/follow up etc). as input.


llm ; llama 2 is downloaded in local

9) Invoice Extraction- ask llm to fetch required columns (information ) in prompt , no need to store anything in vector DB, once fetched additional output formatting may require to get exact table form.


10) Audio Call Summary- whisper library for audio to text conversion. langchain agent can be used to send email automatically to caller after call summarisation. https://blog.langchain.dev/langchain-zapier-nla/






17 views0 comments

Comments


bottom of page