top of page

Finetune LLM( from openai) using LangChain ( for beginners)

Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. However, using these LLMs in isolation is often insufficient for creating a truly powerful app - the real power comes when you can combine them with other sources of computation or knowledge.

Langchain aims to assist in the development of those types of applications.


There are six main areas that LangChain is designed to help with. These are, in increasing order of complexity:


one can access complete document from here-


Example of using langchain




Import required packages-

import os import langchain import openai


here langchain will use openai's model, one can use hugging face models also.one needs to have open ai account and openai key-


from langchain.embeddings.openai import OpenAIEmbeddings from langchain.chains import ConversationalRetrievalChain from langchain.document_loaders.csv_loader import CSVLoader from langchain.vectorstores import FAISS



Data load & Vector DB creation

# Load CSV file using Langchain CSVLoader #https://github.com/ArpitSisodia/Data_Sets/blob/main/keto.csv

loader = CSVLoader(file_path="keto.csv") # data = loader.load() pd.read_csv("keto.csv").head()


# Data to Embeddings (Vector/Matrix) embeddings = OpenAIEmbeddings() # Store this Vector/Matrix in a Vector Database #Vector database is used for semantic search vectorstore = FAISS.from_documents(data, embeddings)



Set up a conversational retrieval chain using the ConversationalRetrievalChain

chain = ConversationalRetrievalChain.from_llm( llm = ChatOpenAI(temperature=0.0,model_name='gpt-3.5-turbo'), retriever=vectorstore.as_retriever(), memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True) )


here the model used is gpt-3.5-turbo for which openai key is required, memory is required to keep previous conversation.



Creating prompt and passing it to chain-

prompt1 = "what context can you provide?"

res1 = chain({"question":prompt1})

res1['answer']


#output is-

I can provide context about different recipes that fall under the "keto" diet type. I have information about the recipe names, cuisine types, protein, carbs, and fat content of each recipe. I also have the extraction day and time for each piece of context. Let me know if you have any specific questions or if there's anything else I can assist you with!


prompt2 = "Give me a recommendation for best Keto diet based on American cuisine type"

res2 = chain({"question":prompt2})

res2['answer']


Output-

Based on the available information, the best Keto diet recommendation based on American cuisine type would be the "Keto Diet: Cost Breakdowns of Popular Recipes." This recipe has the highest protein content (227.31g) and a good balance of carbs (27.27g) and fat (430.02g) for a Keto diet.












Recent Posts

See All

How to save a data variable in Python using pickle library

# import pickle library import pickle # take user input to take the amount of data number_of_data = int(input('Enter the number of data : ')) data = [] # take input of the data for i in range(number_o

bottom of page