What is LangChain? A deeper look at LangChain-EEWORLD

Collect

Let’s summarize it all in the following diagram.

Understanding all modules and chain operations is very important for building pipeline applications of large language models using LangChain. This is just a brief introduction to LangChain.

Practical Applications of LangChain

Without further ado, let’s get straight to building simple applications using LangChain. One of the most interesting applications is creating a robot on custom data.

Disclaimer/Warning: This code is only intended to show how an application can be built. I do not guarantee the optimization of the code and further improvements may be necessary depending on the specific problem statement.

Start importing modules

Import LangChain and OpenAI for the large language model part. If you haven't installed them yet, please install them first.

#IMPORTS
fromlangchain.embeddings.openaiimportOpenAIEmbeddings
fromlangchain.vectorstoresimportChroma
fromlangchain.text_splitterimportCharacterTextSplitter
fromlangchain.chainsimportConversationalRetrievalChain
fromlangchain.vectorstoresimportElasticVectorSearch,Pinecone,Weaviate,FAISS
fromPyPDF2importPdfReader
fromlangchainimportOpenAI,VectorDBQA
fromlangchain.vectorstoresimportChroma
fromlangchain.promptsimportPromptTemplate
fromlangchain.chainsimportConversationChain

fromlangchain.document_loadersimportTextLoader
#fromlangchainimportConversationalRetrievalChain
fromlangchain.chains.question_answeringimportload_qa_chain
fromlangchainimportLLMChain
#fromlangchainimportretrievers
importlangchain
fromlangchain.chains.conversation.memoryimportConversationBufferMemory

py2PDF is a tool for reading and processing PDF files. In addition, there are different types of memory, such as ConversationBufferMemory and ConversationBufferWindowMemory, which have specific functions. I will talk about memory in detail in the last section.

Setting up the environment

I assume you know how to get an OpenAI API key, but I’d like to explain it anyway:

Go to the OpenAI API page,

Click on “Create new secret key”

That will be your API key. Paste it below

impo
os.environ["OPENAI_API_KEY"]="sk-YOURAPIKEY"

Which model to use? Davinci, Babbage, Curie, or Ada? GPT-3-based, GPT-3.5-based, or GPT-4-based? There are many questions about models, and all models are suitable for different tasks. Some models are cheaper, and some models are more accurate.

For simplicity, we will use the most affordable model "gpt-3.5-turbo". Temperature is a parameter, which affects the randomness of the answers. The higher the temperature value, the more random the answers we get.

llm=ChatOpenAI(temperature=0,model_name="gpt-3.5-turbo")

Here you can add your own data. You can use any format like PDF, Text, Doc or CSV. Depending on your data format, you can uncomment/comment the following code.

#Customdata
fromlangchain.document_loadersimportDirectoryLoader
pdf_loader=PdfReader(r'YourPDFlocation')

#excel_loader=DirectoryLoader('./Reports/',glob="/*.txt")
#word_loader=DirectoryLoader('./Reports/',glob="/*.docx")

We cannot add all the data at once. We divide the data into chunks and send it to create data embeddings.

Embeddings are represented in the form of numeric vectors or arrays that capture the essence and contextual information of the tokens processed and generated by the model. These embeddings are derived from the parameters or weights of the model and are used to encode and decode input and output text.

This is how Embeddings are created.

In simple terms, in LLM, Embedding is a way to represent text as a digital vector. This enables language models to understand the meaning of words and phrases and perform tasks such as text classification, summarization, and translation.

In layman's terms, embedding is a way of turning words into numbers. This is achieved by training a machine learning model on a large corpus of text. The model learns to associate each word with a unique numeric vector. This vector represents the meaning of the word, as well as its relationship to other words.

Let's do the exact same thing as shown above.

#Preprocessingoffile

raw_text=''
fori,pageinenumerate(pdf_loader.pages):
text=page.extract_text()
iftext:
raw_text+=text

#print(raw_text[:100])


text_splitter=CharacterTextSplitter(
separator="
",
chunk_size=1000,
chunk_overlap=200,
length_function=len,
)
texts=text_splitter.split_text(raw_text)

In practice, when a user issues a query, a search is performed in the vector store and the most appropriate index is retrieved and passed to the LLM. The LLM then reconstructs the content in the index to provide a formatted response to the user.

I recommend further diving into the concepts of vector storage and embedding to enhance your understanding.

embeddings=OpenAIEmbeddings()
#vectorstore=Chroma.from_documents(documents,embeddings)
vectorstore=FAISS.from_texts(texts,embeddings)

The embedding vectors are directly stored in a vector database. There are many vector databases available like Pinecone, FAISS, etc. Here, we will use FAISS.

prompt_template="""Usethefollowingpiecesofcontexttoanswerthequestionattheend.Ifyoudon'tknowtheanswer,justsayGTGTGTGTGTGTGTGTGTG,don'ttrytomakeupananswer.
{context}
Question:{question}
HelpfulAnswer:"""
QA_PROMPT=PromptTemplate(
template=prompt_template,input_variables=['context',"question"]
)

You can use your own hints to refine your queries and answers. After writing the hint, let's link it with the final chain.

Let's call the final chain which includes everything that was chained before. We use ConversationalRetrievalChain here. It helps us to have a conversation in a human way and remember the previous chat history.

qa=ConversationalRetrievalChain.from_llm(ChatOpenAI(temperature=0.8),vectorstore.as_retriever(),qa_prompt=QA_PROMPT)

We will use a simple Gradio to create a web app. You can choose to use Streamlit or other front-end technologies. In addition, there are many free deployment options to choose from, such as deploying to Hugging Face or local host, which we can do later.

#Frontendwebapp
import gradioasgr
withgr.Blocks()asdemo:
gr.Markdown("##GroundingDINOChatBot")
chatbot=gr.Chatbot()
msg=gr.Textbox()
clear=gr.Button("Clear")
chat_history=[]
defuser(user_message,history)
print("Typeofusemsg:",type(user_message))
#GetresponsefromQAchain
response=qa({"question":user_message,"chat_history":history})
#Appendusermessageandresponsetochathistory
history.append((user_message,response["answer"]))
print(history)
returngr.update(value=""),history
msg.mit(user,[msg,chatbot],[msg,chatbot],queue=False)
clear.click(lambda:None,None,chatbot,queue=False)
############################################

if__name__=="__main__":
demo.launch(debug=True)

This code will create a link locally where you can ask questions and see answers directly. At the same time, you will see the chat history maintained in your integrated development environment (IDE).

LangChain Snapshot

This is a simple introduction showing how to create the final chain by connecting different modules. By tweaking the modules and the code, you can achieve many different functions. I would say that playing is the highest form of research !

LangChain Token and Model

Token

Tokens can be thought of as parts of words. Before processing the prompt, the API will split the input into tokens. The token split position does not necessarily correspond exactly to the start or end position of the word, and may also include trailing spaces or even subwords.

In natural language processing, we usually perform Tokenizer operations to split paragraphs into sentences or words. Here, we also split sentences and paragraphs into small chunks consisting of words.

The above image shows how text is segmented into tokens. Different colors represent different tokens. A rule of thumb is that one token is roughly equivalent to 4 characters in common English text. This means that 100 tokens are roughly equivalent to 75 words.

If you want to check the number of tokens for a particular text, you can check it directly on OpenAI’s Tokenizer.

Another way to calculate the number of tokens is to use the tiktoken library.

the import tiktok
#Writefunctiontotakestringinputandreturnnumberoftokens
defnum_tokens_from_string(string:str,encoding_name:str)->int:
"""Returnsthenumberoftokensinatextstring."""
encoding=tiktoken.encoding_for_model(encoding_name)
num_tokens=len(encoding.encode(string))
returned_tokens

Finally, using the above function:

prompt=[]
front data:
prompt.append((num_tokens_from_string(i['prompt'],"davinci")))

completion=[]
forjindata:
completion.append((num_tokens_from_string(j['completion'],"davinci")))

res_list=[]
foriinrange(0,len(prompt)):
res_list.append(prompt[i]+completion[i])

no_of_final_token=0
foriinres_list:
no_of_final_token+=i
print("Numberoffinaltoken",no_of_final_token)

Output:

Numberoffinaltoken2094

The choice of different models is affected by the number of tokens.

First, let’s understand the different models provided by OpenAI. In this blog, I am focusing on the OpenAI model. We can also use hugging faces and cohere AI models.

Let's first understand the basic model.

Model

GPT is powerful because it is trained on large datasets. However, with great power comes a price, so OpenAI offers multiple models, also called engines, to choose from.

Davinci is the largest and most powerful engine. It can do everything that the other engines can do. Babbage is the next most powerful engine, and it can do everything that Curie and Ada can do. Ada is the least powerful engine, but it has the best performance and is the cheapest.

As GPT continues to evolve, there are many different versions of models to choose from. There are approximately 50+ models available in the GPT family.

Screenshot from OpenAI official model page