How To Use OpenAI's API & GTP With LangChain [UPDATED 2024]

In my previous tutorial I showed How to use the Hugging Face API with LangChain, in order to use the free open source large language models available on their website.

Due to some time out error limitations on their free account of Hugging Face, the results were unfortunately not satisfactory for real world usage. This might change over time, but for now, we need a reliable alternative.

This is where OpenAI’s API and GTP-3.5 Turbo LLM model comes into play.

Let’s now look at how to use the Open AI API with LangChain and see if we get better results.

Table of Contents

Install LangChain

To do this we first need to install LangChain by using the following command in Google Collab or Jupyter Notebook:

Copy Code

!pip install -qU langchain

The options -q and -U modify the behavior of the pip package manager as follows:

-q or --quiet: This option instructs pip to run in quiet mode, which means that it will only output error messages or warnings, but not the installation progress.
-U or --upgrade: This option tells pip to upgrade the package to the latest version if it’s already installed. If the package is not installed, then this option has no effect.

So in this command, -qU is a combination of the two options, which means that pip will install the LangChain package silently, without displaying any output, and it will upgrade the package if it’s already installed to the latest version.

Install OpenAI Library

With LangChain installed we now need to install the Hugging Face prerequisite libraries by using the following command:

Copy Code

!pip install -qU openai

Install OpenAI API key

For our Collab or Jupyter file to talk to OpenAI, we need to also install an API key.

This can be do by doing the following:

Create a free account at https://platform.openai.com/account/api-keys
Click Create new secret key
Give it a name
Copy it to your clipboard or text file
Click Done

Now simply Paste the token below into the following command, replacing the YOUR_OPENAI_API_KEY part, and remembering to keep the ‘ ‘ on either side of it:

Copy Code


import os
os.environ['OPENAI_API_KEY'] = 'YOUR_OPENAI_API_KEY'

Install OpenAI Model

With the API key making a direct connection to OpenAI, we can now use one of their trained Large Language Models (LLM).

For this tutorial, we will be using the gpt-3.5-turbo model, which has a usage cost of $0.002 per 1000 tokens.

This is their fastest model currently available, but also a very accurate one.

Most importantly it’s very cheap to use in comparison to other models such as GPT-4 which costs anywhere from $0.03 to $0.12 per 1000 tokens.

There are also other fine tuned models such as Ada, Babbage, Curie and Davinci available but they are inferior to GTP-3.5 Turbo, as well as costing more.

Even OpenAI themselves currently recommend using 3.5 Turbo over all these other models.You can get a full breakdown on all models and their pricing on the official OpenAI pricing page.

Copy Code


from langchain.chat_models import ChatOpenAI

gpt_model = ChatOpenAI(model_name='gpt-3.5-turbo')

The code above is pretty straight forward:

Initialize the model using from langchain.chat_models import ChatOpenAI
Create a variable called gpt_model and assign it to ChatOpenAI(model_name='gpt-3.5-turbo')

How to Create LangChain OpenAI Prompt Template

With the OpenAI Library, API and model setup we can test out our first Quest & Answer Prompt Template by using the gpt_model variable that we declared above, in the following code:

Copy Code


# build prompt template for simple question-answering
template = """Question: {question}

Answer: """
prompt = PromptTemplate(template=template, input_variables=["question"])

llm_chain = LLMChain(
   prompt=prompt,
   llm=gpt_model
)

question = "Who is the CEO of Twitter?"

print(llm_chain.run(question))

Code Breakdown

This code is a simple Q&A template using template = """Question: {question} Answer: """
- {question} is the input variable from the user
We then create an LLM Chain using lm_chain = LLMChain() which has 2 arguments:
A Prompt using prompt=prompt,
Calling the LLM using llm=gpt-model
For testing purposes, we manually enter a string into the {question} variable, to recreate what a user might input using question = "Who is the CEO of Twitter?"
Finally we print out the answer of the question using print(llm_chain.run(question))

Running the Code

If we run this command using the gpt-3.5-turbo model, we will see that it outputs a response of:

As an AI language model, I am not aware of the current date and time, which is necessary to provide the current CEO of Twitter. However, as of May 2021, the current CEO of Twitter is Jack Dorsey.

While this is correct, it’s also outdated.

The reason that I specifically asked this question to highlight two facts:

ChatGTP 3 model was trained with data up to 2021, meaning that it won’t know anything that happened after this, as we can see.
This is a pre-trained model that does not have access to the internet, meaning that it cannot go out and find more updated information.

This won’t affect our tutorial, but I wanted to just make you aware of this before we continued.

In the near future I’m sure this will change, especially with the new plugins.

LangChain PromptTemplate with Multiple Questions

We can take this a step further by creating a PromptTemplate that asks multiple questions, except we instruct the LLM to answer all of the questions with a single input qs[ ]:

Copy Code


qs = [
   {'question': "Which NFL team won the Super Bowl in the 2018 season?"},
   {'question': "If a person is 5 ft 8 inches, how tall are they in centimeters?"},
   {'question': "Who was the first person on the moon?"},
   {'question': "How many planets are in our solar system?"}
]
llm_chain.generate(qs)

Code Breakdown

This is a list of KEY:VALUE pair dictionaries, with key being the question input variable and value being string based question that a user would ask.
We can also have multiple different input variables, so don’t have to use the same one in each dictionary.
We then generate the output using `llm_chain.generate(qs)`

Running the Code

Let’s again see how these answers compared to the previous temples answers and Google search:

Copy Code


LLMResult(generations=[[ChatGeneration(text='The New England Patriots won the Super Bowl in the 2018 season.', generation_info=None, message=AIMessage(content='The New England Patriots won the Super Bowl in the 2018 season.', additional_kwargs={}))], [ChatGeneration(text='172.72 centimeters.', generation_info=None, message=AIMessage(content='172.72 centimeters.', additional_kwargs={}))], [ChatGeneration(text='Neil Armstrong was the first person on the moon.', generation_info=None, message=AIMessage(content='Neil Armstrong was the first person on the moon.', additional_kwargs={}))], [ChatGeneration(text='There are eight planets in our solar system: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune.', generation_info=None, message=AIMessage(content='There are eight planets in our solar system: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune.', additional_kwargs={}))]], llm_output={'token_usage': {'prompt_tokens': 103, 'completion_tokens': 58, 'total_tokens': 161}, 'model_name': 'gpt-3.5-turbo'})

LLMResult(generations=[[ChatGeneration(text=’The New England Patriots won the Super Bowl in the 2018 season.’, generation_info=None, message=AIMessage(content=’The New England Patriots won the Super Bowl in the 2018 season.’, additional_kwargs={}))], [ChatGeneration(text=’172.72 centimeters.’, generation_info=None, message=AIMessage(content=’172.72 centimeters.’, additional_kwargs={}))], [ChatGeneration(text=’Neil Armstrong was the first person on the moon.’, generation_info=None, message=AIMessage(content=’Neil Armstrong was the first person on the moon.’, additional_kwargs={}))], [ChatGeneration(text=’There are eight planets in our solar system: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune.’, generation_info=None, message=AIMessage(content=’There are eight planets in our solar system: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune.’, additional_kwargs={}))]], llm_output={‘token_usage’: {‘prompt_tokens’: 103, ‘completion_tokens’: 58, ‘total_tokens’: 161}, ‘model_name’: ‘gpt-3.5-turbo’})

As you can see this data isn’t exactly accurate, as a Google search yields some correct answers that are different to what this model provided:

Q: Which NFL team won the Super Bowl in the 2018 season?
A: The New England Patriots won the Super Bowl in the 2018 season – INCORRECT as the New England Patriots played in the finals, but lost to the Philadelphia Eagles.

Q: If a person is 5 ft 8 inches, how tall are they in centimeters?
A: 172.72 centimeters – CORRECT

Q: Who was the first person on the moon?
A: Neil Armstrong was the first person on the moon – CORRECT

Q: How many planets are in our solar system?
A: There are eight planets in our solar system: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune – CORRECT

While GTP 3.5 Turbo definitely performed better than the google/flan-t5-xxl model on Hugging Face, which only got 2 out of 4 correct, it also got one wrong resulting in a correct score of 3 out of 4.

Multiple Questions Alternative Templates

There are two alternative ways to format the code as well.

The first is very similar except we print the results instead of generating them, while the second alternative method is to instruct it to answer all the questions at the same time.

Let’s see what happens.

Single Prompt with Multiple Questions Template

We can also use the following code to instruct the LLM to answer all of the questions with a single input qs [ ], instead of giving it four separate question as with the previous example:

Copy Code


qs = [
   "Which NFL team won the Super Bowl in the 2018 season?",
   "If a person is 5 ft 8 inches, how tall are they in centimeters?",
   "Who was the first person on the moon?",
   "How many planets are in our solar system?"
]
print(llm_chain.run(qs))

Running the Code

Let’s again see how these answers compared to the previous temples answers and Google search:

Q: Which NFL team won the Super Bowl in the 2018 season?
A: The NFL team that won the Super Bowl in the 2018 season was the New England Patriots – INCORRECT as the New England Patriots played in the finals, but lost to the Philadelphia Eagles.

Q: If a person is 5 ft 8 inches, how tall are they in centimeters?
A: A person who is 5 ft 8 inches tall is approximately 172.72 centimeters tall – CORRECT

Q: Who was the first person on the moon?
A: The first person on the moon was Neil Armstrong – CORRECT

Q: How many planets are in our solar system?
A: There are eight planets in our solar system: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune – CORRECT

We can see that it got the same answers correct and incorrect as before with a score of 3 out of 4. It however rephrased the answers differently which is interesting.

Multi-Prompt Template with Instruction

Now we can try to answer all questions in one go by giving it an instruction to Answer the following questions one at a time.

Copy Code


multi_template = """Answer the following questions one at a time.

Questions:
{questions}

Answers:
"""
long_prompt = PromptTemplate(
   template=multi_template,
   input_variables=["questions"]
)


llm_chain = LLMChain(
   prompt=long_prompt,
   llm=gpt_model
)

qs_str = (
   "Which NFL team won the Super Bowl in the 2018 season?\n" +
   "If a person is 5 ft 8 inches how tall are they in centimeters?\n" +
   "Who was the first person on the moon?\n" +
   "How many planets are in our solar system?\n"
)

print(llm_chain.run(qs_str))

We again print out the results for a cleaner output.

Running the Code

Let’s again see how these answers compared to the previous temples answers and Google search:

Q: Which NFL team won the Super Bowl in the 2018 season?
A: The Philadelphia Eagles won the Super Bowl in the 2018 season – CORRECT

Q: If a person is 5 ft 8 inches, how tall are they in centimeters?
A: A person who is 5 ft 8 inches tall is approximately 172.72 centimeters tall – CORRECT

Q: Who was the first person on the moon?
A: Neil Armstrong was the first person on the moon – CORRECT

Q: How many planets are in our solar system?
A: There are eight planets in our solar system – CORRECT

It again rephrases some of the answers differently, but more importantly, it got all of them correct! 4 out of 4!

This shows that by altering the LangChain PromptTemplate yields different results and structuring it in a certain way can actually increase the accuracy of the results.

Conclusion

In this tutorial, we have explored how to use the LangChain package with the OpenAI API to build a question-answering prompt template. By following the steps outlined in this tutorial, we have learned the following:

How to install LangChain and OpenAI libraries using pip.
How to set up an OpenAI API key and link it to our Collab or Jupyter Notebook file.
How to install and use the gpt-3.5-turbo model provided by OpenAI to generate responses to user prompts.
How to build a question-answering prompt template using LangChain’s PromptTemplate and LLMChain classes.
How to test our prompt template by providing a question and generating a response.

Through the course of the tutorial, we also discussed the limitations of the free Hugging Face API and explored an alternative in the form of the OpenAI API, specifically the gpt-3.5-turbo model. We saw how the gpt-3.5-turbo model is not only fast and accurate, but also cost-effective compared to other models.

Overall, we have gained a foundational understanding of how to use the LangChain package and OpenAI API to build a question-answering prompt template. Armed with this knowledge, we can now explore more complex use cases and build on top of what we have learned here.

Tags: Beginners

How to use OpenAI’s API & GTP with LangChain

How to Use Google Colab with Python

Getting Started with Python Development in Visual Studio Code

Mastering Normally Distributed Random Numbers with NumPy

How to Add Elements to a Python List

How to install tar.gz file on Linux

Mastering Linear Regression Analysis: How to Interpret the Slope for Financial Forecasting

Corporate Finance in Restructuring a Distressed Company: Insights and Strategies

Efficient Corporate Finance in the Large Scale Phase: A Guide to Enterprise Finance

From Growth to Profitability: Navigating the Corporate Finance Profitability Phase Successfully

How to use OpenAI’s API & GTP with LangChain

How to Add Elements to a Python List

Mastering Normally Distributed Random Numbers with NumPy

Related Posts

How to Use Google Colab with Python

Getting Started with Python Development in Visual Studio Code

Mastering Normally Distributed Random Numbers with NumPy

How to Add Elements to a Python List

How to use Python Floor Division Operator (//)

Python Arithmetic Operators

Mastering Normally Distributed Random Numbers with NumPy

Getting Started with Python Development in Visual Studio Code

How to Use Google Colab with Python

Popular

Mastering Normally Distributed Random Numbers with NumPy

Nvidia RTX 4070 vs RTX 3090 Benchmark

Nvidia RTX 4080 vs RTX 2080 Ti Benchmark

Getting Started with Python Development in Visual Studio Code

Nvidia RTX 4090 vs RTX 4070 Benchmark

Special Offers & Discounts

Tutorials