Generating structured LLM outputs.
It turns out that when building products, we can use LLMs for a lot more than just a chat with AI
feature. (Of course, that requires a little bit of creativity which is not always easy to come by.)
A common approach is to use LLMs to generate structured outputs for users. This is useful for things like extracting information from text, auto-completing user inputs, providing suggestions based on past behavior, and a lot more. (Again, creativity is key here.)
What makes this difficult is that LLMs are good at generating text, but not often good at following specific instructions. So, no matter how much you yell at your LLM DO NOT EXPLAIN YOURSELF
, or IF YOU DON'T RETURN JSON JOHN WILL DIE
, they’ll still do whatever they want.
Thankfully, there are a few tools that help us with this. One of them is Instructor, and it makes it easy to generate structured outputs using LLMs.
Getting Started with Instructor
First, we need to install Instructor (duh), Zod, and OpenAI’s SDK.
npm i @instructor-ai/instructor zod openai
Once we have these installed, we can start with initializing our OpenAI client and Instructor.
import OpenAI from 'openai';
import Instructor from '@instructor-ai/instructor';
const client = Instructor({
client: new OpenAI({ apiKey: 'YOUR_API_KEY' }),
mode: 'FUNCTIONS'
});
With the client initialized, we can then start generating structured outputs.
Defining the Schema
A nice thing about Instructor is that it works with Zod
schemas instead of inventing yet another schema language. This makes it easy to reuse existing schemas and share them across all different parts of your application that also use Zod.
Let’s say we have a schema for a User
and want to extract their name and job title from a piece of text.
import { z } from 'zod';
const UserSchema = z.object({
name: z.string().describe('The name of the user'),
job: z.string().describe('The job of the user')
});
Here, we describe each field so that Instructor can use it to generate better outputs.
Generating Structured Outputs
Once we have the schema defined, we can use the client to generate structured outputs.
const user = await client.chat.completions.create({
messages: [
{
role: 'user',
content: 'My name is John Doe and I am a software engineer.'
}
],
model: 'gpt-4-turbo-preview',
response_model: {
schema: UserSchema,
name: 'User'
}
});
Here, we pass the schema to the response_model
field, and Instructor will use it to generate structured outputs.
Once we run this code, we’ll get a structured output that matches the schema.
{
"name": "John Doe",
"job": "software engineer"
}
Let’s try a different input:
const user = await client.chat.completions.create({
messages: [
{
role: 'user',
content: 'My name is John Doe and I am a software engineer.'
content: 'Hey, this is Ellie, and I work as a UX Designer.'
}
],
model: 'gpt-4-turbo-preview',
response_model: {
schema: UserSchema,
name: 'User'
}
});
And we’ll get a different output:
{
"name": "Ellie",
"job": "UX Designer"
}
Of course, this is a very simple example, but we can go much further and do more complex things with Instructor.
For example, say we are building a car review website and want to extract some information from a long review. We can define a schema for a Car
and extract things like pros, cons, overall sentiment, and more.
const CarSchema = z.object({
car: z.string().describe('Name of the car'),
sentiment: z
.enum(['positive', 'negative'])
.describe('Sentiment of the review'),
pros: z.array(z.string()).describe('Up to 5 pros of the car'),
cons: z.array(z.string()).describe('Up to 5 cons of the car')
});
Then, we can run the same code as before, but instead feed it the entire car review, or even multiple reviews, and get structured outputs.
const car = await client.chat.completions.create({
messages: [
{
role: 'user',
content: 'The 2024 Tesla Cybertruck...'
}
],
model: 'gpt-4-turbo-preview',
response_model: {
schema: CarSchema,
name: 'Car'
}
});
And we’ll get a structured output like this:
{
"car": "2024 Tesla Cybertruck",
"sentiment": "positive",
"pros": [
"Sleek design",
"Impressive performance",
"Good range and charging times"
],
"cons": [
"Dual Motor and Beast models are pricey",
"Limited interior space",
"No Apple CarPlay and Android Auto"
]
}
And that’s it! Of course, this is just the tip of the iceberg, and there’s a lot more you can do with Instructor. But this should give you a good starting point to generate structured outputs using LLMs.