A presentation given by Todd Kerpelman, Developer Advocate at Plaid, at our 2024 Austin API Summit, March 12-13.
Session Description: Have you ever thought about building your own chatbot to help developers be more successful using your APIs? Well, we made one for Plaid’s documentation site, and in this talk, I’ll cover some of the things we learned!
This presentation will cover topics like:
– How does it work? What does it mean to “train” a bot on your docs?
– Setting appropriate expectations: Do you still need to write documentation? Do you still need a support team?
– The trade-offs around building your own vs. buying a 3rd party solution
– Some decisions around the underlying tech
– How to build a decent “conversational mode” so you can ask follow-up questions
– How you evaluate the quality of a chatbot, and some surprises we ecountered along the way
– What do you do when things go wrong?
– Security considerations
And much more! Actually, probably not that much more. That already sounds like a lot.
How I Built Bill, the AI-Powered Chatbot That Reads Our Docs for Fun , by Todd Kerpelman at Plaid
1. ∏
What I learned making Bill
The helpful robot platypus that reads our docs for fun
2. Plaid
Account Veri
fi
cation & Payments
Fraud & Compliance
Personal Finance Insights
Credit Underwriting
Open Finance
All your Fintech solutions on a single platform
7. Step 2: Figure out the "meaning" of each
Mowing the
lawn
Weeding
Making
dinner
Shoveling
snow
Chains on
tires
Sledding
Skiing
Pool
Beach Hot weather
Cold weather
Fun
Chore
Cross-
country skiing "Embeddings Model"
8. Step 3: Store all that in a database
-0.0172341652,0.0231553614,0.0176660642,-0.0137998713,,0.0070
0442726,-0.0216646139,-0.00155431416,0.0154508408,-0.0206196
96,0.0172063019,0.00771148782,-0.00135577994,0.0161195882,-0.
1500-dimensional vector
URL
https://plaid.com/docs/signal/add-to-app/
Text
for each institution that Plaid supports. Link makes it secure and easy
for users to connect their bank accounts to Plaid. Note that these
instructions cover Link on the web. For instructions on using Link
within mobile apps, see the Link documentation.Using Link, we will
Title
Signal - Add Signal to your app | Plaid Docs
9. So, when a user asks a question...
👩💻
Hey, Bill, how do I get
started with Plaid
Transactions?
Ooh! Lemme convert
that question into a
vector!
These documents are
closest to the vector
you just sent me.
User's
question
Vector
Embeddings
API
Vector
database
Vector
database
10. Now... we can ask our question
Hey, large language
model! Can you answer
this user's question...
...with the help of these
documents?
Sure can!
User's
question
Surprisingly
accurate
answer
GPT-4
11. A simplified version of the prompt
You are a helpful friendly chatbot that wants developers to be successful when using
the Plaid API. You are given a question and you should answer it if you can.
It's okay to say you don't know if you don't know the answer.
Feel free to include detail when relevant, and include code snippets if appropriate.
Here is some context that might prove useful in answering the user's question:
=========
{all those documents}
=========
The user's question is: "{question}"
Please format your answer in Markdown.
13. Build vs Buy
Reasons to build Reasons to buy
Less expensive in the long run Faster time to market
Fine control over everything Dedicated engineering teams
Vendor lock-in Less manual maintenance
More integrated experience
You get to learn about LLMs
17. ...but it can make your docs experience
better
Synthesizing information across different pages
Better search than search
Rephrasing generic docs for a speci
fi
c use case
19. But you can make their lives easier
Help! Our main
account owner left the
company!
This call
failed – can you look
through your logs and
tell us why?
Can you enroll me in
this beta feature?
Something isn’t working as
described in the docs. Help us
locate the problem.
Here's a question
that's already answered
in the documentation
30. How can we make it smarter?
More isn't always better
31. How can we make it smarter?
Can we change the way we read code?
app.post("/server/create_new_user", async (req, res, next) => {
try {
const username = escape(req.body.username);
const email = escape(req.body.email);
const userId = uuidv4();
const result = await db.run(
`INSERT INTO users(id, username, email) VALUES("${userId}", "${username}", "${email}")`
);
console.log(`User creation result is ${JSON.stringify(result)}`);
if (result["lastID"] != null) {
res.cookie("signedInUser", userId, {
maxAge: 900000,
httpOnly: true,
sameSite: "none",
secure: "false",
});
}
res.json(result);
} catch (error) {
next(error);
}
});
This code is part of a server-side application written in Node.js that interacts with
the Plaid API. It defines three functions: `getLoggedInUserId`, `getUserObject`,
and an anonymous function that handles POST requests to the "/server/
create_new_user" endpoint.
The `getLoggedInUserId` function takes a request object as an argument and
returns the value of the "signedInUser" cookie. This function is used to identify
the user who is currently logged in based on the cookies sent with the HTTP
request.
The `getUserObject` function is an asynchronous function that takes a user ID as
an argument. It queries a database for a user with the given ID and returns the
result. This function is used to retrieve the details of a user from the database.
The anonymous function is a route handler for the "/server/create_new_user"
endpoint. It is an asynchronous function that takes a request and response object,
and a next function for error handling.
When a POST request is made to this endpoint, the function first sanitizes the
32. How can we make it smarter?
Can we change the way we parse reference docs?
34. How can we make it smarter?
Probably the bigger problem...
MORTGAGES
ANNUAL
PERCENTAGE
RATES
PAYMENT
DUE DATES
OVERDUE
PAYMENTS
35. How can we make it smarter?
Probably the bigger problem...
SOME
REFERENCE
DOCS ABOUT A
REST API
36. Other things to try, maybe...
Change the way we break up text
Try more, smaller chunks
Try fewer, larger chunks
Re
fi
ne how we import text
Try different matching models
What if we try "Three answers from the docs,
3 answers from the reference API?"
Updated embeddings API
37. What's made the biggest impact?
Using newer models
Having good source material