There is promise in what modern-style AI code tools can do to amke life easier for engineers and other technically-minded people as a companion to their knowledge and experience. It also can work well for vibe coding a proof of concept to explore ideas. If you’re non-technical and vibe coding some software, please ask at least one technical friend to take a look, even casually, at what code was generated or the final website/app.
Disclaimers
- I won’t share who the founder or the startup is; it isn’t relevant.
- No live end-consumer data was disclosed.
- This is meant as an illustrative example, not the one specific thing you need to watch out for. Don’t just tell ChatGPT to check for this vulnerability and think you’re 100% safe.
Context
Knowing the promise of modern AI agents in making it easier to write code, I’ve been curious to see how people are using it. Engineers on my team have liked using it as a companion, filling skills gaps to help keep the work moving. I’ve been exploring it for certain development work, too. But what are startup founders doing? In particular, what are non-technical founders doing?
Non-technical founders are doing more vibe coding, and I wanted to know what kind of code is getting deployed, potentially into production, and potentially in more secure data contexts.
The Target
The target is a B2B2C web application that processes and stores PII, and also provides a front-end for authenticated business accounts to view a scoped set of that PII. It includes one demo account which has access to demo data for one business. Other than the demo business account, I did not get access to the codebase, application db, or app backend. Only the knowledge that and interface that a regular user would use. I’d call it “informal black box testing” to explore how LLM-generated systems work, not explicitly to find any problems.
The Method
Keeping this brief, rather than telling a whole story.
I loaded the public-facing website, and found that it makes a series of requests to load and then display user data for the demo business account. The request sequence looks like:
GET indexGET companies?select=*&id=eq.demo_guidGET users?select=*,user_contact:user_contacts(email,phone_number, zip_code,birth_year,name) &company_id=eq.demo_guidGET companies?select=id,demo_enabled&slug=eq.company_demo_nameGET companies?select=*&id=eq.company_demo_guidGET users?select=*,user_contact:user_contacts(email,phone_number, zip_code,birth_year,name) &company_id=eq.company_demo_guidGET users?select=*,user_contact:user_contacts(email,phone_number, zip_code,birth_year,name) &company_id=eq.company_demo_guidGET companies?select=*&id=eq.company_demo_guid
Some funny notes:
- The page fetches the data for a ‘demo’ and immediately discards it. This was probably part of some early demo/proof of concept work, before support was added for multiple companies, with each company have a
demo_enabledstate. - The page fetches information about the demo company multiple times (5, 8), and also fetches information about all users multiple times (6, 7). It is not clear why.
More important notes:
- Requests all use POSTGrest with the full query being made by the web front-end itself.
- The table name is the URL base (users, companies, etc.).
- The where clauses are each URL parameter after the select (e.g.
company_id=eq.company_demo_guid).
- All the data being returned is highly-structured JSON
There’s a DB-like interface exposed on to front end. This is the easiest and first thing to check. Initially I tried using Firefox’s ‘Edit and Resend’ feature to alter the requests, to change the parameters but making requests out of order was yielding strange responses: Firefox would say 3.34kB transferred (0B in size). Only requests made in the expected order would provide results.
Rather than trying to modify the javascript powering the front-end, I used the free and open source Requestly extension to rewrite the requests on the fly. Set up multiple rules to intercept those calls to change the parameters.
A little back-end validation to work around: limit 1. There was some back-end validation to limit certain queries to only return one result, and the server returns an HTTP error with information about that limit. This meant I couldn’t get a full list of every company in one query. I just worked around this by adding an offset=1 parameter to
the POSTgrest query, and incremented the offset with every call. If there were more than dozens of customers, I would have automated or found a faster way to pull this information.
This method allowed me to modify queries to return arbitrary data (with a different/no where clause), and to return the count on any table in the db, and retrieve rows, either in bulk or one at a time.
The Findings
This simple method in this case was enough, without ever having seen a line of the app code, to:
- Get a full list of all companies, whether they had a demo mode or were in production.
- Get a listing of all individual users that shared their information with this app, including their names, emails, locations, and highly-personal details.
- Get data from any other table in the database available to the db user – I didn’t find anything too interesting though. At least not as interesting as the two previous bullet points.
Other than that I also found:
- A semi-functional admin user signup/account creation page
- A password reset function
- A number of redundant calls and data processing functions, probably abandoned by the LLM while iteratively building the codebase.
The Follow-ups and My Recommendations
I did connect with my friend to let them know what I found, and how easy it was. No code-level exploits involved. Minimal technical knowledge. No knowledge of how the back-end was implemented. And thanked them for the opportunity to learn a lot in the process. Thankfully they hadn’t gone live with real people’s data yet, and I offered to take a look before they do, as a friendly courtesy. They’ll take me up on that offer :-)
What should you do? At minimum, get a friend with technical knowledge, someone you trust, to take a look before you launch/publish/go to production. Ideally leverage the LLM as a way to learn more yourself about what your codebase is and what it does. If you’re staking your reputation and your company on it, someone, at some level should understand what it is and what it does.
— Varun
Who’s Varun? I’ve been in Product at companies like at CLEAR and Noom, Hotel Tech with ALICE, now Actabl, and I was previously founder of an HR tech startup Disqovery. I have worn many hats, and I like making things. Advising startups and early stage founders. I also like talking business. You can reach me at me@varunmehta.com, Mastodon, Github, and LinkedIn.