How to Find Out if Your Idea Is Worth Building Before You Spend Anything
- •
- 15 min
I built an AI-powered startup idea analyzer. Founders described their idea in plain text and got a structured report back: likely buyers, key risks, what to test next. Over 500 ideas ran through it.
The product still failed. Why? Because the feedback loop was broken from the start.
Founders got reports that sounded rigorous and went away feeling validated. The AI was too agreeable. The signal I was collecting — usage counts and "thank you" replies — was pointing in the wrong direction. The concern that should have killed the idea early: AI tools can't tell you whether people will pay, because they validate enthusiasm rather than challenging assumptions. If I'd designed a test to surface evidence against that concern, I'd have seen it in week two.
Instead I learned it after 500 users and months of work.
That's the problem this article is about. Not a shortage of validation tools — founders have more options than ever. The problem is picking a tool (or approach) before knowing what question it actually answers, collecting data that confirms you should build, and discovering six months later that the data was measuring the wrong thing.
Seven methods follow. Each one generates a specific signal. The signal only matters if you've defined, in advance, what you're trying to contradict.
Before you pick a tool: write down the concern
Before you run any test — before a single interview, before a landing page, before a Stripe payment link — write down the one thing that would kill your idea.
Not a general risk. The specific, falsifiable concern. The thing that, if true, means you shouldn't build.
- "People who have this problem are already paying someone else to handle it."
- "The price I need to charge is above what they'd actually spend in this category."
- "The problem only bothers people once a year — not often enough to build a habit around."
Then define what evidence against it would look like. And set the threshold before you start testing — "if 6 out of 10 people I speak to..." — and commit to it before the first conversation.
Setting the bar after you've collected the data is how you talk yourself into building something you shouldn't. The purpose of defining it early is to make the test honest.
Every tool below generates a signal. The relevant question is whether that signal contradicts your stated concern — not whether people seem interested.
1. Conversations first
The cheapest tool and the most consistently underused one.
Direct conversations with five to ten people who have the specific problem right now — not friends, not your professional network generally, people who've already been bitten by the problem. People who've tried to solve it and failed, or who've built workarounds that cost them something.
The mistake most founders make in these conversations is asking about the future:
- "Would you use this?"
- "Would you pay for this?"
- "Does this sound useful?"
These questions produce encouraging answers that mean nothing. People are polite. They'll tell you what sounds reasonable without really thinking about whether they'd change behavior.
Past-behavior questions work better:
- "How did you handle this the last time it came up?"
- "What did it cost you in time and money?"
- "What have you tried, and why did it stop working?"
- "Walk me through exactly what you did."
Past behavior is honest in a way that future intent isn't. People can't easily construct a flattering fiction about something that already happened.
Example
I built TinyHR in two evenings. A UAE HR compliance tracker — log renewal deadlines for employee documents, get alerts before anything lapsed, never get fined for a missed visa renewal again. The problem was real and I'd seen it firsthand: a yoga studio had received fines because an accountant missed renewal deadlines and nobody caught it.
Before I went further than a prototype, I sat down with business owners and asked how they actually handled compliance tracking. What they currently used. What had nearly slipped through in the last twelve months. What they'd do differently.
The answers killed the idea quickly.
The market already had two working solutions. Business owners who managed their own compliance used the government portals — Ministry of Human Resources, ICP — which display expiry dates, send reminders, and handle renewals directly. Business owners who didn't want to manage compliance themselves hired PRO service companies to handle all of it. TinyHR was solving a problem that was already solved, in two different ways, for two different types of companies.
A third structural issue emerged from the conversations: the government portals don't expose APIs. Any tool that needed to pull renewal data would require manual entry from the user. The most friction-heavy part of the product was unavoidable by design.
Two evenings of building. A handful of conversations. Clear result. That's not a failure — the method worked.
Best for
"I don't know whether this problem is real enough that people would pay to have it solved."
2. The napkin
For offline businesses — a property management company, a medical clinic, a real estate agency, any business where the problem is operational and the decision-maker is in the room — the most effective pre-build validation tool isn't digital.
You sit across from the business owner. You ask about their current workflow. You pull out a piece of paper and draw what the solution looks like. Then you ask if they'd pay for it.
It sounds informal because it is. That informality is the point.
Example
The real estate agency that became RealEstateCRM ran their listings on a shared spreadsheet. Agent activity happened over WhatsApp. When a deal progressed, someone had to remember to update the spreadsheet — or it didn't get updated. When a property moved from available to under offer, the change lived in someone's head until the next time they opened the file.
The conversation before any code started with workflow questions.
- How does a new listing get added?
- How do you know what's available right now without calling someone?
- When a client asks about a specific property, what do you actually do?
The answers described a set of manual steps that could fail at any point and often did.
I drew the core flow on paper. Properties, statuses, a client record attached to each viewing. A single place where both the agents and the manager could see what was happening without opening two different WhatsApp threads and a spreadsheet. The agency director looked at the sketch and started pointing at parts of it: "This part would save us time. This we don't need. What happens if two agents are showing the same property to different clients?"
Those interruptions are the signal worth listening for. A business owner who is already problem-solving in the sketch is already imagining using it. They've moved from "sounds interesting" to "how does this handle my specific situation" — and that shift is real.
The first version — a simple apartments database — was scoped in that conversation and delivered in two weeks. It went immediately into production. The first feedback came from people actually using it: they needed a customers database too, so the CRM function could work as a CRM. That came in the next two weeks. Then months of near-daily changes driven by real usage. It's still running.
What the napkin conversation tests that other methods don't:
It shows whether the business owner can see themselves using a specific solution. A drawing is concrete enough to react to. "I wouldn't need this part" and "what happens when..." only come out when there's something to push against. An abstract description of a product produces polite interest. A sketch produces objections — and objections are useful data.
It surfaces scope constraints in real time. During the drawing, the business owner tells you what matters and what doesn't. Every constraint that surfaces in that conversation shapes the build before you start, which is where shaping costs nothing.
And if you ask for a deposit before you write the first line of code, and they pay it, the concern about willingness to pay is closed. You don't need a Stripe experiment when the deposit is already in your account.
Best for
Offline and traditional businesses with a visible, operational problem. When you can sit across from the decision-maker and point at the specific workflow that's broken.
3. Landing page + email capture
The most commonly recommended pre-build validation step. Also the most commonly misread.
A landing page tells you whether people will stop scrolling when they see your promise. That's a real signal — it tests framing, headline clarity, who the positioning speaks to. A 2% signup rate on cold traffic means something different than a 12% rate. It tells you whether the copy is working.
What it doesn't tell you: whether those people will pay, whether they'll change their existing workflow, or whether they'll still be using the product in month three.
Email signups measure curiosity. Curiosity is not the same as willingness to pay.
The landing page is a copywriting test more than a demand test. Use it to find the sharpest version of your pitch. Run the headline that got 3% against the one that got 9% and learn something real about positioning. Don't look at the total signup count and call it validation.
When it's genuinely useful
Consumer or prosumer ideas where awareness is the bottleneck. If the reason people don't have a solution is that they don't know one exists, the landing page tells you whether your framing closes that gap. If the reason is that they won't pay for a solution, the landing page won't surface that.
4. Stripe payment link (pre-sell)
The upgrade from the smoke test. Put a real price on it before you've built it.
Stripe, Gumroad, Lemon Squeezy — a payment link takes an afternoon to set up. Post it where your target user already spends time. Be transparent: you'll refund if you don't build. Ask them to pay anyway.
This model is old. Kickstarter has run on it for fifteen years. Early SaaS founders used it before Stripe existed. You're asking people to commit money behind something that doesn't exist yet — which means only the people who actually feel the problem will do it.
The filter is brutal. "Sounds interesting" doesn't convert. "I've been looking for exactly this" does.
Where to run a pre-sell: the niche communities where your specific user already congregates. A Slack group for independent real estate agents. A subreddit where indie developers compare tools. The more specific the community, the more meaningful the signal — cold traffic on a broad ad produces clicks from people whose motivation you can't read. Start with the people most likely to have the problem right now, not the largest audience you can find.
The pre-sell also gives you a number to calibrate against before you write code. Ten people paying $49 upfront is a different signal than ten people clicking a "learn more" button. The size of the commitment matters, not just the count.
Best for
B2B or prosumer ideas where you're confident the problem is real but uncertain whether people will pay — or uncertain about the price point they'll accept.
5. Concierge MVP (deliver the value by hand first)
Most founders skip this because it doesn't feel like product development. That's why it works.
Build nothing. Deliver the value manually. Charge for it.
If you're building a tool that generates competitive analysis reports, write the first five by hand in Google Docs, charge a fair price, and deliver them on a clear timeline. The customers get a real output. You get paid while learning what "done" actually means.
What the concierge MVP reveals that no other tool does: the definition of the work itself. Most founders don't know what they're automating until they've done it by hand enough times to see the pattern. The manual version surfaces edge cases, client preferences, and "I actually need X not Y" moments that would otherwise appear as change requests six weeks into a build.
Example
The Automator engagement didn't start with automation. It started with a two-day conversation about a music publishing pipeline that took four hours every day to run manually — seventeen steps, mostly file renaming and folder organization, moving content across platforms, managing uploads across multiple WordPress sites, tracking what had published and what hadn't.
That two-day conversation was the concierge phase. We mapped the workflow step by step. What happened when a file landed in the wrong folder. What happened when an upload failed and nobody noticed for a day. The difference between a priority release and a routine one. What the client actually checked to confirm something had gone right. Every answer shaped what the automation needed to do.
When I built it, I didn't start with all seventeen steps. I started with the highest-friction point — file renaming and folder organization, the most manual and the most error-prone — and built just that. Deployed it. Watched whether my understanding of the process matched what actually happened.
It did. The rest followed quickly.
That sequencing is the concierge approach in practice: map the work manually, build the smallest slice that proves the diagnosis is right, then expand. The Automator has been running for years. It now handles 100–150 posts across multiple sites and takes under ten minutes a day. The accuracy of that build came from starting with the manual map, not from the code.
Best for
Service-shaped ideas — anything where the value is an output (a report, a processed file, an analyzed dataset, a curated list) rather than the software experience itself. Also useful when you're not yet certain whether the automation is the value or the output is.
6. ProductHunt-like platforms
ProductHunt is a discovery platform built around tech-aware early adopters who try new products and have opinions about what they're doing.
A strong launch day generates visibility, a spike in signups, and a round of structured feedback from people who've seen a lot of products. That's useful at the right moment.
The honest version of what it tells you: your idea has appeal to a tech-curious audience. It does not tell you whether your actual target market will pay.
Products that go #1 on ProductHunt and disappear six months later are common enough to be a pattern worth naming. The audience that shows up on launch day is not always the customer. A productivity tool that engineers love might not work for the operations managers it was built for. A developer tool gets upvoted by developers who are flattered someone built it — not by the procurement managers who sign the contracts.
ProductHunt is most useful when your target user IS the ProductHunt audience — makers, indie developers, early-stage founders — or when you've already validated demand through conversations and a pre-sell, have a working version, and want initial visibility and a fast round of structured public feedback.
Used at the right moment, it's valuable. Used as a substitute for demand validation, it produces an encouraging number and then silence.
7. Wizard of Oz / fake door
Build the interface. Don't build the backend.
The user clicks "Generate Report." You generate the report manually and send it within the hour. The experience looks automated. The fulfillment isn't.
This separates two questions that are easy to conflate: "will they want the output?" and "will they find their way through the interface to get it?" A user who wants the output might still abandon a confusing flow. The Wizard of Oz catches that before you've automated anything.
The limitation is obvious: this only works for a small number of early users. It's a prototype validation method, not a scalable one. The useful version is running ten people through a specific interface flow, watching where they hesitate or drop off, and treating that as data about the interface rather than about demand. The friction you observe is more honest than anything they'd tell you in an interview about what they'd find confusing.
Best for
When the UX itself is the hypothesis — before you've built the logic behind the interface.
How to pick
The question you're trying to answer determines the tool.
| Your concern | The test that addresses it |
|---|---|
| Does this problem actually bother people enough to act on? | Conversations first |
| Will an offline decision-maker commit to a specific solution? | The napkin |
| Is my framing sharp enough to make people stop and look? | Landing page |
| Will people pay for this, not just say they'd consider it? | Stripe payment link |
| What does solving this actually mean in practice? | Concierge MVP |
| Will a tech-aware early adopter audience notice this exists? | ProductHunt |
| Will people navigate the interface without confusion? | Wizard of Oz |
Most ideas need more than one. The sequence usually runs left to right: start with conversations to confirm the problem is real, use the result to sharpen the framing, test the framing with a landing page or pre-sell, deliver the first version concierge-style to the people who paid.
The temptation is to skip to the middle — build a landing page, call it validation, start coding. The landing page is useful, but it answers a different question than the conversation does. Stack the tests so each one builds on what the previous one found.
The concern that runs through all of them
No matter which tool you pick, the signal only matters if you've stated the concern it's designed to contradict — and set the threshold in advance.
CheckMVP failed because I never stated the right concern. I ran the tool as a hypothesis-validation instrument: measuring usage, counting return visits, asking whether founders found the reports helpful. The concern I should have been testing was specific: can AI-generated analysis produce enough disagreement to actually change a founder's direction? If I'd stated that concern explicitly and designed the test to find evidence against it, the answer would have come back in the first two weeks. The AI was too agreeable. Reports validated what founders already believed rather than challenging it.
After 500 uses I had exactly the data I'd designed to collect. None of the data that would have told me whether the core premise worked.
Write the concern before you pick the tool. Define what evidence against it looks like. Set the threshold before you start — then hold the line when the results come in.
The rest is choosing which signal to collect.
If you're at the pre-build stage — an idea, a problem you've seen firsthand, and a question about whether it's worth the build — that's where the MVP development engagement starts. Concern first. Evidence second. Scope third.