I Built an AI That Watches the Market for Me
I wanted a system that would scan public data sources, identify market opportunities, score them by confidence, and let me explore the results from a dashboard. Not a SaaS product I pay $200/month for. Something I own. Running on my own infrastructure. Trained on my own interests.
So I built Innovation Scout. It pulls from 7 data sources, runs opportunity detection with confidence scoring, performs 4-dimension risk assessment, and serves everything through 9 API routes and 6 UI pages. It's integrated into ArgoBox as a module.
199 tests passing. Zero TypeScript errors. Migrated from a standalone repo to the monorepo in one session.
What Innovation Scout Does
At its core, Innovation Scout is a market intelligence engine. You give it a query like "AI healthcare trends" or "edge computing startups" and it:
- Fetches data from 7 sources simultaneously
- Aggregates the results into a unified format
- Detects opportunities with confidence scores
- Assesses risk across 4 dimensions
- Generates ideas based on cross-domain patterns
- Personalizes results based on your feedback history
The output is a structured view of what's happening in a market. Not just links to articles. Scored, categorized, risk-assessed intelligence that I can filter, sort, and drill into.
The 7 Data Sources
| Source | What It Provides |
|---|---|
| Community sentiment, trending topics, problem discussions | |
| Hacker News | Technical trends, startup launches, developer tools |
| USPTO Patents | Recent patent filings, technology directions |
| Yahoo Finance | Market data, sector performance, company metrics |
| NewsAPI | Global news coverage, event tracking |
| Crunchbase | Startup funding rounds, acquisitions, market moves |
| Cross-domain | Fusion analysis combining signals across sources |
Each source has its own fetcher module. They run in parallel and return normalized data. The aggregator merges everything and deduplicates.
The cross-domain fusion is where it gets interesting. When a patent filing mentions a technology that's also trending on HackerNews and showing funding activity on Crunchbase -- that's a stronger signal than any single source alone. Innovation Scout detects these cross-source correlations automatically.
Two modes of operation. Heuristic mode runs without any API keys. It uses pattern matching, keyword extraction, and statistical analysis. Good enough to be useful, fast, and free. AI-powered mode activates when you provide an Anthropic, OpenAI, or OpenRouter key. That adds LLM analysis for deeper pattern recognition, natural language summaries, and smarter confidence scoring.
Opportunity Detection
This is the main feature. Given aggregated market data, the engine identifies specific opportunities and assigns each one a confidence score.
Confidence scoring factors:
- Signal strength -- how many sources report the same trend
- Signal freshness -- how recent the data points are
- Signal diversity -- whether it appears across different source types
- Momentum -- is the trend accelerating or decaying
- Market size indicators -- funding amounts, patent density, media coverage
An opportunity might look like: "AI-powered medical imaging -- 3 recent patents, trending on HN, $40M Series B on Crunchbase, confidence: 0.87."
The opportunities page lets me filter by confidence threshold, source, category, and time range. I can sort by any dimension. Click into an opportunity and see the underlying data points that generated it.
4-Dimension Risk Assessment
Every opportunity also gets a risk profile across 4 dimensions.
Technical Risk: How hard is this to build? Does it require breakthroughs or is it engineering work?
Market Risk: Is there proven demand? Who are the competitors? Is the timing right?
Financial Risk: What's the capital requirement? What's the burn rate? When does it break even?
Execution Risk: Can a small team do this? What's the talent requirement? Are there regulatory hurdles?
Each dimension gets a score from 0 to 1. The risk page shows radar charts and gauge visualizations for each assessed opportunity. High technical risk but low market risk? Probably worth exploring. High across all four? Walk away.
The risk assessment uses a different analysis pipeline than opportunity detection. It specifically looks for counter-signals. Competitor density, regulatory filings, patent litigation, negative sentiment threads. If opportunity detection is "what's promising," risk assessment is "what could go wrong."
The Architecture
Innovation Scout follows ArgoBox's 3-tier pattern. This matters for portability.
Tier 1: Core Engine (packages/innovation-scout/)
Pure TypeScript. Zero framework dependencies. Contains:
analysis/-- AI analysis providers, opportunity detection, trend identification, streamingdata/-- 6 data source fetchers + aggregatorrisk/-- 4-dimension risk scoring enginepersonalization/-- User profiles, recommendations, feedback loopconfig/-- Service registry, environment helpersdb/-- Schema, D1 adapter, in-memory adapterutils/-- ID generation utilities
This is the forkable open-source product. Anyone can take this package, wrap it in their own frontend, and run it. It works in Node, Deno, Cloudflare Workers, or any JS runtime.
Tier 2: Adapters
Config-injectable factories. createDataFetcher({newsApiKey: '...'}). No environment assumptions. No framework coupling. You pass config in, you get a working instance out.
Tier 3: ArgoBox Module (modules/innovation-scout/)
The Cloudflare Workers + Astro integration layer. Reads API keys from CF Pages environment bindings. Provides the admin UI pages. Handles rate limiting, caching, SSE streaming, and database wiring.
This tier is ArgoBox-specific. It's how the engine plugs into the admin dashboard. But the engine itself doesn't know or care about Cloudflare.
The UI: 6 Pages
Dashboard (/admin/innovation-scout/): The home base. Analyze form at the top -- type a query, hit "Run Analysis." Getting started guide for first-time users. Navigation tiles to other pages. Latest analysis results.
Opportunities (/admin/innovation-scout/opportunities): Browse and filter detected opportunities. Sort by confidence, recency, source. Click into any opportunity for detail view with supporting data points.
Ideas (/admin/innovation-scout/ideas): Idea generator based on detected patterns. This one is still a stub. The API endpoint exists, the data model is defined, but the full implementation is next on the list.
Risk (/admin/innovation-scout/risk): 4-dimension risk gauges. Visual breakdown of technical, market, financial, and execution risk for any assessed opportunity.
Matrix (/admin/innovation-scout/matrix): Feasibility vs. impact chart. Scatter plot where each opportunity is a dot positioned by how feasible it is and how much impact it could have. Top-right quadrant is where the gold lives.
Settings (/admin/settings/innovation-scout): API key configuration, data source toggles, analysis preferences. Configure which sources are active, set your default query parameters, manage AI provider selection.
All 6 pages use CosmicLayout, which gives them the ArgoBox admin sidebar. They feel native to the platform, not bolted on.
The 9 API Routes
| Route | Method | What It Does |
|---|---|---|
/api/admin/scout/analyze |
POST | Run a full analysis on a query |
/api/admin/scout/assess-risk |
POST | 4D risk scoring for an opportunity |
/api/admin/scout/feedback |
POST/GET | Submit or retrieve user feedback |
/api/admin/scout/ideas |
POST/GET | Create or list generated ideas |
/api/admin/scout/jobs |
POST/GET | Async job management |
/api/admin/scout/opportunities |
GET | List and filter opportunities |
/api/admin/scout/recommendations |
GET | Personalized recommendations |
/api/admin/scout/sources |
GET | Data source status and health |
/api/admin/scout/stream |
GET | SSE streaming for live analysis |
All routes share a utilities module (_utils.ts) that handles rate limiting, caching, SSE setup, database wiring, and core engine instantiation. One file, all the shared infrastructure. Clean.
The streaming endpoint is designed but not yet wired to the frontend. The plan is real-time analysis updates as sources return data. Watch the opportunities populate live instead of waiting for the full batch.
The Monorepo Migration
Innovation Scout started as a standalone Astro 4 repo. Good for prototyping, bad for integration. I migrated the entire thing into the ArgoBox monorepo in one session.
That meant:
- Splitting the engine into
packages/innovation-scout/(Tier 1+2) - Creating the module in
modules/innovation-scout/(Tier 3) - Registering the module manifest in
src/config/modules/innovation-scout.ts - Wiring subpath exports in
package.jsonto match API route imports - Adding
@argobox/innovation-scoutto thenoExternallist inastro.config.mjs - Writing
install.shanduninstall.shscripts (copies 17 files into position) - Updating all import paths from standalone to monorepo layout
The migration surfaced 7 build issues. Each one was a different kind of integration problem -- missing zod in root deps, lockfile desync, deep subpath imports that needed re-routing, layout component changes, prerender frontmatter syntax, and missing files from other sessions' parallel work.
All 7 fixed. All pushed. 199 tests still passing.
Personalization and Feedback
Innovation Scout learns from your feedback. Rate an opportunity as interesting or irrelevant. Mark ideas as "exploring" or "dismissed." Over time, the recommendation engine weights your preferences.
The personalization module tracks:
- Topics you consistently engage with
- Confidence thresholds that match your risk appetite
- Source types you prefer
- Time horizons that matter to you
This data stays local. No telemetry. No cloud analytics. Your interest profile lives in the same KV/D1 storage as the rest of ArgoBox.
The Test Suite
199 tests across the core engine. That number is from the standalone repo where the test harness lives. Zero TypeScript errors across both the package and the module.
The tests cover:
- Each data source fetcher (mocked HTTP responses)
- Aggregation and deduplication logic
- Confidence scoring algorithms
- Risk assessment calculations
- Personalization feedback loop
- Database adapters (D1 and in-memory)
- API route handlers
I'm not going to pretend 199 tests make the code perfect. But they mean I can refactor aggressively without wondering what broke. When the monorepo migration changed every import path, the test suite told me immediately which connections were severed.
Why Build This Instead of Buying It
Market intelligence tools exist. CB Insights, Crunchbase Pro, Exploding Topics, Glimpse. They're good. They're also $100-500/month and they show you what everyone else sees.
Innovation Scout is mine. I pick the sources. I tune the scoring. I control what "opportunity" means. When I add a new data source, the cross-domain fusion starts finding correlations nobody else is looking for.
And it runs on Cloudflare's free tier. The API keys for premium sources (NewsAPI, Crunchbase) are optional. Heuristic mode with Reddit and HN costs nothing.
What's Next
The immediate roadmap:
- Full Ideas page -- implement the generator, not just the stub
- Frontend SSE streaming -- watch analysis results populate in real time
- Persistent opportunities -- D1/KV storage instead of in-memory (survives cold starts)
- More data sources -- GitHub trending repos, Product Hunt launches, arXiv papers
The longer vision is Innovation Scout as the intelligence layer for all my projects. Surface opportunities, assess them, track the ones I act on, and feed outcomes back into the model. A personal market radar that gets sharper over time.
Seven data sources, 199 tests, 4-dimension risk scoring, and a clean 3-tier architecture. My own market intelligence system, running on my own infrastructure, tuned to my own interests.
That's the kind of tool I want to build.