Methodology
The Dominion List is an open-source database of US-incorporated companies founded or co-founded by people with significant Canadian roots. It tracks 430 companies that have collectively raised $347B in venture capital. This page documents how the database is built, maintained, and verified.
Inclusion Criteria
A company qualifies for The Dominion List if it meets all of the following conditions:
Founder connection
At least one founder or co-founder must have a meaningful Canadian connection, defined as one of the following:
- Birthplace — born in Canada or raised there through formative years
- Education — attended a Canadian university, college, or secondary school
- Citizenship — Canadian citizen or permanent resident
- Birthplace + Education — born in Canada and educated at a Canadian institution
- Education + Citizenship — educated in Canada and holds Canadian citizenship
Brief visits, conference attendance, or short-term work stints do not qualify. The connection must be substantive enough to have shaped the founder's trajectory.
Company requirements
- Must be incorporated or headquartered in the United States
- Must have raised venture capital, achieved meaningful revenue scale, or reached a significant valuation
- Must be a technology company or venture-backed startup — traditional businesses, franchises, and lifestyle businesses are excluded
Examples
Taxonomy
Industry categories
Each company is assigned one primary industry:
Company stages
Stages reflect the most recent known funding round or company status:
Canadian institutions
The database tracks 29 Canadian institutions, including universities, colleges, and select secondary schools. The most represented are:
Other tracked institutions include Carleton, Concordia, McMaster, University of Calgary, University of Ottawa, Wilfrid Laurier, York, and select feeder schools like Upper Canada College, St. George's School, and Shawnigan Lake School.
Data Pipeline
The database is built through a multi-stage process combining manual research, automated enrichment, and cross-referencing against external data sources.
1. Discovery
Companies are identified through founder network research, venture capital database screening, university alumni records, community submissions via GitHub, and cross-referencing with public datasets such as the CVCA (Canadian Venture Capital Association) and Y Combinator alumni lists.
2. Verification
Each company's Canadian founder connection is verified against public sources: professional profiles, university alumni records, Wikipedia biographies, press coverage, and corporate filings. Companies remain unverified until the Canadian connection can be confirmed through at least one reliable source.
3. Enrichment
Once verified, each entry is enriched with structured data:
- Funding history — full round-by-round data sourced from venture databases, SEC filings, and press releases. Round types are normalized to a standard taxonomy (Seed, Series A–G, Growth, etc.) and exit events (IPO, Acquisition, SPAC) are tagged separately.
- Valuations — post-money valuations from funding rounds where disclosed. Public companies use market capitalization as of the most recent data point.
- Founder profiles — professional profile URLs, X/Twitter handles, Wikipedia links, biographical summaries, and headshot images sourced from company websites and public profiles.
- Institutional mapping — each founder's Canadian institution(s) are mapped to their city for geographic analysis (e.g., University of Waterloo and Wilfrid Laurier both map to Waterloo, ON).
4. Normalization
Funding amounts are standardized to USD. Canadian-dollar figures (primarily from CVCA data used in comparative analytics) are converted at Bank of Canada annual average exchange rates. Round types are normalized from source-specific labels (e.g., "Venture Round - Series B" becomes "Series B"). Amounts are deduplicated to avoid double-counting when multiple sources report the same round.
5. Ongoing maintenance
The database is updated continuously as new rounds close, companies are acquired or go public, and new qualifying startups are discovered. Version history and update dates are tracked in metadata.
Data Schema
Each company in data/companies.json is a structured object. Key fields:
Company fields
| Field | Description |
|---|---|
| name | Official company name |
| website | Primary company URL |
| description | One-line company description |
| hq_city, hq_region | US headquarters (city and state) |
| founding_year | Year the company was founded |
| industry | Primary industry category |
| stage | Current company stage |
| status | Active, Public, Acquired, Inactive, or Defunct |
| yc_batch | Y Combinator batch, if applicable |
| capital_raised_usd | Total capital raised in USD (computed) |
Founder fields
| Field | Description |
|---|---|
| name | Full name |
| role | Title (e.g., Co-Founder & CEO) |
| LinkedIn profile URL | |
| x_url | X / Twitter profile URL |
| wikipedia_url | Wikipedia page, if notable |
| bio | Brief biographical summary |
| canadian_connection_type | birthplace, education, citizenship, or compound |
| canadian_institution | Primary Canadian institution |
| canadian_institutions | Array of all Canadian institutions attended |
Funding round fields
| Field | Description |
|---|---|
| date | Round close date (YYYY-MM-DD) |
| round | Normalized round type (Seed, Series A, etc.) |
| amount_usd | Amount raised in USD |
| valuation_usd | Post-money valuation, if known |
| lead_investors | Array of lead investor names |
| other_investors | Array of participating investors |
| source_urls | Array of source URLs for the round |
Analytics Methodology
The Analytics page presents several computed views of the data. Key methodological notes:
- Combined value uses the best available valuation for each company: last funding round post-money valuation for private companies, market capitalization for public companies. This is a point-in-time estimate, not a sum of capital raised.
- Capital raised sums all venture funding rounds. Exit events (IPO proceeds, acquisition prices) and post-IPO rounds are excluded from the total.
- US vs. Canada comparisons (the "gap is widening" and "more capital, fewer companies" charts) use CVCA Market Overview data for the Canadian side and Dominion List data for the US side. The comparison window is 2010–2025. CVCA 2025 figures are linear estimates based on Q3 YTD data. All amounts are in USD, with CVCA figures converted at Bank of Canada annual average FX rates.
- The "Dominion 5" (OpenAI, Anthropic, xAI, Tesla, SpaceX) are flagged separately in deal-size analysis because their scale distorts averages. Analytics show both inclusive and exclusive views.
- Institution-to-city mapping in the Sankey diagram maps each institution to its physical city (e.g., University of Waterloo and Wilfrid Laurier both appear as "Waterloo").
Data Quality
Each entry carries implicit quality signals based on how it was sourced and verified.
Verification status
A company is considered verified when its Canadian founder connection has been confirmed against at least one independent public source. As of the latest update, 375 of 430 entries are verified. Unverified entries remain in the database but should be treated with appropriate caution.
Valuation confidence
Valuations are tagged with a confidence level where available:
- High — public market capitalization or disclosed acquisition price
- Medium — post-money valuation from a reported funding round with a credible source
- Low / None — estimated from round size, comparable companies, or undisclosed. The majority of private companies fall into this category
The valuation_confidence field in each company's metadata object reflects this tier. Aggregated valuation figures (like combined enterprise value) should be understood as order-of-magnitude estimates, not precise sums.
Limitations & Known Biases
- Coverage is not exhaustive. The database captures companies discoverable through public records, founder networks, and community submissions. Stealth-mode companies and founders who don't publicly identify their Canadian roots are likely underrepresented.
- VC-funding bias. Inclusion requires venture capital, significant revenue, or meaningful valuation. This systematically excludes bootstrapped companies, even successful ones, and skews the dataset toward the venture-backed ecosystem.
- Geographic skew. Discovery methods favor companies in major tech hubs (San Francisco, New York, Seattle). Canadian founders building companies in smaller US cities may be underrepresented.
- Recency bias. Post-2010 companies are better covered than earlier ones. Companies founded before 2000 are likely underrepresented, particularly those that were acquired or shut down before the current era of public startup databases.
- Valuation data is incomplete. Valuations are available for a subset of companies. Private company valuations are based on last known funding rounds and may not reflect current fair market value.
- Industry classification is subjective. Many companies span multiple sectors. Each is assigned a single primary industry based on its core product and revenue model.
- Founding year ambiguity. Some companies report different founding dates across sources. The database uses the earliest credible date.
- Survivorship bias in analytics. The analytics page reflects companies that have raised capital and reached some scale. It does not capture the full universe of Canadian-founded US startups, many of which failed before reaching visibility.
Changelog
Major changes to the database are tracked here. For granular history, see the commit log.
| Version | Date | Changes |
|---|---|---|
| v1.0.0 | Apr 29, 2026 | Initial public release. 430 companies across 20 industries. Structured funding history, valuations, founder profiles with Canadian institutional connections. Interactive analytics with 7 charts. Full methodology documentation. Open-sourced under MIT License. |
Contributing
The Dominion List is open source. You can contribute by:
- Submitting a new company via GitHub issue
- Suggesting corrections or updates via GitHub issue
- Opening a pull request directly against
data/companies.json
All submissions require at least one public source confirming the Canadian founder connection.
Citation
If you use The Dominion List in research, journalism, or analysis, please cite it as:
License
The source code for this project is released under the MIT License. The dataset (data/companies.json and related data files) is released under the same license. Both are free to use, modify, and redistribute with attribution.