- mandate
- Posts
- ontology & what palantir does
ontology & what palantir does
given that we're building the palantir killer it's probably good to share a primer on what palantir does that isn't just "gov stuff + consulting"
this article aims to explain the origin of ontology and how the world developed w/o it. why we need it. how Palantir developed it. why the world needs one.
it was also adapted from an internal textql memo that was horribly written by a total moron — your mileage may vary
how the world developed w/o ontology [setting]
the state of the world today
there are two types of data companies
horizontal data companies sold to IT & R&D, can be customized, can handle generalized data, and can be built on AWS
vertical data companies had rigid specialized data, could not be customized, sold to business teams, and had to deliver value
no product can be customized and is also built for business value delivery
how it came about

if you sell to IT teams [general]
they don’t care about business value
they want generalizability
results in thinner and thinner layers on the cloud providers
you build it for as flexible a data model as possible, w/ no templates
if you sell to business teams [opinionated]
they don’t care about generalizability [in the absolute sense]
they want business value per unit cost to be high
result is competition forces you to forfeit the long tail of generalizable business use cases
you build as little data model as possible - cannot add to it
thse budgets are divided by the cloud providers
if your user has to think about the cloud providers,
if your user doesn’t, it’s a business unit
so how did this ecosystem evolve?
over time, this resulted in both apps and infra becoming thinner and thinner layers of pointier and pointier point solutions
for infra
you have storage and compute [snowflake]
but then transformations on that [dbt]
and then monitoring of those pipelines [monte carlo]
and then catalogs of those monitors [atlan]
for apps
netsuite: need customer object… to run business
salesforce: need specific customer object for sales
gong: need specific customer object for sales call recording
clari: need specific customer object for using notes from sales call records to help forecast sales comp planning
over time this means a combinatorial explosion of data models
there’s nothing connecting them
there is no shared ontology that you can use to bring things together
teh apps aren’t built on top of teh data

why we need ontology [problem]
this seems fine - why is this a problem?
frog boiling in water, each point solution feels fine but they pay a debt of complexity
entity resolution is impossible to disentangle,
source of truth is unresolvable
ontology / semantic layer is unresolvable
resulting in:
you don’t know what’s going on - too many competing answers
this results in companies taking 10^5. times longer to answer questions than they need to
^this also means AI cannot meaningfully find the right answer
you pay way more money, in both # of apps & amount of data stored
this results in companies paying 5-10x more for software than they need to
who benefits from this trend?
every vendor that doesn’t deserve to exist benefits from this trend
but in a combinatorial way, the underlying infra benefits [cloud providers]
the cloud providers get 10x more compute as “account info” is stored 100x redundantly
and every calculation is done 100x more times than necessary
so why can’t we build a generalizable business solution for all of this?
it’s cost-prohibitive to get off the ground
to reach business value parity w/ a point solution but a generalizable data model, you need to build out 10x as much
you cannot get a business team to pay 10x as much to subsidize the early R&D
you cannot motivate an IT team to pay 10x as much
if you observe the
how Palantir developed ontology [solution]
so that’s it? everyone’s fucked against a future of more and more layers?
unless there is an industry sector where budgets like this are more integrated
where you can bundle a huge amount of use cases under a large enough value prop that you can justify building the whole thing in a generalized way
this would be able to subsidize the R&D costs of building out so much, and all the unknown unknowns
a solution like this once built could then make its way down the market since the R&D costs are already fronted
the government is a huge use case, w/ huge budgets, and no centralized IT spending but rather end-to-end contracts
served by a company like… Palantir
who’s been making its way into the commercial sector
after years of false starts and failures
w/ a generalized system of record
that it started w/ $100M contracts like that w/ swiss Re
and is recently releasing PLG versions of their platform… via AIP
Palantir has this?
yes.
palantir was basically in a hyperbaric chamber training like Vegeta — and subsidized by $4B of losses and costs from peter thiel
to develop a platform that can connect the app layer to build directly on top of a generalized storage

thus creating a composable system of record
Palantir has every component of the modern data stack
ingest
storage
transform
query
model
notebook
ML
but also a generalized framework for business app building, for apps like
customer service engine [general]
account payable automation [finance]
dynamic scheduling [provider healthcare]
inventory rebalancing [supply chain]
warranty claims [fraud and insurnace]
because they owned both sides of the equation - the business case and the general data infra, they needed to build the connective tissue to connect both pieces. so they developed Ontology
Part 2
this is a 101 for what palantir does, how it relates to the modern data stack
this article assumes that you know what the modern data stack is
if you don’t know - do not look at matt turk’s trash; use this https://a16z.com/emerging-architectures-for-modern-data-infrastructure/
its a meme that no one knows what palantir does
palantir is the apple of the modern data stack, in that it has built an integrated ecosystem with tightly coupled inter-operability across all the pieces of the data stack
in order of furthest from the user to closest to the user, the world is
category | MDS Winner | Palantir Foundry Equivalent | |
---|---|---|---|
Data Infra | Ingestion and Transport | Fivetran | |
Orchestration & Transforamtion | Airflow / dbt | ||
Storage & Compute | Snowflake / Databricks | Foundry | |
Semantic Layer | [no winner, LookML] | ||
Analytics | Business Intelligence | Tableau | Quiver |
GraphDB UI | Neo4J | Vertex | |
Notebooks | Hex / Jupyter | Contour / Code Workbook | |
Office | Document Processing | Word / Notion | Notepad |
Spreadsheets | Excel / Equals | Fusion | |
Workflows | App Building [No Code] | Retool | Workshop |
App Building [Low Code] | Webflow | Slate | |
Robotic Process Automation | Zapier / UIPath | Automate / Workflow Builder |
claiming something is the “apple” of X is a big claim, it implies that there is very strong ecosystem synergy that results from combining it, that no point solution can replicate
we believe this claim is true
the glue that wraps a lot of the apple products together its the core suite of apple apps, and how they feel
in the case of palantir - it is ontology
recapping the need for ontology
palantir was the only company that served use cases where unique combinations data integration → user-facing applications
as such they had to build a composable data stack [not unlike the modern data stack] that’s tightly integrated with their composable application stack [which is very different from the rest of the world]
this tight integration is held together by a flexible system of record called “ontology”
parker conrad’s compound startup principle is here - by bundling everything together
they can make use of shared abstractions for interoperability guarantees
[in this case, the ontology]
they can also charge much more by bundling the costs of a ton of components underneath the parent
going to war with SAP
ripping and replacing systems of record by replacing workflows on top, before ripping out the whole thing
palantir’s internal teams view the world this way