AI workloads, data platforms, and infrastructure notes, written from the engineering edge between benchmarks and production.

RSS feed

So, What Does VAST Data Actually Do?

A BBQ-friendly explainer, written down so I stop doing it one friend at a time 1. Why This Post Exists (a BBQ Story) Earlier this week I was standing in a friend’s backyard when my phone lit up: VAST Data,…

I

Itzik — VP Mission Alignment, VAST Data

·

·

10 min read


A BBQ-friendly explainer, written down so I stop doing it one friend at a time

1. Why This Post Exists (a BBQ Story)

Earlier this week I was standing in a friend’s backyard when my phone lit up: VAST Data, the company I work for, had just announced a Series F round at a thirty billion dollar valuation. Before I had finished my burger, three friends asked me some version of the same question. ‘Wait, what does your company actually do?’

None of them works in tech. They wanted the BBQ answer, not the jargon one. I tried. It went okay. This post is that answer, written down, so I stop doing it one friend at a time. Two disclosures: I work at VAST, so I am not disinterested; and I will spend almost no time on the funding itself, because the company is the point.

The shortest answer is this: VAST builds the data platform companies use to store, find, move, and use their data for AI in one system, instead of stitching together separate products.

2. The Headline, Briefly

VAST Data closed a Series F of roughly one billion dollars at a thirty billion dollar valuation, led by Drive Capital with Access Industries, NVIDIA, Fidelity, and NEA participating.

Figure 1. Milestones officially announced by VAST Data via vastdata.com press releases. Bars show the amount raised in each round; the gold line shows post-money valuation where disclosed. The shape of the curve matters more than any single number on it.

The official version is on our site: VAST Data Valued at $30 Billion as AI Drives a New Infrastructure Stack. The rest of this post is about what the company is for.

3. What Is an ‘Operating System,’ Really?

We describe our product as an AI operating system, and the phrase is doing a lot of work.

Figure 2. Yesterday’s OS managed apps and files. The new job is to manage data, models, and agents.

An OS is the invisible layer between raw hardware and the things you actually want to do. Windows, macOS, iOS. You do not want to know which sector of the SSD your vacation photos live on; you want a folder called ‘Italy 2024.’ Every computing era produced its own OS, because each era had a different mess to hide. Mainframes, PCs, phones. The claim behind ‘AI OS’ is not that we wrote a new Windows. It is that the plumbing AI needs is different enough from previous plumbing that it deserves a new abstraction layer. The next section explains why.

4. Why AI Needs Its Own OS

Picture a large organization trying to build an AI chatbot that can answer any question about any product, policy, or past interaction.

AI work is not one thing. Training is the long schooling phase, where a model learns broad patterns from huge amounts of data. Fine tuning is company onboarding, teaching that model your industry, tone, and tasks. Reinforcement learning is coaching with feedback and scores so it gets better at the behavior you want. Inference is game day, when the model is live and answering or acting.

Figure 3. The legacy stack is a plumbing exercise. A unified AI OS collapses it into one system.

The data it needs lives in at least five places: call transcripts, contracts, order history, product docs, product images. To make anything useful, each one has to be copied, converted into a form AI can search by meaning instead of just keywords, parked in a specialized database, and kept in sync forever. Most companies call that a pipeline. In practice it is a Rube Goldberg machine.

Three symptoms: the data comes in too many shapes; the answers need to be fresh within seconds, not overnight; and the scale is strange, mixing the size of a data lake with the speed of a database and the throughput of a factory line.

A Formula 1 analogy. The engine is a marvel, but the engine alone does not make the car fast. The fuel line, tyres, brakes, aerodynamics, and pit crew all have to keep up. AI infrastructure is the same. You can buy the most expensive pile of GPUs ever assembled; if the data cannot reach them in time, they sit idle. Idle GPUs are one of the most expensive mistakes in modern computing. Most enterprise AI projects fail not because the models are bad but because the plumbing collapses. VAST builds the plumbing.

5. VAST’s AI OS in Plain English: The Seven Engines

Think of VAST as one platform that does seven jobs companies buy separately. Full write up on the VAST AI OS product page. Below is the BBQ version.

Figure 4. Seven engines, one foundation. The integration is the product.

DataStore (the warehouse).

One place for files, cloud style object data, and the kind of block storage databases use. That means the same data does not have to be copied three times just so three tools can read it. It is built for Six Nines of availability, roughly thirty seconds of downtime a year. It also includes a ‘similarity engine’ that notices when blocks are nearly the same and stores only the differences, which is a big reason the whole thing can land near 60% lower total cost than legacy storage.

DataBase (the library catalog).

Lets you ask both exact business questions and meaning based AI questions in the same system, on the same live data, without moving that data into another product first. We measure the meaning based search path at about eleven times faster at ninety one percent lower cost than the stitched alternative.

DataSpace (number portability for data).

Treats data across many data centers, cloud regions, and edge sites as one logical namespace. Remember when switching phone carriers meant giving up your number? Regulators forced carriers to keep the number with you. DataSpace does the same for enterprise data.

DataEngine (the kitchen inside the warehouse).

Small bits of event driven compute that run where the data already lives. An engineer writes ‘when a new PDF lands, extract the text and update the index’ and the system handles it. No shipping ingredients across town to the restaurant; the kitchen is inside the warehouse.

SyncEngine (the newsroom assignment editor).

Discovers, indexes, and continuously pulls data from all the places it actually lives: laptops, collaboration tools, CRMs, ERPs, application logs, sensor feeds, cameras. The assignment editor who absorbs every incoming feed so the rest of the newsroom can do its job.

InsightEngine (the librarian).

Turns raw documents and other data into a form AI can search by meaning, then makes sure the model can look things up in your real current data before it answers. A librarian who reads every new book the instant it arrives and can point you to the exact paragraph that matters.

AgentEngine (the air traffic control tower).

Safe production runtime for AI agents, meaning software workers that take actions instead of just answering questions. Handles credentials, retries, audit trails, guardrails, and coordination between agents. The tower that keeps the planes from colliding and enforces the rules of the airspace.

The point is not that one engine wins every beauty contest. The point is that seven of them on one foundation, with one security model and one set of interfaces, add up to something a stitched stack cannot match easily.

6. Under the Hood, Gently: DASE and the Tradeoffs It Refuses

If you remember one technical idea from this post, make it this one. DASE is the reason VAST says it can be simple, fast, and large scale at the same time. For years buyers were told they had to choose between one big box that was fast but rigid, or lots of smaller boxes that scaled but became a coordination headache.

Figure 5. Three ways to build storage. DASE is the one that does not force a tradeoff.

DASE chooses a third layout. Imagine one shared warehouse of data, with many simple workers in front of it who can all reach the same shelves. Need more speed? Add more workers. Need more room? Add more shelves. Because the workers are not each guarding a private stash, the system spends less time passing requests around and more time doing useful work.

That is where the value claim comes from. VAST says this design lets it separate speed from capacity, keep data fast without copying it all over the place, let busy and quiet jobs live together more peacefully, and protect data without paying the full penalty of triple copies. In plain English: less waste, less waiting, less complexity.

That is also the uniqueness claim. Most vendors started from one of the older layouts, so they can add AI features on top but still inherit the old plumbing underneath. VAST says it started with the shared warehouse design, and everything else compounds from there. If that architectural claim is true, it is the reason the company is different, not just another feature on a slide.

7. A Day in the Life: Bank, Hospital, Retailer

Three quick composites from the industries we work with.

Figure 6. Same platform, three very different industries.

Bank. An advisor meets a client whose life just changed, a new grandchild or a house sale. She asks out loud: ‘summarise this client and suggest three conversation starters.’ A governed agent reads across every internal system in seconds and answers.

Hospital. A radiologist is reviewing an MRI. The system highlights a pattern that looks like a case from two years ago, pulls the prior scans, treatment notes, and outcomes, and hands her the comparison inside the workflow she already uses. She still decides.

Retailer. A regional manager asks ‘why did sales drop in the northwest this week?’ and gets, within seconds, a grounded narrative joining weather, inventory, promotions, and the one supplier that missed a delivery. Not a dashboard. An answer.

8. VAST Polaris and the Cloud Service Providers

A large share of AI runs on ‘neoclouds,’ new GPU cloud providers that rent out AI capacity, plus the familiar hyperscalers and sovereign sites. VAST Polaris is our Kubernetes based control plane that deploys, governs, and operates VAST AI OS clusters across all of those, as one fabric.

Figure 7. Polaris: one control plane, many clouds and sites.

For a service provider this is table stakes: multi tenant governance, per tenant cost visibility, quota enforcement, no drift between regions. For an enterprise it means data sovereignty stops being a spreadsheet and becomes a policy the platform enforces. More on the VAST for cloud service providers page.

9. Where VAST Fits in the Landscape

To help orient readers, one map.

Figure 9. Horizontal: storage-centric to AI-native. Vertical: stitched from many vendors to one unified platform.

Traditional enterprise storage sits upper left: one vendor, but mainly built to store bytes. The lower half is the AI stack most companies run today, stitched together from many specialized products. The lower right is focused AI tooling that still depends on a lot of other infrastructure. VAST is trying to sit in the upper right: one platform that is built for AI from the start and still gives buyers one foundation to run on.

10. What It Means for Your Team

Three practical implications for a non technical leader.

Figure 8. The strategic question is not which AI tool to buy. It is who owns the data layer underneath all of them.

Treat data plumbing as strategy.

Cool AI demos without a clean data layer stay demos. If your organization is doing AI seriously, whoever owns the data layer owns the ceiling on how far you can go.

Beware the integration tax.

Ten bright point products plus a team of integrators is usually slower, more fragile, and more expensive than one good platform. The savings you imagined show up as duplicate copies, reconciliation bugs, and a security surface nobody fully understands.

Agents are the next management problem.

Once AI stops answering and starts acting, questions of permission, accountability, and audit become daily operating concerns. Decide now where agents will live, who supervises them, and how you will roll them back when something goes wrong.

11. Why This Matters Beyond the Tech Bubble

Back to the BBQ.

Figure 9. The platforms worth building are the ones you do not notice.

If the AI OS thesis is right, the next decade of software will feel a lot like the decade after Windows: a wave of new applications built on top of a common foundation. Most people will not hear the name of that foundation. They will just notice that their bank is easier to deal with, their hospital caught something earlier, their retailer had what they needed, and the people they dealt with seemed a little less hurried. That is the version of AI worth building. It only exists if someone does the tedious work of rebuilding the plumbing underneath. Your neighbor may never hear the name. That may well be the highest compliment.

Analogies and opinions above are mine. Not investment advice, not a formal company communication.

Discover more from Lots of Data - Thoughts around AI Workloads

Subscribe now to keep reading and get access to the full archive.

Continue reading