A Few Thoughts on Human Data
I rarely talk about my thoughts, online or off, about the business of human data. With friends and family, I mostly complain about the details of the job itself, like how busy I am or how frequently I fly to SF. But human data is one of the most interesting industries I can name, and one I find myself constantly morally conflicted about. For the record, I don’t mean that as a bad thing, considering that I’m a software engineer within the space.
For context: I joined Handshake AI in September to help build a leading human data platform. Since then, we’ve grown significantly and become one of the top data partners for frontier AI labs. I’m fully convinced that we’re the right company to be doing this work. But the scope of what we touch, how billions of people will end up interacting with one of the fastest growing technologies of all time, is pretty scary, and there’s a lot of beauty in it. I want to talk about it here.
If you were at a US university in the last decade, you’ve probably used Handshake or been told by a career counselor to make an account. The original mission was (and still is) democratized access to career opportunities for young talent. In early 2025, we ran a simple experiment: could the network of experts and professionals we’d built for that mission also help frontier AI labs train better models?
The answer was yes, and Handshake AI became a serious investment almost immediately.
Conceptually, it’s important to remember that LLMs are next-token predictors. The function of an LLM is to learn a statistical representation of language, so that when we feed it a phrase like “The quick brown fox,” it knows that the next few most likely words are “jumps,” “over,” and so on. The magic is that this generalizes across language, grammar, syntax, and eventually reasoning-like behavior. If I ask an LLM, “What does the quick brown fox do?” it can respond with “It jumps over the lazy dog,” because it has encountered that phrase, and phrases like it, many times in its training set.
The process of teaching an LLM about language using internet-scale data is called pre-training. The process of teaching it taste, preferences, behavior, and judgment is called post-training. That is where human data comes in.
“Human data” sounds a little dystopian, but I think its prevalence/stickiness in tech is partly because of its attribution of labor to a real person. It originally made me imagine some vague image of “data labelers” sorting worked examples into “good” and “bad” piles. But the work I see our fellows create is much more specific, and much more human, than that. Human data looks like a scientist writing a problem that tests whether a model understands molecular structure, or a philosopher creating a moral quandary to determine a new LLM’s moral framework preferences.
The hard part is deciding what good means when evaluating model output. Good according to whom? In what context? Under what constraints? Should it refuse, redirect, hedge, ask a clarifying question, or answer directly? As models become more capable, the difference between a mediocre model and a great one increasingly depends on whether the humans training it can articulate the shape of good judgment.
Post-training and human data have enabled AI progress to moonshot over the last few years. This also makes human data extremely valuable, and as models become more capable, the demand for this data actually continues to rise. It’s a bit paradoxical, but intuitively, a few years ago it was very easy to prompt a model to create a coding snippet that it would get wrong. This prompt + output + correction data tuple was valuable for researchers to understand why early LLMs got these questions wrong, and as a result, they got smarter. Way smarter. To the point where it’s actually incredibly challenging now to write a prompt that causes a frontier AI model to straight-up fail at a coding question. That isn’t to say that AI models have “solved software engineering,” but labs will pay high premiums for these types of examples now as the examples get rarer to find.
When we lay it out this way, it seems like we’re treating knowledge as a finite resource, and human data companies like Handshake or Scale AI are trying to extract it from humanity for the sake of feeding it to our AI overlords. This is partly why I think human data is so morally complicated. People like to say that data is the new oil, and the analogy is useful mostly because it makes us uncomfortable. Oil is insatiable, geopolitically relevant, environmentally destructive, and morally entangled with almost every convenience of modern life. It gives us transportation, plastics, global trade, and an incomprehensible amount of human leverage. Data has a similar duality. The AI industry seems to need more of it every time models get better, not less. Every new capability reveals some new layer of missing judgment.
But unlike oil, human intelligence is not a natural resource sitting underground, waiting to be drilled. It belongs to real people. It comes from their education, their work, their cultures, their failures, and their years of trying to become good at something. This makes the business of human data feel extractive by nature, but not inevitably exploitative. The difference depends on whether we treat human knowledge like cheap fuel, or like something that’s worth our respect. This difference alone is enough to separate the winners from the losers in the human data space.
I am still immensely hopeful for the future and the ways AI will affect the lives of all people, but not because I think better models automatically make a better world. My hope comes from the fact that model behavior is still something we can shape. The way these systems answer, reason, refuse, explain, and admit uncertainty is trained through the accumulated judgment of real people. I have seen tasks move through layers of review and quality checks because the thing being captured is fragile: real judgment and knowledge. Every profession contains some hidden structure. We all carry around some private compression of the world that was never written down, and human data is one way of turning that specific knowledge into something that can steer AI toward better behavior.
This is why the cultural contempt for AI feels more complicated to me than I think it does to a lot of people. I understand the stigma. Some of it is well deserved. Tech culture can be arrogant and careless (“most fast and break things”), and a lot of AI products are being forced into places where they do not belong. But part of me still feels uneasy treating something that contains so much knowledge with pure contempt. I do not think models should be worshipped, or that criticism of AI is small-minded. I mean almost the opposite: knowledge itself deserves reverence, especially when it has been earned by people over years of study, practice, failure, and attention. I have so much respect for the knowledge that is found in our books, our institutions, our societies that encode more intelligence than any one person could hold. AI is increasingly becoming a vessel through which human knowledge is compressed and returned to the world. That has certainly made it dangerous in proportion to how much it carries, and worth handling with care for the same reason.
To be clear, respect is not the same as trust. In fact, respect often requires more scrutiny. We inspect bridges and elevators because we respect the lives of their passengers. We peer review science because we respect empirical truth. AI should be treated the same way. The right response to intelligence is reliable stewardship, not contempt or blind acceleration.
I’ve learned to love human data, and I’m honestly excited to be able to sit at the frontier of AI research and development. If billions of people are going to think with, learn from, and build alongside AI systems, then the behavior of those systems really matters. The humans in the loop matter, and the design of the loop matters even more. At its worst, human data can become a machine that consumes human attention and converts it into model capability. At its best, what we are striving to achieve, it becomes a way of rewarding people for the expertise they spent years developing, getting the next generation of industry professional ready for the AI economy, and shaping systems that will mediate how people understand the world for years to come.
So, if we are going to build machines that learn from humanity, then we should make sure they learn from us carefully, honestly, and with respect for the people doing the teaching.