Alienation
From the outside communities can be identified by their shared collective behaviors, whether it’s slang, clothes or general values.
How else do you explain the below outfit? For added realism imagine a Blue badge and head tilted downward desperately avoiding eye contact. Just close your eyes and imagine them coding away at a café on side projects on their silver MacBook with corporate stickers. It’s the weekend and who can handle knowing that more successful people exist somewhere.
For self proclaimed individual minded folks, we sure all do look the same.
The Quest for meaning
For all we share in the real world our differences are metastasizing far more in our online personas. The superheroes of old are dead, we don’t need a caped crusader to protect us from a tyrannical state, we need a Venture Capitalist live action roleplaying his way to Mayor to help us better manage the decay in our cities. Danger lies in our processes and not our enemies, we crave volatility, seeking refuge in the wisdom of the ancients. It’s not SuperMan that will save us but LindyMan.
So what gives? Why are lots of smart people both alienated and rich? This is perhaps the first time in history where nerds can be so well paid but something seems missing. Maybe moving half way across the world in a 1 bedroom apartment wasn’t really worth it. What is one to do with their safety net? Daydreaming or hustling don’t feel quite as satisfying as making, contributing to a community.
So why feel upset when identical projects become bigger than ours, does jealousy betray our true intentions? Why do we use software tools at all?
Dev tools as actualization
People use software not (only) because it solves a business problem but because it solves a core emotional need.
Solving a technical problem means your users will exchange money for a solution to that problem.
Solving an emotional problem means your users are instead community members, who will contribute code, ideas, blog posts, videos and whole bunch of externalities that come with attention to make your success assured.
The successful dev tools of the near future will all be communities.
How many businesses can claim to help you solve both your psychological and self fulfillment needs? The market cap of that is the sum total historical value of states, religions and the entertainment industry.
However, failing to address these emotional needs has made it very difficult for most Machine Learning startups to achieve any kind of widespread success.
The Business Models of AI
Can we use AI to tell us which projects to invest in?
AI won’t help you reach product market fit but it’ll help you scale your business to unprecedented heights once you do.
Business models in Machine Learning startups have traditionally fallen into one of 5 camps which I’ve sorted in increasing profitability.
Service
What: Charge user per inference
Pro: Clear business model
Con: Network distillation makes it impossible to keep a moat. One research paper away from being disrupted.
Example: Clarif.ai
Consultancy
What: Help businesses without ML expertise to use ML
Pro: Lots of low hanging fruit with high upside
Con: Only sustainable if Machine Learning continues to be hard. (it won’t). The vast majority of ML startups are consultancies.
Example: Most ML startups
Exception: If you’re Geoff Hinton tier, starting an LLC lets you get acquired instead of getting hired like the rest of us plebs
Media
What: Publish cool papers and get lots of attention
Pro: You get to work on whatever you want and hire the best
Con: Only possible if you’re already very rich and successful
Example: Open AI
Note: Open AI is the hardest to value because likelihood of AGI is small but reward is infinite. So if they succeed they will join the greats Like Einstein and Newton and if they fail as a Ponzi.
Platform
What: People build their own services on top of your platform
Pro: Value is a percentage of everything built on top of it. This is good.
Con: Hard to build
Example: Weights and Biases
Community
What: Platform + strong in-group identity
Pro: Has benefits of platform, media and religion
Con: Very hard to build
Example: HuggingFace
Knowing which domain you’re in is crucial for success if you think you’re building a platform but are instead building a media business then you’re competing against the best media businesses not just your competitors.
If your audience is primarily legacy financial institutions then openness and quirkiness are liabilities.
Traditional enterprise sales would dictate that you build something useful and then try to sell it to one of the larger cloud companies whose ML modeling pipeline tends to now look like the below.
These large companies don’t need your service, media, consulting services or platform since they have enough internal versions of it. The entire premise of Enterprise Sales seems to have a major weakness when it comes to Machine Learning startups.
The only thing that makes them large ML companies crack is large communities that they end up having to support out of existential necessity.
Embrace, Extend, Extinguish is the only possible strategy for survival against strong communities but it runs always runs the risk of not guaranteeing the killing blow.
Metcalfe’s law
Metcalfe’s law has been a great way to value social networks before they are profitable. The value of a community increases as a square of the number of users. But a community of makers instead of just consumers increases that exponent even more.
I’ve previously called Open AI a media company and this has been a widely misunderstood point. What I should have said was academia is in the media business where contributions are competing for the sum total of all researchers finite attention. Open AI just happens to be the best at just that.
Media is the new marketing and recruiting strategy but it’s potential energy failing to materialize into something more without a platform or community to anchor on.
Reflexivity
How do you go from a cutesy emoji like 🤗, a fork of an existing project released by one of the largest companies on in history
To saying:
“I think one of the big challenges that you have in machine learning, it seems these days, is that most of the power is concentrated in the hands of a couple of big organizations,” he said. “We’ve always had acquisition interests from Big Tech and others, but we believe it’s good to have independent companies — that’s what we’re trying to do.”
Is HuggingFace Live Action Roleplaying?
Yes! and this is a particularly effective strategy for any online first community. Reflexivity means that if enough people believe something then it’s more likely to become true.
In 2021, Reflexivity has become a popular and recurring theme in the media. Just buying Tesla and Bitcoin meant you outperformed most hedge funds. Reflexivity is so powerful that groups like WallStreetBets can coordinate by 💎🙌’ing their way to a global peaceful revolution against the existing financial system.
“If enough people believe something then it becomes true” which finally entered the mainstream lexicon with “🦍s together strong”
The world is more interesting if HuggingFace believes it will be the ML company of the future, regardless of whether it’s true or not. The harder they believe it, the more likely people like myself will believe it, write articles about them and the feedback loop continues.
All of us are craving good stories.
FOMO insurance
HuggingFace is positioning itself to be the npm of Machine Learning but what are they actually selling? It’s not requests since they’re not a cloud provider, it’s not algorithms because they’ve open sourced them all.
HuggingFace sells FOMO insurance
FOMO is a real emotional problem that nags at you, it forces you to make bets instead of suffering from endless analysis paralysis and those bets take a long of time, education and grit to see through. HuggingFace hugs you and tells you it’ll all be OK.
How I would market HuggingFace if I were them: “Don’t worry about the specifics about which transformer architecture you’ll need to learn about or support, we’ll implement them all for you and let you know which one is the best one for you.
Someday you’ll find the right dataset and get to build your startup or get that promotion and we can promise you, that the dataset loaders we’ll have for you will be 👨🍳 😘
No downside, unlimited upside is a compelling value proposition. Especially if Transformers are universal computation engines. IMO, jury is still out on this claim.
But even if the jury is out, reflexivity could still make it so Transformers are actually the best architecture for most problems. Hardware will get hyper-optimized for them and report their public benchmarks, compiler writers will showcase their various IR, fusion, lowering tricks to make it so they are the best at running transformers. And now even if some other random technique would actually be better in the long run it takes a lot of effort to know ourselves out of this local optimum.
The Lean Startup Redux
The Lean Startup methodology posits that the best way to build a successful company is to figure out what your users want or need and built it for them. But for dev-tooling specifically we’ve artificially separated what our users want at a personal level vs what they want at a professional level.
Tensorflow was published in 2016 and is supported by one of the largest companies in history and has about 155K stars on Github. HuggingFace is a project started by a few researchers with 44.8K stars.
Github stars are the best measure of product market fit for any developer tool
Product market fit doesn’t guarantee that you’ll supplant a tech giant but the lack of it guarantees that you won’t. If you can’t convince anyone to use your product for free, how would you ever convince them to pay you?
As an another example: Julia the programming language runs programs quickly because of two core features: Multiple dispatch and the Julia community.
Multiple dispatch because it helps you write generalizable code that runs fast and the community because if you do publicly announce anywhere that Julia is slow, you’ll get a Julia community member correct you by rewriting your entire codebase.
Historical Fiction
History is told from the perspective of the best storyteller.
How would you value a community when comparing businesses? Infinity? Zero? Something in between?
If models assessing the value of a business fail to capture the importance of having an audience in the form of Github stars or Twitter followers then instead of yelling out bubble before heading back to our armchairs it’s time to acknowledge that our existing explanations and stories have failed to provide an adequate explanation for the Rise of HuggingFace.
Some apocryphal story about tulips in the 1600’s that lasted about 6 months has prevented us from acknowledging when paradigm shifts are happening.
HuggingFace’s moat is the community not the source code
The Religion of OSS
Religions have arguably the best structure to solve emotional problems. The best open share many similarities with religions.
Founding moment: First commit
Scripture: Git history
Priests: PR mergers
Priests in training: PR committers
Followers: Users, Bloggers, Memers
Every company in the future will be a social network but by focusing their audience they can create a much more tight knit group identity.
Tinkering
As an example if I really wanted to see a pomegranate syrup Coke be released, I’d need to win some Coca Cola competition, there would probably be tons of entrants, backroom meetings between Coke execs, a long chain of complex commands that needs to be orchestrated across the entire supply chain. No matter how hard I try I will be a Coke consumer at best and not a community member.
Perhaps the only intelligent thing Marx has ever said was his theory of alienation that posits that workers detached from the right to think and participate in the design of their products invariably feel alienated. OSS communities solve this problem by making everything introspectable, remixable & improvable.
Next time you procrastinate on answering a Github Issue or Pull Request on your project consider that a real person on the other end was hoping to get your attention and feels upset or insecure that they didn’t get it. With more time and attention this is someone you could have turned into a maintainer. We all need pats on the back from the people we admire, so help first time maintainers to scope their contributions, don’t nitpick and make them feel valued and appreciated even if the contribution was simple.
Schrodinger’s benchmarks
Database Companies or Hardware Vendors will straight up sue you if you publish benchmarks insinuating they are slow
Open source companies: open Github issue to improve product performance
Closed source companies: sue people showing their slow performance
Closed source benchmarks can be both fast and slow depending on who runs them.
Open source products get hardened by constant user feedback.
The Recruiters Dilemma
We only hire the best
The best all have offers
The number one complaint I’ve heard from hiring managers is that they can’t find good candidates. My hot take is that this is a self imposed problem that is failing because of a narrow focus on what it means to be good i.e: worked at a more prestigious competitor.
An open source product with a strong community means you can hire the best without paying them top dollar and you’ll get a steady stream of contributions and bug fixes for free.
Community is a recruiters hack, hire the top contributors as software engineers, the best writers as marketers, the issue closers as product managers. They are the 100x employees hidden in plain sight, give them a bay area salary and tell them they can work remotely and you’ll have an excellent and loyal employee.
This strategy heavily de-risks hiring for a company but also provides a good hedge for the candidate in case HuggingFace doesn’t have a role for them. If you tell the world that a reliable way for them to get a 6 figure ML job is to make a contribution to HuggingFace then a lot of people will make contributions to HuggingFace and a small percentage of them are likely to be very good. The HuggingFace ecosystem and skillset becomes more valuable to other tech companies.
Everything is awesome!
So why do all ML startups publish tutorials and blog posts? It’s not because they want to but because they have to. They are also attempting to LARP their way to a community but are unrelatable. They are your parents weird friend giving you real advice on life as you politely nod along and wish you were somewhere else.
“Content Marketing” is for the most part too safe, it’s a processed vanilla ice cream flavor meant to be disliked by as few people as possible but loved by almost no-one. You’re great, your competitor is great, everything is great!
You don’t build a community by bolting a “content strategy” on top of your product. HuggingFace interacts their users personally, frequently and publicly. You root for the project and founders and you don't need a salary or equity to do so.
A community can only build a strong in-group identity if it’s OK with being disliked. This is something game developers understand very well. A community needs the ability to say I don’t like product X or company Y.
For example, the most hated character in DOTA 2 will never be removed from the game. Techies is a character that places bombs all over the map and can trigger them to kill almost any hero in the game while being on the other side of the map. Techies can make games drag out to 90+ minutes from the average 30 minute DOTA game. But the few players that play Techies love him that they often exclusively play this hero.
TL;DR
Online communities are replacing religions
Dev tools need to be open online communities
HuggingFace is a great reproducible case study for ML startups
Have fun, make friends, LARP more
Outro
Most of the above ideas are well known among Game Developers but have recently become more obvious in Open Source communities. HuggingFace was perhaps the ML company that embraced all of the above the most.
I’ve been meaning to write more about game developers and what we can learn from them when building other kinds of software businesses. So stay tuned and make sure to subscribe if that sounds cool.
Acknowledgements
Thank you sudomaze, oiboimishka and mczuggins for hanging out while I was streaming myself writing this on twitch.tv/marksaroufim
Ainur Smagulova, Jason Antic, Hamel Husain, Adam Nemecek & Andrew Carr for invaluable feedback.
And as always the Robot Overlord Discord community for being a seeding ground for many of my recent ideas.
Interesting post. V enjoyable read.
Brilliant post Mark. I laughed out loud when I saw the Gaussian meme & the “I do ML to share cool stuff on Twitter”, basically described my past 3 years online.
Not only that you’ve put into words something I couldn’t put my finger on... why HuggingFace + Weights & Biases have quickly become two of my favourite ML companies.
Taking notes and sharing this.