About THE DATASET

arun rao

linkedin twitter github

i co-lead a large team of research engineers working on ML recommenders and ranking models at Meta CoreML; we serve 2.7bn people.

i spend my time thinking about real-world applications of ML, crypto, internet protocols, APIs, contemporary art, and design.

i write the Substack letter Hash Collision, where i report on developments in science, tech, culture, economics, and ethics. i co-wrote “A History of Silicon Valley, 2nd Ed.

in the past, i led an applications and platform machine learning team at Amazon Music in San Francisco. prior to that, i was a co-founder of Starbutter AI (Skydeck F17) and a quant debt and derivatives trader at PIMCO.

i’m a grad of UCLA, Penn/Wharton, and the Indiana Academy.

i like books, running, yoga, math, grammar and style, dead ancient languages, modern computer languages, black hole physics, bears, and pad see-ew.

writing

What the heck is Web3? A primer.

8 Surprising Things You Didn’t Know about the Metaverse, and How it Will Change Your Life More than Your iPhone

Why Crypto Matters (It’s a Lot More than Bitcoin)

Capitalism, Socialism, and Inequality

Best Books of 2020

Why the new AI/ML language model GPT-3 is a big deal

Capital

Primer on AI and Machine Learning (Part 1, Non-Technical)

Machine Learning (including Deep Learning and Reinforcement Learning) for Engineers (Part 2, Technical)

Musings on Datasets

A List of the World’s Best Artificial Intelligence (AI) Institutes and Think Tanks

resources

Google: The Unreasonable Effectiveness of Data – 2009

Google: The Unreasonable Effectiveness of Data – Revisited 2017

Open AI: GPT-3: Language Models are Few-Shot Learners

MIT: The Data Nutrition Project

Open AI/Jack Clark: Import AI on new Machine Learning advances

UC Irvine: Machine Learning Repository of Datasets