(Re)Share | #29 - Hard Problems for 2024

Sub-scale ML models | Robotic chefs | Nuclear propulsion | EV tax credits

Jan 21, 2024

Welcome back everyone and happy 2024! I’m very excited to be back from break and I hope everyone reading this can say the same. One of my biggest new year resolutions is to make (Re)Share a less occasionally weekly newsletter so hopefully you’ll be seeing my name in your inbox much more frequently this year. With that said, I missed my personal reading challenge by 75%, so I don’t have a great track record with resolutions.

Shameless Plug

Hard Problems for 2024 - It’s now a personal tradition to kick off the new year with a post that summarizes the areas I’m most interested for investment over the coming 12. This year I’m taking things one step further by committing to developing mini thesis for each over the coming year. Long-time readers will note that the list broadly reflects what I include in this newsletter. Up first on the docket: robotics.

Prideful Plug

Road Trippin - Ventures Together is one of the strongest founder-only investment syndicates in London and includes some of my favorite people in the ecosystem. While their investment process and member caliber is reason enough to be impressed, their new charity initiative takes it up a notch. A collection of founders and investors are joining forces with the Driving Ukraine team to transport convoy of vehicles 1,318 miles to Lviv, Ukraine. These vehicles will play a crucial role in saving lives in the conflict, which is needed more than ever. If you have the means please join me in donating to this inspiring cause.

Stuff Worth Sharing

It’s important to be prompt - The horizontal (generalist) vs. vertical (specialist) debate is present in many walks of life, but none more active than language models. Much ink has been spilled over the opportunities and challenges of taking either modeling path, perhaps best articulated by my friend Mike in his AI business models post. This research by Microsoft gives ammunition to the generalist camp by showing that GPT-4 can outperform medical specialist models through a novel prompting strategies. The piece builds on their earlier research but the newer results show far more definitive results. Advanced prompting is an interesting thought area. On the one hand, compare the insanely impressive image results that some users are able to create with Midjourney and against my “artistic” trash. On the other hand it seems impossible to accept that complex, nuanced prompting will be accepted as a necessary UX for the mass market. SaaS has shown that everyone wants an all-in-one solution until the UX becomes unwieldy, which gives birth a highly fragmented collection of solutions, which then starts to buckle under the volume and the cycle continues. LLMs could chart a different course with the plasticity of the capability, potential abstraction points and, perhaps most critically, the cash in play.
Sharing is caring - AI’s on a tear because data is abundant but that’s much less true in the world of atoms. Robotics have struggled in the past for several reasons but one of the most significant is the training data cold start problem. Typically training sets are created slowly and tediously by researchers in laboratory environments for very specific tasks, making any broadscale deployment impossible. The RT-X project aims to overcome with 32 robotics labs pooling their efforts to curate an asset of nearly a million robotic trials for 22 types of robots in the hopes of enabling generalist, cross-embodiment functionality. The post features a range of impressive trial recordings and zero-shot executions so I encourage a full read.
01011001 01100101 01110011, Chef! - Speaking of RT-X, a research group at Stanford released their work on Mobile ALOHA and it is absolutely delightful. Seriously if there is one link you click in this entire issues, this is it. The team aimed to develop a system for low cost, light-touch robotic training for a mobile (i.e. not fixed) bi-manual (i.e. two hands) system and the results are incredible. The system was able to perform a wide range of complex tasks like laundry, house cleaning and my personal favorite, cooking a 3-course meal Cantonese meal. The system leverages imitation learning whereby a human tele-operates the rig as a means of demonstration and saw results of over 80% success with <50 human demonstrations per task. With a total cost of just $32K this is a very promising development in a multi-functional robotics future. Full paper here.
A picture’s worth ~~1000~~ five words - Deepmind released their latest text-to-image tool, Imagen 2, and it’s awesome. The post goes into the typical detail of how the model was trained and I found their approach to aesthetic scoring (weighting based on human preferences for qualities like good lighting, framing, exposure, sharpness) quite intriguing. The piece goes on on to highlight more advanced functionality such as in-painting and their default integration with Google’s image provenance offering, SynthID, which we’ve discussed in past issues.
It’s not the size that counts - With the cries of an impending data drought on the rise I’m increasingly interested in solutions for “small data”. Microsoft Research released their work on Phi-2, an attempt to show that big things can come in small packages. 7 billion-parameter language model that demonstrates outstanding reasoning and language understanding capabilities, showcasing state-of-the-art performance among base language models with less than 13 billion parameters. On complex benchmarks Phi-2 matches or outperforms models up to 25x larger, thanks to new innovations in model scaling and training data curation - more specifically “textbook quality data”. Microsoft may be tied for the world’s largest company, but the little guys still have a shot.
Chip stack - What better way to kick off a new year than by fanning the flames of techno-nationalism? This article in The Economist shined a light on the recent trend of national governments anointing AI champions. The piece mainly highlights Abu Dabi’s A171 but it also covers Mistral, Aleph Alpha and others. Billions of government spending is being earmarked for compute (see chart in piece) which will be a sobering look for my fellow Anglophiles because we’re really, really behind.
It’s a wonderful halflife - I’ve shared multiple episodes of the Age of Miracles podcast in past issues but I sort of wish I waited until now. With the season over, full interviews are being released and this conversation with Zeno Power founder Tyler Bernstein is awesome. Zeno is harnessing the power of radioisotopes to provide low wattage propulsion systems for lunar deployments. It’s a super interesting discussion that covers radioactive decay, the state of space mission regulation and NASA’s upcoming Artemis missions. If you’re a space and / or nuclear nerd this is a good one.
To have yellow cake and eat it too - If you’re still jonesing for some radioactive intrigue then you’re in luck. This Future of Life Institute podcast episode interviewed Carl Robichaud a nuclear security and weapons policy expert. I’ve been thinking a lot about nuclear deterrents over the past few months as I’ve explored AGI protections so this discussion comes at a good time. Far reaching and in-depth discussion but if you’re interested in the topic it’s great.
Government stimuloss - Apparently EV tax credits are becoming more and more out of reach. Beyond Teslas and a few other models, almost none of the available selection on market today qualify for the full $7,500. Fans of the Chevy Bolt can breath a sigh of release, but if you actually like the look of that inbred Prius then you have bigger challenges ahead.
Lost in translation - A court in DC is hearing a securities fraud case that seeks to validate or nullify the legal permissibility of emojis. This article doesn’t really fit with the rest of the newsletter, I know, but seeing as I “formally” signed my Fly VC job offer with a 🤝 I thought this was entertaining.