(Re)Share | #30 - The Robots are Coming!
Robotic training | Zero-shot navigation | Data provenance | Gene therapy
In the last issue I introduced my 2024 investment interests and specifically highlighted that the first deep dive would be robotics. Since then I’ve done little else with my free time but read through research on the subject and harvest my LinkedIn for people to speak to. You will very much see this new focus in the links below so if you’re not into robots, adaptable AI training systems and the inevitability of our future mechanical overlords this may not be the (Re)Share for you.
Stuff Worth Sharing
Autobots, roll out! - In the last issue I included the RT-X project, a multi-lab collective effort to pool training data across a diverse set of rigs and tasks in order to scale generalized capabilities. Google is working on their own effort called Robotics Transformer (RT-1) a model designed to transferring knowledge from task-agnostic datasets to enable specific tasks either zero-shot or with small task-specific datasets to a high level of performance. The compression of implementation (time, cost and effort) is an extremely exciting area of development. Jury’s still out if the most attractive manifestation of this scale hacking will be previously impossible / high skilled applications (e.g. maintenance, medicine) or the long tail of highly varied / lowish value (e.g. elderly care).
Search engine - FAIR, Mistral and a bunch of university researchers released their Go To Anything (GOAT) agent, the first universal navigation system that can search for and navigate to any object in unseen environments without requiring pre-computed maps or locations of objects. It’s pretty remarkable to watch and there are several experiment recordings linked. These terrifying horse robots get dropped into a random AirBnB (imagine the security footage) and are then tasked with running around to look for and retrieve items. It does this through an interesting architecture of global models (assignment), local models (semantic mapping) and a collection of functional models like perception, depth estimation, planning, etc. Interns around the country beware - well, at least the paid ones.
Do it again monkey! - I can’t tell you the amount of times I’ve gone hoarse from verbally abusing my Alexa when it errs. This Stanford group developed a system for language-guided human-robot interaction within a shared autonomy paradigm. Instead of discrete turn-taking between a human and robot, the LILAC system splits agency between the two with language as the core input in a low-dimensional control space. A purely verbal interface could be a transformational unlock with robotics and one of my most curious / excited areas.
What’s in the box?! - Questions on ownership of the knowledge extracted by LLMs and the risk of copyright infringement is a hot topic. We talked about this a bit last year and my personal research suggested that data provenance was effectively impossible. This new research from Imperial suggests that document-level membership inference (has an LLM been trained with a document or not) is within reach. The research queries the model in question for token-level predictions, normalizes those predictions for how common the token is, aggregates the predictions to the document level, and then builds a meta-classifier for a final call. With an AUC of 0.856 and 0.678 for books and ArXiv papers respectively, the results are promising but it wouldn’t win a case. Sarah Silverman will have to wait.
Central planning - Some of the most groundbreaking advancements in machine learning over the past decade have come from the compounding advantage of unsupervised reinforcement learning. For all the obvious atoms-based reasons, that is very difficult within robotics. Deepmind’s AutoRT project provides a system that leverages foundation models to scale up the deployment and orchestration of operational robots in completely unseen scenarios with minimal human supervision. They use VLMs to do open-vocab descriptions of what the robot sees, then pass that description to an LLM which proposes natural language instructions. The proposals are then critiqued by another LLM, the robot constitution, to refine instructions towards safe, completable behaviors. If that sounds like the first step in 12 steps that ends in murderbots, rest assured that the system’s been built with Asimov's Laws in mind.
Web of lies - Japanese scientists successfully developed synthetic spider silk through an innovative microfluidic device that manipulates small amounts of fluid through narrow channels and negative pressure. Spider silk has a tensile strength comparable to steel and a strength-to-weight ratio that makes it perfect for biomedical applications, such as sutures and artificial ligaments.
Storm Troopers - A meteorology team out of the University of Oklahoma (who knew?) developed a computational model that leverage lightning flash count to more reliably predict weather predictions. The method dubbed “Flash Extent Density measurement” uses geostationary satellites armed with cameras designed to detect lightning through an array of optical sensors. This method records both point-like lightning locations but also the horizontal extent of the lightning detections, making it possible to watch how lightning evolved over time. Initial results showed a equivalent predictive power at much lower compute costs but they might get another lucky strike (😏).
Ledge computing - There’s been a lot of takedowns on the increasing usage AI social agents to fight loneliness but this is research in Nature provides a pretty uplifting take. A survey of 1000+ student users of Replika found that in 3% of cases the user halted their suicidal ideation. Mental health is a massive, massive challenge on several fronts so any effort to overcome this is welcome. While it may seem weird to divulge to an artificial listener, existing options are limited. I trialed BetterHealth last year and it was the worst product experience since Comcast.
Hear me out - A recent study showed a breakthrough gene therapy could provide the cure for congenitally deaf children. Though a small study (six in total), the results over 26 weeks are extremely promising with five showing strong improvement in hearing ability. The children in the trial suffered a condition called DFNB9, resulting in total deafness due to a mutation in the OTOF gene. The solution was to deliver a targeted gene therapy via a host virus. Interestingly the gene-therapy was too large for the AAV - an interesting area for investment - so the therapy had to be split in two and delivered via separate viral vectors. A massive win for the field and a feel good story if there ever was one.
Portfolio Flex
Giddy up! - Ophelos CEO and co-founder, Amon Ghaiumy, was featured on the Riding Unicorns podcast to tell his inspiring story of using AI to solve debt collection.
Let’s get clinical - Pear Bio, alongside their partners the Royal Free Hospital, UCL and Kidney Cancer UK, released the first results of their kidney cancer clinical trial to great success.