(Re)Share | #36 - Revenge of the Old Salts
Geo-engineering | AI safety | Multi-agent debate | Synthetic voice | Drug pricing
Greetings from San Francisco! Today marks the first week of several weeks which has me everywhere but my home: SF —> LA —> Phoenix —> LA again —> Chicago —> NYC for like a minute —> Chicago again —> London —> Berlin —> probably an early grave. So if you’re around and want to see me the window is closing.
Stuff Worth Sharing
Silver Linings DO NOT Playbook - We’ve talked about geo-engineering a number of times, specifically Solar Radiation Modification (SRM), but to steal a line from Stefon, 'this article has everything': science experiments, naval ships, a group of rag-tag octogenarian climate researchers who reluctantly came out of retirement and called themselves the Old Salts. SRM involves submicron particles, about 1/700th the thickness of a human hair, being shot into the atmosphere at a quadrillion particles every second. The result is an enrichment of the reflective clouds that could, in theory, negate the warming effects of GHGs. As predicted, this approach receives a lot of pushback from well-intentioned but woefully naive activists. For example, '...an extraordinarily dangerous distraction...the best way to address climate change would be to quickly pivot away from burning fossil fuels.' Last year was the hottest in recorded history. In case you missed it, Trump is the nominee, so I don’t think reducing emissions is still the top priority for many.
Safety first among equals - The UK and US announced a collaborative agreement for the development of AI safety standards and systems. The partnership is the first of its kind and builds on last year’s international AI safety summit and the UK’s AI Safety Institute (AISI). Details are light but the framework seems to be focused on the two nations sharing ideas and insights on how to evaluate AI systems as well as physically swapping researchers through secondments. It’s an interesting niche the UK is attempting to carve out with itself as the de facto Switzerland of AI - balancing the capitalist/pseudo-accelerationist US with an increasingly skeptical Europe. To be fair, France is trying to do this, but they’re pre-Frexit.
Wastewatermarking - A lab at ETH Zurich demonstrated that current Gen AI security measures are inadequate. The team managed to bypass safety systems designed to identify AI-generated text using little more than $50 and some creative prompting. They achieved this by reverse-engineering the watermarking rules through comparative analysis, then deploying a series of spoofing attacks to make AI text appear human-derived. The team had an 85% success rate in their study, conducted as a fairly ad-hoc operation. We’ve talked a lot about watermarking and other embedded authentication services in past issues. The more I dig in, the more I come to view detection as an impossibility, at least in written text. Increasingly, I’m drawn to human-based verification tools, e.g., blockchain-based 'I approve this message'.
Imitation game - Deepmind unveiled SIMA, a generalist AI agent that can take natural language instructions for task execution within 3D virtual settings. This is really interesting research that provides numerous opportunities across simulation, robotics and embodied intelligence. Deepmind trained and tested SIMA across nine gaming titles and saw impressive cognition and versatility. The training methods used were particularly interesting to me - not just the standard human play monitoring / imitation, but also post play description. In other words, I play a game, have it recorded, and then verbally describe what I did. This method opens up enormous skill learning opportunities in the analog world when you consider the mass of video recording already in existence.
Across the AIsle - A few weeks ago, we discussed the growing field of collective / multi-agent intelligence. This research continues that thread by exploring the question of LLM alignment via debate. The researchers structured a series of tests to see if 'non-experts' (i.e., humans or non-specialized models) could select the correct answer from two arguing expert models. Turns out that yelling at one another works! Debate improves the ability of both non-expert models and humans to answer questions correctly, achieving significantly higher accuracy than naive baselines. It also found that optimizing expert debater models for persuasiveness in an unsupervised manner can enhance the ability of non-experts to identify the truth in debates. This study contradicts my own findings over the past seven years of marriage.
Self-care - Deepmind for the second time this week with the release of Promptbreeder, a general-purpose self-improvement mechanism that evolves and adapts prompts for a given domain. The approach was successful across basic tasks (math problems, basic reasoning) and complex ones (identifying hate speech). It does this through a two-part reinforcement system of task and mutation optimization. Task-prompting essentially mirrors the actions we perform in our favorite LLM interfaces, while mutation-prompting involves a higher-level cognitive process of iteratively refining the question-answer process through self-play. Remember when Prompt Engineering was an exciting new career path? Me neither.
Don’t Speak - OpenAI announced Voice Engine, a text-to-speech AI model for creating synthetic voices based on a 15-second segment of recorded audio. Voice synthesis is not a new idea, but fidelity and required training still place some barriers. As an example, one of the leading players in the space, ElevenLabs, requires 30 minutes of recorded speech to create a professional-grade clone. OpenAI’s model has drastically improved on this, so much so that it triggered safety concerns and is not being released. If you’ve watched any of the tech executive congressional hearings, you’ll understand why. Election interference is a huge fear of Congress, and this type of technology, as well as deep fakes, have taken center stage.For those interested, OpenAI’s own blog post on the topic is here.
Over the counter and through the roof - You probably already know that Americans pay more for prescription drugs than the rest of the world, but you probably don’t want to know how much. For example, an HIV antiviral costs approximately $25/month in Germany and—wait for it—$4,000/month in the States. This fantastic article by SoltDB offers a wide-ranging global view of 2023’s blockbuster drugs (sales > $1 billion). The article provides a comprehensive analysis of various data points, including the distribution of blockbuster drugs, the conditions they treat, and changes in blockbuster drug revenue from 2022. For political junkies, scroll to the bottom for a take on what drives the asymmetric pricing of American prescriptions. Spoiler alert: Congress.
Chip stack - Micron’s Inventory Management team has a pretty cushy year ahead. The company announced that they’ve completely sold out its high bandwidth memory (HBM3e) chips for this and most of next year. This is perhaps not shocking because of Micron’s gilded relationship with Nvidia’s H200 accelerator, but it still astounds me just how big the AI boom is.