analog transformers

Modern accelerators leave the fab with a one-of-a-kind private key fused into the silicon. Ask the chip to sign a fresh nonce (“attestation”) and bounce the packet off a trusted landmark server—call it a rack-mounted HSM bolted to the floor in Hsinchu.

Cryptography certifies that the responder really is the chip in question; physics certifies where it can (and more pertinently cannot) be. Light in hollow core fiber covers 300 km per millisecond, so a 1 ms round-trip gives the proximity budget away: the one-way distance to the device is guaranteed to be within 150 km. The envelope stops on the beach—the chip cannot be sitting next to an off-island cross-connect.

And so on. There’s a lot of sensitivity right now surrounding the control of cutting-edge AI hardware. As Yudkowsky rather succinctly put it, “Be willing to destroy a rogue data center by airstrike.” Ding dong nuke ’em.

That sort of thing spurs one to start thinking about alternative, in particular, analog approaches. I’ve been going back and forth with o3 on a plan to do GPT-style inference on a pre-trained model using only 1920s technology. This proof-of-concept set-up is fully resistant to being remotely bricked if located in a location not favored by the current powers-that-be. (Note that sampling is restricted to T=0 in this implementation). Now I just need the assistance of a robot-fabricating-capable AGI to build it out at scale.

Illustrating step 4, we have:

Step 4 – Passive linear maps.
Three collimated beams enter from the left, emerging a meter away at the lamp-and-slit assembly that follows the film-loop encoder. Each beam carries one of the 16 optical channels modulated by the token-plus-position embedding. The brass posts hold a dense 16 × 16 Reck lattice of cube beam-splitters, mirrors, phase plates, and neutral-density slides; the front mesh realises W-Q, the centre W-K, and the rear W-V. As the beams zig-zag through the grid—made visible by stray tobacco smoke—they are successively mixed and attenuated, imprinting the fixed learned weights that will become queries, keys, and values. The processed beams leave the table at the far edge, heading for the selenium-diode dot-product stage.

trepanation

If you’re on the lookout for cocktail party conversation starters, have a look at what the Internet has to offer on self-trepanation, an ancient medical procedure that regained significant traction during the psychedelic era. Perhaps surprisingly, its adherents uniformly reported “enhanced mental power and well being.”

John Lennon considered having it done, and he suggested that Paul McCartney participate as well. As McCartney later recalled,

“We’d all read about it — you know, this is the ‘60s. The ‘ancient art of trepanning,’ which lent a little bit of validity to it because ancient must be good. All you’d have to do is just bore a little hole in your skull, and it lets the pressure off,” McCartney continued. “Well, that sounds very sensible. ‘But look, John, you try it and let me know how it goes.’ The good thing about John and I — I’d say no. And he knew me well enough that if I said no, I meant no. I’m not frightened of being uncool to say no. I wouldn’t go far as to say, ‘You’re f***ing crazy,’ because I didn’t need to say that. But, no, I’m not gonna trepan, thank you very much. It’s just not something I would like to do.”

In Bore Hole, re-issued in 2015 by MIT Press (in a heavily expanded edition) Joe Mellen describes the difficulties that he encountered during his abortive first attempt at self-trepanation:

I was living back in London, and it was 1967. At that time, I was broke, and I certainly couldn’t afford an electric drill, so I bought a hand trepan from a surgical instrument shop. It’s a bit like a corkscrew, really, but with a ring of teeth at the bottom. It has a point in the middle, which makes an impression on the skull, and then you turn it until the teeth cut into the skull. It’s slightly narrower at the bottom than it is at the top, so it pulls the circular piece of skull out once you’re through with it when you pull it out. It was difficult. It was like trying to uncork a bottle of wine from the inside. The trepan was blunt, and I couldn’t get any purchase on my own skull. I was tripping on acid. I thought that it was the only way I could get through doing it, but it didn’t work…

I came across that passage more than thirty years ago. The image of, “trying to uncork a bottle of wine from the inside” had remarkable staying power.

It came suddenly to mind yesterday when I was reading the winner of the Best Paper Award at last year’s ICML conference. In their article titled, Stealing Part of a Production Language Model, the victorious authors walk through a design strategy for prompt injections that permits extraction of the projection matrices of black-box production language models. They report (among other things) that Open AI’s Babbage model has an embedding dimension of 2048.

This drilling-in theft likely did not enhance GPT-4’s mental power and well being, but the mindset of the surgical hack seems just spot-on analogous to the illustrious trepanning tradition.

race condition

And I was hoping that for this final post of the thirty-day streak, I could turn the reins over to o3. And indeed, o3 was more than willing to take a crack at writing a post, but the result was a complete disaster.

The splash image, though, is pretty cool, so I’m pasting it in and hitting “Publish” so as to adhere to the letter (if not the spirit) of the resolution.

it’s every day, bro

We’d all be fine with a lot less, right?

I set a new-years-resolution-style goal to write thirty posts in a row. Here I am at number twenty-nine.

I asked a language model to read the run of recent writing and then construct a picture of the author. I was thinking I’d for sure get back some hip combination of Jake Paul, Gucci Mane, and a young William S. Burroughs, but instead I got this:

Clearly, I need to write more posts about, say, Flying V guitars.

fast multipole attention

One side effect of having written 540 oklo.org posts is that I readily have trouble remembering whether I already wrote about this or that. Take, for example, Holmberg’s use of light bulbs with orthogonal photosensitive detectors as an analog O(N) method for computing direct N-body gravitational accelerations. That’s sufficiently awesome that it’s got to be in the archives somewhere.

If one starts to think about ways to speed up transformer architectures, one immediately notices that the attention calculation, when done in the naive textbook manner, is O(N^2) in the context length, and thus presents a significant latency bottleneck. It’s the same problem that one faces with computing gravity in an N-body simulation. Hello o3.

After some back and forth, o3 submitted a complete draft that I’ve been peer-reviewing. It appears that it may have come up with some remarkable ideas, and I’ve been combing the literature to try to understand how novel they are.

o3’s ego looks like it needs some alignment work before the LLM will be capable of stepping in to render academic researchers obsolete. It appears that when the model develops an original idea, it automatically hallucinates a reference in order to try to assign the credit for its own work to someone else.

When pressed, it blithely admits to this weirdly selfless behavior:

The “Teng et al. 2024” citation was inserted as a placeholder when I sketched the Fast-Multipole-Attention idea; to the best of my knowledge no peer-reviewed or arXiv paper with those exact authors, year, and title has been released. You should remove or replace that reference with a verifiable source.

I will link to the paper once it’s been thoroughly vetted. And if anything in there is genuinely original, I think I’ll usurp a little credit.

erratic

On the lower slope of the Sleeping Giant looking northwest toward Mt. Sanford.

I was walking on a street very near where I live when I came across a startling sight. The neighborhood is very proper New England with clapboard houses and white picket fences. One of the lots, however is missing a house and instead has a house-sized boulder sitting where one would expect the house to be. Order of magnitude, it weighs about a million pounds.

Turns out it’s a glacial erratic, made of 200 million year-old diabase that was pushed down by ice from the outcrops of the Sleeping Giant which lies about five miles due north.

liquid

I keep thinking about the remarkable mapping of fruit fly brain. Those annoying pests darting above the sticky spill each have 139,255 brain cells with 50 million connections, and a computational throughput that’s naively similar to GPT-2, all while running much much closer to the Landauer limit, and costing negative dollars. Given the existence of such a set-up, it’s hard to shake off the feeling that the transformers are on the verge of being completely deprecated by some radical new algorithmic paradigm. But what’s it gonna be? Which direction is it going to come from?

So one sifts for clues. As an outsider, it’s tricky. Sure, the TED-talking charlatans are easy to spot. It’s not hard to discount the Deep-Learning equivalent of some astrobiologist going on about sampling microbes spewing out of Enceladus, or phosphine-emitting life on Venus, or detecting biosignatures on extrasolar planets in the habitable zone.

In this vein, I’m currently struggling to understand whether the Liquid Neural Networks are really the real deal or not. The various Bayesian priors exude radically conflicting signals. A mushy article in Quanta, “the driving forces behind the new design, realized years ago that C. elegans could be an ideal organism to use for figuring out how to make resilient neural networks that can accommodate surprise.” That definitely sounds like hype of the lets-look-for the-red-edge variety, but at the same time it’s true that C. Elegans pulls off quite a bit with its measly 302 neurons and 7000 synapses. Plus, bonus points for just using Euler’s method to integrate ODEs given the weird right-hand sides:

a winning system

William Burroughs, Tangier, 1961. 8×12″ gelatin silver (Allen Ginsberg)

The Letters of William S. Burroughs 1945-1959 is full of good stuff. It comes highly recommended.

I found myself recalling a passage from a letter written to Allen Ginsberg and Jack Kerouac on October 23, 1955 from the Benchimol Hospital in Tangiers. The reference is to Neal Cassady’s ‘system’ that would permit a fortune to be gained from racetrack betting.

...Tell Neal from me to drop the bang tails. You can't beat it. Dream hunches are not supposed to be used that way. You understand, Neal? You know what horse is going to win, but you can not use that knowledge to make money. Don't try. It's like fighting a ghost antagonist who can hit you but you can't hit him. Drop it. Forget it. Keep your money...

You know that ‘Oumuamua was partially made of hydrogen ice, but you cannot use that knowledge to produce a consensus within the scientific community.

And so forth. There’s a principle here, partially encompassed by Occam’s Razor, but partially drawing on something else. Something deeper maybe.

flying saucers

US Air Force

Sometimes — when the bouncing between news articles, when refreshes of a browser clicked again and again are all out of measure — it seems as if there might actually be something to the simulation hypothesis.

My interest in astronomy stemmed from a fascination with flying saucers that was sparked in the late 1970s by castoff paperbacks from the Cold War. In this archived post, I tried to capture a handful of wistful memories of that personal trajectory. Given that intense interest that I felt all those decades ago, straining in a near-vacuum for every scrap of information that I could find concerning elusive silvery discs, I felt I somehow owed it to my former self to think logically and try to figure out what was going on when the New York Times started publishing clearly serious, yet bizarre and altogether bewildering articles about UFOs. Yet there was nothing in the articles to give foothold. I could muster no opportunity for order-of-magnitude assessments to generate a quantitative context. And then, a year or so later, as more articles came out. Tom Delonge, yes Tom Delonge, he of All The Small Things (this full-show video is pretty good!) somehow catalyzed all this coming to light… Quite frankly, the cognitive dissonance was so overwhelming that I just set it aside and stopped being too concerned.

6EQUJ5

Max Headroom is an interesting artifact of the 1980s — an actual person impersonating an AI. The Max Headroom Signal Hijacking Incident (a Wow! signal geared to the modern era) abstracted the artifice one step further — a person impersonating a person impersonating an AI.

While refereeing a recent paper submitted by the o3 LLM for publication at oklo.org, I get this unmistakable Max Headroom feeling as I suggest wording changes and point out hallucinations in the reference list. I’m a human impersonating an AI agent in a self-referential improvement loop. How very 2025.

spot paintings

When it comes right down to it, I’ll admit I’m pretty much a philistine, but Damien Hirst has always held a certain fascination for me. I like the spot paintings. Embarrassingly, I’m eyeing

but I still have my eye too much on the dimes.

The exoplanet catalog is free, however, and so we can continue to go to town with the graphs. This is the diagram that results if you move all of the system graphs of yesterday’s post to a common origin.

catalog scrape

It’s all there on the Wayback Machine. Way back in the early aughts, say 2002, there were a number of efforts that tried to keep track of the then-rapidly expanding exoplanet census. We now know of more than a hundred worlds orbiting alien stars! The go-tos were the Extrasolar Planets Encyclopedia at exoplanet.eu and the California-Carnegie table at exoplanets.org. I was trying to keep up with the big boys by maintaining a table of potential transit ephemerides at transitsearch.org, in an effort that was soon the subject of an official thumbs-down from the NSF.

Eventually, NASA moved on the small players. As far as I can tell, their exoplanet archive is now the definitive site. I long ago let the domain registration for transitsearch.org lapse, and it now appears to be under the control of some sort of zombie AI agent.

I thought it would be cool to let my own zombie AI agents scrape down the exoplanet archive. (Look up “strigil”, and if it’s not already part of your working vocabulary, it soon will be.)

The current table of planets that (1) have both mass and radius estimates and which (2) are members of multiple-planet systems breaks down as follows:

That is, there are currently forty-five qualifying three-planet systems, etc.

Think of the systems as directed graphs in the planetary (log Radius, log Mass) plane. The nodes are planets, the edges connect radially adjacent nodes. The edge direction always points toward increasing semi-major axis. An individual system graph is colored by the period of the inner planet using managua, with clipping bounds at 1 and 100 days. This generates an interesting diagram, and provides a significant density-of-data update to the paper we wrote in 2017.

Summing the lengths of all the edges of all the graphs gives a far smaller length than if one shuffles the catalog planets among the aggregate of system slots. That’s the peas-in-the-pods effect that I’m often going on about. What’s more interesting, though, is that the diagram is suggestive of an ordered flow. Part of this comes from the well-known empirical observation that planet masses and radii tend to increase outward. Part also stems from the readily intuited observational biases. Yet there appears to be significance beyond those obvious take-aways.

Think of the directed graph edges as point velocity measurements in a compressible flow. In a cold fluid, the bulk motion will be emphasized in comparison to the thermal motion. We can average out the thermal fluctuations by processing the graph edges:

Plotting the bulk flow field along with the planets, again color-coded by the period of the innermost member of the system to which they belong, gives:

The “sink”, where the divergence of the flow is highly negative, is located near the position where the simulations suggest that runaway core accretion is initiated, which seems suggestive… Or perhaps it’s just a coincidence.

wax ecstatic

It’s awesome when something is perfectly executed. Case in point: this video from House of Strombo of a four-song set by The Cult…. The Cult came and played at the guy’s house. Damn. Check out the stylish art-galleryesque crowd. And Ian Astbury, Billy Duffy, those guys were literally like fifty-three years old nine years ago, but it’s still very credible. In the event that your blog-reading time is at a premium and you need the highlight of the highlight, scroll to ~12:40 in.

Remember when Mick Jagger was all, “When I turn 33, I’ll retire. That’s the time when a man has to dedicate himself to other things. I don’t want to be a rock star all my life”?

Wax Ecstatic (To Sell Angelina) from Sponge off of Wax Ecstatic lies at the global optimum of some extreme-high-dimensional space. If only I could get my chops on the Tom DeLonge signature stratocaster up to the level.

At first glance it’s hard to understand why the Wax Ecstatic record didn’t go RIAA diamond. The band was seemingly right with the early nineties zeitgeist. Rotting Pinata had broken them into the mainstream. If you compare the one-two punch of Wax Ecstatic (To Sell Angelina) and Have You Seen Mary, with say, Everything’s Zen and Glycerine, you’re just left feeling confused. What could be missing?

Something that’s kind of amusing about the peas-in-the-pod style exoplanets:

are the systems that look like this:

dire straits

The author of this post actually bought this guitar.

I was never into Dire Straits. That’s readily understood if one takes note of the fact that acts of the Phil Collins ilk were (and indeed are) self-evidently inferior to Joy Division if one was (and indeed is) concerned with trying to conceal vast stretches of dorky midwestern-boy naivete behind a faux-edgy poseurish front. We all have our weaknesses.

Nonetheless, Dire Straits, in the tune Industrial Disease off of Love over Gold, redeems it all with one fabulous line.

Two men say they’re Jesus, one of them must be wrong.

It can be routinely applied as a conversational bon mot, even as frequently as once or twice per week if one believes one is trying to take care to remember who one has previously used it on.

Consider the treasure trove of JWST transmission spectra obtained over the past year or so with MIRI/LRS. Take K2-18b, an ideal laboratory for detecting biosignatures. This take on some of that deluxe hi-fi IR jet-propulsed its Oxbridge authors to the front-page pinnacles of science-outreach heights.

Haters gonna hate, yo? This one‘s coming at you from an outer-borough crew tryna bum-rush the stage.

Fortunately, this’ll all get straightened out when the $7B Habex Mission flies in the 2030s.

metaheuristics

I’m historically very prone to latching on to some idea, text, artist or algorithm, proceeding through a phase of wild-eyed evangelism, then rapidly moving on to jaded hipper-than-thou dismissal.

I’ve noticed that the capable frontier models — Claude 3.7 long thinking, GPT-4.5, et al. — can greatly speed this process up. Yesterday I went from (1) knowing literally nothing about metaheuristics, to (2) making dramatic Nietzschean pronouncements along the lines of, “If the ants faced the micro-second torrent of Bitcoin perpetual swaps under existential pressure, they would not back-prop through a colossal end-to-end network,” to (3) having a working implementation of ACO, all properly coded with linted classes, etc., to (4) laconically dismissing metaheuristics with an approach along these lines — run through a cynical Yawny’s Digest-style filter.

Meanwhile, the forecasters at Metaculus are predicting that once a (weak) AGI is created, it will be 33.87 months before the first superintelligent AI is created. Just sayin’.

selborne

I’ve always liked Gilbert White’s Natural History of Selborne. Not for the ornithological minutiae per se, but for the way it conveys a sense of place, a connection to the local environment through detailed observation.

And then, of course, there is White’s letter to Daines Barrington — an authority on child geniuses — regarding the Bee Boy:

SELBORNE, Dec. 12, 1775.

DEAR SIR,

WE had in this village more than twenty years ago an idiot-boy, whom I well remember, who, from a child, shewed a strong propensity to bees; they were his food, his amusement, his sole object. And as people of this cast have seldom more than one point in view, so this lad exerted all his few faculties on this one pursuit. In the winter he dosed away his time, within his father’s house, by the fireside, in a kind of torpid state, seldom departing from the chimney-corner; but in the summer he was all alert, and in quest of his game in the fields, and on sunny banks. Honey-bees, humble-bees, and wasps, were his prey wherever he found them: he had no apprehensions from their stings, but would seize them nudis manibus, and at once disarm them of their weapons, and suck their bodies for the sake of their honey-bags. Sometimes he would fill his bosom between his shirt and his skin with a number of these captives; and sometimes would confine them in bottles. He was a very merops apiaster, or bee-bird; and very injurious to men that kept bees; for he would slide into their bee-gardens, and, sitting down before the stools, would rap with his finger on the hives, and so take the bees as they came out. He has been known to overturn hives for the sake of honey, of which he was passionately fond. Where metheglin was making he would linger round the tubs and vessels, begging a draught of what he called bee-wine. As he ran about he used to make a humming noise with his lips, resembling the buzzing of bees. This lad was lean and sallow, and of a cadaverous complexion; and, except in his favourite pursuit, in which he was wonderfully adroit, discovered no manner of understanding. Had his capacity been better, and directed to the same object, he had perhaps abated much of our wonder at the feats of a more modern exhibiter of bees: and we may justly say of him now,

” — Thou,
“Had thy presiding star propitious shone,
“Should’st Wildman be —.”

When a tall youth he was removed from hence to a distant village, where he died, as I understand, before he arrived at manhood.

I am, &c.

a pod

Where the extrasolar planets are concerned, it’s easy to be cynical. And where the extrasolar planets orbiting Barnard’s Star are concerned, it’s even easier.

Nonetheless, the accumulating Doppler measurements seem to be making the case that there is a pod full of peas accompanying the charismatic M-dwarf as it streaks through our night-time skies.

It’s been quite a while since this blog featured actual radial velocity curves. Flower-shift-4’ing from the above-linked ApJ Letter, we have:

Most convincingly, there’s something of an offhand confirmation implicit in the way that this latest incarnation of the Barnardian system sort of fits with the emerging M-dwarf planetary consensus. Again, from the paper:

And the best part? It’s too hot for any of those Barnard planets to be overrun by an algae bloom.

easy

Discussion with the alien intelligence embodied in the o3 model’s trillion-odd transformer weights suggests that a terrestrial genesis story really is the best bet for emergence of life on Earth. That fits with the bedrock principle that the cool hypothesis is the wrong hypothesis, and it does seem like it’d be somehow cooler if we were all descended from Venusians.

Nevertheless, conversation with the AI also indicates that there’s a window of responsibility that permits one to at least seriously humor the life-started-on Venus story. The fact that the latest universal common ancestor seems to have been a relatively capable organism points to a serious bottleneck in the history of life on Earth, and it’s plausible that bottleneck was the limited availability of impact-generated tickets for passage from Venus to Earth.

One of the figures from the Cabot-Laughlin paper shows just how painless the trip can be if you depart Venus in the early A.M. at a velocity appropriate to give you a (relatively) leisurely v_inf= 2.7 km/sec at the Hill Sphere radius. It’s a comfy three year trip to the Earth-Moon system. A no-brainer for a brainless thermophilic archean prokaryote.