k-hole is the game development and design blog of kyle kukshtel
subscribe to my newsletter to get posts delivered directly to your inbox

Heisenfunctions, Incremental Determinism, and The Future of Programming


Exploring what it could mean to program with GPT from the very start

January 8, 2023 Programming GPT Architecture

hands

It’s no secret that I’m into the new AI stuff. Not only that, but I think we’re in a real adapt or die” moment. It’s not obviously that dire (yet), but the advances the field made in 2022 have definitely got me thinking about how to rearticulate my own creative practice moving forward, and doing so in such a way as to incorporate AI collaborators throughout. To quote Pentiment, you can’t roll back the wheel of time”.

At the same time, my own practice is at a bit of a crossroads. Work is closing out on Cantata as we approach 1.0 and I’m staring down the start of a new project. I’ll be able to pull some work from Cantata into it as to not start completely from zero, but for all intents and purposes the game will be started from scratch.

A few years ago, the next step would be to just to do it. Program the game”. Set an entry point, start mocking some data, turn it into objects, add some systems, etc. Same way we’ve been making games forever.

But as mentioned, things have changed. Copilot/GPT-3/ChatGPT, etc. have proven to be ok” enough to be taken seriously as programming companions. This is, as stated everywhere, a big deal. However, a lot of the use cases posited for using such technology often have to do with working in existing codebases, with demos showing refactors, writing one-off functions, optimizations, etc. This is all well and good (and makes sense because there is a lot of code already written that is being worked on), but what if you’re starting from scratch? What use is the ability to write tests if you have no code to actually test? Or said differently, how can we use AI when starting from zero?

I have no answers to this, but I feel like I got a bit of a taste I’d like to tease out more after reading the recent article on creating an infinite ai array”. The whole piece is well worth reading, but the gist of it is that GPT, similar to the crazy VM sample, will effectively hallucinate” functions that aren’t actually defined yet. Not only this, but they can return contextually smart” answers. Here’s the initial code sample from the post just so you know what I’m talking about (but also seriously just read the article):

>>> from iaia import InfiniteAIArray, InfiniteAIDict
>>> primes = InfiniteAIArray()
>>> primes[:10]
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
>>> minnesota_cities = InfiniteAIArray()
>>> city_populations = InfiniteAIDict()
>>> for i, city in enumerate(minnesota_cities[:6]):
...     print(f"{i+1:>2}. {city:<20} {city_populations[city]}")
 1. Minneapolis          422331
 2. St. Paul             302551
 3. Duluth               87622
 4. Rochester            111264
 5. Bloomington          84863
 6. St. Cloud            68320

You’ve maybe got to do a double take to see what’s so insane about this. Calling InfiniteAIArray() returns a sequence of prime numbers in one case, and in another case, with nothing changed but the assigned name of the return value, returns a list of cities in Minnesota. What the fuck?

Long story short is that InfiniteAIArray() wraps calls to the GPT API and is made more specific” by passing through the variable name the return value is assigned to. Said differently, it’s very likely that a variable called minnesota_cities contains… cities in Minnesota. This information is passed to InfiniteAIArray() and a relvant list is returned.

This is, needless to say, insane. It goes against basically every tenet of programming. Not only that, but… it works? Mostly? For the scope in which the functions are used, based on the example above, the functions and their outputs are totally passable. If your goal is to enumerate the populations of cities in Minnesota roughly (there’s no indication of when in the returned data), this code seems to work.

But what happens if you call this code again? Well that’s where things start to fall apart, as there is no guaranteed determinism to GPT responses yet, meaning you may get a different result. It’s also possible that your calls for seeking city populations misses some Minnesota cites you care about. Or it samples a different time period. Etc.

It’s this specific fact, the lack of determinism, that makes people tend to brush off the implications of what could be accomplished with such a model. The HN thread is littered with talks about usefulness being effectively zero due to lack of deterministic output/stability.

Despite a lack of imagination, this mostly makes sense. For so long, programming has had to be deterministic by necessity, in part to develop stable software, but also software that acts consistently and is able to actually be compiled.

The concept of InfiniteAIArray() goes against that, instead suggesting that you can sort-of write code to get sort-of a result, which is good some of the time. In a domain almostly entirely built around exactness, this is heresy. On HN someone even goes as far to remark that using this for anything real would be a criminal offence”.

This comment’s notion of real” is the exact point — software, when being built, needs to target the end goal 100% of the time. It needs to be real”, and as such needs to be complete, deterministic, exact, etc. You can still write bad code, but bad code can still compile. You are scaffolding the end result from day one.

What InfiniteAIArray() suggestes however, is that maybe that doesn’t have to be the case? Maybe it’s okay if your program misses a few cities in Minnesota. Maybe it’s okay if the populations are a bit off. However the code works and we get a result. If we REALLY need the right data, we can pull that in the old way.

The mostly-works-ish-and-needs-basically-no-code of this is… intoxicating. After reading the article I started to think, obviously, about games. Could you extend something like this to be more general purpose?

This isn’t some lead in to me doing exactly that, but this article has started to get me wondering about how, now that there are other options, we may not need 100% determinism from the start when making something. For example, let’s image I want to make a Chess game using the same idea:

var inital_chess_board = HF("SetupChessBoard");
var updated_chess_board_state = HF("Move(inital_chess_board,A2,A3)");

Would this work?

Screen Shot 2023-01-07 at 10.44.43 PM

Seems like… yeah! Let’s talk about this line though:

var inital_chess_board = HF("SetupChessBoard");

Two things: What is HF(). Well on Twitter recently I said this:

heisenfunction - a programming function hallucinated by an ai at call time and returns a real result then disappears after it’s invoked

I’m calling this invocation a heisenfunction” (or even HallcinatedFunction)— obviously channeling the uncertainty principle. I think it aptly captures what is happening by passing a fake method” to a function. What HF() would do internally would be very similar to the InfiniteAIArray(), with the notable difference that (secondly), unless you provide new implementations of HF(), it would require dynamicon the return type. This is bad, but you can easily imagine different, domain-specific implementations of HF() that, in addition to added context for GPT through custom struct definitions, would allow you to capture typed responses.

On the backend, you could simulate a C# REPL or Eval the function to provide a result (again channeling the VM idea).

What about the Move? It also works:

Screen Shot 2023-01-08 at 9.51.16 AM

You can already see some of the pitfalls here though. Our initial SeutpChessBoard() function returns a string[,] and Move returns a char[,]. However, this is trivially easy to fix with GPT queries if you add a real signature to the function in HF(), requesting HF("string[,] Move(inital_chess_board,A2,A3)");. Actually even better is using a tuple, which could allow you to better contextualize the return values (as we are basically trying to data-pack our query to GPT):

HF("(bool chessMoveIsValid, string[,] outputChessBoardState) Move(string[,] inputChessBoardState, string initialChessPiecePosition, string destinationChessPiecePosition)");

How would you invoke the same function continually? You could potentially keep using the thread” of the GPT conversation if they maintain some consistency between API calls. You could instead capture the string output of the GPT call and Invoke the C# code yourself (instead of spoofing a C# REPL). I’m not sure! It’s all new.

If it works? Well you’ve basically programmed a simple chess game with two lines of code. That’s pretty incredible. Granted, you could probably also just ask GPT to program you a chess game and the output would be more consistent, but what I’m really trying to sketch out here is a new model for programming that allows for pretty-good output that doesn’t necessarilly rely on determinism from the outset. You can imagine for more non-trivial games you’d spike out maybe 10-50 of these functions (honestly idk!) to start stitching together… something. You enumerate the possible bounds of the game through these HFs, and can (hopefully) get a simple demo of the thing you’re trying to do up and running quickly.

Once you’ve got your game mocked in HFs and ideally working, you can start to better reason about next steps. Where does it make sense to start actually adding in determinism? In our chess example, maybe we want to better setup the board in a non-traditional arrangement of pieces. We could substitue our SetupChessBoard heisenfunction with a real” implementation of SetupChessBoard() that would give us the tradionally-correct program-y correct” answer. Notably, our program probably mostly” worked before this, but some domain requirement meant that, actually, we need more determinism at this step.

I like to think of this concept as incremental determinism”. In lieu of needing a perfect” program from the first few keystrokes of writing a new program, you can better sketch the bounds of a program to just test an idea. If something proves actually useful, you swap out the HF version for a real version, as if you are focusing in on the actual program.

I keep saying this, but this is so incredibly counter to the current model of programming. The current model acts more like chiseling — the body” of what you are looking for must be mined out of the possibility space of all programs. Writing code provides structure and bounds that give all that possibility some definable shape.

What this new mode positis instead is that you can treat an entire program as if it already exists. You aren’t so much chiseling as you are focusing in on the parts of the program that matter enough to be enumerated. Incidentally, it feels similar to watching something like DALL-E or Midjourney work. The final elements are present in the lowest resolution output, but get incrementally more clear with further revisions.

incremental2

This whole process reminded me a bit of a post I read a while ago called Minimum Viable Airtable. The premise of the article is simple: instead of wasting time building a real backend for your app or project, just use Airtable to allow you to quickly get initimate with the data you’ll be processing and working with. But it’s this specfic nugget that really stood out:

Some systems just don’t deserve the time investment of a full-blown app up front. Worse, good ideas never get started because the upfront cost is high.

Using AIs like this could allow you to more quickly just get the the point” of what you’re looking for. Similarly, Nick Popovich recently tweeted:

Here’s some gamedev advice for prototyping or the early stages of dev. I think this is especially useful for newer devs bc in those days it’s really easy to get a kind of tunnel vision on anything. In short: skip as much default parts of your game as possible.

The whole thread is great, but it belies the point that the scaffolding still needs to exist to test the differentiating factor. If you’re building a chess game with A Cool Twist, you still probably need to build a whole chess game first. Apply the Minimum Viable Airtable” theory, you can effectily build” the boring” parts of your game that are mostly write with little upfront effort, and then target in on what you are actually trying to do.

The inability to do this already has basically just been a pill all programmers have suffered. This obviously makes a lot of us get lost in the proverbial sauce, worn down by the bounds of scaffolding that by the time we get to our idea we are too exhausted to try to do the thing we wanted to do in the first place.

As I stare down the long road of my next project, I’m obviously thinking about this then. I know all the scaffolding I’ll need to do. All that work of meta-building that won’t be reflected much in the end result, but is necessary to get to the good stuff. I don’t think I’ll adopt what I suggest here for the new game, to be clear. It’s not really ready yet, but also I lack the experience to program this way, as well as does everyone else likely reading this.

Instead, I suspect this is the sort of thing that some kids are sketching out in their bedroom, or college students are tinkering with in computer labs. I’ll continue to experiement as well, and will definitely report back here as te general tech continues to improve as well as our ability to wield it.

Thanks for reading!



Date
January 8, 2023




subscribe to my newsletter to get posts delivered directly to your inbox