Index

osmarks' website

All processes that are stable we shall predict. All processes that are unstable we shall control.

Blog

Read my opinions via the internet.

2025-01-09 / 1.35k words
Computer algebra systems leave lots to the user and require task-specific manual design. Can we do better?
2024-11-01 / 2.65k words
Has Minecraft become easier?
2024-10-16 / 665 words
A slightly odd pattern I've observed.
2024-10-06 / 2.99k words
Or: why most AI hardware startups are lying.
2024-10-06 / 1.08k words
As ever, AI safety becomes AI capabilities.
2020-06-11 / 4.75k words
A nonexhaustive list of media which I like and which you may also be interested in.
2024-07-06 / 1.58k words
I got annoyed and rewrote everything.
2023-08-28 / 2.59k words
Powerful search tools as externalized cognition, and how mine work.
2024-05-12 / 1.29k words
What exactly is "magic" anyway?
2024-04-27 / 848 words
Please stop making chatbots.
2024-04-22 / 1.54k words
Absurd technical solutions for problems which did not particularly need solving are one of life's greatest joys.
2024-02-25 / 3.08k words
How to run local AI slightly more cheaply than with a prebuilt system. Somewhat opinionated.
2024-03-27 / 1.87k words
RSAPI and the rest of my infrastructure.
2023-09-24 / 1.64k words
This is, of course, all part of my evil plan to drive site activity through systematically generating (meta)political outrage.
2023-06-06 / 2.50k words
The history of the feared note-taking application.
2023-07-02 / 1.61k words
Why programming education isn't very good, and my thoughts on AI code generation.
2022-02-24 / 949 words
Learn about how osmarks.net works internally! Spoiler warning if you wanted to reverse-engineer it yourself.
2023-01-28 / 407 words
A common criticism of school is that it focuses overmuch on rote memorization. While I don't endorse school, I think this argument is wrong.
2022-05-14 / 463 words
RSS/Atom are protocols for Internet-based newsletter/feed services. They're surprisingly well-supported and you should consider using them.
2021-07-08 / 1.07k words
In which I get annoyed at yet more misguided UK government behaviour.
2020-05-20 / 582 words
Is solving Sudoku and similar puzzles by hand really useful in building computer science ability? We don't think so.
2017-08-16 / 940 words
We are not responsible if these tips cause your ship to implode/explode. Contains spoilers in vast quantities.
2018-08-14 / 688 words
Why I think that government programs telling everyone to "code" are pointless.
2020-01-25 / 145 words
It's slightly different now!
2018-06-01 / 737 words
My (probably unpopular in general but... actually likely fairly popular amongst this site's intended audience) opinions on smartphones today.

Microblog

Short-form observations.

Religion has progressed, historically, from:

  • there is a very large quantity of widely dispersed gods and you don't know about the vast majority of them
  • there are quite a few gods, but a bounded amount
  • there is exactly one god
  • there are exactly zero gods

By extrapolation, we can conclude that the next step is that humanity has negative one god, i.e. is in theological debt and must build a god to continue. This is where the EY-style "aligned singleton" came from. But people are now moving toward "we need everyone to have pocket gods" because they are insane, in line with the pattern. The next step is of course "we need to build gods and put them in everything".

It annoys me that my bank makes it so onerous to send payments ever. Five confirm screens and an 8-character base36 OTP I can't fit in working memory. I get why (they are required to reimburse you if you get defrauded and happen to use the bank's push payments while being defrauded, in some circumstances) but this is a very silly consequence.

I finally got round to watching the political documentary "Yes, Minister". It would be very funny if it were fictional, which I am told it is not.

DeepSeek V3 was unexpectedly released recently. It's a decently big (685 billion parameters) model and apparently outperforms Claude 3.5 Sonnet and GPT-4o on a lot of benchmarks. And they release the base model! Very cool. Some notes:

  • They don't make this comparison, but the GPT-4 technical report has some benchmarks of the original GPT-4-0314 where it seems to significantly outperform DSv3 (notably, WinoGrande, HumanEval and HellaSwag). I can't easily find evaluations of current-generation cost-optimized models like 4o and Sonnet on this. Is this just because GPT-4 benefits lots from posttraining whereas DeepSeek evaluated their base model, or is the model still worse in some hard-to-test way? GPT-4 is 1.8T trained on about as much data.
  • It's conceivable that GPT-4 (the original model) is still the largest (by total parameter count) model (trained for a useful amount of time). The big labs seem to have mostly focused on optimizing inference costs, and this shows that their SOTA models can mostly be matched with ~600B. We cannot rule out larger, better models not publicly released or announced, of course.
  • DeepSeek has absurd engineers. They have 2048 H800s (slightly crippled H100s for China). LLaMA 3.1 405B is roughly competitive in benchmarks and apparently used 16384 H100s for a similar amount of time. This is due to some standard optimizations like Mixture of Experts (though their implementation is finer-grained than usual) and some newer ones like Multi-Token Prediction - but mostly because they fixed everything making their runs slow. They avoid tensor parallelism (interconnect-heavy) by carefully compacting everything so it fits on fewer GPUs, designed their own optimized pipeline parallelism, wrote their own PTX (roughly, Nvidia GPU assembly) for low-overhead communication so they can overlap it better, fix some precision issues with FP8 in software, casually implement a new FP12 format to store activations more compactly and have a section suggesting hardware design changes they'd like made.
  • It should in principle be significantly cheaper to host than LLaMA-3.1-405B, which is already $0.8/million tokens.

Mass-market robot dogs now beat biological dogs in TCO.

When analyzing algorithms, O(log n) is actually the same as O(1), because log n ≤ 64. Don't believe me? Try materializing 2^64 things on your computer. I dare you.

https://pmc.ncbi.nlm.nih.gov/articles/PMC10827157/

What other things are hiding in underanalyzed sequence data?

This paper is kind of hilarious: https://www.nber.org/papers/w31047

Apparently "hyperbolic discounting" - the phenomenon where humans incorrectly weight future rewards ("incorrectly" in that if you use any curve which isn't exponential you will regret it at some point) - isn't necessarily some kind of issue of "self-control", or due to uncertain future gains. It results from humans being really bad at calculating exponentials.

Experiments

Various web projects I have put together over many years. Made with at least four different JS frameworks. Some of them are bad.

A game about... apioforms... by Heavpoot.
Collect Arbitrary Points and achievements by doing things on this website! See how many you have! Do nothing with them because you can't! This is the final form of gamification.
Automatic score keeper, designed for handling Monopoly money.
Colorizes the Alphabet, using highly advanced colorizational algorithms.
The Limitless Grid screensaver (kind of) implemented in a somewhat laggy pixel shader.
An unfinished attempt to replicate an Apple screensaver.
Survive as long as possible against emus and other wildlife. Contributed by Aidan.
Fly an ominous flying square around above some ground! Includes special relativity!
A somewhat unperformant generator for pleasant watercolor-y "fractalart" images. Ported from a Haskell implementation by "TomSmeets".
My fork of GUIHacker. Possibly the only version actually on the web right now since the original website is down.
Obligatory (John Conway's) Game of Life implementation.
It is pitch black (if you ignore all of the lighting). You are likely to be eaten by Heavpoot's terrible writing skills, and/or lacerated/shot/[REDACTED]. Vaguely inspired by the SCP Foundation.
Generates ideas. Terribly. Don't do them. These are not good ideas.
The exciting multiplayer game of incrementing and decrementing! No cheating.
Outdoing all other websites with INFINITE PAGES!
Tells you how late Joe's homework is.
Lorem Ipsum (latin-like placeholder text), eternally. Somehow people have left comments at the bottom anyway.
Instead of wasting time thinking of the best political opinion to hold, simply pick them pseudorandomly per day with this tool.
A Reverse Polish Notation (check wikipedia) calculator, version 2. Buggy and kind of unreliable. This updated version implements advanced features such as subtraction.
Reverse Polish Notation calculator, version 3 - with inbuilt docs, arbitrary-size rational numbers, utterly broken float/rational conversion and quite possibly Turing-completeness.
Reverse Polish Notation calculator, version 4 - increasingly esoteric and incomprehensible. Contributed by Aidan.
Apply custom CSS to most pages on here.
Your favourite* tic-tac-toe game in 3 dimensions, transplanted onto the main website via a slightly horrifically manual process! Technically this game is solved and always leads to player 1 winning with optimal play, but the AI is not good enough to do that without more compute!
More dimensions. More confusion. Somewhat worse performance. 4D Tic-Tac-Toe.
A basic implementation of the WFC procedural generation algorithm.
Type websocket URLs in the top bar and hit enter; type messages in the bottom bar, and also hit enter. Probably useful for some weirdly designed websocket services.
Dice-rolling webapp. Not very useful pending me writing a good parser.
Unholy horrors moved from the depths of my projects directory to your browser. Theoretically, this is a calculator. Good luck using it.

Get updates to the blog (not experiments) in your favourite RSS reader using the RSS feed.

View some of my projects atmy git hosting.

Other blogs

View list
2025-01-16 / Money Stuff
Also Goldman says private markets are the new public markets and Hindenburg quits.
The fun, as it were, is presumably about to begin.
2025-01-16 / ServeTheHome
This is the AMD SMC for its Instinct 8-GPU UBB found in its AI servers to help integrate the GPU assembly into servers The post This is the AMD SMC for its Instinct UBB appeared first on ServeTheHome.
2025-01-16 / Drew DeVault
Jack Dorsey, former CEO of Twitter, ousted board member of BlueSky, and grifter extraordinaire to the tune of a $5.6B net worth, is giving a keynote at FOSDEM. The FOSDEM keynote stage is one of the biggest platforms in the free software community....
2025-01-15 / Chips and Cheese
Hello you fine Internet folks,
My new novel, The City That Would Eat the World, is the first book in my new Sword and Sorcery Progression Fantasy trilogy More Gods Than Stars Source
Everything put into the building that is unnecessary, every cubic foot that is used for purely ornamental purposes beyond that needed to express its use and to make it harmonize with others of its class, is a waste — is, to put it in plain English,...

Comments