Index

osmarks' website

Humans aren't even AGI.

Blog

Read my opinions via the internet.

2025-03-02 / 4.00k words
The TAM for God is very large.
2024-07-06 / 1.62k words
I got annoyed and rewrote everything.
2025-02-10 / 1.55k words
My new main router.
2024-02-25 / 3.44k words
How to run local AI slightly more cheaply than with a prebuilt system. Somewhat opinionated.
2025-01-26 / 1.84k words
Predicting the post-social world.
2025-01-24 / 4.17k words
Downloading and indexing everything* on Reddit on one computer.
2025-01-09 / 1.35k words
Computer algebra systems leave lots to the user and require task-specific manual design. Can we do better?
2024-11-01 / 2.65k words
Has Minecraft become easier?
2024-10-16 / 665 words
A slightly odd pattern I've observed.
2024-10-06 / 2.99k words
Or: why most AI hardware startups are lying.
2024-10-06 / 1.08k words
As ever, AI safety becomes AI capabilities.
2020-06-11 / 4.82k words
A nonexhaustive list of media which I like and which you may also be interested in.
2023-08-28 / 2.59k words
Powerful search tools as externalized cognition, and how mine work.
2024-05-12 / 1.29k words
What exactly is "magic" anyway?
2024-04-27 / 848 words
Please stop making chatbots.
2024-04-22 / 1.54k words
Absurd technical solutions for problems which did not particularly need solving are one of life's greatest joys.
2024-03-27 / 1.87k words
RSAPI and the rest of my infrastructure.
2023-09-24 / 1.64k words
This is, of course, all part of my evil plan to drive site activity through systematically generating (meta)political outrage.
2023-06-06 / 2.50k words
The history of the feared note-taking application.
2023-07-02 / 1.61k words
Why programming education isn't very good, and my thoughts on AI code generation.
2022-02-24 / 949 words
Learn about how osmarks.net works internally! Spoiler warning if you wanted to reverse-engineer it yourself.
2023-01-28 / 407 words
A common criticism of school is that it focuses overmuch on rote memorization. While I don't endorse school, I think this argument is wrong.
2022-05-14 / 463 words
RSS/Atom are protocols for Internet-based newsletter/feed services. They're surprisingly well-supported and you should consider using them.
2021-07-08 / 1.07k words
In which I get annoyed at yet more misguided UK government behaviour.
2020-05-20 / 582 words
Is solving Sudoku and similar puzzles by hand really useful in building computer science ability? We don't think so.
2017-08-16 / 940 words
We are not responsible if these tips cause your ship to implode/explode. Contains spoilers in vast quantities.
2018-08-14 / 688 words
Why I think that government programs telling everyone to "code" are pointless.
2020-01-25 / 145 words
It's slightly different now!
2018-06-01 / 737 words
My (probably unpopular in general but... actually likely fairly popular amongst this site's intended audience) opinions on smartphones today.

Microblog

Short-form observations.

This is ridiculous. Font descriptions mean nothing. We need bitter-lesson font classification.

Theory: people (partly) dislike deep learning because it feels like cheating, like Ozempic - it is "too easy" for what it gets you.

As Robin Hanson says, building the sheer variety of products we have is actually bad, because it increases unit costs. This is especially clear in laptops - there are far too many laptops with too little to distinguish them and too many nonsense minor issues. As such, I think we need a new streamlined and harmonized lineup of all laptops:

  • Cheapest Possible Technically Functional Laptop
  • Mediocre Office and Home Laptop (to be issued to most office workers and people who want to edit spreadsheets or emails and such)
  • CEO Laptop (reasonably fast, expensive, big battery for CEO activities)
  • Programmer Laptop (ThinkPad-like focused on CPU performance and reasonable portability)
  • Gamer Laptop (16" Legion-like with middling battery life and decently high-powered CPU/GPU)
  • Gamer Laptop (Big) (17"-18" desktop replacement)
  • Technician Laptop (smallish thick and rugged laptop with many ports)
  • Multimedia Laptop (Mediocre Office and Home Laptop with a nicer display and better graphics)

There would also be a version number updated whenever new components are available, of course. There can perhaps be two or three variants of each (with the same chassis, board, etc but different components) with different pricing, but no more.

Anyone optimistic about society adapting sanely to AGI should look at the uptake of IPv6.

Why do all three of the reasonably okay AI music tools (Udio, Suno, Riffusion) have fairly similar artifacts? Except for, I think, older versions of Udio, they all sound consistently off in some way I don't know enough music theory to explain, particularly in metal vocals and/or complex instrumentals. Do they all use the same autoencoders or something?

Street-Fighting Mathematics is not actually related to street fighting, but you should read it if you like estimating things. There is much power in being approximately right very fast, and it contains many clever tricks which are not immediately obvious but are very powerful. My favourite part so far is this exercise - you can uniquely (up to a dimensionless constant) identify this formula just from some ideas about what it should contain and a small linear algebra problem!

People are claiming (I don't know much RL) that DeepSeek-R1's training process is very simple (based on the paper: https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf) - a boring standardish (for LLMs) RL algorithm optimizing for reward on some ground-truth-verifiable tasks (they don't say which). So why did o1 not happen until late 2024 (public release) or late 2023 (rumours of Q*)? "Do RL on useful tasks" is a very obvious idea. I think the relevant algorithms are older than that.

The paper says that they tried applying it to smaller models and it didn't work nearly as well, so "base models were bad then" is a plausible explanation, but it's clearly not true - GPT-4-base is probably a generally better (if costlier) model than 4o, which o1 is based on (could be distillation from a secret bigger one though); and LLaMA-3.1-405B used a somewhat similar postttraining process and is about as good a base model, but is not competitive with o1 or R1. So I don't think it's that.

What's going on here? The process is simple-sounding but filled with pitfalls DeepSeek don't mention? What has changed between 2022/23 and now which means we have at least three decent long-CoT reasoning models around?

Religion has progressed, historically, from:

  • there is a very large quantity of widely dispersed gods and you don't know about the vast majority of them
  • there are quite a few gods, but a bounded amount
  • there is exactly one god
  • there are exactly zero gods

By extrapolation, we can conclude that the next step is that humanity has negative one god, i.e. is in theological debt and must build a god to continue. This is where the EY-style "aligned singleton" came from. But people are now moving toward "we need everyone to have pocket gods" because they are insane, in line with the pattern. The next step is of course "we need to build gods and put them in everything".

Experiments

Various web projects I have put together over many years. Made with at least four different JS frameworks. Some of them are bad.

A game about... apioforms... by Heavpoot.
Collect Arbitrary Points and achievements by doing things on this website! See how many you have! Do nothing with them because you can't! This is the final form of gamification.
Automatic score keeper, designed for handling Monopoly money.
Colorizes the Alphabet, using highly advanced colorizational algorithms.
The Limitless Grid screensaver (kind of) implemented in a somewhat laggy pixel shader.
An unfinished attempt to replicate an Apple screensaver.
Survive as long as possible against emus and other wildlife. Contributed by Aidan.
Fly an ominous flying square around above some ground! Includes special relativity!
A somewhat unperformant generator for pleasant watercolor-y "fractalart" images. Ported from a Haskell implementation by "TomSmeets".
My fork of GUIHacker. Possibly the only version actually on the web right now since the original website is down.
Obligatory (John Conway's) Game of Life implementation.
It is pitch black (if you ignore all of the lighting). You are likely to be eaten by Heavpoot's terrible writing skills, and/or lacerated/shot/[REDACTED]. Vaguely inspired by the SCP Foundation.
Generates ideas. Terribly. Don't do them. These are not good ideas.
The exciting multiplayer game of incrementing and decrementing! No cheating.
Outdoing all other websites with INFINITE PAGES!
Tells you how late Joe's homework is.
Lorem Ipsum (latin-like placeholder text), eternally. Somehow people have left comments at the bottom anyway.
Instead of wasting time thinking of the best political opinion to hold, simply pick them pseudorandomly per day with this tool.
A Reverse Polish Notation (check wikipedia) calculator, version 2. Buggy and kind of unreliable. This updated version implements advanced features such as subtraction.
Reverse Polish Notation calculator, version 3 - with inbuilt docs, arbitrary-size rational numbers, utterly broken float/rational conversion and quite possibly Turing-completeness.
Reverse Polish Notation calculator, version 4 - increasingly esoteric and incomprehensible. Contributed by Aidan.
Apply custom CSS to most pages on here.
Your favourite* tic-tac-toe game in 3 dimensions, transplanted onto the main website via a slightly horrifically manual process! Technically this game is solved and always leads to player 1 winning with optimal play, but the AI is not good enough to do that without more compute!
More dimensions. More confusion. Somewhat worse performance. 4D Tic-Tac-Toe.
A basic implementation of the WFC procedural generation algorithm.
Type websocket URLs in the top bar and hit enter; type messages in the bottom bar, and also hit enter. Probably useful for some weirdly designed websocket services.
Dice-rolling webapp. Not very useful pending me writing a good parser.
Unholy horrors moved from the depths of my projects directory to your browser. Theoretically, this is a calculator. Good luck using it.

Get updates to the blog (not experiments) in your favourite RSS reader using the RSS feed.

View some of my projects atmy git hosting.

Other blogs

View list
2025-03-03 / ServeTheHome
In our Lenovo ThinkCentre M75q Tiny Gen5 review, we see how this 1L-class PC offers an improved AMD Ryzen SoC and we got 128GB of RAM working The post Lenovo ThinkCentre M75q Tiny Gen5 Review An AMD Ryzen Powered TinyMiniMicro appeared first on...
To celebrate the launch of the upcoming Mage Errant Illustrated Omnibus Edition Kickstarter, the first three books in the series are free until March 5th! Source
It’s happening.
2025-03-03 / The Eldraeverse
Just been making some adjustments to the Imperial Military Service Table of Ranks today (specifically, adjusting the enlisted ranks to correct some seniorities and fix the missing E-9 grade), so here's the current version, screenshotted from my...
Nissan courting Tesla, Europe’s Starlink competitor, Chinese semiconductor progress, ways to use existing interconnection capacity, and more.
2025-03-01 / Chips and Cheese
Zen 5 is AMD's first core to use full-width AVX-512 datapaths.
2025-02-28 / rtl-sdr.com
Over on GitHub, Alejandro Martín has recently released his open-source 'rtl-sdr-analyzer' software, which is an RTL-SDR-based signal analyzer and automatic jamming detector. The software is based on Python and connects to the RTL-SDR via an rtl_tcp...

Comments