Writings
Extending llm-wiki with event-driven doc management
After trying a set of LLM-based and 'claw' personal doc management approaches, I've settled on a pragmatic and practical setup for ingesting…
Ralph-looping a math paper to (attempt to) win a pub argument
A while back a friend and I played a variant of Texas hold'em. In short: four players, cooperative, no betting. Each street (pre-flop, flop…
If only I had more time, my agent would publish a smaller PR.
"I have only made this letter longer because I have not had the time to make it shorter." Blaise Pascal (background on the quote) One thing…
Single failed payment immediately removed legacy Spotify plan, requests to upgrade
I've had Spotify Premium continuously for 10 years. My earliest playlist is from 2016, my latest from this week. For the past year+ in the…
Granular personal notes access for limiting OpenClaw blast-radius
I want an always-on agent that can read a subset of my notes and manage tasks. I don't want it anywhere near the rest of my vault or…
Pragmatic Notes on Running Dangerous AI Coding Agents in Cloud VMs
Running coding agents with free reign is very powerful for a certain class of tasks, especially ones that require little human supervision…
ChatGPT Atlas doesn't have time for me: fails at well-scoped repetition
In short: Atlas performs individual browsing steps correctly, but it breaks down when asked to repeat the same well-scoped action across…
Telemetry Redaction with Presidio: A Showcase
I've been working on telemetry redaction using Microsoft Presidio, and recently contributed a sample to the Presidio repository. This post…
Masking PII in Logs and Traces: Manual vs Automated
I’ve recently been experimenting with PII masking in observability pipelines using Presidio. When comparing the approaches, three automated…
Building personalized focus apps in minutes.
Coding with AI agents creates this peculiar problem: you get these short 10-30 second waits while the model thinks, and it's just long…
Tracking AI Assistant Contributions Using Git Trailers and Git Hooks
It can be challenging to pinpoint the 'concrete' impact AI assistants have on daily development work. For example, during a workshop I was…
Dynamically Routing Traces to Customer-Specific App Insights
When building a SaaS offering publishing platform that helps customers publish azure marketplace offers, we faced an interesting…
Learnings from ingesting millions of technical pages for RAG on Azure.
Learnings from ingesting millions of technical pages for RAG on Azure. Context overview This document outlines insights of an engagement…
Showcase: Azure AI Hybrid Search unexpected results gotcha
Showcase: Azure AI Hybrid Search unexpected results gotcha This document describes a gotcha in Azure AI Search hybrid queries where…
Encoding hidden prompt in LLMs as potential attack vector.
The recent publication on LLM "sleeper agents" prompted me to re-explore influencing LLMs to pursue alternative objectives. In this case…
What does a statement like “AI will take my job” look like in practice?
What does a statement like “AI will take my job” look like in practice? On two different occasions and coincidentally, I have been…
GPT-4 CLI with persistence in 10 lines of code.
A short one: I needed a GPT-4 CLI interface (that's a RAS Syndrome), but most options seemed quite cluttered, and I like code-golf. Here an…
Voice record daily thoughts, redact with GPT4, and save to Apple Notes using Shortcuts.
The why Recording my daily thoughts and notes is a challenge I return to every few months, and so far, I hadn't figured out a system which…
Evaluating RAG/LLMs in highly technical settings using synthetic QA generation
In short: The RAG pattern for LLMs can be evaluated using QA pairs. Creating a "golden" dataset is expensive, but an auto-generated "silver…
On automating unit tests with LLMs.
Dear colleagues, I have a confession to make: I have been delegating some of my unit tests to my Jr. engineer, Gary-Pete Truman. First off…
78% MNIST accuracy using GZIP in under 10 lines of code.
GZIP Addendum after hitting the HN frontpage: MNIST is a straightforward dataset, and higher accuracies are possible with various methods…
From Concept to practice: Learnings from LLMs for Enterprise Production – Part 0
Disclaimer: I am an employee at Microsoft ISE. However, the views and opinions are of my own. We have recently engaged in an architecture…
One approach to achieving self-governing AI today
Note: I will not discuss the alignment issue or responsible AI / ethics. A self-governing AI is capable of solving any challenge in the…
What defines great software: solving my problem in under 5 minutes with Tailscale
TL;DR: I had been avoiding properly setting up remote networking to my homeserver. Tailscale solved my problem in under 5 minutes. Today, at…