Pragmatic Notes on Running Dangerous AI Coding Agents in Cloud VMs

Running coding agents with free reign is very powerful for a certain class of tasks, especially ones that require little human supervision, or where you want to close (or disconnect) your laptop, walk away, and come back to results.

Recently there have been several HN discussions about safely running Claude Code or Copilot CLI agents, such as Yolobox – Run AI coding agents with full sudo without nuking home dir and Running Claude Code dangerously. These post detail the potential dangers and show how to run these agents more safely, and while reasonable, I find they lack in a few respects.

In particular, I want strong isolation, long running agent tasks, minimal cognitive overhead and I really value being able to close my laptop, walk away, and get notified on my phone when things are done. I do not mind paying for a cloud VM.

There are many valid ways to solve this problem. This post describes mine. It covers running multiple coding agents concurrently in a cloud VM, how I handle access and repos, and how I keep notifications simple.

Setup

I generated some Terraform to spin up an Azure VM with a cloud-init.yml for setting up common tools/environments I use. Claude can generate a decent starting point for this quite easily, given your particular environment.

Managing access

For secure access, I use Tailscale. Note: I'm not paid by them, but it is easily my favorite piece of infrastructure software!

A cloud-init script installs Tailscale on first boot and automatically joins the VM to my tailnet. SSH access is enabled using Tailscale SSH. Once the VM is up, it appears on my private network with a stable hostname via Magic DNS. No SSH key management, no exposed ports.

Excerpt from cloud-init.yml:

runcmd:
  - apt clean
  - apt update
  - curl -fsSL https://tailscale.com/install.sh | sh
  - sleep 10
  - tailscale up --authkey=${tailscale_auth_key} --ssh --hostname=devbox

I can now run:

ssh devuser@devbox

or connect using VS Code Remote SSH:

https://code.visualstudio.com/docs/remote/ssh

Managing repos

Most of the time I prefer tight, step by step control over code generation, working locally in VS Code with Copilot. For longer running or experimental tasks, I instead let an agent work remotely on a branch inside the VM, and pull the results once I am satisfied.

While this is arguably git basics, it works well for me and I found that it is useful sharing how to set up a VM as a remote:

On the cloud VM:

mkdir ~/myrepo.git
cd ~/myrepo.git
git init --bare

On the local machine, from the repo directory:

git remote add devbox ssh://devuser@devbox/~/myrepo.git
git push devbox mybranch

Then you can pull clone and check out the branch, do the work, commit, and push to bare repo:

cd ~
git clone ./myrepo.git

# still the cloud VM: do the work on mybranch
cd ~/myrepo
git checkout mybranch

# agent edits files, runs tools, commits changes
git status
git commit -am "agent: complete task"

# Push the updated branch back to the bare repo
git push origin mybranch

Finally, locally, you can get the changes:

# On your local machine
git fetch devbox
git checkout mybranch
git pull devbox mybranch

Managing persistent sessions

I use tmux to manage long running sessions. This lets agents keep running after I disconnect, and makes it easy to juggle multiple concurrent sessions. If you are not familiar with tmux, it is worth learning!

Managing agent to human communication

For notifications, I use https://ntfy.sh.

It is free, extremely simple, and works over plain HTTP POST. I have the iOS app installed, so I can walk away from my laptop and still get notified when work completes. I explicitly instruct my agents to make a POST request once their work is done in the agent instructions.

Example of a notification:

curl -X POST https://ntfy.sh/my-topic \
  -d "Agent finished refactoring auth flow on branch mybranch"

That is it. No SDKs, no auth setup required for basic usage. The notification shows up immediately on my phone/browser.

Notes

Some of what I am doing here overlaps with task delegation features in tools like Copilot CLI. I still prefer this setup because it gives me full control over isolation, repos, and long running workflows across multiple projects.
Before this, I had a simple .devcontainer setup. I would copy it into a repo, open it in VS Code, and run agents inside the container with tools like Copilot CLI preinstalled. That was my original "yolo box", but it has since been replaced by the VM based setup described above.

If there is interest, I can publish a repo with the Terraform, cloud-init scripts, makefile, etc, and the old .devcontainer setup.