<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>Zevs</title>
        <link>https://zevs.gg/</link>
        <description>Zevs' Blog</description>
        <lastBuildDate>Fri, 06 Mar 2026 09:32:29 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <image>
            <title>Zevs</title>
            <url>https://zevs.gg/avatar.png</url>
            <link>https://zevs.gg/</link>
        </image>
        <copyright>CC BY-NC-SA 4.0 2025 © Zevs</copyright>
        <atom:link href="https://zevs.gg/feed.xml" rel="self" type="application/rss+xml"/>
        <item>
            <title><![CDATA[Building SMSCode: A Production Rust Service from Scratch]]></title>
            <link>https://zevs.gg/posts/building-smscode</link>
            <guid isPermaLink="true">https://zevs.gg/posts/building-smscode</guid>
            <pubDate>Fri, 06 Mar 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[How I built a virtual number platform handling thousands of concurrent orders with Axum, PostgreSQL, Redis, and a thin Astro frontend — the architecture, the patterns, and the hard-won decisions.]]></description>
            <content:encoded><![CDATA[<p>[[toc]]</p>
<p><a href="https://smscode.gg">SMSCode</a> is a virtual number platform — users purchase temporary phone numbers to receive SMS verification codes. Think of it as a marketplace that connects users who need phone numbers with SMS providers who supply them, handling payments, order lifecycle, and real-time delivery in between.</p>
<p>This isn't a toy project or a demo. It's a production system with real money, real users, and real SMS providers that can fail in creative ways. Building it taught me more about Rust in production than any tutorial or book ever could.</p>
<p>This post is a deep dive into how it works — the architecture decisions, the patterns that survived production, and the ones that didn't.</p>
<h2>The Migration Story</h2>
<p>SMSCode started on <a href="https://directus.io">Directus</a>, a Node.js headless CMS. For the early stage, Directus was perfect — it gave me a database, an admin panel, and a REST API with zero custom code. I could focus on building the product instead of building infrastructure.</p>
<p>But as the platform grew past a few thousand users, the cracks showed:</p>
<ul>
<li><strong>Query performance</strong> — Directus abstracts SQL behind a generic query builder. As relationships got complex, generated queries became inefficient and hard to optimize.</li>
<li><strong>Business logic limitations</strong> — order lifecycle management, atomic balance operations, and provider integration needed logic that didn't fit Directus's hook system.</li>
<li><strong>Memory usage</strong> — the Node.js process was consuming 300-400MB at rest, spiking higher under load.</li>
<li><strong>Concurrency</strong> — background tasks (order expiration, provider reconciliation) competed with API requests on the same event loop.</li>
</ul>
<p>I needed a custom backend. The question was: <strong>what do I build it with?</strong></p>
<p>I considered staying in the Node.js ecosystem — Hono or Fastify would have been the safe choice. But the workload profile of SMSCode is unusual: it's not just request-response. There are 10 background tasks running on configurable intervals, real-time SSE streams, webhook processing, and atomic financial operations. All running 24/7 with zero tolerance for memory leaks or GC pauses.</p>
<p>Rust with <a href="https://github.com/tokio-rs/axum">Axum</a> was the answer. Axum gave me the web framework, <a href="https://tokio.rs">Tokio</a> gave me the async runtime for background tasks and SSE, and Rust's type system gave me guarantees that the financial logic would be correct.</p>
<p>The migration took about three weeks. Memory usage dropped from ~400MB to ~12MB. Tail latency dropped by 10x. And the codebase became easier to reason about, not harder — because the type system caught entire categories of bugs at compile time. You can see the result live at <a href="https://smscode.gg">smscode.gg</a>.</p>
<h2>Architecture Overview</h2>
<p>The system has three main components:</p>
<pre><code>┌──────────────────┐     ┌──────────────────┐
│   Astro (Web)    │     │  Admin SSR       │
│   SSR pages      │     │  TanStack Start  │
│   Port 4321      │     │  Port 3001       │
└────────┬─────────┘     └────────┬─────────┘
         │ axumFetch()            │ axumFetch()
         │ (internal auth)        │ (admin auth)
         ▼                        ▼
┌──────────────────────────────────────────────┐
│              Axum API (Rust)                  │
│              Port 3000                        │
│                                               │
│  /internal/*  ── Web frontend calls          │
│  /v1/*        ── Public REST API             │
│  /webhooks/*  ── SMS + payment callbacks     │
│  + 10 background tasks (Tokio)               │
└──────┬──────────┬──────────┬─────────────────┘
       │          │          │
       ▼          ▼          ▼
  PostgreSQL  Redis × 3   SMS Providers
  (17 tables) (session,   Payment Gateways
              cache,
              ratelimit)
</code></pre>
<p><strong>Astro</strong> is a thin proxy — it handles SSR rendering and session cookies but delegates <strong>all</strong> business logic to Axum via internal HTTP calls. This is a deliberate choice: Axum is the single source of truth. The web frontend (what you see at <a href="https://smscode.gg">smscode.gg</a>) can be rebuilt or replaced without touching business logic.</p>
<p><strong>The admin panel</strong> is a separate TanStack Start (React SSR) app. It communicates with Axum over the internal network, never exposed publicly. Different deployment cadence, different auth model, clean separation.</p>
<p><strong>Axum</strong> handles everything else: API routes, webhooks, background tasks, metrics, and real-time event streaming. A single Rust binary.</p>
<h2>Application State</h2>
<p>One of the first things you design in an Axum app is your <code>AppState</code> — the shared state injected into every request handler. Getting this wrong means refactoring hundreds of handlers later.</p>
<pre><code class="language-rust">#[derive(Clone)]
pub struct AppState {
    inner: Arc&lt;AppStateInner&gt;,
}

struct AppStateInner {
    pool: PgPool,
    redis_session: ConnectionManager,
    redis_cache: ConnectionManager,
    redis_ratelimit: ConnectionManager,
    config: Config,
    providers: ProviderRegistry,
    payment_client: reqwest::Client,
    webhook_client: reqwest::Client,
    email_client: reqwest::Client,
    metrics: Metrics,
    prom_handle: PrometheusHandle,
    push_service: Option&lt;WebPushService&gt;,
}
</code></pre>
<p>A few things to note:</p>
<p><strong>Three separate Redis connections</strong> — session, cache, and rate limit data live on three different Redis instances. This means I can flush the cache or rate limit data without killing user sessions. During development, I accidentally flushed Redis once. With a single instance, that would have logged out every user. With isolation, it was just a cache miss.</p>
<p><strong>Separate HTTP clients</strong> — payment gateway calls get a 10-second timeout. Webhook dispatch gets a 3-second timeout with no redirect following. SMS provider calls go through a proxy with their own circuit breaker. Each client is tuned for its specific use case.</p>
<p><strong><code>Arc&lt;AppStateInner&gt;</code> pattern</strong> — <code>AppState</code> is a thin <code>Clone</code>-able wrapper around an <code>Arc</code>. Cloning the state for each handler is just incrementing a reference count — zero allocation. The inner struct holds the actual data, which is shared across all handlers and background tasks.</p>
<p><strong>Optional services</strong> — <code>push_service: Option&lt;WebPushService&gt;</code> is <code>None</code> when VAPID keys aren't configured. The app gracefully degrades instead of panicking on startup. Same pattern for payment gateways and Google OAuth.</p>
<h2>Error Handling</h2>
<p>Every Axum handler returns <code>Result&lt;impl IntoResponse, AppError&gt;</code>. The <code>AppError</code> enum covers every failure mode:</p>
<pre><code class="language-rust">#[derive(Debug, thiserror::Error)]
pub enum AppError {
    #[error(&quot;Authentication required&quot;)]
    Unauthorized,

    #[error(&quot;Insufficient balance&quot;)]
    InsufficientBalance,

    #[error(&quot;{0}&quot;)]
    NotFound(String),

    #[error(&quot;{0}&quot;)]
    Validation(String),

    #[error(&quot;Too many requests&quot;)]
    RateLimit,

    #[error(&quot;Too many requests&quot;)]
    RateLimitWithRetry(i64),

    #[error(&quot;Provider error: {0}&quot;)]
    Provider(String),

    #[error(transparent)]
    Internal(#[from] anyhow::Error),

    #[error(&quot;Database error&quot;)]
    Database(#[from] sqlx::Error),
}
</code></pre>
<p>The <code>IntoResponse</code> implementation maps each variant to the correct HTTP status code and a consistent JSON envelope:</p>
<pre><code class="language-json">{
  &quot;success&quot;: false,
  &quot;error&quot;: {
    &quot;code&quot;: &quot;INSUFFICIENT_BALANCE&quot;,
    &quot;message&quot;: &quot;Not enough balance&quot;
  }
}
</code></pre>
<p>Two important details:</p>
<p><strong>Internal errors are opaque</strong> — <code>AppError::Internal</code> and <code>AppError::Database</code> both return a generic &quot;Internal server error&quot; to the client while logging the full error with <code>tracing::error!</code>. Users never see stack traces or database error messages.</p>
<p><strong>Rate limit with retry</strong> — <code>RateLimitWithRetry(i64)</code> adds a <code>Retry-After</code> header to the response, telling clients exactly when they can retry. This is critical for the public API where automated clients need to back off gracefully.</p>
<p>The <code>?</code> operator makes error propagation clean across the entire callstack:</p>
<pre><code class="language-rust">async fn create_order(
    State(state): State&lt;AppState&gt;,
    auth: AuthUser,
    Json(input): Json&lt;CreateOrderInput&gt;,
) -&gt; Result&lt;Json&lt;OrderResponse&gt;, AppError&gt; {
    let product = get_product(&amp;state, input.product_id)
        .await?  // → AppError::NotFound if missing
        .ok_or_else(|| AppError::NotFound(&quot;Product not found&quot;.into()))?;

    check_cancel_limit(&amp;state, auth.user_id)
        .await?;  // → AppError::Conflict if exceeded

    let balance = atomic_debit(&amp;state, auth.user_id, product.price)
        .await?;  // → AppError::InsufficientBalance

    let number = state.providers()
        .get_number(&amp;params)
        .await
        .map_err(|e| AppError::Provider(e.to_string()))?;

    // ... create order record
    Ok(Json(order.into()))
}
</code></pre>
<p>Every <code>?</code> either propagates to the correct HTTP error or converts automatically via <code>#[from]</code>. No try/catch chains, no forgotten error handling.</p>
<h2>Atomic Balance Operations</h2>
<p>Financial operations are the most critical part of the system. A user's balance must always be consistent with their transaction ledger. Getting this wrong means users lose money or get free credit.</p>
<p>The core pattern is <strong>debit-first, refund-on-failure</strong>:</p>
<pre><code>1. BEGIN TRANSACTION
   a. UPDATE users SET balance = balance - $price
      WHERE id = $user_id AND balance &gt;= $price
      RETURNING balance
   b. INSERT transaction (ORDER_DEBIT, -$price, balance_after)
   c. COMMIT

2. Call SMS provider (external API — OUTSIDE transaction)

3. On success: INSERT order record
4. On failure: BEGIN → credit balance back + INSERT ORDER_REFUND → COMMIT
</code></pre>
<p>The debit happens <strong>before</strong> the provider call. Why? Because the provider call can take seconds. If we deducted after, a user could spend the same balance twice by firing two concurrent requests.</p>
<p>The SQL itself is a single atomic statement with a CAS (compare-and-swap) guard:</p>
<pre><code class="language-sql">UPDATE users
SET balance = balance - $1
WHERE id = $2 AND balance &gt;= $1
RETURNING balance
</code></pre>
<p>If the balance is insufficient, zero rows are returned, and we map that to <code>AppError::InsufficientBalance</code>. No race condition possible — PostgreSQL's row-level locking guarantees atomicity.</p>
<p>To prevent integer overflow on refunds (balance is <code>BIGINT</code>):</p>
<pre><code class="language-sql">UPDATE users
SET balance = balance + $1
WHERE id = $2 AND balance &lt;= 9223372036854775807 - $1
RETURNING balance
</code></pre>
<p>A background task (<code>reconcile_balances</code>) runs every 30 minutes, comparing each user's balance against the sum of their transaction ledger. Any mismatch is logged and alerted. In months of production, it has never found one — but I sleep better knowing it's checking.</p>
<h2>Background Tasks with Leader Election</h2>
<p>SMSCode runs 10 background tasks on configurable intervals. These handle everything from expiring old orders to syncing the product catalog from SMS providers.</p>
<p>The challenge: I want to run multiple Axum instances for high availability, but background tasks should only run on <strong>one</strong> instance at a time. The solution is Redis-based leader election using <code>SETNX</code>:</p>
<pre><code class="language-rust">async fn acquire_leader_lock(
    redis: &amp;ConnectionManager,
    task: &amp;str,
    instance_id: &amp;str,
    ttl_secs: u64,
) -&gt; bool {
    let key = format!(&quot;task_lock:{task}&quot;);
    let result: RedisResult&lt;Option&lt;bool&gt;&gt; = redis::cmd(&quot;SET&quot;)
        .arg(&amp;key)
        .arg(instance_id)
        .arg(&quot;NX&quot;)        // Only set if not exists
        .arg(&quot;EX&quot;)        // Expire after TTL
        .arg(ttl_secs)
        .query_async(&amp;mut redis.clone())
        .await;

    matches!(result, Ok(Some(true)))
}
</code></pre>
<p>Each task tick: try to acquire the lock. If another instance already holds it, skip. If we get it, run the task, then release the lock with a Lua CAS script (only delete if we still own it):</p>
<pre><code class="language-lua">if redis.call(&quot;GET&quot;, KEYS[1]) == ARGV[1] then
    return redis.call(&quot;DEL&quot;, KEYS[1])
end
return 0
</code></pre>
<p>The lock TTL is <code>max(interval × 2, 30s)</code> — a safety net so that if an instance crashes mid-task, the lock expires and another instance picks up. Normal runs release immediately.</p>
<p>The task loop itself is shutdown-aware using Tokio's <code>CancellationToken</code>:</p>
<pre><code class="language-rust">loop {
    tokio::select! {
        _ = shutdown.cancelled() =&gt; break,
        _ = interval.tick() =&gt; {
            if !acquire_leader_lock(...).await { continue; }

            tokio::select! {
                _ = shutdown.cancelled() =&gt; {
                    release_leader_lock(...).await;
                    break;
                }
                result = func(state.clone()) =&gt; {
                    // log and record metrics
                    release_leader_lock(...).await;
                }
            }
        }
    }
}
</code></pre>
<p>The nested <code>select!</code> ensures that even a long-running task is interrupted promptly on SIGTERM — critical for zero-downtime deployments.</p>
<h2>Provider Integration with Circuit Breaker</h2>
<p>SMS providers are external APIs that fail in exciting ways — timeouts, rate limits, malformed responses, or just going down entirely. A failing provider shouldn't take down the entire platform.</p>
<p>Each provider implements the <code>ProviderClient</code> trait:</p>
<pre><code class="language-rust">#[async_trait]
pub trait ProviderClient: Send + Sync {
    fn code(&amp;self) -&gt; &amp;str;
    fn circuit_state(&amp;self) -&gt; (u32, bool);

    async fn get_number(&amp;self, params: &amp;GetNumberParams)
        -&gt; Result&lt;GetNumberResult, ProviderError&gt;;
    async fn cancel_activation(&amp;self, id: &amp;str)
        -&gt; Result&lt;bool, ProviderError&gt;;
    async fn get_status(&amp;self, id: &amp;str)
        -&gt; Result&lt;StatusResult, ProviderError&gt;;
    // ... more methods
}
</code></pre>
<p>The <code>Send + Sync</code> bounds are required because providers are stored as <code>Arc&lt;dyn ProviderClient&gt;</code> in the <code>ProviderRegistry</code> and shared across all Tokio tasks. This is where Rust's type system shines — the compiler enforces that provider implementations are thread-safe.</p>
<p>Each provider has a built-in circuit breaker. After N consecutive failures, the circuit opens and subsequent calls fail fast without hitting the external API. After a cooldown period, the circuit enters a half-open state and allows one probe request. If it succeeds, the circuit closes and normal operation resumes.</p>
<p>The <code>ProviderRegistry</code> loads provider configs from the database with a 5-minute cache, dispatching to the correct client based on the <code>protocol</code> column (e.g., <code>sms_activate</code>, <code>sms_bower</code>). Adding a new provider protocol means implementing the trait — the registry handles discovery and caching automatically.</p>
<h2>Real-Time with Server-Sent Events</h2>
<p>When a user purchases a number and waits for the SMS code, they need to know the moment it arrives. Polling every few seconds works but wastes resources. SSE gives us push-based updates with minimal overhead.</p>
<p>The architecture:</p>
<pre><code>Browser EventSource
    → Astro /api/orders/stream (session check)
        → Axum /internal/user/orders/stream (Redis pub/sub)
                ↑
    Redis PUBLISH ← webhook handler (OTP received)
                  ← background task (order expired)
                  ← order action (created, canceled)
</code></pre>
<p>When an SMS webhook arrives with an OTP code, the handler updates the database and publishes an event to the user's Redis channel:</p>
<pre><code class="language-rust">// Publish order event to Redis pub/sub
publish_order_event(&amp;state, user_id, OrderEvent {
    order_id,
    status: &quot;OTP_RECEIVED&quot;.into(),
    phone_number: Some(phone),
    otp_code: Some(code),
    otp_message: Some(message),
}).await;
</code></pre>
<p>The SSE handler subscribes to the user's channel and streams events as they arrive. The Astro layer validates the session cookie and proxies the stream, adding auth without exposing the internal Axum endpoint.</p>
<p><strong>Fallback</strong>: If the SSE connection drops (tab backgrounded, network hiccup), the frontend falls back to polling. On reconnect, it polls once immediately to catch any events missed during the gap — because Redis pub/sub is fire-and-forget, events during a disconnection are lost.</p>
<p>This works across multiple Axum replicas because Redis pub/sub broadcasts to all subscribers. A webhook hitting replica A publishes to Redis, and the SSE connection on replica B receives it instantly.</p>
<h2>Four Auth Layers</h2>
<p>Different consumers need different auth mechanisms:</p>
<table>
<thead>
<tr>
<th>Layer</th>
<th>Mechanism</th>
<th>Used by</th>
</tr>
</thead>
<tbody>
<tr>
<td>Internal</td>
<td>Shared secret + user identity header</td>
<td>Astro → Axum</td>
</tr>
<tr>
<td>Admin</td>
<td>Session header over internal network</td>
<td>Admin panel → Axum</td>
</tr>
<tr>
<td>Public API</td>
<td><code>Authorization: Bearer &lt;token&gt;</code></td>
<td><a href="https://smscode.gg">External API consumers</a></td>
</tr>
<tr>
<td>Webhook</td>
<td>Signature verification</td>
<td>SMS providers, payment gateways</td>
</tr>
</tbody>
</table>
<p>The internal auth uses a shared secret between Astro and Axum — only valid over the internal network (never exposed to the internet). Astro reads the session cookie, decrypts it (AES-256-GCM), and forwards the user identity in a separate header.</p>
<p>The admin auth uses a separate session store in Redis. Admin sessions carry permissions (<code>role</code>, <code>permissions</code> array), checked by the <code>dashboard_auth</code> middleware before any handler runs.</p>
<p>All key comparisons use constant-time comparison to prevent timing attacks. Password hashing uses argon2id via <code>spawn_blocking</code> — heavy CPU work runs on Tokio's blocking thread pool, keeping the async runtime responsive.</p>
<h2>Monitoring</h2>
<p>You can't run a financial platform blind. SMSCode exports metrics at two levels:</p>
<p><strong>In-memory metrics</strong> — a custom <code>Metrics</code> struct using <code>DashMap</code> and ring buffers tracks per-endpoint latency (p50/p95/p99), provider call performance, cache hit rates, balance operations, and background task health. Exposed via <code>/internal/metrics</code> as JSON for the admin dashboard.</p>
<p><strong>Prometheus metrics</strong> — standard counters and histograms (<code>http_requests_total</code>, <code>http_request_duration_seconds</code>, <code>provider_api_calls_total</code>, etc.) scraped by Prometheus every 15 seconds. Grafana dashboards visualize trends and AlertManager sends notifications to Telegram when thresholds are breached.</p>
<p>The monitoring stack runs on a separate LXC container from the application — if Axum crashes, monitoring stays up and alerts fire.</p>
<h2>Infrastructure</h2>
<p>Production runs on a single EPYC server with Proxmox VE, using LXC containers (not Docker):</p>
<table>
<thead>
<tr>
<th>Container</th>
<th>Purpose</th>
</tr>
</thead>
<tbody>
<tr>
<td>App</td>
<td>Axum API + Astro SSR + Admin panel</td>
</tr>
<tr>
<td>Database</td>
<td>PostgreSQL 17</td>
</tr>
<tr>
<td>Cache</td>
<td>Redis × 3 (session / cache / ratelimit)</td>
</tr>
<tr>
<td>Gateway</td>
<td>Traefik + Cloudflare Tunnel</td>
</tr>
<tr>
<td>Monitoring</td>
<td>Grafana + Prometheus</td>
</tr>
</tbody>
</table>
<p>LXC over Docker because: near-native performance, lower resource overhead, and systemd service management instead of Docker's restart policies. Each service (<code>smscode-api</code>, <code>smscode-web</code>, <code>smscode-admin</code>) is a systemd unit with proper dependency ordering and env file loading.</p>
<p>Traffic flows through Cloudflare (DDoS protection, TLS termination) → Cloudflare Tunnel → Traefik (routing) → the appropriate service. The only ports exposed to the internet are behind Cloudflare — the actual server IP is hidden.</p>
<p>VLAN segmentation isolates production from infrastructure. The monitoring container can scrape metrics from the app container, but not the other way around.</p>
<h2>Deployment</h2>
<p>Deployments are automated via a single script:</p>
<pre><code class="language-bash">./scripts/deploy.sh api    # Build + deploy Axum
./scripts/deploy.sh web    # Build + deploy Astro
./scripts/deploy.sh admin  # Build + deploy Admin
./scripts/deploy.sh all    # All three
</code></pre>
<p>The script handles building (cross-compilation for Rust, <code>bun run build</code> for the frontends), uploading artifacts to the server, creating backups, restarting systemd services, and running health checks. If the health check fails, it rolls back automatically.</p>
<p>For the Rust binary, I build with <code>--release</code> on my local machine and <code>scp</code> the binary to the server. The binary is ~15MB and starts in milliseconds. Compare that to deploying a Node.js app with <code>node_modules</code> — it's a refreshingly simple deployment model.</p>
<h2>Lessons from Production</h2>
<p>After running this system in production, a few hard-won lessons:</p>
<p><strong>Debit first, refund on failure.</strong> Never rely on a successful external API call before touching balances. Provider APIs are unreliable. Your database isn't. Debit the balance, call the provider, and refund if it fails. The alternative — reserving balance or deducting after — creates race conditions that are nearly impossible to reproduce in testing but trivial to trigger in production.</p>
<p><strong>Three Redis instances, not one.</strong> The first time I needed to flush the rate limit store during a configuration change, I was glad sessions and cache were on separate instances. Isolation costs almost nothing and saves you from cascading failures.</p>
<p><strong>Circuit breakers are essential.</strong> A failing SMS provider without a circuit breaker will eat your timeout budget across every request. With a circuit breaker, failures are detected in milliseconds after the circuit opens, and the system gracefully routes to other providers.</p>
<p><strong>SSE with polling fallback.</strong> Pure SSE is fragile — browsers throttle background tabs, mobile connections drop, proxies have idle timeouts. Always have a polling fallback that activates seamlessly. Users don't care about the transport mechanism; they care that the OTP appears when it arrives.</p>
<p><strong>Reconciliation tasks are non-negotiable.</strong> For anything financial, run periodic reconciliation. Our <code>reconcile_balances</code> task has never found a discrepancy — which tells me the atomic operations are correct. But the moment I remove it is the moment a bug will slip through.</p>
<p><strong>Rust's compile times are the real cost.</strong> The initial build of the workspace takes ~2 minutes. Incremental rebuilds are 10-15 seconds. Coming from the instant feedback loop of Node.js, this requires an adjustment in workflow — I batch changes and think more carefully before compiling. But in exchange, when it compiles, it usually works.</p>
<h2>What's Next</h2>
<p>The platform is stable and handling production traffic at <a href="https://smscode.gg">smscode.gg</a>. Upcoming work includes:</p>
<ul>
<li><strong>Horizontal scaling</strong> — the leader-elected task system already supports multiple replicas. Next step is load balancing across multiple Axum instances behind Traefik.</li>
<li><strong>More providers</strong> — the <code>ProviderClient</code> trait makes adding new SMS providers straightforward. Each new protocol is a separate module implementing the same trait.</li>
<li><strong>GraphQL for the admin</strong> — the admin panel currently makes many small REST calls. A GraphQL layer would let it fetch exactly the data it needs in a single request.</li>
</ul>
<p>Building a production Rust service from scratch was a significant investment — but seeing <a href="https://smscode.gg">SMSCode</a> handle thousands of concurrent operations at 12MB of memory with sub-millisecond latency made every fight with the borrow checker worth it.</p>
<p>You can find my <a href="/uses">full tech stack here</a>.</p>
<p>Thanks for reading!</p>
]]></content:encoded>
            <author>hi@zevs.gg (Zevs)</author>
        </item>
        <item>
            <title><![CDATA[DNS, Cloudflare & Zero Trust Networking]]></title>
            <link>https://zevs.gg/posts/dns-cloudflare-zero-trust</link>
            <guid isPermaLink="true">https://zevs.gg/posts/dns-cloudflare-zero-trust</guid>
            <pubDate>Fri, 06 Mar 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[A practical guide to managing DNS with Cloudflare, securing services with Zero Trust, and building a network where nothing is trusted by default.]]></description>
            <content:encoded><![CDATA[<p>[[toc]]</p>
<p>In my <a href="/posts/self-hosted-infrastructure">self-hosted infrastructure post</a>, I mentioned Cloudflare sitting in front of my services. What I didn't cover was how much work Cloudflare actually does beyond basic proxying — and how a zero trust model changed the way I think about network security entirely.</p>
<p>This post goes deeper into DNS management, Cloudflare's security stack, and the zero trust architecture that ties everything together.</p>
<h2>DNS Fundamentals (That People Get Wrong)</h2>
<p>Before diving into Cloudflare specifics, it's worth revisiting how DNS actually works — because most developers treat it like magic and then get surprised when things break.</p>
<p>DNS is a distributed, hierarchical database. When someone visits <code>app.example.com</code>, the query travels through multiple layers: root nameservers, TLD nameservers (<code>.com</code>), authoritative nameservers (yours or your provider's), and finally returns an IP address. Each layer caches the result based on TTL (time-to-live) values.</p>
<p>The most common mistake I see is setting TTLs too high during development. If you set a TTL of 86400 (24 hours) and then need to change a DNS record, you're waiting up to a full day for the change to propagate globally. During active development or migration, I keep TTLs at 300 seconds (5 minutes). Once everything is stable, I increase them.</p>
<pre><code class="language-bash"># Check current DNS resolution and TTL
dig app.example.com +short
dig app.example.com +noall +answer

# Check propagation across different nameservers
dig @8.8.8.8 app.example.com    # Google DNS
dig @1.1.1.1 app.example.com    # Cloudflare DNS
dig @9.9.9.9 app.example.com    # Quad9
</code></pre>
<p>The other common mistake: not understanding the difference between record types. <code>A</code> records point to IPv4 addresses, <code>AAAA</code> to IPv6, <code>CNAME</code> is an alias to another domain (but can't be used at the zone apex), and <code>MX</code> handles email routing. Cloudflare adds proxy functionality on top — when a record is &quot;proxied&quot; (orange cloud), traffic goes through Cloudflare's network instead of directly to your server.</p>
<h2>Cloudflare DNS Management</h2>
<p>I manage all my domains through Cloudflare. The free tier alone gives you authoritative DNS with Anycast routing, DDoS protection, and a CDN. For a self-hosted setup, that's an incredible amount of value for zero cost.</p>
<p>My DNS setup follows a consistent pattern across all domains:</p>
<pre><code># Zone: example.com
# Root domain and www → main web server (proxied)
example.com        A      → Cloudflare Proxy → Origin Server
www                CNAME  → example.com (proxied)

# Subdomains for services (proxied)
app                A      → Cloudflare Proxy → Origin Server
api                A      → Cloudflare Proxy → Origin Server
status             A      → Cloudflare Proxy → Origin Server

# Internal services (DNS-only, accessed via Tailscale)
grafana            A      → Internal IP (DNS only, gray cloud)
proxmox            A      → Internal IP (DNS only, gray cloud)

# Email (MX records, never proxied)
@                  MX     → mail provider
@                  TXT    → SPF record
_dmarc             TXT    → DMARC policy
</code></pre>
<p>The key decision for each record is whether to proxy it through Cloudflare (orange cloud) or not (gray cloud). Proxied records get DDoS protection, caching, WAF, and hide your origin IP. DNS-only records expose the real IP. The rule is simple: <strong>proxy anything public, DNS-only for internal services</strong>.</p>
<p>For internal services, I point DNS records to Tailscale IPs. These are only resolvable and reachable from within the Tailscale network. Even if someone discovers the DNS record, the IP is useless without being on the mesh.</p>
<h2>Automating DNS with Terraform</h2>
<p>Managing DNS records through the Cloudflare dashboard works for a few records. Once you have dozens across multiple domains, it becomes error-prone. I use <a href="https://www.terraform.io">Terraform</a> with the Cloudflare provider to manage everything as code.</p>
<pre><code class="language-hcl"># providers.tf
terraform {
  required_providers {
    cloudflare = {
      source  = &quot;cloudflare/cloudflare&quot;
      version = &quot;~&gt; 4.0&quot;
    }
  }
}

provider &quot;cloudflare&quot; {
  api_token = var.cloudflare_api_token
}
</code></pre>
<p>Each domain gets its own module:</p>
<pre><code class="language-hcl"># modules/example-com/main.tf
resource &quot;cloudflare_zone&quot; &quot;main&quot; {
  account_id = var.account_id
  zone       = &quot;example.com&quot;
}

resource &quot;cloudflare_record&quot; &quot;root&quot; {
  zone_id = cloudflare_zone.main.id
  name    = &quot;@&quot;
  content = var.origin_ip
  type    = &quot;A&quot;
  proxied = true
  ttl     = 1  # Auto TTL when proxied
}

resource &quot;cloudflare_record&quot; &quot;app&quot; {
  zone_id = cloudflare_zone.main.id
  name    = &quot;app&quot;
  content = var.origin_ip
  type    = &quot;A&quot;
  proxied = true
  ttl     = 1
}

resource &quot;cloudflare_record&quot; &quot;grafana&quot; {
  zone_id = cloudflare_zone.main.id
  name    = &quot;grafana&quot;
  content = var.tailscale_ip
  type    = &quot;A&quot;
  proxied = false  # Internal only
  ttl     = 300
}
</code></pre>
<p>Now DNS changes go through version control. I can review changes in a PR before applying them, and if something goes wrong, I roll back with <code>terraform apply</code> using a previous state. No more clicking around in dashboards wondering who changed what and when.</p>
<pre><code class="language-bash"># Preview changes before applying
terraform plan

# Apply changes
terraform apply

# Oops, roll back
git revert HEAD &amp;&amp; terraform apply
</code></pre>
<h2>Cloudflare Tunnel: No Open Ports</h2>
<p>This is the feature that changed my setup the most. <a href="https://developers.cloudflare.com/cloudflare-one/connections/connect-networks/">Cloudflare Tunnel</a> (formerly Argo Tunnel) creates an outbound-only connection from your server to Cloudflare's network. Traffic flows from visitors → Cloudflare → tunnel → your server. Your server never accepts inbound connections from the internet.</p>
<p>This means <strong>zero open ports on your firewall</strong>. No port 80, no port 443, nothing. The attack surface drops to essentially zero because there's nothing to scan or probe.</p>
<p>Setting up a tunnel:</p>
<pre><code class="language-bash"># Install cloudflared
curl -L https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64 -o /usr/local/bin/cloudflared
chmod +x /usr/local/bin/cloudflared

# Authenticate with Cloudflare
cloudflared tunnel login

# Create a tunnel
cloudflared tunnel create my-server

# This generates a credentials file at:
# ~/.cloudflared/&lt;tunnel-id&gt;.json
</code></pre>
<p>The tunnel configuration maps public hostnames to internal services:</p>
<pre><code class="language-yaml"># ~/.cloudflared/config.yml
tunnel: my-server
credentials-file: /root/.cloudflared/&lt;tunnel-id&gt;.json

ingress:
  - hostname: app.example.com
    service: http://localhost:8080
  - hostname: api.example.com
    service: http://localhost:3000
  - hostname: status.example.com
    service: http://localhost:9090
  # Catch-all rule (required)
  - service: http_status:404
</code></pre>
<p>Each hostname maps to a local service. Cloudflare handles TLS termination, so the tunnel connects to your services over plain HTTP internally. Traefik becomes optional for public services — though I still use it for internal routing and middleware.</p>
<p>I run <code>cloudflared</code> as a systemd service so it starts automatically and reconnects on failure:</p>
<pre><code class="language-bash">cloudflared service install
systemctl enable cloudflared
systemctl start cloudflared
</code></pre>
<p>The DNS records for tunneled services use <code>CNAME</code> pointing to <code>&lt;tunnel-id&gt;.cfargotunnel.com</code>. Terraform handles this too:</p>
<pre><code class="language-hcl">resource &quot;cloudflare_record&quot; &quot;app_tunnel&quot; {
  zone_id = cloudflare_zone.main.id
  name    = &quot;app&quot;
  content = &quot;${var.tunnel_id}.cfargotunnel.com&quot;
  type    = &quot;CNAME&quot;
  proxied = true
}
</code></pre>
<h2>Zero Trust: Trust Nothing, Verify Everything</h2>
<p>Traditional network security works like a castle with a moat — everything outside the perimeter is untrusted, everything inside is trusted. The problem is obvious: once an attacker gets past the perimeter (phishing, compromised credentials, supply chain attack), they have access to everything.</p>
<p>Zero trust flips this model. <strong>No user, device, or network is inherently trusted.</strong> Every request must be authenticated and authorized, regardless of where it comes from. Even if you're on the &quot;internal&quot; network, you still prove who you are before accessing anything.</p>
<p>My zero trust stack has three layers:</p>
<h3>Layer 1: Cloudflare Access</h3>
<p><a href="https://developers.cloudflare.com/cloudflare-one/policies/access/">Cloudflare Access</a> puts an authentication layer in front of any web application. Instead of exposing a login page to the internet and hoping your app's auth is bulletproof, Cloudflare handles authentication before the request ever reaches your server.</p>
<p>I protect internal dashboards like this:</p>
<pre><code class="language-hcl"># Cloudflare Access application
resource &quot;cloudflare_access_application&quot; &quot;grafana&quot; {
  zone_id          = cloudflare_zone.main.id
  name             = &quot;Grafana&quot;
  domain           = &quot;grafana.example.com&quot;
  session_duration = &quot;24h&quot;
}

# Access policy: only allow specific emails
resource &quot;cloudflare_access_policy&quot; &quot;grafana_policy&quot; {
  application_id = cloudflare_access_application.grafana.id
  zone_id        = cloudflare_zone.main.id
  name           = &quot;Allow admins&quot;
  precedence     = 1
  decision       = &quot;allow&quot;

  include {
    email = [&quot;admin@example.com&quot;]
  }
}
</code></pre>
<p>When someone visits <code>grafana.example.com</code>, Cloudflare intercepts the request and presents a login screen. They can authenticate via email OTP, Google, GitHub, or any other configured identity provider. Only after successful authentication does Cloudflare forward the request to the origin. The application itself never sees unauthenticated traffic.</p>
<p>This is particularly powerful for services that have weak or no built-in auth. I have some older tools that only support basic HTTP auth — putting Cloudflare Access in front of them gives me proper authentication without modifying the application.</p>
<h3>Layer 2: Tailscale for Machine-to-Machine</h3>
<p>While Cloudflare Access handles browser-based access, <a href="https://tailscale.com">Tailscale</a> handles machine-to-machine and SSH communication. Every device in my network has a Tailscale client installed, and they communicate over WireGuard-encrypted tunnels.</p>
<p>Tailscale ACLs (Access Control Lists) define who can reach what:</p>
<pre><code class="language-json">{
  &quot;acls&quot;: [
    {
      &quot;action&quot;: &quot;accept&quot;,
      &quot;src&quot;: [&quot;group:admins&quot;],
      &quot;dst&quot;: [&quot;tag:server:*&quot;]
    },
    {
      &quot;action&quot;: &quot;accept&quot;,
      &quot;src&quot;: [&quot;tag:monitoring&quot;],
      &quot;dst&quot;: [&quot;tag:server:9100&quot;]
    },
    {
      &quot;action&quot;: &quot;accept&quot;,
      &quot;src&quot;: [&quot;tag:server&quot;],
      &quot;dst&quot;: [&quot;tag:server:*&quot;]
    }
  ],
  &quot;groups&quot;: {
    &quot;group:admins&quot;: [&quot;user@example.com&quot;]
  },
  &quot;tagOwners&quot;: {
    &quot;tag:server&quot;: [&quot;group:admins&quot;],
    &quot;tag:monitoring&quot;: [&quot;group:admins&quot;]
  }
}
</code></pre>
<p>This ACL says: admins can reach any port on servers, the monitoring stack can only reach port 9100 (node_exporter) on servers, and servers can talk to each other. Everything else is denied by default.</p>
<p>The key insight is that these ACLs work at the network level. Even if a service has no authentication, it's only reachable by authorized devices. Defense in depth — the network enforces what the application might not.</p>
<h3>Layer 3: Application-Level Auth</h3>
<p>The innermost layer is application-level authentication. Even with Cloudflare Access and Tailscale ACLs, each service still has its own auth. If Cloudflare Access were somehow bypassed (configuration error, new network path), the application still requires valid credentials.</p>
<p>For services behind Cloudflare Access, I validate the <code>Cf-Access-Jwt-Assertion</code> header to ensure the request actually came through Access and wasn't somehow injected:</p>
<pre><code class="language-python">import jwt
import requests

# Cloudflare's public keys for JWT verification
CERTS_URL = &quot;https://example.cloudflareaccess.com/cdn-cgi/access/certs&quot;

def verify_cf_access_token(request):
    token = request.headers.get(&quot;Cf-Access-Jwt-Assertion&quot;)
    if not token:
        return False

    keys = requests.get(CERTS_URL).json()[&quot;public_certs&quot;]
    for key in keys:
        try:
            decoded = jwt.decode(
                token,
                key=jwt.algorithms.RSAAlgorithm.from_jwk(key[&quot;cert&quot;]),
                algorithms=[&quot;RS256&quot;],
                audience=&quot;&lt;your-audience-tag&gt;&quot;
            )
            return True
        except jwt.InvalidTokenError:
            continue
    return False
</code></pre>
<p>Three layers, three independent auth mechanisms. An attacker would need to bypass Cloudflare Access, be on the Tailscale network, <em>and</em> have valid application credentials. Each layer reduces the blast radius if another layer fails.</p>
<h2>WAF &amp; Security Rules</h2>
<p>Cloudflare's <a href="https://developers.cloudflare.com/waf/">Web Application Firewall</a> runs on every proxied request. The managed rulesets catch common attacks — SQL injection, XSS, path traversal — without any configuration. But the real power is in custom rules.</p>
<p>I use custom WAF rules for patterns specific to my services:</p>
<pre><code># Block requests with suspicious user agents
(http.user_agent contains &quot;sqlmap&quot;) or
(http.user_agent contains &quot;nikto&quot;) or
(http.user_agent contains &quot;nmap&quot;) → Block

# Rate limit API endpoints
(http.request.uri.path matches &quot;^/api/&quot;) → Rate limit: 100 req/min per IP

# Block access to sensitive paths
(http.request.uri.path contains &quot;/.env&quot;) or
(http.request.uri.path contains &quot;/wp-admin&quot;) or
(http.request.uri.path contains &quot;/.git&quot;) → Block

# Country-based restrictions for admin paths
(http.request.uri.path contains &quot;/admin&quot;) and
(not ip.geoip.country in {&quot;ID&quot;}) → Block
</code></pre>
<p>The country-based rule is practical, not security theater. I know that all legitimate admin access comes from Indonesia. Blocking other countries for admin paths eliminates a massive amount of automated scanning noise. It's not a security boundary — it's a noise reduction filter.</p>
<h2>Caching Strategy</h2>
<p>Cloudflare's CDN caches static assets at edge locations worldwide. For a self-hosted setup, this means your server handles less traffic and users get faster responses regardless of their location.</p>
<p>My caching strategy:</p>
<pre><code># Page Rules (or Cache Rules in the new UI)

# Static assets: cache aggressively
*.example.com/assets/*     → Cache Everything, Edge TTL: 1 month
*.example.com/images/*     → Cache Everything, Edge TTL: 1 month
*.example.com/fonts/*      → Cache Everything, Edge TTL: 1 year

# API responses: never cache
api.example.com/*          → Bypass Cache

# HTML pages: short cache with revalidation
app.example.com/*          → Edge TTL: 1 hour, Browser TTL: 5 min
</code></pre>
<p>For my static sites (like this blog), I use the <code>Cache Everything</code> page rule with long edge TTLs. The site is rebuilt and deployed on each push, and I purge the Cloudflare cache as part of the deployment pipeline:</p>
<pre><code class="language-bash"># Purge entire cache after deployment
curl -X POST &quot;https://api.cloudflare.com/client/v4/zones/&lt;zone-id&gt;/purge_cache&quot; \
  -H &quot;Authorization: Bearer &lt;api-token&gt;&quot; \
  -H &quot;Content-Type: application/json&quot; \
  --data '{&quot;purge_everything&quot;: true}'
</code></pre>
<h2>Putting It All Together</h2>
<p>Here's how traffic flows through the stack for a public service:</p>
<pre><code>User → Cloudflare DNS (Anycast)
     → Cloudflare Edge (WAF, DDoS, Cache)
     → Cloudflare Tunnel (encrypted, outbound-only)
     → Origin Server
     → Traefik (internal routing)
     → Docker Container
</code></pre>
<p>For an internal service accessed by an admin:</p>
<pre><code>Admin → Cloudflare DNS
      → Cloudflare Access (authentication)
      → Cloudflare Tunnel
      → Origin Server
      → Service (validates CF Access JWT)
</code></pre>
<p>For SSH and machine-to-machine:</p>
<pre><code>Admin Device → Tailscale (WireGuard mesh)
             → ACL Check
             → Target Server
</code></pre>
<p>No single layer is responsible for security. Each layer adds defense, and the failure of any one layer doesn't compromise the whole system. That's the core principle of zero trust — assume every layer will eventually fail, and design accordingly.</p>
<h2>What I'd Do Differently</h2>
<p>If I were starting over:</p>
<ul>
<li><strong>Cloudflare Tunnel from day one.</strong> I spent months with open ports and Cloudflare proxy before discovering Tunnel. The reduction in attack surface is dramatic and the setup is simpler than managing firewall rules.</li>
<li><strong>Terraform from the start.</strong> Manual DNS management through dashboards doesn't scale. Once you have more than one domain, it's worth the initial investment to codify everything.</li>
<li><strong>Tighter Tailscale ACLs.</strong> I started with a permissive <code>&quot;*&quot;</code> rule and tightened over time. Starting tight and loosening is safer than starting loose and tightening.</li>
</ul>
<h2>Wrap Up</h2>
<p>The combination of Cloudflare (DNS, Tunnel, Access, WAF) and Tailscale (mesh VPN, ACLs) gives you an enterprise-grade zero trust architecture for the cost of a few hours of setup. Most of Cloudflare's features are free, and Tailscale's free tier covers up to 100 devices.</p>
<p>The old model of &quot;everything behind the firewall is safe&quot; doesn't work anymore — it probably never did. Zero trust isn't just a buzzword. It's a fundamentally better way to think about network security: verify everything, trust nothing, and design for the assumption that every layer will eventually be compromised.</p>
<p>If you're running self-hosted services with open ports and no authentication layer beyond the application itself, start with Cloudflare Tunnel. It's the single highest-impact change you can make — zero open ports, automatic TLS, and DDoS protection in about 15 minutes of setup.</p>
<p>Thanks for reading!</p>
]]></content:encoded>
            <author>hi@zevs.gg (Zevs)</author>
        </item>
        <item>
            <title><![CDATA[From Node to Rust: A Practical Journey]]></title>
            <link>https://zevs.gg/posts/from-node-to-rust</link>
            <guid isPermaLink="true">https://zevs.gg/posts/from-node-to-rust</guid>
            <pubDate>Fri, 06 Mar 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[What I learned moving from Node.js and TypeScript to Rust — the mental shifts, the practical differences, and when each one makes sense.]]></description>
            <content:encoded><![CDATA[<p>[[toc]]</p>
<p>I've been writing JavaScript and TypeScript for years. It's the language I think in, the ecosystem I know inside out. Node.js is what I reach for by default — fast to prototype, massive ecosystem, and good enough for most things.</p>
<p>But &quot;good enough&quot; has a ceiling. When I started building services that needed to handle tens of thousands of concurrent connections, process data with minimal latency, and run for months without memory creeping up — I started hitting that ceiling. That's when I picked up Rust.</p>
<p>This isn't a &quot;Rust is better than Node&quot; post. It's a practical walkthrough of what the transition actually looks like — the mental shifts, the code patterns, and the real tradeoffs — from someone who still writes TypeScript daily.</p>
<h2>Why Rust?</h2>
<p>The honest answer: <strong>performance and reliability.</strong></p>
<p>Node.js is single-threaded by design. Yes, we have worker threads and the event loop handles I/O concurrently, but CPU-bound work blocks the main thread. For I/O-heavy web services, Node is excellent. For anything that involves heavy computation, data processing, or needs predictable low-latency responses — you feel the limits.</p>
<p>Rust gave me:</p>
<ul>
<li><strong>No garbage collector</strong> — memory is freed deterministically, no GC pauses, no memory creep over long-running processes.</li>
<li><strong>True parallelism</strong> — real threads with zero-cost abstractions for concurrency.</li>
<li><strong>Compile-time guarantees</strong> — if it compiles, an entire class of bugs (null references, data races, use-after-free) simply cannot exist.</li>
<li><strong>Predictable performance</strong> — no JIT warmup, no deoptimization surprises. The performance you measure is the performance you get.</li>
</ul>
<p>That said, Rust has a real learning curve. The first few weeks were humbling.</p>
<h2>The Ownership Mental Model</h2>
<p>The single biggest shift coming from JavaScript to Rust is <strong>ownership</strong>. In JavaScript, you don't think about who &quot;owns&quot; a value. Everything is garbage collected. You create objects, pass them around, and the runtime figures out when to clean them up.</p>
<p>In Rust, every value has exactly one owner. When the owner goes out of scope, the value is dropped. If you want to give a value to someone else, you <strong>move</strong> it — and the original variable becomes invalid.</p>
<pre><code class="language-rust">fn main() {
    let name = String::from(&quot;hello&quot;);
    let greeting = name; // `name` is moved to `greeting`

    println!(&quot;{}&quot;, name); // ERROR: value used after move
}
</code></pre>
<p>In JavaScript, this would just work — both variables point to the same string, and the GC handles cleanup. In Rust, this is a compile-time error.</p>
<p>The first time you hit this, it feels like the compiler is fighting you. But there's a reason: Rust is preventing you from having two variables that could both try to free the same memory, or from reading data that another part of your code is modifying.</p>
<h3>Borrowing: The Solution</h3>
<p>Instead of moving values everywhere, Rust lets you <strong>borrow</strong> them — either as a shared reference (<code>&amp;T</code>, read-only, multiple allowed) or a mutable reference (<code>&amp;mut T</code>, exclusive, only one at a time).</p>
<pre><code class="language-rust">fn greet(name: &amp;str) {
    println!(&quot;Hello, {name}!&quot;);
}

fn main() {
    let name = String::from(&quot;Zevs&quot;);
    greet(&amp;name);       // borrow `name` — we still own it
    println!(&quot;{name}&quot;); // still valid, we only lent it out
}
</code></pre>
<p>The mental model that clicked for me: <strong>think of values like physical objects.</strong> You can hand someone a book to read (shared borrow), or hand it to them to write in (mutable borrow), but you can't let two people write in it at the same time. And if you give the book away (move), you don't have it anymore.</p>
<p>Coming from TypeScript where everything is a reference and mutation is unrestricted, this feels limiting at first. After a while, it starts feeling like a superpower — the compiler enforces discipline that you'd otherwise need extensive testing and code review to maintain.</p>
<h2>Comparing HTTP Servers</h2>
<p>Let's look at something concrete: building an HTTP server. This is where most Node developers start with Rust.</p>
<h3>Express / Hono (TypeScript)</h3>
<pre><code class="language-typescript">import { Hono } from 'hono'

interface User {
  id: number
  name: string
  email: string
}

const users: User[] = [
  { id: 1, name: 'Zevs', email: 'zevs@example.com' },
]

const app = new Hono()

app.get('/users', (c) =&gt; {
  return c.json(users)
})

app.get('/users/:id', (c) =&gt; {
  const id = Number(c.req.param('id'))
  const user = users.find(u =&gt; u.id === id)
  if (!user) {
    return c.json({ error: 'User not found' }, 404)
  }
  return c.json(user)
})

app.post('/users', async (c) =&gt; {
  const body = await c.req.json&lt;Omit&lt;User, 'id'&gt;&gt;()
  const user: User = {
    id: users.length + 1,
    ...body,
  }
  users.push(user)
  return c.json(user, 201)
})

export default app
</code></pre>
<p>Straightforward. Parse params, find data, return JSON. The TypeScript types give us editor support but no runtime guarantees — <code>c.req.json&lt;Omit&lt;User, 'id'&gt;&gt;()</code> doesn't actually validate the body at runtime.</p>
<h3>Axum (Rust)</h3>
<pre><code class="language-rust">use axum::{
    extract::{Path, State, Json},
    http::StatusCode,
    response::IntoResponse,
    routing::{get, post},
    Router,
};
use serde::{Deserialize, Serialize};
use std::sync::{Arc, Mutex};

#[derive(Clone, Serialize, Deserialize)]
struct User {
    id: u32,
    name: String,
    email: String,
}

#[derive(Deserialize)]
struct CreateUser {
    name: String,
    email: String,
}

type AppState = Arc&lt;Mutex&lt;Vec&lt;User&gt;&gt;&gt;;

async fn list_users(
    State(users): State&lt;AppState&gt;,
) -&gt; Json&lt;Vec&lt;User&gt;&gt; {
    let users = users.lock().unwrap();
    Json(users.clone())
}

async fn get_user(
    State(users): State&lt;AppState&gt;,
    Path(id): Path&lt;u32&gt;,
) -&gt; impl IntoResponse {
    let users = users.lock().unwrap();
    match users.iter().find(|u| u.id == id) {
        Some(user) =&gt; Ok(Json(user.clone())),
        None =&gt; Err(StatusCode::NOT_FOUND),
    }
}

async fn create_user(
    State(users): State&lt;AppState&gt;,
    Json(input): Json&lt;CreateUser&gt;,
) -&gt; impl IntoResponse {
    let mut users = users.lock().unwrap();
    let user = User {
        id: users.len() as u32 + 1,
        name: input.name,
        email: input.email,
    };
    users.push(user.clone());
    (StatusCode::CREATED, Json(user))
}

#[tokio::main]
async fn main() {
    let state: AppState = Arc::new(Mutex::new(vec![
        User {
            id: 1,
            name: &quot;Zevs&quot;.into(),
            email: &quot;zevs@example.com&quot;.into(),
        },
    ]));

    let app = Router::new()
        .route(&quot;/users&quot;, get(list_users).post(create_user))
        .route(&quot;/users/{id}&quot;, get(get_user))
        .with_state(state);

    let listener = tokio::net::TcpListener::bind(&quot;0.0.0.0:3000&quot;)
        .await
        .unwrap();
    axum::serve(listener, app).await.unwrap();
}
</code></pre>
<p>More verbose, yes. But look at what you get for free:</p>
<ul>
<li><strong>Thread-safe state</strong> — <code>Arc&lt;Mutex&lt;Vec&lt;User&gt;&gt;&gt;</code> is enforced by the compiler. You literally cannot share mutable state between handlers without proving it's safe.</li>
<li><strong>Deserialization with validation</strong> — <code>Json&lt;CreateUser&gt;</code> automatically deserializes <em>and</em> validates the request body. If a field is missing, Axum returns a 422 before your handler even runs.</li>
<li><strong>Type-safe path extraction</strong> — <code>Path(id): Path&lt;u32&gt;</code> extracts and parses the path parameter. If someone sends <code>/users/abc</code>, it returns a 400 automatically.</li>
<li><strong>Exhaustive pattern matching</strong> — the <code>match</code> in <code>get_user</code> forces you to handle both the found and not-found cases. You can't forget.</li>
</ul>
<p>The Rust version is longer, but it handles edge cases that the TypeScript version silently ignores.</p>
<h2>Error Handling: try/catch vs Result</h2>
<p>In JavaScript, errors are thrown and (hopefully) caught:</p>
<pre><code class="language-typescript">async function fetchUser(id: number): Promise&lt;User&gt; {
  const res = await fetch(`/api/users/${id}`)
  if (!res.ok) {
    throw new Error(`HTTP ${res.status}`)
  }
  const data = await res.json()
  return data as User
}

// Caller must remember to try/catch
try {
  const user = await fetchUser(1)
  console.log(user.name)
}
catch (err) {
  console.error('Failed:', err)
}
</code></pre>
<p>The problem: nothing forces the caller to handle the error. Forget the <code>try/catch</code>, and the error propagates silently until it crashes your process or gets swallowed by a generic handler.</p>
<p>In Rust, errors are <strong>values</strong>, not exceptions:</p>
<pre><code class="language-rust">use reqwest;
use serde::Deserialize;

#[derive(Deserialize)]
struct User {
    name: String,
}

async fn fetch_user(id: u32) -&gt; Result&lt;User, reqwest::Error&gt; {
    let user = reqwest::get(format!(&quot;http://api/users/{id}&quot;))
        .await?
        .error_for_status()?
        .json::&lt;User&gt;()
        .await?;
    Ok(user)
}

// Caller MUST handle the Result — the compiler enforces it
async fn main_logic() {
    match fetch_user(1).await {
        Ok(user) =&gt; println!(&quot;{}&quot;, user.name),
        Err(e) =&gt; eprintln!(&quot;Failed: {e}&quot;),
    }
}
</code></pre>
<p>The <code>?</code> operator is Rust's equivalent of <code>await</code> for errors — it propagates the error to the caller if it fails, or unwraps the success value if it succeeds. But unlike exceptions, the return type <code>Result&lt;User, reqwest::Error&gt;</code> makes it explicit that this function can fail. The compiler won't let you ignore it.</p>
<p>This was one of the things that immediately made me a better programmer. In TypeScript, I had to rely on discipline and code review to ensure errors were handled. In Rust, the type system does it for me.</p>
<h3>Custom Error Types</h3>
<p>In real applications, you usually define your own error types. With the <code>thiserror</code> crate, this is clean:</p>
<pre><code class="language-rust">use thiserror::Error;

#[derive(Error, Debug)]
enum AppError {
    #[error(&quot;user {0} not found&quot;)]
    NotFound(u32),

    #[error(&quot;database error: {0}&quot;)]
    Database(#[from] sqlx::Error),

    #[error(&quot;request failed: {0}&quot;)]
    Request(#[from] reqwest::Error),

    #[error(&quot;unauthorized&quot;)]
    Unauthorized,
}
</code></pre>
<p>Every error variant is typed, carries context, and converts automatically from library errors via <code>#[from]</code>. Pattern matching on these errors gives you fine-grained control:</p>
<pre><code class="language-rust">match do_something().await {
    Ok(result) =&gt; handle_success(result),
    Err(AppError::NotFound(id)) =&gt; return not_found(id),
    Err(AppError::Unauthorized) =&gt; return redirect_to_login(),
    Err(e) =&gt; {
        tracing::error!(&quot;unexpected error: {e}&quot;);
        return internal_error();
    }
}
</code></pre>
<p>Compare this to JavaScript where <code>catch (err)</code> gives you an <code>unknown</code> type and you're left guessing what went wrong with <code>instanceof</code> checks.</p>
<h2>Async: Promises vs Futures</h2>
<p>Both Node and Rust use <code>async/await</code>, but the underlying mechanics are fundamentally different.</p>
<p>In JavaScript, Promises are <strong>eager</strong> — they start executing immediately when created:</p>
<pre><code class="language-typescript">const promise = fetchUser(1) // Already running!
// ... do other stuff ...
const user = await promise // Wait for it to finish
</code></pre>
<p>In Rust, Futures are <strong>lazy</strong> — they do nothing until polled:</p>
<pre><code class="language-rust">let future = fetch_user(1); // Nothing happens yet
// ... do other stuff ...
let user = future.await;    // NOW it starts executing
</code></pre>
<p>This laziness is actually an advantage. It means you can compose futures without accidentally triggering side effects. You build up a computation graph, and nothing runs until you explicitly drive it.</p>
<h3>Concurrency Patterns</h3>
<p>Running tasks concurrently in JavaScript:</p>
<pre><code class="language-typescript">// Run in parallel
const [users, posts] = await Promise.all([
  fetchUsers(),
  fetchPosts(),
])

// Race — first one wins
const result = await Promise.race([
  fetchFromPrimary(),
  fetchFromFallback(),
])
</code></pre>
<p>In Rust with Tokio:</p>
<pre><code class="language-rust">// Run in parallel
let (users, posts) = tokio::join!(
    fetch_users(),
    fetch_posts(),
);

// Race — first one wins
tokio::select! {
    result = fetch_from_primary() =&gt; handle(result),
    result = fetch_from_fallback() =&gt; handle(result),
}
</code></pre>
<p>Similar syntax, but Rust's <code>tokio::select!</code> is more powerful — it cancels the losing future automatically, while JavaScript's <code>Promise.race</code> leaves the other promise running in the background.</p>
<p>Where Rust really shines is CPU-bound concurrency. In Node, you'd need worker threads:</p>
<pre><code class="language-typescript">// Node.js — worker threads for CPU work (cumbersome)
const { Worker } = require('node:worker_threads')

const worker = new Worker('./heavy-computation.js', {
  workerData: input,
})
worker.on('message', (result) =&gt; { /* ... */ })
</code></pre>
<p>In Rust, you just spawn a task on a thread pool:</p>
<pre><code class="language-rust">// Rust — spawn blocking work on a thread pool
let result = tokio::task::spawn_blocking(move || {
    heavy_computation(input)
}).await?;
</code></pre>
<p>Or use <code>rayon</code> for data parallelism that automatically scales across all CPU cores:</p>
<pre><code class="language-rust">use rayon::prelude::*;

// Process millions of items across all cores
let results: Vec&lt;Output&gt; = inputs
    .par_iter()
    .map(|item| process(item))
    .collect();
</code></pre>
<p>There's no JavaScript equivalent that's this simple and this fast.</p>
<h2>The Type System</h2>
<p>TypeScript's type system is <strong>structural</strong> and exists only at compile time — it's erased at runtime. Rust's type system is <strong>nominal</strong> and exists at every level, from compile time to the actual memory layout.</p>
<h3>Enums: Rust's Secret Weapon</h3>
<p>TypeScript has union types. Rust has <strong>algebraic data types</strong> (enums with data). This is, hands down, the feature I miss most when I go back to TypeScript.</p>
<pre><code class="language-typescript">// TypeScript: discriminated union
type ApiResponse
  = | { status: 'success', data: User }
    | { status: 'error', message: string }
    | { status: 'loading' }

function handle(response: ApiResponse) {
  switch (response.status) {
    case 'success':
      console.log(response.data.name)
      break
    case 'error':
      console.error(response.message)
      break
    case 'loading':
      console.log('Loading...')
      break
  }
}
</code></pre>
<pre><code class="language-rust">// Rust: enum with data
enum ApiResponse {
    Success { data: User },
    Error { message: String },
    Loading,
}

fn handle(response: ApiResponse) {
    match response {
        ApiResponse::Success { data } =&gt; println!(&quot;{}&quot;, data.name),
        ApiResponse::Error { message } =&gt; eprintln!(&quot;{message}&quot;),
        ApiResponse::Loading =&gt; println!(&quot;Loading...&quot;),
    }
}
</code></pre>
<p>They look similar. But here's the difference: if you add a new variant to the Rust enum, the compiler immediately flags every <code>match</code> that doesn't handle it. In TypeScript, <code>switch</code> on a discriminated union won't warn you about missing cases by default (you need <code>exhaustive-deps</code> lint rules, and even then it's not bulletproof).</p>
<p>Rust enums are also how <code>Option&lt;T&gt;</code> and <code>Result&lt;T, E&gt;</code> work — they're not special language features, they're just enums:</p>
<pre><code class="language-rust">enum Option&lt;T&gt; {
    Some(T),
    None,
}

enum Result&lt;T, E&gt; {
    Ok(T),
    Err(E),
}
</code></pre>
<p>This means <code>null</code> and errors are handled through the same pattern matching system as everything else. No special syntax, no special rules. It's elegant.</p>
<h3>Traits vs Interfaces</h3>
<p>TypeScript interfaces define a shape. Rust traits define behavior.</p>
<pre><code class="language-typescript">// TypeScript: interface
interface Serializable {
  serialize: () =&gt; string
}

class User implements Serializable {
  serialize(): string {
    return JSON.stringify(this)
  }
}
</code></pre>
<pre><code class="language-rust">// Rust: trait
trait Serializable {
    fn serialize(&amp;self) -&gt; String;
}

impl Serializable for User {
    fn serialize(&amp;self) -&gt; String {
        serde_json::to_string(self).unwrap()
    }
}
</code></pre>
<p>The key difference: in Rust, you can implement traits for types you don't own. I can implement <code>Serializable</code> for <code>String</code> or <code>Vec&lt;u8&gt;</code> or any third-party type. In TypeScript, you can't retroactively make a class implement an interface without modifying it.</p>
<p>This extensibility is what makes Rust's ecosystem so composable. The <code>serde</code> crate (Rust's de facto serialization framework) works by implementing <code>Serialize</code> and <code>Deserialize</code> traits — and you can add support for any data format or any type without modifying either.</p>
<h2>The Ecosystem: npm vs Cargo</h2>
<table>
<thead>
<tr>
<th></th>
<th>npm</th>
<th>Cargo</th>
</tr>
</thead>
<tbody>
<tr>
<td>Package registry</td>
<td><a href="http://npmjs.com">npmjs.com</a></td>
<td><a href="http://crates.io">crates.io</a></td>
</tr>
<tr>
<td>Lock file</td>
<td><code>package-lock.json</code> / <code>pnpm-lock.yaml</code></td>
<td><code>Cargo.lock</code></td>
</tr>
<tr>
<td>Monorepo support</td>
<td>workspaces</td>
<td>workspaces</td>
</tr>
<tr>
<td>Build scripts</td>
<td><code>package.json</code> scripts</td>
<td><code>build.rs</code></td>
</tr>
<tr>
<td>Testing</td>
<td>External (vitest, jest)</td>
<td>Built-in (<code>cargo test</code>)</td>
</tr>
<tr>
<td>Formatting</td>
<td>External (prettier)</td>
<td>Built-in (<code>cargo fmt</code>)</td>
</tr>
<tr>
<td>Linting</td>
<td>External (eslint)</td>
<td>Built-in (<code>cargo clippy</code>)</td>
</tr>
<tr>
<td>Docs</td>
<td>External (typedoc)</td>
<td>Built-in (<code>cargo doc</code>)</td>
</tr>
</tbody>
</table>
<p>Cargo is batteries-included in a way that npm isn't. Testing, formatting, linting, documentation generation — all built into the toolchain. No arguing about which test runner to use, no configuring formatters, no installing extra dev dependencies for basic workflows.</p>
<p>The crates I use most frequently:</p>
<ul>
<li><strong>tokio</strong> — async runtime, the foundation everything else builds on</li>
<li><strong>axum</strong> — HTTP framework by the Tokio team</li>
<li><strong>serde</strong> + <strong>serde_json</strong> — serialization/deserialization</li>
<li><strong>sqlx</strong> — async database driver with compile-time SQL checking</li>
<li><strong>reqwest</strong> — HTTP client</li>
<li><strong>tracing</strong> — structured logging and diagnostics</li>
<li><strong>thiserror</strong> / <strong>anyhow</strong> — error handling</li>
<li><strong>clap</strong> — CLI argument parsing</li>
</ul>
<p>The ecosystem is smaller than npm, but the quality bar is noticeably higher. Fewer packages to choose from, but less decision fatigue and fewer abandoned dependencies.</p>
<h2>Performance: Real Numbers</h2>
<p>I ran a simple benchmark on an identical API (JSON serialization, database query, response) on the same hardware:</p>
<table>
<thead>
<tr>
<th></th>
<th>Hono (Bun)</th>
<th>Axum (Rust)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Requests/sec</td>
<td>~48,000</td>
<td>~210,000</td>
</tr>
<tr>
<td>p99 latency</td>
<td>4.2ms</td>
<td>0.3ms</td>
</tr>
<tr>
<td>Memory usage</td>
<td>~85MB</td>
<td>~8MB</td>
</tr>
<tr>
<td>Cold start</td>
<td>~120ms</td>
<td>~3ms</td>
</tr>
</tbody>
</table>
<p>Rust isn't just faster — it's a different class of performance. The 10x lower memory usage means I can run more services on the same hardware. The sub-millisecond p99 latency means tail latency doesn't exist. The instant cold start means serverless deployments are truly instant.</p>
<p>But these numbers only matter for specific workloads. For a blog, a CRUD app, or a prototype — the difference is irrelevant. Node is fast enough, and you ship weeks sooner.</p>
<h2>When I Use Which</h2>
<p>After working with both for a while, I've developed a simple decision framework:</p>
<p><strong>I use Node.js / TypeScript when:</strong></p>
<ul>
<li>Rapid prototyping and MVPs — nothing beats the iteration speed.</li>
<li>Frontend + full-stack web apps — Nuxt, Astro, and the ecosystem are unmatched.</li>
<li>Scripting and tooling — quick scripts, CLIs, build tools.</li>
<li>The team is JavaScript-heavy — hiring and onboarding matter.</li>
</ul>
<p><strong>I use Rust when:</strong></p>
<ul>
<li>High-throughput backend services — APIs handling thousands of RPS.</li>
<li>Long-running processes — daemons, workers, queue consumers where memory stability matters.</li>
<li>Data processing pipelines — parsing, transforming, aggregating large datasets.</li>
<li>Systems-level work — anything touching the network stack, file systems, or needing fine-grained control.</li>
<li>Performance-critical paths — the hot loop that everything else depends on.</li>
</ul>
<p>In practice, my projects are often <strong>both</strong>. A TypeScript frontend with Nuxt, calling a Rust API service via Axum. Or a Node.js orchestration layer that delegates heavy lifting to Rust microservices. They complement each other well.</p>
<h2>What I Wish I Knew Earlier</h2>
<p>A few things that would have saved me weeks of frustration:</p>
<p><strong>Don't fight the borrow checker.</strong> When the compiler rejects your code, it's usually pointing at a design issue, not a syntax issue. Instead of adding <code>.clone()</code> everywhere to make it compile, step back and think about who should own the data. The borrow checker is teaching you better architecture.</p>
<p><strong>Start with <code>String</code>, not <code>&amp;str</code>.</strong> Coming from JavaScript, you'll want to use owned types (<code>String</code>, <code>Vec&lt;T&gt;</code>, <code>HashMap</code>) everywhere at first. That's fine. You can optimize to borrowed types (<code>&amp;str</code>, <code>&amp;[T]</code>) later when you understand lifetimes better. Premature optimization of lifetimes is the root of all borrow checker frustration.</p>
<p><strong>Use <code>anyhow</code> for applications, <code>thiserror</code> for libraries.</strong> <code>anyhow::Result</code> gives you ergonomic error handling without defining error types upfront — perfect for application code. <code>thiserror</code> gives you structured, typed errors — perfect for library code where callers need to match on error variants.</p>
<p><strong>Read the <code>clippy</code> lints.</strong> <code>cargo clippy</code> isn't just a linter — it's a Rust teacher. Every suggestion comes with an explanation of why the alternative is better. Running <code>clippy</code> on my early Rust code was humbling but incredibly educational.</p>
<p><strong>The Rust community is genuinely helpful.</strong> The <a href="https://users.rust-lang.org/">Rust Users forum</a>, the subreddit, and Discord are some of the most welcoming programming communities I've encountered. Don't hesitate to ask questions — everyone remembers fighting the borrow checker for the first time.</p>
<h2>Conclusion</h2>
<p>Learning Rust made me a better TypeScript developer. The concepts of ownership, explicit error handling, and thinking about data lifetimes — these are transferable ideas that improved how I write code in any language. I now think more carefully about who owns data, where mutations happen, and what errors a function can produce, even when writing JavaScript.</p>
<p>Rust isn't a replacement for Node.js in my workflow. It's an addition. A powerful tool for the problems where Node struggles — and there are more of those problems than I initially thought.</p>
<p>If you're a Node.js developer curious about Rust, my advice: start with a small project. Rewrite a CLI tool you already have, or build a simple API with Axum. The first week will be painful. By the third week, you'll start to feel the compiler working <em>with</em> you instead of against you. And by the first month, you'll wonder how you ever wrote concurrent code without the borrow checker watching your back.</p>
<p>You can find my <a href="/uses">full tech stack here</a>.</p>
<p>Thanks for reading!</p>
]]></content:encoded>
            <author>hi@zevs.gg (Zevs)</author>
        </item>
        <item>
            <title><![CDATA[Reverse Engineering Android Apps with Frida]]></title>
            <link>https://zevs.gg/posts/reverse-engineering-android-frida</link>
            <guid isPermaLink="true">https://zevs.gg/posts/reverse-engineering-android-frida</guid>
            <pubDate>Fri, 06 Mar 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[A practical guide to Android app reverse engineering using Frida, mitmproxy, and Jadx — how to intercept traffic, hook functions at runtime, and understand what apps are really doing under the hood.]]></description>
            <content:encoded><![CDATA[<p>[[toc]]</p>
<p>Every app on your phone is making network requests, processing data, and executing logic that you can't see. As developers, understanding what's happening behind the scenes isn't just curiosity — it's a critical skill for security research, debugging, and building better software.</p>
<p>This post covers my approach to Android reverse engineering using tools I use daily: <a href="https://frida.re">Frida</a> for runtime instrumentation, <a href="https://mitmproxy.org">mitmproxy</a> for traffic interception, and <a href="https://github.com/skylot/jadx">Jadx</a> for static analysis. Everything here is for educational purposes — understanding how apps work so you can build more secure ones.</p>
<h2>The Toolkit</h2>
<p>Before diving in, here's the stack:</p>
<table>
<thead>
<tr>
<th>Tool</th>
<th>Purpose</th>
</tr>
</thead>
<tbody>
<tr>
<td><a href="https://frida.re">Frida</a></td>
<td>Dynamic instrumentation — hook functions at runtime</td>
</tr>
<tr>
<td><a href="https://mitmproxy.org">mitmproxy</a></td>
<td>Intercept and inspect HTTPS traffic</td>
</tr>
<tr>
<td><a href="https://github.com/skylot/jadx">Jadx</a></td>
<td>Decompile APKs to readable Java/Kotlin source</td>
</tr>
<tr>
<td><a href="https://github.com/sensepost/objection">Objection</a></td>
<td>Runtime mobile exploration, built on Frida</td>
</tr>
<tr>
<td><a href="https://developer.android.com/tools/adb">ADB</a></td>
<td>Android Debug Bridge — communicate with devices</td>
</tr>
<tr>
<td><a href="https://www.genymotion.com">Genymotion</a></td>
<td>Android emulator optimized for testing</td>
</tr>
</tbody>
</table>
<p>These tools complement each other. Jadx gives you the static picture — what the code looks like. Frida gives you the dynamic picture — what the code actually does at runtime. mitmproxy shows you what's going over the wire.</p>
<h2>Static Analysis with Jadx</h2>
<p>Every RE session starts with static analysis. Before you run anything, understand the codebase.</p>
<h3>Decompiling the APK</h3>
<p>Pull the APK from a device or download it, then open it in Jadx:</p>
<pre><code class="language-bash"># Pull APK from connected device
adb shell pm path com.example.app
adb pull /data/app/.../base.apk ./target.apk

# Open in Jadx GUI
jadx-gui target.apk
</code></pre>
<p>Jadx decompiles DEX bytecode back to Java source. It's not perfect — obfuscated apps produce mangled class names like <code>a.b.c.d</code> — but it's usually readable enough to understand the architecture.</p>
<h3>What to Look For</h3>
<p>When I open an APK in Jadx, I focus on a few things:</p>
<p><strong>1. Network layer</strong> — Find the HTTP client configuration. Most apps use OkHttp or Retrofit. Search for <code>OkHttpClient</code>, <code>Interceptor</code>, or <code>Retrofit.Builder</code>:</p>
<pre><code class="language-java">// Common pattern: custom OkHttp interceptor adding auth headers
public class AuthInterceptor implements Interceptor {
    @Override
    public Response intercept(Chain chain) {
        Request original = chain.request();
        Request request = original.newBuilder()
            .header(&quot;Authorization&quot;, &quot;Bearer &quot; + getToken())
            .header(&quot;X-Device-Id&quot;, getDeviceId())
            .build();
        return chain.proceed(request);
    }
}
</code></pre>
<p>This tells you what headers the app sends, how auth works, and what custom interceptors modify requests.</p>
<p><strong>2. Certificate pinning</strong> — Search for <code>CertificatePinner</code>, <code>TrustManager</code>, or <code>SSL</code>. If the app pins certificates, you'll need to bypass this before mitmproxy can intercept traffic.</p>
<p><strong>3. Encryption/signing</strong> — Look for request signing logic. Many apps sign API requests with HMAC or similar schemes. Search for <code>Mac.getInstance</code>, <code>MessageDigest</code>, <code>Cipher</code>, or <code>Signature</code>:</p>
<pre><code class="language-java">// Request signing pattern
public String signRequest(String path, String body, long timestamp) {
    String payload = path + body + timestamp;
    Mac mac = Mac.getInstance(&quot;HmacSHA256&quot;);
    mac.init(new SecretKeySpec(SECRET_KEY, &quot;HmacSHA256&quot;));
    byte[] hash = mac.doFinal(payload.getBytes());
    return Base64.encodeToString(hash, Base64.NO_WRAP);
}
</code></pre>
<p>Understanding the signing scheme is critical — without it, modified requests will be rejected by the server.</p>
<p><strong>4. Obfuscation</strong> — Check <code>proguard-rules.pro</code> or look for patterns indicating obfuscation tools (ProGuard, R8, DexGuard). Heavy obfuscation means you'll rely more on dynamic analysis with Frida.</p>
<h2>Traffic Interception with mitmproxy</h2>
<p>Once you understand the app's network layer from static analysis, it's time to see real traffic.</p>
<h3>Setup</h3>
<p>mitmproxy acts as a proxy between the device and the internet. All HTTPS traffic flows through it, and you can inspect, modify, or replay requests.</p>
<pre><code class="language-bash"># Start mitmproxy on your machine
mitmproxy --listen-port 8080

# Or use the web interface
mitmweb --listen-port 8080
</code></pre>
<p>Configure the Android device to use your machine as a proxy:</p>
<pre><code class="language-bash"># Set proxy on connected device via ADB
adb shell settings put global http_proxy $(hostname -I | awk '{print $1}'):8080

# Install mitmproxy CA certificate on device
# Download from http://mitm.it on the device browser
</code></pre>
<p>For apps targeting Android 7+, user-installed CA certificates aren't trusted by default. You need to install the cert as a system CA, which requires root:</p>
<pre><code class="language-bash"># Convert mitmproxy cert to Android system format
openssl x509 -inform PEM -subject_hash_old \
  -in ~/.mitmproxy/mitmproxy-ca-cert.pem | head -1
# Output: c8750f0d (example hash)

cp ~/.mitmproxy/mitmproxy-ca-cert.pem c8750f0d.0

# Push to device system cert store (requires root)
adb root
adb remount
adb push c8750f0d.0 /system/etc/security/cacerts/
adb shell chmod 644 /system/etc/security/cacerts/c8750f0d.0
adb reboot
</code></pre>
<h3>What You See</h3>
<p>With mitmproxy running, every API call the app makes is visible:</p>
<pre><code>POST https://api.example.com/v2/feed
    Authorization: Bearer eyJhbGciOiJIUzI1NiJ9...
    X-Device-Id: a1b2c3d4-e5f6-7890
    X-Request-Sign: Kx8mNpQ2r1...
    Content-Type: application/json

    {&quot;page&quot;: 1, &quot;limit&quot;: 20, &quot;filter&quot;: &quot;trending&quot;}
</code></pre>
<p>You can see the exact headers, request body, response, timing, and status codes. This is invaluable for understanding API contracts that aren't documented.</p>
<h3>Scripting with mitmproxy</h3>
<p>mitmproxy supports Python scripts for automating analysis. For example, logging all API endpoints and their parameters:</p>
<pre><code class="language-python"># log_endpoints.py
from mitmproxy import http

def response(flow: http.HTTPFlow):
    if &quot;api.example.com&quot; in flow.request.pretty_host:
        print(f&quot;{flow.request.method} {flow.request.path}&quot;)
        print(f&quot;  Status: {flow.response.status_code}&quot;)
        print(f&quot;  Request size: {len(flow.request.content)} bytes&quot;)
        print(f&quot;  Response size: {len(flow.response.content)} bytes&quot;)
        print()
</code></pre>
<pre><code class="language-bash">mitmproxy -s log_endpoints.py
</code></pre>
<p>You can also modify requests on the fly — change parameters, swap tokens, or inject headers to test how the server responds.</p>
<h2>Dynamic Instrumentation with Frida</h2>
<p>This is where the real magic happens. Frida lets you inject JavaScript into a running process, hook any function, read and modify arguments, and change return values — all without modifying the APK.</p>
<h3>Setup</h3>
<p>Install Frida on your machine and push the Frida server to the device:</p>
<pre><code class="language-bash"># Install Frida CLI tools
pip install frida-tools

# Download frida-server for your device architecture
# Push to device
adb push frida-server-16.x.x-android-arm64 /data/local/tmp/frida-server
adb shell chmod 755 /data/local/tmp/frida-server

# Start frida-server (requires root)
adb shell su -c '/data/local/tmp/frida-server &amp;'
</code></pre>
<p>Verify it's running:</p>
<pre><code class="language-bash">frida-ps -U  # List processes on USB device
</code></pre>
<h3>Bypassing SSL Pinning</h3>
<p>The first thing you'll want to do is bypass certificate pinning so mitmproxy can intercept traffic. This is the most common Frida use case:</p>
<pre><code class="language-javascript">// ssl_bypass.js
Java.perform(() =&gt; {
  // Bypass OkHttp CertificatePinner
  const CertificatePinner = Java.use('okhttp3.CertificatePinner')
  CertificatePinner.check.overload(
    'java.lang.String',
    'java.util.List'
  ).implementation = function (hostname, peerCertificates) {
    console.log(`[*] Bypassing pin for: ${hostname}`)
    // Do nothing — skip the pin check
  }

  // Bypass custom TrustManager
  const TrustManagerImpl = Java.use('com.android.org.conscrypt.TrustManagerImpl')
  TrustManagerImpl.verifyChain.implementation = function () {
    console.log('[*] Bypassing TrustManager verification')
    return arguments[0] // Return the unverified chain
  }
})
</code></pre>
<pre><code class="language-bash">frida -U -l ssl_bypass.js -f com.example.app
</code></pre>
<p>The <code>-f</code> flag spawns the app fresh with Frida attached from the start, which is important for intercepting early network calls.</p>
<h3>Hooking Functions</h3>
<p>The real power of Frida is hooking arbitrary functions. Say you found an interesting method in Jadx:</p>
<pre><code class="language-java">// Found in Jadx: com.example.app.crypto.RequestSigner
public class RequestSigner {
    public String sign(String path, String body, long timestamp) {
        // ... signing logic
    }
}
</code></pre>
<p>You can hook it to see exactly what's being signed:</p>
<pre><code class="language-javascript">// hook_signer.js
Java.perform(() =&gt; {
  const Signer = Java.use('com.example.app.crypto.RequestSigner')

  Signer.sign.implementation = function (path, body, timestamp) {
    console.log('[*] sign() called')
    console.log(`    path: ${path}`)
    console.log(`    body: ${body}`)
    console.log(`    timestamp: ${timestamp}`)

    // Call the original method
    const result = this.sign(path, body, timestamp)
    console.log(`    signature: ${result}`)

    return result
  }
})
</code></pre>
<p>Every time the app signs a request, you see the inputs and output in your terminal. This is how you reverse-engineer proprietary signing schemes.</p>
<h3>Modifying Behavior</h3>
<p>Frida can also change how functions behave:</p>
<pre><code class="language-javascript">// Force a function to always return true
Java.perform(() =&gt; {
  const Auth = Java.use('com.example.app.auth.AuthManager')

  Auth.isLoggedIn.implementation = function () {
    console.log('[*] isLoggedIn() → forcing true')
    return true
  }

  // Change a parameter before it reaches the original function
  Auth.validateToken.implementation = function (token) {
    console.log(`[*] Original token: ${token}`)
    // Call original with modified token
    return this.validateToken('modified_token_here')
  }
})
</code></pre>
<h3>Tracing and Discovery</h3>
<p>When you don't know which function to hook, Frida can trace entire classes:</p>
<pre><code class="language-javascript">// trace_class.js — log every method call on a class
Java.perform(() =&gt; {
  const target = Java.use('com.example.app.api.ApiClient')
  const methods = target.class.getDeclaredMethods()

  methods.forEach((method) =&gt; {
    const name = method.getName()
    const overloads = target[name].overloads

    overloads.forEach((overload) =&gt; {
      overload.implementation = function () {
        const args = Array.from(arguments).map((a) =&gt; {
          return a ? a.toString() : 'null'
        })
        console.log(`[*] ${name}(${args.join(', ')})`)
        return this[name].apply(this, arguments) // eslint-disable-line prefer-spread
      }
    })
  })
})
</code></pre>
<p>This is like setting a breakpoint on every method in a class — you see the call sequence, arguments, and can narrow down to the specific function you care about.</p>
<h2>Objection: Frida Made Easy</h2>
<p><a href="https://github.com/sensepost/objection">Objection</a> is a toolkit built on top of Frida that provides common RE tasks as simple commands:</p>
<pre><code class="language-bash"># Start objection on a running app
objection -g com.example.app explore
</code></pre>
<p>Once inside the Objection REPL:</p>
<pre><code class="language-bash"># Disable SSL pinning (one command)
android sslpinning disable

# List all activities
android hooking list activities

# List methods in a class
android hooking list class_methods com.example.app.api.ApiClient

# Hook a method and watch calls
android hooking watch class_method com.example.app.api.ApiClient.sendRequest

# Dump the keystore
android keystore list

# Search for classes containing &quot;Crypto&quot;
android hooking search classes Crypto

# Dump shared preferences
android hooking get_current_activity
</code></pre>
<p>Objection is perfect for exploration. When you know what you're looking for, write custom Frida scripts. When you're exploring, use Objection.</p>
<h2>uiautomator2: UI Automation</h2>
<p>Sometimes you need to automate the app's UI — navigate screens, tap buttons, scroll feeds — while your Frida hooks and mitmproxy capture data.</p>
<p><a href="https://github.com/openatx/uiautomator2">uiautomator2</a> is a Python library for Android UI automation:</p>
<pre><code class="language-python">import uiautomator2 as u2

# Connect to device
d = u2.connect()

# Launch app
d.app_start('com.example.app')

# Wait for element and tap
d(text=&quot;Sign In&quot;).click()

# Type text
d(resourceId=&quot;com.example.app:id/email&quot;).set_text(&quot;test@example.com&quot;)

# Scroll
d.swipe_ext(&quot;up&quot;, scale=0.8)

# Screenshot
d.screenshot(&quot;screen.png&quot;)

# Get element info
info = d(resourceId=&quot;com.example.app:id/balance&quot;).info
print(f&quot;Balance text: {info['text']}&quot;)
</code></pre>
<p>Combined with Frida hooks, this creates a powerful pipeline: automate user flows in the UI while capturing every API call and function invocation happening underneath.</p>
<h2>A Complete RE Workflow</h2>
<p>Here's how I approach reverse engineering an Android app from scratch:</p>
<h3>Phase 1: Reconnaissance</h3>
<pre><code class="language-bash"># Pull the APK
adb pull $(adb shell pm path com.example.app | cut -d: -f2) target.apk

# Decompile with Jadx
jadx-gui target.apk
</code></pre>
<p>In Jadx, I map out:</p>
<ul>
<li>The network layer (HTTP client, interceptors, base URLs)</li>
<li>Authentication mechanism (token storage, refresh logic)</li>
<li>Request signing (if any)</li>
<li>Certificate pinning implementation</li>
<li>Interesting business logic classes</li>
</ul>
<h3>Phase 2: Traffic Analysis</h3>
<pre><code class="language-bash"># Start mitmproxy
mitmweb --listen-port 8080

# Configure device proxy
adb shell settings put global http_proxy &lt;host_ip&gt;:8080

# If the app uses SSL pinning, bypass with Frida first
frida -U -l ssl_bypass.js -f com.example.app
</code></pre>
<p>Browse the app normally and observe the traffic in mitmproxy. Document:</p>
<ul>
<li>All API endpoints and their request/response format</li>
<li>Authentication headers and how tokens are structured</li>
<li>Any request signing or encryption patterns</li>
<li>Rate limiting behavior</li>
</ul>
<h3>Phase 3: Deep Dive with Frida</h3>
<p>Based on what I found in Phase 1 and 2, I write targeted Frida hooks:</p>
<pre><code class="language-javascript">Java.perform(() =&gt; {
  // Hook the request signing function found in Jadx
  const Signer = Java.use('com.example.app.crypto.RequestSigner')
  Signer.sign.implementation = function (path, body, ts) {
    console.log(`[sign] ${path}`)
    console.log(`  body: ${body.substring(0, 100)}`)
    console.log(`  ts: ${ts}`)
    const result = this.sign(path, body, ts)
    console.log(`  sig: ${result}`)
    return result
  }

  // Hook SharedPreferences to see what's being stored
  const SP = Java.use('android.app.SharedPreferencesImpl$EditorImpl')
  SP.putString.implementation = function (key, value) {
    console.log(`[prefs] ${key} = ${value}`)
    return this.putString(key, value)
  }
})
</code></pre>
<h3>Phase 4: Documentation</h3>
<p>This is the most important step that most people skip. Document everything:</p>
<ul>
<li>API endpoint map with request/response schemas</li>
<li>Authentication flow diagram</li>
<li>Request signing algorithm</li>
<li>Interesting findings and potential vulnerabilities</li>
</ul>
<p>Without documentation, you'll forget everything within a week and have to redo the analysis.</p>
<h2>Security Lessons</h2>
<p>Reverse engineering apps taught me more about building secure software than any security course. Here are patterns I see repeatedly:</p>
<p><strong>Client-side validation is not validation.</strong> Every check that happens on the client can be bypassed with Frida. Price checks, quantity limits, feature flags — if the server doesn't enforce it, it doesn't exist.</p>
<p><strong>Hardcoded secrets are always found.</strong> API keys, signing secrets, encryption keys embedded in the APK — Frida can dump them at runtime even if they're obfuscated in the binary. Use server-side key management.</p>
<p><strong>Certificate pinning is a speed bump, not a wall.</strong> It takes roughly 30 seconds to bypass standard certificate pinning with Frida or Objection. It's still worth implementing — it raises the bar — but don't rely on it as your sole security measure.</p>
<p><strong>Obfuscation slows analysis, it doesn't prevent it.</strong> ProGuard renames classes, but the logic is the same. DexGuard adds more layers, but Frida hooks at the JVM level, bypassing bytecode obfuscation entirely. Focus your security budget on server-side hardening.</p>
<p><strong>The real security boundary is the server.</strong> Every request from the client should be treated as potentially malicious. Validate everything server-side: authentication, authorization, input validation, rate limiting, business logic constraints. The client is untrusted territory.</p>
<h2>Ethics and Legality</h2>
<p>A note on responsible use. The techniques in this post are powerful — and with power comes responsibility.</p>
<p><strong>Authorized use only.</strong> Only reverse engineer apps you own, have permission to test, or are participating in a legitimate bug bounty program for. Unauthorized access to computer systems is illegal in most jurisdictions.</p>
<p><strong>Responsible disclosure.</strong> If you find vulnerabilities, report them to the developer through their security contact or bug bounty program. Don't exploit them, don't publish exploit code, and give the developer reasonable time to fix issues before any public disclosure.</p>
<p><strong>Educational context.</strong> Understanding how apps work makes you a better developer and a better security engineer. The goal is to build more secure software — not to break existing ones.</p>
<h2>Wrap Up</h2>
<p>Reverse engineering is a skill that compounds. The more apps you analyze, the faster you recognize patterns — the same OkHttp interceptor setup, the same JWT structure, the same HMAC signing scheme. What takes hours at first becomes minutes with experience.</p>
<p>The tools in this post — Frida, mitmproxy, Jadx, Objection — cover 95% of what you'll need for Android RE. The remaining 5% is native code analysis (IDA Pro, Ghidra) for apps with C/C++ libraries, which is a topic for another post.</p>
<p>If you're a developer who has never looked at their own app from the attacker's perspective, I'd encourage you to try. Install mitmproxy, set up Frida, and see what your app looks like from the outside. You might be surprised.</p>
<p>You can find my <a href="/uses">full tech stack here</a>.</p>
<p>Thanks for reading!</p>
]]></content:encoded>
            <author>hi@zevs.gg (Zevs)</author>
        </item>
        <item>
            <title><![CDATA[Building a Self-Hosted Infrastructure]]></title>
            <link>https://zevs.gg/posts/self-hosted-infrastructure</link>
            <guid isPermaLink="true">https://zevs.gg/posts/self-hosted-infrastructure</guid>
            <pubDate>Fri, 06 Mar 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[How I built my self-hosted infrastructure with Proxmox, Docker, Traefik, and friends — and why I stopped relying on cloud providers for everything.]]></description>
            <content:encoded><![CDATA[<p>[[toc]]</p>
<p>There's a moment every developer hits — you're paying $50/month for a handful of small VPS instances, your services are scattered across three different cloud providers, and you realize you have no idea where half your stuff is running. That was me two years ago. Today, everything runs on hardware I own, in a setup I fully control.</p>
<p>This post walks through how I built my self-hosted infrastructure from the ground up — the decisions, the stack, and the lessons I picked up along the way.</p>
<h2>Why Self-Host?</h2>
<p>The obvious answer is cost. Running a few services on cloud VPS instances is cheap, but once you start stacking databases, queues, monitoring, storage, and multiple apps — costs add up fast. A single bare-metal server can replace several cloud instances for a fraction of the recurring cost.</p>
<p>But cost isn't the real reason. <strong>Control</strong> is. When you self-host, you own the data, you own the network, and you decide the rules. No vendor lock-in, no surprise pricing changes, no arbitrary rate limits. If something breaks, it's on you — but at least you can actually fix it.</p>
<p>There's also the learning aspect. Managing your own infrastructure teaches you things that no tutorial or managed service ever will. DNS propagation, firewall rules, disk I/O bottlenecks, certificate renewal failures at 3 AM — these are the experiences that make you a better engineer.</p>
<h2>The Hardware Layer</h2>
<p>Everything starts with <a href="https://www.proxmox.com">Proxmox</a>, an open-source virtualization platform built on top of Debian. It gives you a clean web UI for managing virtual machines and containers, with support for clustering, live migration, and backups out of the box.</p>
<p>I run Proxmox on a dedicated server with enough RAM and cores to comfortably host 20+ services. The storage backend is <a href="https://openzfs.org">ZFS</a> — a filesystem that handles compression, snapshots, and data integrity verification natively. ZFS snapshots are incredibly useful for rollbacks. Before any risky upgrade, I snapshot the dataset, and if things go wrong, I can roll back in seconds.</p>
<pre><code class="language-bash"># Create a snapshot before upgrading
zfs snapshot rpool/data/myservice@pre-upgrade

# Something went wrong? Roll back instantly
zfs rollback rpool/data/myservice@pre-upgrade
</code></pre>
<p>For lightweight services, I use <a href="https://linuxcontainers.org">LXC containers</a> instead of full VMs. LXC gives you near-native performance with process-level isolation — perfect for running databases, reverse proxies, or any service that doesn't need a full kernel. The resource overhead is minimal compared to a VM.</p>
<p>My general rule: <strong>LXC for infrastructure services, Docker for application workloads.</strong> This keeps things clean and separable.</p>
<h2>Container Orchestration</h2>
<p>Most of my application workloads run in <a href="https://www.docker.com">Docker</a> containers, orchestrated with <a href="https://docs.docker.com/compose/">Docker Compose</a>. For the scale I operate at, Compose hits the sweet spot — declarative, version-controlled, and simple enough that I can understand exactly what's running without consulting a dashboard.</p>
<p>A typical service looks like this:</p>
<pre><code class="language-yaml"># docker-compose.yml
services:
  app:
    image: ghcr.io/my-org/my-app:latest
    restart: unless-stopped
    environment:
      DATABASE_URL: postgres://user:pass@db:5432/app
      REDIS_URL: redis://redis:6379
    labels:
      - traefik.enable=true
      - traefik.http.routers.app.rule=Host(`app.example.com`)
      - traefik.http.routers.app.tls.certresolver=cloudflare
    networks:
      - traefik
      - internal

  db:
    image: postgres:16-alpine
    restart: unless-stopped
    volumes:
      - pgdata:/var/lib/postgresql/data
    networks:
      - internal

  redis:
    image: redis:7-alpine
    restart: unless-stopped
    networks:
      - internal

volumes:
  pgdata:

networks:
  traefik:
    external: true
  internal:
</code></pre>
<p>The pattern is consistent across all services: the app connects to an external Traefik network for ingress, while databases and caches live on an internal network that's not exposed. Labels on the app container tell Traefik how to route traffic — no separate config files to maintain.</p>
<p>I also use <a href="https://kubernetes.io">Kubernetes</a> for workloads that need horizontal scaling or more sophisticated scheduling. But honestly, for most self-hosted scenarios, Docker Compose is more than enough. K8s introduces significant operational complexity that's only worth it when you genuinely need it.</p>
<h2>Reverse Proxy &amp; SSL</h2>
<p><a href="https://traefik.io">Traefik</a> is the centerpiece of my ingress layer. It automatically discovers services via Docker labels, handles TLS termination, and manages certificate renewal — all without manual intervention.</p>
<p>The Traefik configuration is minimal:</p>
<pre><code class="language-yaml"># traefik.yml
entryPoints:
  web:
    address: ':80'
    http:
      redirections:
        entryPoint:
          to: websecure
          scheme: https
  websecure:
    address: ':443'

certificatesResolvers:
  cloudflare:
    acme:
      email: admin@example.com
      storage: /letsencrypt/acme.json
      dnsChallenge:
        provider: cloudflare

providers:
  docker:
    exposedByDefault: false
    network: traefik

api:
  dashboard: true
</code></pre>
<p>Every new service I deploy gets automatic HTTPS with zero extra configuration. I just add the Traefik labels to the Docker Compose file, and Traefik picks it up within seconds. The DNS challenge via <a href="https://www.cloudflare.com">Cloudflare</a> means I can issue wildcard certificates and don't need to expose port 80 for HTTP challenges.</p>
<p>Cloudflare also sits in front as a CDN and DDoS protection layer. DNS records point to Cloudflare, which proxies traffic to my server. This keeps my actual server IP hidden and adds an extra layer of caching for static assets.</p>
<h2>Networking</h2>
<p>This is where things get interesting. My network setup has evolved significantly over time.</p>
<p>At the edge, I run <a href="https://www.pfsense.org">pfSense</a> as my primary firewall. It handles VLAN segmentation, NAT, and firewall rules. I separate my network into multiple VLANs — management, servers, IoT devices, and guest traffic are all isolated from each other. A compromised IoT device shouldn't be able to reach my server VLAN, and guest WiFi shouldn't see anything on my internal network.</p>
<p><a href="https://ui.com">UniFi</a> access points and switches handle the physical layer. The UniFi controller runs as an LXC container on Proxmox, managing all network hardware from a single interface. Say what you will about Ubiquiti's pricing, but the management experience is hard to beat for a home/small office setup.</p>
<p>For remote access, <a href="https://tailscale.com">Tailscale</a> is a game-changer. It creates a WireGuard-based mesh VPN that connects all my devices — laptops, phones, servers — into a single private network, regardless of where they physically are. No port forwarding, no dynamic DNS, no VPN server to maintain.</p>
<pre><code class="language-bash"># Access my home server from anywhere
ssh user@server  # Just works, over Tailscale

# Access internal services without exposing them publicly
curl http://grafana:3000  # Only accessible via Tailscale network
</code></pre>
<p>Services that don't need to be public — like Grafana, Proxmox UI, or internal admin panels — are only accessible through Tailscale. This drastically reduces the attack surface. The only ports exposed to the internet are 80 and 443, both behind Cloudflare.</p>
<h2>Monitoring &amp; Observability</h2>
<p>Running your own infrastructure without monitoring is like driving at night with the headlights off. <a href="https://prometheus.io">Prometheus</a> scrapes metrics from every service, and <a href="https://grafana.com">Grafana</a> turns those metrics into dashboards I can actually understand.</p>
<p>Every Docker host runs <code>node_exporter</code> and <code>cadvisor</code> for system and container metrics. Application services expose custom Prometheus endpoints where relevant. Prometheus collects everything and stores it with configurable retention.</p>
<pre><code class="language-yaml"># prometheus.yml
scrape_configs:
  - job_name: node
    static_configs:
      - targets:
          - 'node-exporter:9100'

  - job_name: cadvisor
    static_configs:
      - targets:
          - 'cadvisor:8080'

  - job_name: traefik
    static_configs:
      - targets:
          - 'traefik:8080'
</code></pre>
<p>I have Grafana dashboards for CPU/memory/disk usage, container health, network throughput, and Traefik request rates. AlertManager sends notifications to Telegram when something goes wrong — disk usage above 85%, a container restarting in a loop, or Traefik returning too many 5xx errors.</p>
<p>The monitoring stack itself runs on a separate LXC container so it stays up even if the Docker host has issues. You don't want your monitoring to go down at the same time as the thing it's monitoring.</p>
<h2>Backup Strategy</h2>
<p>The one thing I've learned the hard way: <strong>backups that aren't tested are not backups.</strong></p>
<p>My strategy is layered:</p>
<ol>
<li><strong>ZFS snapshots</strong> — automatic hourly snapshots with 7-day retention. Instant rollback for filesystem-level issues.</li>
<li><strong>Application-level backups</strong> — PostgreSQL <code>pg_dump</code> runs nightly via cron, compressed and stored on a separate ZFS dataset.</li>
<li><strong>Off-site replication</strong> — critical data is synced to <a href="https://min.io">MinIO</a> on a separate machine using <code>rclone</code>, and the most important stuff goes to an off-site location.</li>
</ol>
<pre><code class="language-bash"># Nightly database backup via cron
0 3 * * * pg_dump -Fc mydb &gt; /backups/mydb-$(date +\%Y\%m\%d).dump
# Sync to MinIO
0 4 * * * rclone sync /backups minio:backups --min-age 1h
</code></pre>
<p>I test restores quarterly. It's tedious, but the one time you need a backup and it doesn't work, you'll wish you had tested it.</p>
<h2>Provisioning with Ansible</h2>
<p>When I started, I configured everything manually — SSH in, install packages, edit config files. That works for one server. It doesn't work when you need to rebuild or replicate.</p>
<p><a href="https://www.ansible.com">Ansible</a> handles all server provisioning now. Every package, every config file, every cron job is defined in playbooks. If my server dies tomorrow, I can spin up a new Proxmox host and have everything running again by executing a single command.</p>
<pre><code class="language-bash">ansible-playbook -i inventory site.yml
</code></pre>
<p>The playbooks cover base system setup, Docker installation, Traefik configuration, monitoring stack deployment, firewall rules, and user management. It's not glamorous work, but it's the difference between a one-hour recovery and a two-day scramble.</p>
<h2>Lessons Learned</h2>
<p>After running this setup for a while, a few things stand out:</p>
<p><strong>Start simple.</strong> Don't build the perfect infrastructure on day one. Start with Docker Compose on a single server. Add complexity only when you hit real limitations, not imagined ones.</p>
<p><strong>Automate early.</strong> The second time you manually configure something, write an Ansible playbook for it. Your future self will thank you.</p>
<p><strong>Network segmentation matters.</strong> VLANs and firewall rules feel like overkill until you have a security incident. It's much easier to set up segmentation from the start than to retrofit it later.</p>
<p><strong>Monitor everything, alert selectively.</strong> Collect all the metrics you can, but only alert on things that require immediate action. Alert fatigue is real and dangerous.</p>
<p><strong>Document your setup.</strong> Not for others — for yourself in six months when you've forgotten why that one iptables rule exists. I keep a private wiki with network diagrams, service inventories, and runbooks for common operations.</p>
<h2>What's Next</h2>
<p>The infrastructure is never really &quot;done.&quot; I'm currently exploring moving more workloads to Kubernetes for better resource utilization and looking into GitOps workflows with Flux or ArgoCD for automated deployments. I'm also considering adding a secondary node for Proxmox clustering to enable live migration and high availability.</p>
<p>But for now, this setup handles everything I throw at it — multiple web apps, databases, queues, monitoring, and storage — all on hardware I own, running software I control. It's not perfect, but it's mine.</p>
<p>If you're thinking about self-hosting, my advice is simple: just start. Pick one service you're currently paying for in the cloud, spin up a cheap used server or a mini PC, and move it over. You'll learn more in a weekend than in a month of reading documentation.</p>
<p>Thanks for reading!</p>
]]></content:encoded>
            <author>hi@zevs.gg (Zevs)</author>
        </item>
        <item>
            <title><![CDATA[My Terminal Setup in 2026]]></title>
            <link>https://zevs.gg/posts/terminal-setup-2026</link>
            <guid isPermaLink="true">https://zevs.gg/posts/terminal-setup-2026</guid>
            <pubDate>Fri, 06 Mar 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[A walkthrough of my terminal setup — Ghostty, Zsh, Starship, tmux, and the CLI tools that make my daily workflow fast and enjoyable.]]></description>
            <content:encoded><![CDATA[<p>[[toc]]</p>
<p>I spend most of my working hours in the terminal. Over the years, my setup has gone through many iterations — from the default Bash on Ubuntu, to iTerm2 with Oh My Zsh, to a heavily customized Kitty setup. Each time, I thought I'd found the endgame. I never had.</p>
<p>This post documents my current terminal setup as of early 2026, running on Fedora 43. It's fast, minimal, and — most importantly — stays out of my way.</p>
<h2>Ghostty</h2>
<p><a href="https://ghostty.org">Ghostty</a> is my terminal emulator. Built by Mitchell Hashimoto (the creator of Vagrant, Terraform, and many others at HashiCorp), Ghostty is GPU-accelerated, written in Zig, and genuinely feels like it was designed by someone who uses a terminal 12 hours a day.</p>
<p>What sold me on Ghostty over alternatives like Kitty or Alacritty:</p>
<ul>
<li><strong>Native splits and tabs</strong> — no need for tmux just to have panes (though I still use tmux for sessions).</li>
<li><strong>Fast</strong> — rendering is buttery smooth even with large scrollback.</li>
<li><strong>Sane defaults</strong> — most things work well out of the box. The config file is minimal.</li>
<li><strong>Native GTK on Linux</strong> — it feels like a proper GNOME app, not an electron wrapper.</li>
</ul>
<p>My config is straightforward:</p>
<pre><code class="language-ini"># Font
font-family = &quot;JetBrainsMono Nerd Font&quot;
font-size = 12.5

# Theme
theme = &quot;TokyoNight&quot;
window-theme = &quot;ghostty&quot;

# Padding
window-padding-x = 12
window-padding-y = 8
window-padding-balance = true

# Shell
command = &quot;/usr/bin/zsh&quot;

# Cursor
cursor-style = &quot;block&quot;
cursor-style-blink = true

# Scrollback
scrollback-limit = 1000000

# Misc
mouse-hide-while-typing = true
copy-on-select = true
unfocused-split-opacity = 0.85
</code></pre>
<p>The TokyoNight theme runs consistently across Ghostty, tmux, and my editor — having a unified color scheme across tools removes a surprising amount of visual friction.</p>
<p>For keybindings, I use Ghostty's built-in splits alongside tmux. Ghostty splits (<code>Ctrl+Super+R/D</code>) for quick side-by-side work, tmux sessions for persistent workspaces that survive terminal restarts:</p>
<pre><code class="language-ini"># Splits
keybind = &quot;ctrl+super+r=new_split:right&quot;
keybind = &quot;ctrl+super+d=new_split:down&quot;
keybind = &quot;ctrl+super+z=toggle_split_zoom&quot;

# Split navigation
keybind = &quot;alt+shift+left=goto_split:left&quot;
keybind = &quot;alt+shift+right=goto_split:right&quot;
keybind = &quot;alt+shift+up=goto_split:up&quot;
keybind = &quot;alt+shift+down=goto_split:down&quot;

# Tab navigation
keybind = alt+1=goto_tab:1
keybind = alt+2=goto_tab:2
keybind = alt+3=goto_tab:3
</code></pre>
<h2>Zsh + Zinit</h2>
<p>I use <a href="https://www.zsh.org/">Zsh</a> as my shell, managed with <a href="https://github.com/zdharma-continuum/zinit">Zinit</a> as the plugin manager. I moved away from Oh My Zsh as a framework a while ago — it's great for getting started, but loading the full framework adds noticeable startup latency. With Zinit, I cherry-pick only the OMZ snippets I actually need.</p>
<p>My plugin setup:</p>
<pre><code class="language-zsh"># Plugin manager
source &quot;${ZINIT_HOME}/zinit.zsh&quot;

# Completions (load before compinit)
zinit light zsh-users/zsh-completions

# Oh My Zsh snippets — only what I use
zinit snippet OMZP::git
zinit snippet OMZP::sudo
zinit snippet OMZP::docker
zinit snippet OMZP::docker-compose
zinit snippet OMZP::node
zinit snippet OMZP::python
zinit snippet OMZP::command-not-found

# Initialize completion
autoload -Uz compinit &amp;&amp; compinit

# fzf-tab (must be after compinit)
zinit light Aloxaf/fzf-tab

# Autosuggestions, then syntax highlighting (order matters)
zinit light zsh-users/zsh-autosuggestions
zinit light zsh-users/zsh-syntax-highlighting
</code></pre>
<p>The three plugins that make the biggest difference in daily usage:</p>
<ul>
<li><strong>zsh-autosuggestions</strong> — suggests commands as you type based on history. Accept with right arrow. Once you're used to this, you can't go back.</li>
<li><strong>zsh-syntax-highlighting</strong> — colors valid commands green and invalid ones red <em>as you type</em>, before hitting enter. Catches typos instantly.</li>
<li><strong>fzf-tab</strong> — replaces the default tab completion with a fuzzy finder. Combined with fzf-preview, it shows directory contents while you're navigating.</li>
</ul>
<pre><code class="language-zsh"># fzf-tab: preview directories while completing
zstyle ':fzf-tab:complete:cd:*' fzf-preview 'ls --color $realpath'
zstyle ':fzf-tab:complete:__zoxide_z:*' fzf-preview 'ls --color $realpath'
</code></pre>
<p>History is configured to deduplicate and share across sessions:</p>
<pre><code class="language-zsh">HISTSIZE=5000
HISTFILE=&quot;$HOME/.zsh_history&quot;
SAVEHIST=$HISTSIZE
setopt appendhistory sharehistory
setopt hist_ignore_all_dups hist_save_no_dups
</code></pre>
<h2>Starship Prompt</h2>
<p><a href="https://starship.rs">Starship</a> is a cross-shell prompt written in Rust. It's fast (sub-millisecond rendering), configurable, and shows contextual information based on the current directory — git branch, language versions, Docker context, command duration.</p>
<p>My prompt format keeps the left side clean (directory + git) and pushes language/tool indicators to the right using a fill:</p>
<pre><code class="language-toml">format = &quot;&quot;&quot;
$directory\
$git_branch\
$git_status\
$fill\
$python$nodejs$golang$rust$docker_context\
$cmd_duration\
$line_break\
$character&quot;&quot;&quot;
add_newline = false
palette = 'nord'
</code></pre>
<p>I use the <a href="https://www.nordtheme.com/">Nord color palette</a> for the prompt, which pairs well with the TokyoNight terminal theme — both are cool-toned and low-contrast enough to not be distracting.</p>
<p>A few tweaks that matter:</p>
<pre><code class="language-toml">[directory]
truncation_length = 3
truncation_symbol = '…/'
truncate_to_repo = false

[cmd_duration]
min_time = 500
style = 'fg:gray'
</code></pre>
<p>The <code>cmd_duration</code> module shows elapsed time for any command that takes longer than 500ms. Useful for noticing when a build or test run took longer than expected.</p>
<h2>tmux</h2>
<p><a href="https://github.com/tmux/tmux">tmux</a> handles session persistence. If my terminal crashes or I close the window, my sessions survive. I can also detach from a session on my desktop and reattach from SSH on my laptop — the session state is preserved.</p>
<p>I rebind the prefix to <code>Ctrl+Space</code> instead of the default <code>Ctrl+B</code>:</p>
<pre><code class="language-bash">unbind C-b
set -g prefix C-Space
bind C-Space send-prefix
</code></pre>
<p>Pane navigation is bound to <code>Alt+Arrows</code> (no prefix needed), and new splits inherit the current working directory:</p>
<pre><code class="language-bash"># Pane navigation without prefix
bind -n M-Left  select-pane -L
bind -n M-Right select-pane -R
bind -n M-Up    select-pane -U
bind -n M-Down  select-pane -D

# Keep cwd for new panes/windows
bind '&quot;' split-window -v -c &quot;#{pane_current_path}&quot;
bind % split-window -h -c &quot;#{pane_current_path}&quot;
bind c new-window -c &quot;#{pane_current_path}&quot;
</code></pre>
<p>The <a href="https://github.com/janoamaral/tokyo-night-tmux">Tokyo Night tmux theme</a> keeps the status bar consistent with the rest of the setup. I disable the clock and git indicator to keep it minimal — Starship already handles those.</p>
<h2>CLI Tools</h2>
<p>These are the tools that replaced standard Unix utilities in my workflow:</p>
<h3>eza</h3>
<p><a href="https://github.com/eza-community/eza"><code>eza</code></a> replaces <code>ls</code>. Git-aware, icons, tree view, and color-coded output. I alias <code>ls</code> to it globally:</p>
<pre><code class="language-zsh">alias ls='eza --icons'
alias ll='eza -l --icons'
alias la='eza -la --icons'
alias lt='eza --tree --icons --level=2'
alias lg='eza -l --git --icons'
</code></pre>
<p>The <code>lg</code> alias is particularly useful — it shows git status per-file inline with the listing.</p>
<h3>bat</h3>
<p><a href="https://github.com/sharkdp/bat"><code>bat</code></a> replaces <code>cat</code>. Syntax highlighting, line numbers, git diff integration. Aliased to <code>cat</code> because there's zero reason to use plain <code>cat</code> for reading files.</p>
<pre><code class="language-zsh">alias cat='bat'
</code></pre>
<h3>fzf</h3>
<p><a href="https://github.com/junegunn/fzf"><code>fzf</code></a> is a fuzzy finder that integrates with everything — shell history (<code>Ctrl+R</code>), file search (<code>Ctrl+T</code>), directory navigation, and tab completion via fzf-tab. It's one of those tools that once you internalize the keybindings, you wonder how you ever lived without it.</p>
<h3>zoxide</h3>
<p><a href="https://github.com/ajeetdsouza/zoxide"><code>zoxide</code></a> replaces <code>cd</code>. It tracks the directories you visit and lets you jump to them with partial matches. Instead of typing <code>cd ~/workspace/projects/my-app</code>, I just type <code>cd my-app</code> and zoxide figures out the rest.</p>
<pre><code class="language-zsh"># Replaces cd with zoxide's smart jump
eval &quot;$(zoxide init --cmd cd zsh)&quot;
</code></pre>
<h3>uv</h3>
<p><a href="https://github.com/astral-sh/uv"><code>uv</code></a> is a Python package manager written in Rust. It's absurdly fast — installing packages feels instant compared to pip. I use it for all Python work now.</p>
<h2>The Full Picture</h2>
<p>Here's how everything fits together:</p>
<pre><code>Ghostty (GPU-accelerated terminal)
  └─ Zsh (shell)
       ├─ Zinit (plugin manager)
       │    ├─ zsh-autosuggestions
       │    ├─ zsh-syntax-highlighting
       │    └─ fzf-tab
       ├─ Starship (prompt)
       ├─ tmux (session management)
       └─ CLI tools
            ├─ eza (ls)
            ├─ bat (cat)
            ├─ fzf (fuzzy finder)
            ├─ zoxide (cd)
            └─ uv (python packages)
</code></pre>
<p>The entire setup initializes in under 100ms. Zsh startup is fast because Zinit lazy-loads most plugins, Starship renders the prompt in single-digit milliseconds, and Ghostty's GPU rendering means there's never a visible lag between keypress and output.</p>
<h2>What I'd Change</h2>
<p>No setup is perfect. A few things I'm still iterating on:</p>
<ul>
<li><strong>Neovim integration</strong> — I use VS Code as my primary editor, but I keep meaning to build out a proper Neovim config for remote editing over SSH where VS Code feels heavy.</li>
<li><strong>Dotfiles management</strong> — My configs are currently just files on disk. I should probably set up a proper dotfiles repo with symlink management via <code>stow</code> or <code>chezmoi</code>.</li>
<li><strong>Shell startup profiling</strong> — Zsh startup is fast <em>enough</em>, but I haven't profiled it in a while. There might be low-hanging fruit I'm missing.</li>
</ul>
<h2>Wrap Up</h2>
<p>A terminal setup is deeply personal — what works for me might feel wrong to you. The key is to invest time in tools you use every day. Small optimizations in your terminal workflow compound over thousands of hours of use. A 2-second save on a command you run 50 times a day is nearly 10 hours saved per year.</p>
<p>If you're still on the default terminal with default Bash, I'd encourage you to try just one thing from this post. Start with Starship or eza — they're drop-in replacements that require zero commitment. Once you see the difference, you'll naturally want to optimize more.</p>
<p>You can find my <a href="/uses">full tech stack here</a>.</p>
<p>Thanks for reading!</p>
]]></content:encoded>
            <author>hi@zevs.gg (Zevs)</author>
        </item>
        <item>
            <title><![CDATA[Vue 3 SSG: How I Built This Website]]></title>
            <link>https://zevs.gg/posts/vue3-ssg-building-this-site</link>
            <guid isPermaLink="true">https://zevs.gg/posts/vue3-ssg-building-this-site</guid>
            <pubDate>Fri, 06 Mar 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[A deep dive into the tech stack behind this site — Vue 3, Vite, SSG, UnoCSS, Markdown as Vue components, auto-imports, and the build pipeline that ties it all together.]]></description>
            <content:encoded><![CDATA[<p>[[toc]]</p>
<p>Every developer eventually builds a personal site. Most of us rebuild it every year or two, chasing the latest framework. I've been through Jekyll, Hugo, Gatsby, Next.js, and Nuxt. This iteration is the one that stuck — not because Vue 3 is objectively the best choice, but because the developer experience finally matches how I want to work.</p>
<p>This post is a walkthrough of the entire stack — the architecture, the build pipeline, and the patterns that make writing content feel effortless.</p>
<h2>Why Vue 3 + SSG</h2>
<p>Static site generation means the site is pre-rendered to HTML at build time. No server, no runtime, no database. The output is a folder of HTML files that can be deployed anywhere — Netlify, Cloudflare Pages, a simple Nginx server, or even a GitHub Pages repo.</p>
<p>Vue 3 might seem like overkill for a personal site, but the combination of Vue's component model with static generation gives me something that pure static site generators (Hugo, Eleventy) can't: <strong>interactive components embedded directly in markdown content</strong>. I can write a blog post in Markdown and drop in a Vue component for a live demo, an interactive chart, or a custom visualization — all without leaving the markdown file.</p>
<p>The framework powering this is <a href="https://github.com/antfu/vite-ssg">vite-ssg</a>, built by Anthony Fu. It takes a standard Vue 3 + Vite application and generates static HTML for every route at build time. The client-side JavaScript then hydrates the HTML, making it interactive. You get the SEO benefits of pre-rendered HTML with the interactivity of a single-page application.</p>
<pre><code class="language-ts">// src/main.ts
import { ViteSSG } from 'vite-ssg'
import { routes } from 'vue-router/auto-routes'
import App from './App.vue'

export const createApp = ViteSSG(
  App,
  { routes },
  ({ router, app, isClient }) =&gt; {
    // Plugins, router guards, client-only setup
    app.use(createPinia())

    if (isClient) {
      setupRouterScroller(router, { behavior: 'auto' })

      router.beforeEach(() =&gt; NProgress.start())
      router.afterEach(() =&gt; NProgress.done())
    }
  },
)
</code></pre>
<p>The <code>isClient</code> guard ensures that browser-specific code (scroll behavior, progress bars) only runs in the browser, not during SSG pre-rendering. This pattern appears everywhere when you work with SSG — you have to be mindful that your code runs in two contexts: Node.js at build time and the browser at runtime.</p>
<h2>File-Based Routing</h2>
<p>Routes are generated automatically from the filesystem using <a href="https://github.com/posva/unplugin-vue-router">unplugin-vue-router</a>. Every <code>.vue</code> or <code>.md</code> file in the <code>pages/</code> directory becomes a route:</p>
<pre><code>pages/
├── index.md              → /
├── uses.md               → /uses
├── photos.vue            → /photos
├── bookmarks.vue         → /bookmarks
└── posts/
    ├── index.md          → /posts
    ├── building-smscode.md    → /posts/building-smscode
    ├── terminal-setup-2026.md → /posts/terminal-setup-2026
    └── from-node-to-rust.md   → /posts/from-node-to-rust
</code></pre>
<p>No router configuration to maintain. Create a file, get a route. Delete the file, the route disappears. The plugin generates type-safe route definitions, so <code>&lt;RouterLink to=&quot;/posts&quot;&gt;</code> gets compile-time validation.</p>
<p>The route generation also handles frontmatter. Each markdown file has YAML frontmatter at the top — title, date, description, duration. This metadata is read at build time using <a href="https://github.com/jonschlinkert/gray-matter">gray-matter</a> and attached to the route's meta:</p>
<pre><code class="language-ts">// vite.config.ts
VueRouter({
  extensions: ['.vue', '.md'],
  routesFolder: 'pages',
  extendRoute(route) {
    const path = route.components.get('default')
    if (path?.endsWith('.md')) {
      const { data } = matter(fs.readFileSync(path, 'utf-8'))
      route.addToMeta({ frontmatter: data })
    }
  },
})
</code></pre>
<p>This is what powers the blog listing page. The <code>ListPosts</code> component reads route metadata from the router directly — no API calls, no file imports, just data that's already available in the route definitions:</p>
<pre><code class="language-ts">// ListPosts.vue
const routes: Post[] = router.getRoutes()
  .filter(i =&gt; i.path.startsWith('/posts')
    &amp;&amp; i.meta.frontmatter.date
    &amp;&amp; !i.meta.frontmatter.draft)
  .map(i =&gt; ({
    path: i.meta.frontmatter.redirect || i.path,
    title: i.meta.frontmatter.title,
    date: i.meta.frontmatter.date,
    lang: i.meta.frontmatter.lang,
    duration: i.meta.frontmatter.duration,
  }))
</code></pre>
<p>Posts are sorted by date, grouped by year, and the year headers render as large, semi-transparent text in the background — a subtle but effective visual hierarchy. Draft posts (marked with <code>draft: true</code> in frontmatter) are filtered out of the listing but still accessible via direct URL during development.</p>
<h2>Markdown as Vue Components</h2>
<p>This is the core trick that makes the whole setup work. <a href="https://github.com/unplugin/unplugin-vue-markdown">unplugin-vue-markdown</a> compiles Markdown files into Vue Single File Components at build time. This means every <code>.md</code> file is a Vue component — it can use other components, receive props, and has full access to Vue's reactivity system.</p>
<pre><code class="language-md">---
title: My Blog Post
date: 2026-03-06T00:00:00Z
duration: 10min
---

This is regular markdown with **bold** and `code`.

But I can also use Vue components directly:

&lt;MyCustomChart :data=&quot;chartData&quot; /&gt;

Or use HTML with UnoCSS utilities:

&lt;div flex gap-4 items-center&gt;
  &lt;span text-2xl&gt;Interactive content in markdown!&lt;/span&gt;
&lt;/div&gt;
</code></pre>
<p>Each markdown file is wrapped in a <code>WrapperPost</code> component that provides the layout — title, date, duration, navigation links, and social sharing buttons. The wrapper reads the frontmatter and renders the chrome around the content:</p>
<pre><code class="language-vue">&lt;!-- WrapperPost.vue --&gt;
&lt;template&gt;
  &lt;div v-if=&quot;frontmatter.title&quot; class=&quot;prose m-auto mb-8&quot;&gt;
    &lt;h1 class=&quot;mb-0&quot;&gt;
      {{ frontmatter.title }}
    &lt;/h1&gt;
    &lt;p v-if=&quot;frontmatter.date&quot; class=&quot;opacity-50 !-mt-6&quot;&gt;
      {{ formatDate(frontmatter.date) }}
      &lt;span v-if=&quot;frontmatter.duration&quot;&gt;· {{ frontmatter.duration }}&lt;/span&gt;
    &lt;/p&gt;
  &lt;/div&gt;
  &lt;article&gt;
    &lt;slot /&gt;
  &lt;/article&gt;
&lt;/template&gt;
</code></pre>
<p>The markdown pipeline itself is heavily customized with <a href="https://github.com/markdown-it/markdown-it">markdown-it</a> plugins:</p>
<ul>
<li><strong>Shiki</strong> — Syntax highlighting with the Vitesse dark/light themes. Supports diff notation (<code>// [!code ++]</code>), line highlighting, and even TypeScript type annotations via TwoSlash.</li>
<li><strong>Anchor links</strong> — Every heading gets a permalink anchor for direct linking.</li>
<li><strong>Table of contents</strong> — The <code>[[toc]]</code> directive generates a floating TOC from headings.</li>
<li><strong>External links</strong> — All external links automatically get <code>target=&quot;_blank&quot;</code> and <code>rel=&quot;noopener&quot;</code>.</li>
<li><strong>Magic links</strong> — Brand names like {SMSCode} auto-link to their URLs with custom icons. This is configured in <code>vite.config.ts</code>:</li>
</ul>
<pre><code class="language-ts">md.use(MarkdownItMagicLink, {
  linksMap: {
    SMSCode: 'https://smscode.gg',
    UNDRCTRL: 'https://undrctrl.id',
    FYP: 'https://fyp.id',
  },
  imageOverrides: [
    ['https://smscode.gg', '/icons/smscode.svg'],
    ['https://undrctrl.id', '/icons/undrctrl.svg'],
  ],
})
</code></pre>
<ul>
<li><strong>GitHub Alerts</strong> — Markdown <code>&gt; [!NOTE]</code> and <code>&gt; [!WARNING]</code> blocks render as styled callouts.</li>
</ul>
<h2>UnoCSS: Utility-First, Zero Config</h2>
<p><a href="https://unocss.dev">UnoCSS</a> replaces Tailwind CSS. It's an on-demand atomic CSS engine — it only generates the CSS for utilities you actually use, and it's significantly faster than Tailwind's JIT compiler because it skips the PostCSS step entirely.</p>
<p>The key feature I rely on is <strong>attributify mode</strong>. Instead of stuffing all utility classes into a <code>class</code> attribute, you write them directly as HTML attributes:</p>
<pre><code class="language-html">&lt;!-- Traditional utility classes --&gt;
&lt;div class=&quot;flex gap-4 items-center opacity-50 mt-4&quot;&gt;
  &lt;!-- Attributify mode --&gt;
  &lt;div flex gap-4 items-center op50 mt-4&gt;&lt;/div&gt;
&lt;/div&gt;
</code></pre>
<p>It reads cleaner and is easier to scan visually. The Vue compiler and UnoCSS work together to transform these attributes into actual CSS at build time.</p>
<p>The design itself is intentionally minimal. Dark mode is toggled via <a href="https://vueuse.org/core/useDark/">VueUse's <code>useDark</code></a> with class-based switching on the <code>&lt;html&gt;</code> element. The color palette is sparse — mostly grays with opacity variations. Content width is constrained by a <code>.prose</code> class that sets <code>max-width</code> and typographic defaults.</p>
<p>The CSS entry point is clean:</p>
<pre><code class="language-css">:root {
  --c-bg: #fff;
  --c-scrollbar: #eee;
}

html.dark {
  --c-bg: #050505;
  --c-scrollbar: #111;
}
</code></pre>
<p>A slide-enter animation plays when navigating between pages, giving each page load a subtle staggered reveal. Each child element in the content area enters with an increasing delay, creating a cascading effect that makes the page feel alive without being distracting.</p>
<h2>Auto-Imports Everywhere</h2>
<p>One of the best DX features in this stack is that almost nothing needs to be explicitly imported. Three unplugin plugins handle this:</p>
<p><strong><a href="https://github.com/unplugin/unplugin-auto-import">unplugin-auto-import</a></strong> — Vue APIs (<code>ref</code>, <code>computed</code>, <code>watch</code>, <code>onMounted</code>), Vue Router APIs (<code>useRoute</code>, <code>useRouter</code>), and VueUse composables (<code>useDark</code>, <code>useEventListener</code>, <code>useLocalStorage</code>) are all available globally without import statements.</p>
<pre><code class="language-ts">// vite.config.ts
AutoImport({
  imports: ['vue', VueRouterAutoImports, '@vueuse/core'],
})
</code></pre>
<p><strong><a href="https://github.com/unplugin/unplugin-vue-components">unplugin-vue-components</a></strong> — Vue components in <code>src/components/</code> are auto-registered. If I create <code>MyWidget.vue</code>, I can use <code>&lt;MyWidget /&gt;</code> in any template or markdown file without importing it.</p>
<p><strong><a href="https://github.com/unplugin/unplugin-icons">unplugin-icons</a></strong> — Icons from the entire <a href="https://iconify.design">Iconify</a> collection (150,000+ icons) are available as components. No icon fonts, no SVG imports — just use an HTML attribute:</p>
<pre><code class="language-html">&lt;div i-ri-github-fill /&gt;
&lt;div i-ri-article-line /&gt;
&lt;div i-carbon-arrow-up-right /&gt;
</code></pre>
<p>The icon is resolved at build time, tree-shaken to only include the icons you actually use, and inlined as an SVG. Zero runtime cost.</p>
<p>The combined effect is significant. A typical <code>.vue</code> file in this project has zero import statements. Everything is available implicitly. This sounds like it would hurt readability, but in practice, the APIs are so well-known (Vue, VueUse) that explicit imports just add noise.</p>
<h2>OG Image Generation</h2>
<p>Every blog post automatically gets an Open Graph image for social media previews. The generation happens at build time using <a href="https://sharp.pixelplumbing.com">sharp</a> — an SVG template is filled with the post title and rendered to a PNG:</p>
<pre><code class="language-ts">// vite.config.ts — frontmatterPreprocess
const route = basename(id, '.md')
const path = `og/${route}.png`
generateOg(frontmatter.title, `public/${path}`)
frontmatter.image = `https://zevs.gg/${path}`
</code></pre>
<p>The generated image URL is set as the frontmatter <code>image</code>, which the head meta tags pick up automatically. When someone shares a post on Twitter, Telegram, or LinkedIn, they see a card with the post title — no manual image creation needed.</p>
<p>The SVG template uses a simple layout — title text on a solid background. The title is split into lines at word boundaries every 30 characters, then rendered with sharp:</p>
<pre><code class="language-ts">async function generateOg(title: string, output: string) {
  const lines = title.trim().split(/(.{0,30})(?:\s|$)/g).filter(Boolean)

  const svg = ogTemplate.replace(
    /\{\{([^}]+)\}\}/g,
    (_, name) =&gt; ({ line1: lines[0], line2: lines[1], line3: lines[2] })[name] || ''
  )

  await sharp(Buffer.from(svg))
    .resize(1200 * 1.1, 630 * 1.1)
    .png()
    .toFile(output)
}
</code></pre>
<h2>RSS Feed</h2>
<p>The RSS feed is generated by a post-build script. It reads all markdown files, parses them with gray-matter and markdown-it, and outputs RSS, Atom, and JSON feeds:</p>
<pre><code class="language-ts">// scripts/rss.ts
const files = await fg('pages/posts/*.md')
const posts = await Promise.all(
  files.map(async (i) =&gt; {
    const { data, content } = matter(await fs.readFile(i, 'utf-8'))
    const html = markdown.render(content)
    return { ...data, content: html, date: new Date(data.date) }
  })
)

posts.sort((a, b) =&gt; +new Date(b.date) - +new Date(a.date))

const feed = new Feed(options)
posts.forEach(item =&gt; feed.addItem(item))

await fs.writeFile('./dist/feed.xml', feed.rss2())
await fs.writeFile('./dist/feed.atom', feed.atom1())
await fs.writeFile('./dist/feed.json', feed.json1())
</code></pre>
<p>The full build pipeline runs sequentially:</p>
<pre><code class="language-bash"># package.json build script
vite-ssg build          # 1. SSG pre-render all routes
tsx scripts/copy-fonts  # 2. Copy font files to dist
tsx scripts/rss.ts      # 3. Generate RSS/Atom/JSON feeds
cp _dist_redirects dist/_redirects  # 4. Copy redirect rules
</code></pre>
<h2>Deployment</h2>
<p>The site deploys to <a href="https://www.netlify.com">Netlify</a> on every push. The <code>netlify.toml</code> configuration is minimal:</p>
<pre><code class="language-toml">[build]
publish = &quot;dist&quot;
command = &quot;pnpm run build&quot;

[build.environment]
NODE_VERSION = &quot;22&quot;

[[headers]]
for = &quot;/assets/*&quot;

[headers.values]
Cache-Control = &quot;public, max-age=31536000, immutable&quot;
</code></pre>
<p>Static assets (JS, CSS, images) get year-long cache headers with <code>immutable</code> — since Vite hashes filenames, the content at any given URL never changes. HTML files use Netlify's default caching, which revalidates on each request.</p>
<p>The SPA fallback redirect ensures that client-side navigation works:</p>
<pre><code class="language-toml">[[redirects]]
from = &quot;/*&quot;
to = &quot;/index.html&quot;
status = 200
</code></pre>
<p>This catches routes that don't have pre-rendered HTML (unlikely with SSG, but good as a safety net) and serves the SPA shell instead.</p>
<h2>Code Quality</h2>
<p>Pre-commit hooks keep the codebase consistent:</p>
<pre><code class="language-json">{
  &quot;simple-git-hooks&quot;: {
    &quot;pre-commit&quot;: &quot;npx lint-staged&quot;
  },
  &quot;lint-staged&quot;: {
    &quot;*&quot;: &quot;eslint --fix&quot;
  }
}
</code></pre>
<p>Every staged file — JS, TS, Vue, Markdown, JSON — runs through ESLint with <a href="https://github.com/antfu/eslint-config">@antfu/eslint-config</a> before commit. This config includes formatting rules (replacing Prettier), Vue-specific rules, TypeScript checks, and UnoCSS attribute ordering. No separate Prettier config, no formatting debates — one tool handles everything.</p>
<h2>The Full Architecture</h2>
<p>Here's how everything connects:</p>
<pre><code>pages/*.md / pages/*.vue          (Content &amp; Routes)
    │
    ├─ unplugin-vue-router        (File-based routing)
    ├─ unplugin-vue-markdown      (Markdown → Vue SFC)
    │   ├─ gray-matter            (YAML frontmatter)
    │   ├─ Shiki                  (Syntax highlighting)
    │   ├─ markdown-it-anchor     (Heading permalinks)
    │   ├─ markdown-it-toc        (Table of contents)
    │   └─ markdown-it-magic-link (Brand auto-linking)
    │
    ├─ unplugin-auto-import       (Vue/VueUse auto-imports)
    ├─ unplugin-vue-components    (Component auto-registration)
    ├─ unplugin-icons             (Iconify → SVG components)
    ├─ UnoCSS                     (Atomic CSS, attributify)
    │
    └─ vite-ssg                   (Static site generation)
        ├─ sharp                  (OG image generation)
        ├─ feed                   (RSS/Atom/JSON feeds)
        └─ dist/                  → Netlify
</code></pre>
<p>The entire build takes about 15 seconds. The development server starts in under 2 seconds with full HMR — editing a markdown file reflects instantly in the browser.</p>
<h2>What I'd Change</h2>
<p><strong>Markdown limitations.</strong> The markdown-to-Vue compilation is powerful, but debugging errors inside markdown files is painful. If a Vue component inside markdown has a syntax error, the error message points to the compiled output, not the source. Better source maps would help.</p>
<p><strong>Image optimization.</strong> I don't have an automated image pipeline yet. Images are manually compressed before commit. An on-the-fly optimization step (like <code>vite-imagetools</code> or a custom sharp pipeline) would be worthwhile.</p>
<p><strong>Search.</strong> There's no search functionality. For a blog with a growing number of posts, this will eventually matter. A client-side search index built at SSG time (like FlexSearch) would fit the static architecture well.</p>
<h2>Wrap Up</h2>
<p>The core principle behind this stack is that <strong>content should be easy to create and hard to break</strong>. Writing a new blog post is: create a <code>.md</code> file, add frontmatter, write content, push. The build system handles routing, OG images, RSS feeds, syntax highlighting, and deployment automatically.</p>
<p>If you're building a personal site and value developer experience over simplicity, Vue 3 + Vite + SSG is a stack worth considering. The plugin ecosystem (all those <code>unplugin-*</code> packages) does most of the heavy lifting, and the result is a site that's fast to build, fast to load, and — most importantly — fast to write for.</p>
<p>Thanks for reading!</p>
]]></content:encoded>
            <author>hi@zevs.gg (Zevs)</author>
        </item>
        <item>
            <title><![CDATA[Categorize Your Dependencies]]></title>
            <link>https://zevs.gg/posts/categorize-deps</link>
            <guid isPermaLink="true">https://zevs.gg/posts/categorize-deps</guid>
            <pubDate>Mon, 28 Apr 2025 14:00:00 GMT</pubDate>
            <description><![CDATA[A better way to organize npm dependencies — beyond just dependencies and devDependencies.]]></description>
            <content:encoded><![CDATA[<p>When building a project, it's very likely that we will install third-party packages from npm to offload some tasks. On that topic, we know there are two major types of dependencies: <code>dependencies</code> (prod) and <code>devDependencies</code> (dev). In our <code>package.json</code>, it might look something like this:</p>
<pre><code class="language-json">{
  &quot;name&quot;: &quot;my-cool-vue-components&quot;,
  &quot;dependencies&quot;: {
    &quot;vue&quot;: &quot;^3.5.15&quot;
  },
  &quot;devDependencies&quot;: {
    &quot;eslint&quot;: &quot;^9.15.0&quot;
  }
}
</code></pre>
<p>The main difference is that <a href="https://github.com/npm/npm/blob/2e3776bf5676bc24fec6239a3420f377fe98acde/doc/files/package.json.md#devdependencies"><code>devDependencies</code></a> are only needed during the build or development phase, while <a href="https://github.com/npm/npm/blob/2e3776bf5676bc24fec6239a3420f377fe98acde/doc/files/package.json.md#dependencies"><code>dependencies</code></a> are required for the project to run. For example, <code>eslint</code> in the case above only lints our source code; it's no longer needed when we publish the project or deploy it to production.</p>
<p>The concept of <code>dependencies</code> and <code>devDependencies</code> was originally introduced for <strong>authoring Node.js libraries</strong> (those published to npm). When you install a package like <code>vite</code>, npm automatically installs its <code>dependencies</code> but not its <code>devDependencies</code>. This is because you are consuming <code>vite</code> as a dependency and don't need its development tools. So, even if <code>vite</code> uses <code>prettier</code> during its development, you won't be forced to install <code>prettier</code> when you only need <code>vite</code> in your project.</p>
<p>As the ecosystem has evolved, we can now build much more complex projects than ever before. We have meta-frameworks for building full-stack websites, bundlers for transpiling and bundling code and dependencies, and so on. Node.js became a lot more than just running JavaScript code and packages on the server side.</p>
<p>I'd roughly categorize projects into three types:</p>
<ol>
<li><strong>Apps</strong>: Websites, Electron apps, mobile apps, etc. Here, <code>package.json</code> primarily keeps track of dependency information, and the app itself is never published to npm.</li>
<li><strong>Libraries</strong>: Packages designed to be published to npm, then installed and consumed by other projects.</li>
<li><strong>Internal</strong>: Packages used within monorepos that are never published.</li>
</ol>
<p>Fundamentally, the distinction between <code>dependencies</code> and <code>devDependencies</code> <strong>only</strong> truly makes sense for libraries intended for publication on npm. However, due to different scenarios and usage patterns, their meaning has extended far beyond the original purpose.</p>
<p>Tools often <strong>overload</strong> the meaning of <code>dependencies</code> and <code>devDependencies</code> to fit various scenarios, aiming for sensible defaults and better Developer Experience.</p>
<p>For example, <a href="https://vite.dev/"><code>Vite</code></a> treats <code>dependencies</code> as &quot;client-side packages&quot; and automatically runs pre-optimization on them. Build tools like <a href="https://github.com/egoist/tsup"><code>tsup</code></a>, <a href="https://github.com/unjs/unbuild"><code>unbuild</code></a>, and <a href="https://github.com/rolldown/tsdown"><code>tsdown</code></a> treat <code>dependencies</code> as packages to be externalized during bundling, automatically inlining (bundling) anything not listed in <code>dependencies</code>.</p>
<p>While these conventions certainly simplify things in most cases, they also force <code>dependencies</code> and <code>devDependencies</code> to wear multiple hats, making it harder to grasp the purpose of each package.</p>
<p>If we see <code>vue</code> listed in <code>devDependencies</code>, it could mean several things:</p>
<ul>
<li>We are inlining/bundling it.</li>
<li>We are only referencing its types.</li>
<li>We use it solely for testing.</li>
<li>We have it to enable IDE IntelliSense.</li>
<li>Or something else entirely.</li>
</ul>
<p>Simply classifying packages as <code>dependencies</code> or <code>devDependencies</code> doesn't provide the full picture of that package's purpose without external documentation (also note that <code>package.json</code> doesn't support comments).</p>
<h3>Categorize Your Dependencies</h3>
<p>Let's forget about <code>dependencies</code> and <code>devDependencies</code> for a moment, how might we categorize our dependencies? Here are some rough ideas I could come up with:</p>
<ul>
<li><code important-text-lime>test</code>: Packages used for testing (e.g., <code>vitest</code>, <code>playwright</code>, <code>msw</code>).</li>
<li><code important-text-purple>lint</code>: Packages for linting/formatting (e.g., <code>eslint</code>, <code>knip</code>).</li>
<li><code important-text-cyan>build</code>: Packages used for building the project (e.g., <code>vite</code>, <code>rolldown</code>).</li>
<li><code important-text-orange>script</code>: Packages used for scripting tasks (e.g., <code>tsx</code>, <code>tinyglobby</code>, <code>cpx</code>).</li>
<li><code important-text-green>frontend</code>: Packages for frontend development (e.g., <code>vue</code>, <code>pinia</code>).</li>
<li><code important-text-yellow>backend</code>: Packages for the backend server.</li>
<li><code important-text-blue>types</code>: Packages for type checking and definitions.</li>
<li><code important-text-amber>inlined</code>: Packages that are included directly in the final bundle.</li>
<li><code important-text-red>prod</code>: Runtime production dependencies.</li>
<li>...</li>
</ul>
<p>Categorization might differ between projects. But that point is that <code>dependencies</code> and <code>devDependencies</code> lack the flexibility to capture this level of detail.</p>
<p>This thing had been bothering me for a while, though it didn't feel like a critical problem needing immediate resolution. Only until pnpm introduced <a href="https://pnpm.io/catalogs">catalogs</a>, opening up possibilities for dependency categorization we never had before.</p>
<h3>PNPM Catalogs</h3>
<p><a href="https://pnpm.io/catalogs">PNPM Catalogs</a> is a feature allowing monorepo workspaces to share dependency versions across different packages via a centralized management location.</p>
<p>Basically, you add <code>catalog</code> or <code>catalogs</code> fields to your <code>pnpm-workspace.yaml</code> file and reference them using <code>catalog:&lt;name&gt;</code> in your <code>package.json</code>.</p>
<pre><code class="language-yaml"># pnpm-workspace.yaml
catalog:
  vue: ^3.5.15
  pinia: ^2.2.6
  cac: ^6.7.14
</code></pre>
<pre><code class="language-json">// package.json
{
  &quot;dependencies&quot;: {
    &quot;vue&quot;: &quot;catalog:&quot;,
    &quot;pinia&quot;: &quot;catalog:&quot;,
    &quot;cac&quot;: &quot;catalog:&quot;
  }
}
</code></pre>
<p>Or with <a href="https://pnpm.io/catalogs#named-catalogs"><strong>named catalogs</strong></a>:</p>
<pre><code class="language-yaml"># pnpm-workspace.yaml
catalogs:
  frontend:
    vue: ^3.5.15
    # We locked the version for some reason, etc.
    pinia: 2.2.6
  prod:
    cac: ^6.7.14
</code></pre>
<pre><code class="language-json">// package.json
{
  &quot;dependencies&quot;: {
    &quot;vue&quot;: &quot;catalog:frontend&quot;,
    &quot;pinia&quot;: &quot;catalog:frontend&quot;,
    &quot;cac&quot;: &quot;catalog:prod&quot;
  }
}
</code></pre>
<p>During installation and publishing, pnpm automatically resolves dependencies to the versions specified in the catalogs. While it's originally designed for managing version consistency across monorepos, I found <a href="https://pnpm.io/catalogs#named-catalogs">Named Catalogs</a> are also a great way to also categorize dependencies. As shown above, we can categorize <code>vue</code> and <code>cac</code> into different catalogs even though they both presented in <code>dependencies</code>. This information makes version upgrade easier and would help on reviewing dependency changes.</p>
<blockquote>
<p>A nice bonus: you can use comments in <code>pnpm-workspace.yaml</code> to share additional context with your team.</p>
</blockquote>
<h3>Tooling Support</h3>
<p>Given that catalogs are still quite new, this shift requires better tooling support. A significant pain point for me on this was losing the ability to see a dependency's version at a glance in <code>package.json</code> when using <code>catalog:&lt;name&gt;</code>.</p>
<p>To address this, I created a VS Code extension, <a href="https://marketplace.visualstudio.com/items?itemName=antfu.pnpm-catalog-lens">PNPM Catalog Lens</a>, which displays the resolved version inline within <code>package.json</code>.</p>
<p><img src="https://zevs.gg/images/pnpm-catalogs-vscode.png" alt="Screenshot of the extension PNPM Catalog Lens"></p>
<p>It also adds distinct colors to each named category for easier identification. This gives us the categorization and centralized version control without significantly impacting DX.</p>
<p>Since versions move to <code>pnpm-workspace.yaml</code>, CLI tools would need to make some integrations to support this. So far, we've adapted the following tools:</p>
<ul>
<li><a href="https://github.com/antfu/taze"><code>taze</code></a>: Checks and bumps dependency versions, now supporting reading and updating versions from catalogs.</li>
<li><a href="https://github.com/antfu/pnpm-workspace-utils/tree/main/packages/eslint-plugin-pnpm"><code>eslint-plugin-pnpm</code></a>: Enforces using catalogs for all dependencies in <code>package.json</code>, with auto-fixes.
<ul>
<li>If you use <a href="https://github.com/antfu/eslint-config"><code>@antfu/eslint-config</code></a>, enable this by setting <code>pnpm: true</code>.</li>
</ul>
</li>
<li><a href="https://github.com/antfu/pnpm-workspace-utils/tree/main/packages/pnpm-workspace-yaml"><code>pnpm-workspace-yaml</code></a>: A utility library for reading and writing <code>pnpm-workspace.yaml</code> while preserving comments and formatting.</li>
<li><a href="https://github.com/antfu/node-modules-inspector"><code>node-modules-inspector</code></a>: Visualizes your <code>node_modules</code>, now labeling dependencies with their catalog name for a better overview of their origin.</li>
<li><a href="https://github.com/antfu/nip"><code>nip</code></a>: Interactive CLI to install packages to catalogs</li>
</ul>
<h3>Looking into the Future</h3>
<p>Currently, I see the value of categorize dependencies is mainly for better communication and easier version upgrade reviews. However, as this convention gains wider adoption and tooling support improves, we could integrate this information more deeply with our tools.</p>
<p>For example, in Vite, we could gain more explicit control over dependency optimization, decoupling it from the <code>dependencies</code> and <code>devDependencies</code> fields:</p>
<pre><code class="language-ts">// vite.config.ts
import { readWorkspaceYaml } from 'pnpm-workspace-yaml'
import { defineConfig } from 'vite'

const yaml = await readWorkspaceYaml('pnpm-workspace.yaml') // pseudo-API

export default defineConfig({
  optimizeDeps: {
    include: Object.keys(yaml.catalogs.frontend)
  }
})
</code></pre>
<p>Similarly, for <a href="https://github.com/unjs/unbuild"><code>unbuild</code></a>, we could explicitly control externalization and inlining without manually maintaining lists in multiple places:</p>
<pre><code class="language-ts">// build.config.ts
import { readWorkspaceYaml } from 'pnpm-workspace-yaml'
import { defineBuildConfig } from 'unbuild'

const yaml = await readWorkspaceYaml('pnpm-workspace.yaml')

export default defineBuildConfig({
  externals: Object.keys(yaml.catalogs.prod),
  rollup: {
    inlineDependencies: Object.keys(yaml.catalogs.inlined)
  }
})
</code></pre>
<p>For linting or bundling, we could enforce rules based on catalogs, such as throwing errors when attempting to import backend packages into frontend code, preventing accidental bundling mistakes.</p>
<p>This categorization could also provide valuable context for vulnerability reports. Vulnerabilities in build tools might be less severe than those in dependencies shipped to production.</p>
<p>...and so on.</p>
<p>I've already started migrating many of my projects to use named catalogs(<a href="https://github.com/antfu/node-modules-inspector"><code>node-modules-inspector</code></a> for example). Even outside monorepos, the ability to categorize dependencies is a compelling reason to adopt to pnpm catalogs. I consider this an exploratory phase where we're still discovering best practices and improving tooling support.</p>
<p>So, that's why I'm writing this post: to invite you to consider this approach and try it out. We'd love to hear your thoughts and how you would utilize it. I look forward to seeing more patterns like this emerge, helping us build more maintainable projects with a better DX. Thanks for reading!</p>
]]></content:encoded>
            <author>hi@zevs.gg (Zevs)</author>
        </item>
        <item>
            <title><![CDATA[Async, Sync, in Between]]></title>
            <link>https://zevs.gg/posts/async-sync-in-between</link>
            <guid isPermaLink="true">https://zevs.gg/posts/async-sync-in-between</guid>
            <pubDate>Mon, 03 Mar 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[The coloring problem in modern programming, and a proposal of a new approach]]></description>
            <content:encoded><![CDATA[<h2>The Coloring Problem</h2>
<p>In modern programming, the function coloring problem isn't new. Based on how functions execute: <code important-text-blue>synchronous</code> (blocking) and <code important-text-rose>asynchronous</code> (non-blocking), we often classify them into two &quot;colors&quot; for better distinction. The problem arises because you generally cannot mix and match these colors freely.</p>
<p>For instance, in JavaScript:</p>
<ul>
<li>An <span text-rose>async</span> function can call both <span text-blue>sync</span> and other <span text-rose>async</span> functions.</li>
<li>A <span text-blue>sync</span> function, however, cannot directly call an <span text-rose>async</span> function without changing its own color to async.</li>
</ul>
<p>This restriction forces developers to propagate the &quot;color&quot; throughout their codebase. If a function deep in your logic needs to become async, it forces every caller up the chain to also become async, leading to a cascading effect (or &quot;async inflection&quot;). This makes refactoring harder, increases complexity, and sometimes leads to awkward workarounds like blocking async calls with await inside sync contexts, or vice versa.</p>
<figure>
  <QuansyncGraph1 />
  <figcaption text-center>
If `loadFile()` needs to be async, then all its callers upstream need to change to async too.
  </figcaption>
</figure>
<p>We often discuss the async inflection problem, where a common solution is to make everything async since async functions can call both sync and async functions, while the reverse is not true. However, the coloring problem actually goes both ways, which seems to be less frequently discussed:</p>
<p>While an async function requires all the <strong>callers</strong> to be async, a sync function also requires all the <strong>dependencies</strong> to be sync.</p>
<figure>
  <QuansyncGraph2 />
  <figcaption text-center>
If `parse()` needs to be sync, then all its dependencies down the road need to be sync too.
  </figcaption>
</figure>
<p>At its core, it's the same problem with different perspectives. It depends on which part of the code you're focusing on and how difficult it is to change its &quot;color.&quot; If the function you're working on <strong>must be async</strong>, the burden shifts to the callers. Conversely, if it <strong>must be sync</strong>, you'll need all your dependencies to be sync or provide a synchronous entry point.</p>
<h3>Libraries in Practice</h3>
<p>For example, the widely used library <a href="https://github.com/sindresorhus/find-up"><code>find-up</code></a> provides two main APIs, <code important-text-rose>findUp</code> and <code important-text-blue>findUpSync</code>, to avoid dependents being trapped by the coloring problem. If you look into the code, you'll find that the package essentially <a href="https://github.com/sindresorhus/find-up/blob/b733bb70d3aa21b22fa011be8089110d467c317f/index.js#L51">duplicates the logic twice</a> to provide the two APIs. Going down, you see its dependency <a href="https://github.com/sindresorhus/locate-path"><code>locate-path</code></a> also <a href="https://github.com/sindresorhus/locate-path/blob/355a681456d79a8506de11120d56b6e34a0389b5/index.js#L49">duplicates the <code important-text-rose>locatePath</code> and <code important-text-blue>locatePathSync</code> logic</a>.</p>
<p>Say you want to build another library that uses <code>findUp</code>, like <code>readNearestPkg</code>, you would also have to write the logic twice, using <code important-text-rose>findUp</code> and <code important-text-blue>findUpSync</code> separately, to support both async and sync usage.</p>
<p>In these cases, even if our main logic does not come with its own &quot;colors,&quot; the whole dependency pipeline is forced to branch into two colors due to an optional async operation down the road (e.g., <code important-text-rose>fs.promises.stat</code> and <code important-text-blue>fs.statSync</code>).</p>
<figure>
  <QuansyncGraph3 />
  <figcaption text-center>
Basically, we would maintain two branches of code to support both sync and async, with only a few sync utils that can be shared.
  </figcaption>
</figure>
<h3>Async Plugins</h3>
<p>Another case demonstrating the coloring problem is a plugin system with async hooks. For example, imagine we are building a Markdown-to-HTML compiler with plugin support. Say the parser and compiler logic are synchronous; we could expose a sync API like:</p>
<pre><code class="language-ts">export function markdownToHtml(markdown) {
  const ast = parse(markdown)
  // ...
  return render(ast)
}
</code></pre>
<p>To make our library extensible, we might allow plugins to register hooks at multiple stages thoughout the process, for example:</p>
<pre><code class="language-ts">export interface Plugin {
  preprocess: (markdown: string) =&gt; string
  transform: (ast: AST) =&gt; AST
  postprocess: (html: string) =&gt; string
}

export function markdownToHtml(markdown, plugins) {
  for (const plugin of plugins) {
    markdown = plugin.preprocess(markdown) // [!code hl]
  }
  let ast = parse(markdown)
  for (const plugin of plugins) {
    ast = plugin.transform(ast) // [!code hl]
  }
  let html = render(ast)
  for (const plugin of plugins) {
    html = plugin.postprocess(html) // [!code hl]
  }
  return html
}
</code></pre>
<p>Great, now we have a plugin system. However, having <code>markdownToHtml</code> as a synchronous function essentially limits all plugin hooks to be synchronous as well. This limitation can be quite restrictive. For instance, consider a plugin for syntax highlighting. In many cases, the best results for syntax highlighting might require asynchronous operations, such as fetching additional resources or performing complex computations that are better suited for non-blocking execution.</p>
<p>To accommodate such scenarios, we need to allow <span text-rose>async hooks</span> in our plugin system. This means that our main function, <code>markdownToHtml</code>, as the caller of the plugin hooks must also be async. We could implement it like this:</p>
<pre><code class="language-ts">// [!code word:Promise]
// [!code word:async]
// [!code word:await]
export interface Plugin {
  preprocess: (markdown: string) =&gt; string | Promise&lt;string&gt;
  transform: (ast: AST) =&gt; AST | Promise&lt;AST&gt;
  postprocess: (html: string) =&gt; string | Promise&lt;string&gt;
}

export async function markdownToHtml(markdown, plugins) { // [!code hl]
  for (const plugin of plugins) {
    markdown = await plugin.preprocess(markdown) // [!code hl]
  }
  let ast = parse(markdown)
  for (const plugin of plugins) {
    ast = await plugin.transform(ast) // [!code hl]
  }
  let html = render(ast)
  for (const plugin of plugins) {
    html = await plugin.postprocess(html) // [!code hl]
  }
  return html
}
</code></pre>
<p>While this maximized the flexibility of the plugin system, this approach also <strong>forces</strong> all users to handle the process <span text-rose>asynchronously</span>, even in the cases where all plugins are synchronous. This is the cost of accommodating the possibility that some operations &quot;<b important-text-purple>might be asynchronous</b>&quot;. To manage this, we often end up duplicating the logic to offer both sync and async APIs, and restrict async plugins to the async version only.</p>
<p>Such duplications lead to increased maintenance efforts, potential inconsistencies, and larger bundle sizes, which are not ideal for maintainers or users.</p>
<p>Is there a better way to handle this?</p>
<h2>Introducing Quansync</h2>
<p>What if we could make our logic decoupled from the coloring problem and let the caller decide the color?</p>
<p>Trying to make the situation a bit better, {@sxzz} and I took inspiration from <a href="https://github.com/loganfsmyth/gensync"><code>gensync</code></a> by {@loganfsmyth} and made a package called <a href="https://github.com/antfu-collective/quansync"><code important-text-purple>quansync</code></a>. Taking it even further, we are dreaming of leveraging this to create a paradigm shift in the way we write libraries in the JavaScript ecosystem.</p>
<p>The name <code important-text-purple>Quansync</code> is borrowed from <a href="https://en.wikipedia.org/wiki/Quantum_mechanics">Quantum Mechanics</a>, where particles can exist in multiple states simultaneously, known as <em>superposition</em>, and only settle into a single state when observed <span op50>(try hovering over the atom below)</span>.</p>
<AsyncSyncQuantum />
<p>You can think of <code important-text-purple>quansync</code> as a new type of function that can be used as both <code important-text-blue>sync</code> and <code important-text-rose>async</code> depending on the context. In many cases, our logic can escape the async inflection problem, especially when designing shared logic with optional async hooks.</p>
<figure>
  <QuansyncGraph4 />
  <figcaption text-center>
Try hovering over either the async or sync side.<br>
Quansync functions are in purple, which can adapt to either sync or async.
  </figcaption>
</figure>
<h3>Usage Examples</h3>
<p>Quansync provides a single API with two overloads.</p>
<h4>Wrapper API</h4>
<p>Wrapper allows you to create a quansync function by providing a sync and an async implementation. For example:</p>
<pre><code class="language-ts">import fs from 'find-up'
import { quansync } from 'quansync'

export const readFile = quansync({
  sync: filepath =&gt; fs.readFileSync(filepath),
  async: filepath =&gt; fs.promises.readFile(filepath),
})
</code></pre>
<pre><code class="language-ts">const content1 = readFile.sync('package.json')
const content2 = await readFile.async('package.json')

// The quansync function itself can behave like a normal async function
const content3 = await readFile('package.json')
</code></pre>
<h4>Generator API</h4>
<p>Generator is where the magic happens. It allows you to create a <code>quansync</code> function by using other <code>quansync</code> functions. For example:</p>
<pre><code class="language-ts">import { quansync } from 'quansync'

export const readFile = quansync({
  sync: filepath =&gt; fs.readFileSync(filepath),
  async: filepath =&gt; fs.promises.readFile(filepath),
})

// Create a quansync with `function*` and `yield*`
// [!code word:function*:1]
export const readJSON = quansync(function* (filepath) {
  // Call the quansync function directly
  // and use `yield*` to get the result.
  // Upon usage, it will auto select the implementation
  // [!code word:yield*:1]
  const content = yield* readFile(filepath)
  return JSON.parse(content)
})
</code></pre>
<pre><code class="language-ts">// fs.readFileSync will be used under the hood
const pkg1 = readJSON.sync('package.json')
// fs.promises.readFile will be used under the hood
const pkg2 = await readJSON.async('package.json')
</code></pre>
<h3>Build-time Macros</h3>
<p>If the <code>function*</code> and <code>yield*</code> syntax scares you a bit, {@sxzz} also made a build-time macro <a href="https://github.com/quansync-dev/unplugin-quansync"><code>unplugin-quansync</code></a> allowing you to write normal <code>async</code>/<code>await</code> syntax, and it will be transformed to the corresponding <code>yield*</code> syntax at build time.</p>
<pre><code class="language-ts">// [!code word:quansync/macro]
import { quansync } from 'quansync/macro'

// Use async/await syntax
// They will be transformed to `function*` and `yield*` at build time
export const readJSON = quansync(async (filepath) =&gt; {
  const content = await readFile(filepath)
  return JSON.parse(content)
})

// Expose the classical sync API
export const readJSONSync = readJSON.sync
</code></pre>
<p>Thanks to <a href="https://github.com/unjs/unplugin"><code>unplugin</code></a>, it can work in almost any build tool, like compiling with <code>unbuild</code> or testing with <code>vitest</code>. Please refer to <a href="https://github.com/quansync-dev/unplugin-quansync">the docs</a> for more detailed setup.</p>
<h2>How does it Work?</h2>
<p><a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Generator">Generators</a> in JavaScript are a powerful yet often underutilized feature. To define a generator function, you use the <code>function*</code> syntax (note that arrow functions do not support generators). Inside a generator function, you can use the <code>yield</code> keyword to pause execution and return a value. This effectively splits your logic into multiple &quot;chunks,&quot; allowing the caller to control when to execute the next chunk.</p>
<p>By leveraging this behavior, we can pause execution at each <code>yield</code> point. In an asynchronous context, we can wait for the async operation to complete before resuming execution. In a synchronous context, the next chunk runs immediately. This approach offloads the coloring problem to the caller, allowing them to decide whether the function should run synchronously or asynchronously.</p>
<p>In fact, during the early days of JavaScript, before the <code>async</code> and <code>await</code> keywords were widely adopted, Babel used generators and <code>yield</code> to polyfill async behavior. While this technique isn't new, we believe it has significant potential to improve how we handle the coloring problem, especially in library design.</p>
<h2>When not to Use?</h2>
<p>Frankly, I wish most of time you don't even need to think about it. High-level tools should support async entry points for most cases, where choice <code>sync</code> and <code>async</code> is not a problem. However, there are still many cases where in the context, it's required to be colored. In such cases, <code>quansync</code> could be a good fit for progressive and gradual adoption.</p>
<p>Promise in JavaScript naturally a <a href="https://javascript.info/event-loop">microtask</a> that delays a tick. <code>yield</code> also introduce certain overhead (<a href="https://github.com/quansync-dev/quansync#benchmark">around <code>~120ns</code> on M1 Max</a>). In performance-sensitive scenarios, you might also want to avoid using either <code>async</code>or <code>quansync</code>.</p>
<h2>Coloring Problem Revisited</h2>
<p>While <code>quansync</code> doesn't completely solve the coloring problem, it provides a new perspective that simplifies managing synchronous and asynchronous code. Quansync introduces a new <span text-purple>&quot;purple&quot;</span> color, blending the red and blue. Quansync functions still face the coloring problem, as wrapping a function to support both sync and async requires it to be a quansync function (or generator). However, the key advantage is that a quansync function can be <a href="https://en.wikipedia.org/wiki/Wave_function_collapse">&quot;collapsed&quot;</a> to either sync or async as needed. This allows your &quot;colorless&quot; logic to avoid the red and blue color inflection caused by some operations that might have a color.</p>
<h2>Conclusion</h2>
<p>This is a new approach to tackling the coloring problem we are still exploring. We will slowly roll out <code>quansync</code> in our libraries and see how it improves our experience and the ecosystem. We are also<br>
looking for feedback and contributions, so feel free to join us in the <a href="https://chat.antfu.me">Discord</a> or <a href="https://github.com/quansync-dev/quansync/discussions">GitHub Discussions</a> to share your thoughts.</p>
<ul>
<li>
<GitHubLink repo="quansync-dev/quansync" />
</li>
<li>
<GitHubLink repo="quansync-dev/unplugin-quansync" />
</li>
</ul>
]]></content:encoded>
            <author>hi@zevs.gg (Zevs)</author>
        </item>
        <item>
            <title><![CDATA[Move on to ESM-only]]></title>
            <link>https://zevs.gg/posts/move-on-to-esm-only</link>
            <guid isPermaLink="true">https://zevs.gg/posts/move-on-to-esm-only</guid>
            <pubDate>Wed, 05 Feb 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[Let's move on to ESM-only]]></description>
            <content:encoded><![CDATA[<p>[[toc]]</p>
<p>Three years ago, I wrote a post about <a href="/posts/publish-esm-and-cjs">shipping ESM &amp; CJS in a single package</a>, advocating for dual CJS/ESM formats to ease user migration and trying to make the best of both worlds. Back then, I didn't fully agree with <a href="https://gist.github.com/sindresorhus/a39789f98801d908bbc7ff3ecc99d99c">aggressively shipping ESM-only</a>, as I considered the ecosystem wasn't ready, especially since the push was mostly from low-level libraries. Over time, as tools and the ecosystem have evolved, my perspective has gradually shifted towards more and more on adopting ESM-only.</p>
<p>As of 2025, a decade has passed since ESM was first introduced in 2015. Modern tools and libraries have increasingly adopted ESM as the primary module format. According to {@wooorm}'s <a href="https://github.com/wooorm/npm-esm-vs-cjs">script</a>, the packages that ships ESM on npm in 2021 was <strong>7.8%</strong>, and by the end of 2024, it had reached <a href="https://github.com/wooorm/npm-esm-vs-cjs"><strong>25.8%</strong></a>. Although a significant portion of packages still use CJS, the trend clearly shows a good shift towards ESM.</p>
<figure>
  <img src="https://zevs.gg/images/npm-esm-vs-cjs-2024.svg" dark:filter-invert />
  <figcaption text-center>ESM adoption over time, generated by the <code>npm-esm-vs-cjs</code> script. Last updated at 2024-11-27</figcaption>
</figure>
<p>Here in this post, I'd like to share my thoughts on the current state of the ecosystem and why I believe it's time to move on to ESM-only.</p>
<h2>The Toolings are Ready</h2>
<h3>Modern Tools</h3>
<p>With the rise of <a href="https://vite.dev">Vite</a> as a popular modern frontend build tool, many meta-frameworks like <a href="https://nuxtjs.org">Nuxt</a>, <a href="https://kit.svelte.dev">SvelteKit</a>, <a href="https://astro.build">Astro</a>, <a href="https://solidstart.dev">SolidStart</a>, <a href="https://remix.run">Remix</a>, <a href="https://storybook.js.org">Storybook</a>, <a href="https://redwoodjs.com">Redwood</a>, and many others are all built on top of Vite nowadays, that <strong>treating ESM as a first-class citizen</strong>.</p>
<p>As a complement, we have also testing library <a href="https://vitest.dev">Vitest</a>, which was designed for ESM from the day one with powerful module mocking capability and efficient fine-grain caching support.</p>
<p>CLI tools like <a href="https://github.com/privatenumber/tsx"><code>tsx</code></a> and <a href="https://github.com/unjs/jiti"><code>jiti</code></a> offer a seamless experience for running TypeScript and ESM code without requiring additional configuration. This simplifies the development process and reduces the overhead associated with setting up a project to use ESM.</p>
<p>Other tools, for example, <a href="https://eslint.org">ESLint</a>, in the recent v9.0, introduced a new flat config system that enables native ESM support with <code>eslint.config.mjs</code>, even in CJS projects.</p>
<h3>Top-Down &amp; Bottom-Up</h3>
<p>Back in 2021, when {@sindresorhus} first started migrating all his packages to ESM-only, for example, <code>find-up</code> and <code>execa</code>, it was a bold move. I consider this move as a <strong>bottom-up</strong> approach, as the packages that rather low-level and many their dependents are not ready for ESM yet. I was worried that this would force those dependents to stay on the old version of the packages, which might result in the ecosystem being fragmented. (As of today, I actually appreciate that move bringing us quite a lot of high-quality ESM packages, regardless that the process wasn't super smooth).</p>
<p>It's way easier for an ESM or Dual formats package to depend on CJS packages, but not the other way around. In terms of smooth adoption, I believe the <strong>top-down</strong> approach is more effective in pushing the ecosystem forward. With the support of high-level frameworks and tools from top-down, it's no longer a significant obstacle to use ESM-only packages. The remaining challenges in terms of ESM adoption primarily lie with package authors needing to migrate and ship their code in ESM format.</p>
<h3>Requiring ESM in Node.js</h3>
<p>The <a href="https://joyeecheung.github.io/blog/2024/03/18/require-esm-in-node-js/">capability to <code>require()</code> ESM modules</a> in Node.js, <a href="https://github.com/nodejs/node/pull/51977">initiated</a> by {@joyeecheung}, marks an <strong>incredible milestone</strong>. This feature allows packages to be published as ESM-only while still being consumable by CJS codebases with minimal modifications. It helps avoid the <a href="/posts/async-sync-in-between">async infection</a> (also known as <a href="https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/">Red Functions</a>) introduced by dynamic <code>import()</code> ESM, which can be pretty hard, if not impossible in some cases, to migrate and adapt.</p>
<p>This feature was recently <a href="https://github.com/nodejs/node/pull/55085">unflagged</a> and <a href="https://github.com/nodejs/node/pull/55217">backported to Node.js v22</a> (<a href="https://github.com/nodejs/node/pull/56927">and soon v20</a>), which means it should be available to many developers already. Consider the <a href="#top-down--bottom-up">top-down or bottom-up</a> metaphor, this feature actually makes it possible to start ESM migration also from <strong>middle-out</strong>, as it allows import chains like <code>ESM → CJS → ESM → CJS</code> to work seamlessly.</p>
<p>To solve the interop issue between CJS and ESM in this case, <a href="https://nodejs.org/api/modules.html#loading-ecmascript-modules-using-require">Node.js also introduced</a> a new <code>export { Foo as 'module.exports' }</code> syntax in ESM to export CJS-compatible exports (by <a href="https://github.com/nodejs/node/pull/54563">this PR</a>). This allows package authors to publish ESM-only packages while still supporting CJS consumers, without even introducing breaking changes (expcet for changing the required Node.js version).</p>
<p>For more details on the progress and discussions around this feature, keep track on <a href="https://github.com/nodejs/node/issues/52697">this issue</a>.</p>
<h2>The Troubles with Dual Formats</h2>
<p>While dual CJS/ESM packages have been a quite helpful transition mechanism, they come with their own set of challenges. Maintaining two separate formats can be cumbersome and error-prone, especially when dealing with complex codebases. Here are some of the issues that arise when maintaining dual formats:</p>
<h3>Interop Issues</h3>
<p>Fundamentally, CJS and ESM are different module systems with distinct design philosophies. Although Node.js has made it possible to import CJS modules in ESM, dynamically import ESM in CJS, and even <code>require()</code> ESM modules, there are still many tricky cases that can lead to interop issues.</p>
<p>One key difference is that CJS typically uses a single <code>module.exports</code> object, while ESM supports both default and named exports. When authoring code in ESM and transpiling to CJS, handling exports can be particularly challenging, especially when the exported value is a non-object, such as a function or a class. Additionally, to make the types correct, we also need to introduce further complications with <code>.d.mts</code> and <code>.d.cts</code> declaration files. And so on...</p>
<p>As I am trying to explain this problem deeper, I found that I actually wish you didn't even need to be bothered with this problem at all. It's frankly too complicated and frustrating. If you are just a user of packages, let alone the package authors to worry about that. This is one of the reasons I advocate for the entire ecosystem to transition to ESM, to leave these problems behind and spare everyone from this unnecessary hassle.</p>
<h3>Dependency Resolution</h3>
<p>When a package has both CJS and ESM formats, the resolution of dependencies can become convoluted. For example, if a package depends on another package that only ships ESM, the consumer must ensure that the ESM version is used. This can lead to version conflicts and dependency resolution issues, especially when dealing with transitive dependencies.</p>
<p>Also for packages that are designed to used with singleton pattern, this might introduce multiple copies of the same package and cause unexpected behaviors.</p>
<h3>Package Size</h3>
<p>Shipping dual formats essentially doubles the package size, as both CJS and ESM bundles need to be included. While a few extra kilobytes might not seem significant for a single package, the overhead can quickly add up in projects with hundreds of dependencies, leading to the infamous node_modules bloat. Therefore, package authors should keep an eye on their package size. Moving to ESM-only is a way to optimize it, especially if the package doesn't have strong requirements on CJS.</p>
<h2>When Should We Move to ESM-only?</h2>
<p>This post does not intend to diminish the value of dual-format publishing. Instead, I want to encourage evaluating the current state of the ecosystem and the potential benefits of transitioning to ESM-only.</p>
<p>There are several factors to consider when deciding whether to move to ESM-only:</p>
<h3>New Packages</h3>
<p>I strongly recommend that <strong>all new packages</strong> be released as ESM-only, as there are no legacy dependencies to consider. New adopters are likely already using a modern, ESM-ready stack, there being ESM-only should not affect the adoption. Additionally, maintaining a single module system simplifies development, reduces maintenance overhead, and ensures that your package benefits from future ecosystem advancements.</p>
<h3>Browser-targeted Packages</h3>
<p>If a package is primarily targeted for the browser, it makes total sense to ship ESM-only. In most cases, browser packages go through a bundler, where ESM provides significant advantages in static analysis and tree-shaking. This leads to smaller and more optimized bundles, which would also improve loading performance and reduce bandwidth consumption for end users.</p>
<h3>Standalone CLI</h3>
<p>For a standalone CLI tool, it's no difference to end users whether it's ESM or CJS. However, using ESM would enable your dependencies to also be ESM, facilitating the ecosystem's transition to ESM from a <a href="#top-down--bottom-up">top-down approach</a>.</p>
<h3>Node.js Support</h3>
<p>If a package is targeting the evergreen Node.js versions, it's a good time to consider ESM-only, especially with the recent <a href="#requiring-esm-in-nodejs"><code>require(ESM)</code> support</a>.</p>
<h3>Know Your Consumers</h3>
<p>If a package already has certain users, it's essential to understand the dependents' status and requirements. For example, for an ESLint plugin/utils that requires ESLint v9, while ESLint v9's new config system supports ESM natively even in CJS projects, there is no blocker for it to be ESM-only.</p>
<p>Definitely, there are different factors to consider for different projects. But in general, I believe the ecosystem is ready for more packages to move to ESM-only, and it's a good time to evaluate the benefits and potential challenges of transitioning.</p>
<h2>How Far We Are?</h2>
<p>The transition to ESM is a gradual process that requires collaboration and effort from the entire ecosystem. Which I believe we are on a good track moving forward.</p>
<p>To improve the transparency and visibility of the ESM adoption, I recently built a visualized tool called <a href="https://github.com/antfu/node-modules-inspector">Node Modules Inspector</a> for analyzing your packages's dependencies. It provides insights into the ESM adoption status of your dependencies and helps identify potential issues when migrating to ESM.</p>
<p>Here are some screenshots of the tool to give you a quick impression:</p>
<figure>
  <img src="/images/node-modules-inspector-1.png" scale-110 />
  <figcaption text-center>Node Modules Inspector - Overview</figcaption>
</figure>
<figure>
  <img src="/images/node-modules-inspector-2.png" scale-110 />
  <figcaption text-center>Node Modules Inspector - Dependency Graph</figcaption>
</figure>
<figure>
  <img src="/images/node-modules-inspector-3.png" scale-110 />
  <figcaption text-center>Node Modules Inspector - Reports like ESM Adoptions and Duplicated Packages</figcaption>
</figure>
<p>This tool is still in its early stages, but I hope it will be a valuable resource for package authors and maintainers to track the ESM adoption progress of their dependencies and make informed decisions about transitioning to ESM-only.</p>
<p>To learn more about how to use it and inspect your projects, check the repository <GitHubLink repo="antfu/node-modules-inspector" />.</p>
<h2>Moving Forward</h2>
<p>I am planning to gradually transition the packages I maintain to ESM-only and take a closer look at the dependencies we rely on. We also have plenty of exciting ideas for the Node Modules Inspector, aiming to provide more useful insights and help find the best path forward.</p>
<p>I look forward to a more portable, resilient, and optimized JavaScript/TypeScript ecosystem.</p>
<p>I hope this post has shed some light on the benefits of moving to ESM-only and the current state of the ecosystem. If you have any thoughts or questions, feel free to reach out using the links below. Thank you for reading!</p>
]]></content:encoded>
            <author>hi@zevs.gg (Zevs)</author>
        </item>
        <item>
            <title><![CDATA[Epoch Semantic Versioning]]></title>
            <link>https://zevs.gg/posts/epoch-semver</link>
            <guid isPermaLink="true">https://zevs.gg/posts/epoch-semver</guid>
            <pubDate>Tue, 07 Jan 2025 12:00:00 GMT</pubDate>
            <description><![CDATA[Proposal for an extended Semantic Versioning called Epoch SemVer to provide more granular versioning information to users.]]></description>
            <content:encoded><![CDATA[<p>If you've been following my work in open source, you might have noticed that I have a tendency to stick with zero-major versioning, like <code>v0.x.x</code>. For instance, as of writing this post, the latest version of UnoCSS is <a href="https://github.com/unocss/unocss/releases/tag/v0.65.3"><code>v0.65.3</code></a>, Slidev is <a href="https://github.com/slidevjs/slidev/releases/tag/v0.50.0"><code>v0.50.0</code></a>, and <code>unplugin-vue-components</code> is <a href="https://github.com/unplugin/unplugin-vue-components/releases/tag/v0.28.0"><code>v0.28.0</code></a>. Other projects, such as React Native is on <a href="https://github.com/facebook/react-native/releases/tag/v0.76.5"><code>v0.76.5</code></a>, and sharp is on <a href="https://github.com/lovell/sharp/releases/tag/v0.33.5"><code>v0.33.5</code></a>, also follow this pattern.</p>
<p>People often assume that a zero-major version indicates that the software is not ready for production. However, all of the projects mentioned here are quite stable and production-ready, used by millions of projects.</p>
<p><strong>Why?</strong> - I bet that's your question reading this.</p>
<h2>Versioning</h2>
<p>Version numbers act as snapshots of our codebase, helping us communicate changes effectively. For instance, we can say &quot;it works in v1.3.2, but not in v1.3.3, there might be a regression.&quot; This makes it easier for maintainers to locate bugs by comparing the differences between these versions. A version is essentially a marker, a seal of the codebase at a specific point in time.</p>
<p>However, code is complex, and every change involves trade-offs. Describing how a change affects the code can be tricky even with natural language. A version number alone can't capture all the nuances of a release. That's why we have changelogs, release notes, and commit messages to provide more context.</p>
<p>I see versioning as a way to communicate changes to users — a <strong>contract</strong> between the library maintainers and the users to ensure compatibility and stability during upgrades. As a user, you can't always tell what's changed between <code>v2.3.4</code> and <code>v2.3.5</code> without checking the changelog. But by looking at the numbers, you can infer that it's a patch release meant to fix bugs, which <strong>should</strong> be safe to upgrade. This ability to understand changes just by looking at the version number is possible because both the library maintainer and the users agree on the versioning scheme.</p>
<p>Since versioning is only a contract, and could be interpreted differently to each specific project, you shouldn't blindly trust it. It serves as an indication to help you decide when to take a closer look at the changelog and be cautious about upgrading. But it's not a guarantee that everything will work as expected, as every change might introduce behavior changes, whether it's intended or not.</p>
<h2>Semantic Versioning</h2>
<p>In the JavaScript ecosystem, especially for packages published on npm, we follow a convention known as <a href="https://semver.org/">Semantic Versioning</a>, or SemVer for short. A SemVer version number consists of three parts: <code>MAJOR.MINOR.PATCH</code>. The rules are straightforward:</p>
<ul>
<li><span font-bold font-mono text-amber>MAJOR</span>: Increment when you make incompatible API changes.</li>
<li><span font-bold font-mono text-lime>MINOR</span>: Increment when you add functionality in a backwards-compatible manner.</li>
<li><span font-bold font-mono text-blue>PATCH</span>: Increment when you make backwards-compatible bug fixes.</li>
</ul>
<p>Package managers we use, like <code>npm</code>, <code>pnpm</code>, and <code>yarn</code>, all operate under the assumption that every package on npm adheres to SemVer. When you or a package specifies a dependency with a version range, such as <code>^1.2.3</code>, it indicates that you are comfortable with upgrading to any version that shares the same major version (<code>1.x.x</code>). In these scenarios, package managers will automatically determine the best version to install based on what is most suitable for your specific project.</p>
<p>This convention works well technically. If a package releases a new major version <code>v2.0.0</code>, your package manager won't install it if your specified range is <code>^1.2.3</code>. This prevents unexpected breaking changes from affecting your project until you manually update the version range.</p>
<p>However, humans perceive numbers on a logarithmic scale. We tend to see <code>v2.0</code> to <code>v3.0</code> as a huge, groundbreaking change, while <code>v125.0</code> to <code>v126.0</code> seems a lot more trivial, even though both indicate incompatible API changes in SemVer. This perception can make maintainers hesitant to bump the major version for minor breaking changes, leading to the accumulation of many breaking changes in a single major release, making upgrades harder for users. Conversely, with something like <code>v125.0</code>, it becomes difficult to convey the significance of a major change, as the jump to <code>v126.0</code> appears minor.</p>
<blockquote>
<p>{@TkDodo|Dominik Dorfmeister} had <a href="https://tkdodo.eu/blog/react-query-api-design-lessons-learned">a great talk about API Design</a>, which mentions an interesting inequality that descripting this: <a href="https://tkdodo.eu/blog/react-query-api-design-lessons-learned?page=30">&quot;Breaking Changes !== Marketing Event&quot;</a></p>
</blockquote>
<h2>Progressive</h2>
<p>I am a strong believer in the principle of progressiveness. Rather than making a giant leap to a significantly higher stage all at once, progressiveness allows users to adopt changes gradually at their own pace. It provides opportunities to pause and assess, making it easier to understand the impact of each change.</p>
<figure text-center>
  <img src="https://zevs.gg/images/epoch-semver-progressive-1.png" alt="Progressive as Stairs" border="~ base rounded-xl">
  <figcaption>Progressive as Stairs - a screenshot of my talk <a italic font-serif href="/talks#the-progressive-path" target="_blank">The Progressive Path</a></figcaption>
</figure>
<p>I believe we should apply the same principle to versioning. Instead of treating a major version as a massive overhaul, we can break it down into smaller, more manageable updates. For example, rather than releasing <code>v2.0.0</code> with 10 breaking changes from <code>v1.x</code>, we could distribute these changes across several smaller major releases. This way, we might release <code>v2.0</code> with 2 breaking changes, followed by <code>v3.0</code> with 1 breaking change, and so on. This approach makes it easier for users to adopt changes gradually and reduces the risk of overwhelming them with too many changes at once.</p>
<figure text-center>
  <img src="/images/epoch-semver-progressive-2.png" alt="Progressive on Breaking Changes" border="~ base rounded-xl">
  <figcaption>Progressive on Breaking Changes - a screenshot of my talk <a italic font-serif href="/talks#the-progressive-path" target="_blank">The Progressive Path</a></figcaption>
</figure>
<h2>Leading Zero Major Versioning</h2>
<p>The reason I've stuck with <code>v0.x.x</code> is my own unconventional approach to versioning. I prefer to introduce necessary and minor breaking changes early on, making upgrades easier, without causing alarm that typically comes with major version jumps like <code>v2</code> to <code>v3</code>. Some changes might be &quot;technically&quot; breaking but don't impact 99.9% of users in practice. (Breaking changes are relative. Even a bug fix can be breaking for those relying on the previous behavior, but that's another topic for discussion :P).</p>
<p>There's a special rule in SemVer that states <strong>when the leading major version is <code>0</code>, every minor version bump is considered breaking</strong>. I am kind of <strong>abusing</strong> that rule to workaround the limitation of SemVer. With zero-major versioning, we are effectively abandoning the first number, and merge <code>MINOR</code> and <code>PATCH</code> into a single number (thanks to <a href="https://x.com/ssalbdivad">David Blass</a> for pointing <a href="https://x.com/ssalbdivad/status/1876614090623431116">this</a> out):</p>
<div py4>
  <code important="text-xl text-gray"><span line-through>ZERO</span>.<span font-bold text-amber>MAJOR</span>.{<span font-bold text-lime>MINOR</span> + <span font-bold text-blue>PATCH</span>}</code>
</div>
<blockquote>
<p>Of course, zero-major versioning is not the only solution to be progressive. We can see that tools like <a href="https://nodejs.org/en">Node.js</a>, <a href="https://vite.dev/">Vite</a>, <a href="https://vitest.dev/">Vitest</a> are rolling out major versions in consistent intervals, with a minimal set of breaking changes in each release that are easy to adopt. It would require a lot of effort and extra attentions. Kudos to them!</p>
</blockquote>
<p>I have to admit that sticking to <strong>zero-major versioning isn't the best practice.</strong> While I aimed for more granular versioning to improve communication, using zero-major versioning has actually limited the ability to convey changes effectively. In reality, I've been wasting a valuable part of the versioning scheme due to my peculiar insistence.</p>
<p>Thus, here, I am proposing to change.</p>
<h2>Epoch Semantic Versioning</h2>
<p><a href="https://x.com/antfu7/status/1679184417930059777">In an ideal world, I would wish SemVer to have 4 numbers: <code>EPOCH.MAJOR.MINOR.PATCH</code></a>. The <code>EPOCH</code> version is for those big announcements, while <code>MAJOR</code> is for technical incompatible API changes that might not be significant. This way, we can have a more granular way to communicate changes. Similarly, we also have <a href="https://github.com/romversioning/romver">Romantic Versioning that propose <code>HUMAN.MAJOR.MINOR</code></a>. The creator of SemVer, <a href="https://tom.preston-werner.com/">Tom Preston-Werner</a> also <a href="https://tom.preston-werner.com/2022/05/23/major-version-numbers-are-not-sacred">mentioned similar concerns and solutions in this blog post</a>. (thanks to <a href="https://x.com/sebastienlorber">Sébastien Lorber</a> for pointing <a href="https://x.com/sebastienlorber/status/1879127128530460856">this</a> out).</p>
<p>But, of course, it's too late for the entire ecosystem to adopt a new versioning scheme.</p>
<p>If we can't change SemVer, maybe we can at least extend it. I am proposing a new versioning scheme called <strong>🗿 Epoch Semantic Versioning</strong>, or Epoch SemVer for short. It's built on top of the structure of <code>MAJOR.MINOR.PATCH</code>, extend the first number to be the combination of <code>EPOCH</code> and <code>MAJOR</code>. To put a difference between them, we use a fourth digit to represent <code>EPOCH</code>, which gives <code>MAJOR</code> a range from 0 to 999. This way, it follows the exact same rules as SemVer <strong>without requiring any existing tools to change, but provides more granular information to users</strong>.</p>
<blockquote>
<p>The name &quot;Epoch&quot; is inspired by <a href="https://manpages.debian.org/stretch/dpkg-dev/deb-version.5.en.html">Debian's versioning scheme</a>.</p>
</blockquote>
<p>The format is as follows:</p>
<div py4>
  <code important="text-xl text-gray">{<span font-bold text-violet>EPOCH</span> * 1000 + <span font-bold text-amber>MAJOR</span>}.<span font-bold text-lime>MINOR</span>.<span font-bold text-blue>PATCH</span></code>
</div>
<ul>
<li><span font-bold font-mono text-violet>EPOCH</span>: Increment when you make significant or groundbreaking changes.</li>
<li><span font-bold font-mono text-amber>MAJOR</span>: Increment when you make minor incompatible API changes.</li>
<li><span font-bold font-mono text-lime>MINOR</span>: Increment when you add functionality in a backwards-compatible manner.</li>
<li><span font-bold font-mono text-blue>PATCH</span>: Increment when you make backwards-compatible bug fixes.</li>
</ul>
<blockquote>
<p>I previously proposed to have the EPOCH multiplier to be <code>100</code>, but according to the community feedback, it seems that <code>1000</code> is a more popular choices as it give more room for the <code>MAJOR</code> version and a bit more distinguision between the numbers. The multiplier is not a strict rule, feel free to adjust it based on your needs.</p>
</blockquote>
<p>For example, UnoCSS would transition from <code>v0.65.3</code> to <code>v65.3.0</code> (in the case <code>EPOCH</code> is <code>0</code>). Following SemVer, a patch release would become <code>v65.3.1</code>, and a feature release would be <code>v65.4.0</code>. If we introduced some minor incompatible changes affecting an edge case, we could bump it to <code>v66.0.0</code> to alert users of potential impacts. In the event of a significant overhaul to the core, we could jump directly to <code>v1000.0.0</code> to signal a new era and make a big announcement. I'd suggest assigning a code name to each non-zero <code>EPOCH</code> to make it more memorable and easier to refer to. This approach provides maintainers with more flexibility to communicate the scale of changes to users effectively.</p>
<blockquote>
<p>[!TIP]<br>
We shouldn't need to bump <code>EPOCH</code> often. It's mostly useful for high-level, end-user-facing libraries or frameworks. For low-level libraries, they might <strong>never</strong> need to bump <code>EPOCH</code> at all (<code>ZERO-EPOCH</code> is essentially the same as SemVer).</p>
</blockquote>
<p>Of course, I'm not suggesting that everyone should adopt this approach. It's simply an idea to work around the existing system, and only for those packages with this need. It will be interesting to see how it performs in practice.</p>
<h2>Moving Forward</h2>
<p>I plan to adopt Epoch Semantic Versioning in my projects, including UnoCSS, Slidev, and all the plugins I maintain, and ultimately abandon zero-major versioning for stable packages. I hope this new versioning approach will help communicate changes more effectively and provide users with better context when upgrading.</p>
<p>I'd love to hear your thoughts and feedback on this idea. Feel free to share your comments using the links below!</p>
]]></content:encoded>
            <author>hi@zevs.gg (Zevs)</author>
        </item>
    </channel>
</rss>