DragonRster`s Void
Banner Image
On This Page
Guestbook Architecture

No database. No backend framework. Just a txt file and a CGI script.


Everything on this site is hand-written static HTML -- except the guestbook. So how do new messages appear in the sidebar?

The short answer: they don't appear in real time. Each message is written to a plain text file, and the next time the site rebuilds, it gets baked into the static HTML.

The Pipeline


  [User fills form]  ->  [CGI script receives]  ->  [Write to guestbook.txt]
         |
  [Trigger rebuild]  ->  [build.ps1 reads data]  ->  [Inject into sidebar HTML]
         |
  [Deploy new pages]  ->  [User sees the message]
                

The guestbook form submits to /cgi-bin/guestbook.py -- a Python CGI script that parses POST data, sanitizes input, and appends a pipe-delimited line to data/guestbook.txt.[1]

The Form

Just a humble HTML 3.2 form in sidebar-left.html:


<form action="/cgi-bin/guestbook.py" method="post">
    Name:    <input type="text" name="name">
    Email:   <input type="text" name="email">
    Content: <textarea name="content"></textarea>
    [x] Show IP  <input type="checkbox" name="show_ip">
    <input type="submit" value="Submit">
</form>
                

Four fields: name, email (optional -- fills in with a mailto link), content, and a show_ip checkbox. No captcha. No CSRF token. Pure 90s trust.

CGI Input Handling


def parse_form_data():
    method = os.environ.get('REQUEST_METHOD', 'GET')
    if method == 'GET':
        qs = os.environ.get('QUERY_STRING', '')
        form_data = parse_qs(qs)
    else:
        length = int(os.environ.get('CONTENT_LENGTH', 0))
        body = sys.stdin.read(length)
        form_data = parse_qs(body)
    return {k: v[0] for k, v in form_data.items()}
                

CGI is refreshingly primitive: GET reads from QUERY_STRING, POST reads from stdin. After parsing, input goes through sanitization:


def sanitize(text):
    text = text.replace('\n', ' ').replace('\r', ' ')
    text = text.replace('|', ' ')
    return html.escape(text)
                

Remove | because it's the data delimiter. Remove newlines to prevent format injection. HTML-escape everything to block XSS.

IP Detection

The server sits behind a reverse proxy, so REMOTE_ADDR only shows the proxy IP. web_server.py extracts the real client IP from proxy headers and injects it into the CGI environment:


cf_ip   = headers.get('CF-Connecting-IP')      # Cloudflare
xff     = headers.get('X-Forwarded-For')        # Standard proxy
real_ip = headers.get('X-Real-IP')              # Nginx

if cf_ip:
    real_client_ip = cf_ip
elif xff:
    real_client_ip = xff.split(',')[0].strip()
elif real_ip:
    real_client_ip = real_ip
else:
    real_client_ip = client_ip                  # Direct connection

os.environ["REAL_CLIENT_IP"] = real_client_ip
                

Priority: Cloudflare > X-Forwarded-For > X-Real-IP > direct. If the user checks "Show IP", the address appears in gray below their message. Otherwise it's stored as "hidden".

Storage: TXT as Database

All messages live in data/guestbook.txt, one per line, pipe-delimited:


name|email|content|ip|time|show_ip    <- current format (6 fields)
name|content|ip|time|show_ip          <- legacy format (5 fields)
                

Backup is cp. Migration is scp. Manual editing is Notepad. The file has accumulated 30+ messages since 2020 across server migrations and blog rebuilds. A txt file never goes obsolete.

Build-Time Injection

During each rebuild, build.ps1 does the following:


# Read guestbook.txt
$lines = Get-Content $guestbookFile -Encoding UTF8

# Take last 20, reverse (newest first)
$lastLines = $lines | Select-Object -Last 20
[array]::Reverse($lastLines)

# Generate HTML for each message
foreach ($line in $lastLines) {
    # Split by |, detect format by field count
    # Build: name(mailto) + content + IP(optional) + time
}

# Inject at placeholder
$sidebarLeft = $sidebarLeft -replace "", $messagesHtml
                

The messages are compiled directly into the HTML at the <!-- GUESTBOOK_MESSAGES --> placeholder. Email becomes a mailto link. IP display is opt-in. Everything goes in a 123x260px scrollable container.

Defenses

  • Input sanitization: strip | and newlines to prevent format injection
  • HTML escaping: html.escape() on all user input
  • Opt-in IP: users choose whether to show their IP
  • Directory protection: /data/, /scripts/, /src/ return 403
  • robots.txt: crawlers blocked from /cgi-bin/ and /data/
  • No rate limiting, no captcha. If someone spams, the worst case is deleting a few lines from a txt file and rebuilding.

    Why Do It This Way?

    Honestly? Laziness.

    No MySQL to install. No Redis to configure. No Express backend to maintain. No login system. A txt file + CGI + static build -- three pieces, under 200 lines of code, and the guestbook runs just fine.

    The tradeoff: messages aren't real-time. They appear on the next rebuild. But since a personal blog rebuilds once a day anyway, that's good enough. Data portability and simplicity are the real wins here.


    [1]CGI (Common Gateway Interface), introduced in 1993 by Rob McCool at NCSA, was the earliest standard for dynamic web content. Each request forks a new process -- slow by modern standards, but perfectly adequate for low-traffic sites with zero framework dependencies.

    « April 2026 · What I've Been Up To Home 脚本详解:generate-archive.ps1 —— 归档、标签与搜索索引 »
    中文

    Search

    Latest Posts

    » Guestbook ...
    » April 2026...
    » Panlongge · Server
    » FnOS Vulne...
    » Wiki.js Setup Guide

    » Article Archive


    Tags

    Web Dev CGI Python Tutorial


    Tools

    » RSS Feed
    » GitHub Source
    » Back to Top
    » Archive


    DRAGONRSTER
    CC BY-NC-SA
    © 2004-2026 DragonRster • Made with HTML • 本站支持IE5.5+
    最佳浏览分辨率:1024x768 • 最后更新于 2026年04月28日 22:19:55