On This Page
|
Guestbook Architecture
No database. No backend framework. Just a txt file and a CGI script.
|
Everything on this site is hand-written static HTML -- except the guestbook. So how do new messages appear in the sidebar?
The short answer: they don't appear in real time. Each message is written to a plain text file, and the next time the site rebuilds, it gets baked into the static HTML.
|
The Pipeline
[User fills form] -> [CGI script receives] -> [Write to guestbook.txt]
|
[Trigger rebuild] -> [build.ps1 reads data] -> [Inject into sidebar HTML]
|
[Deploy new pages] -> [User sees the message]
The guestbook form submits to /cgi-bin/guestbook.py -- a Python CGI script that parses POST data, sanitizes input, and appends a pipe-delimited line to data/guestbook.txt.[1]
|
The Form
Just a humble HTML 3.2 form in sidebar-left.html:
<form action="/cgi-bin/guestbook.py" method="post">
Name: <input type="text" name="name">
Email: <input type="text" name="email">
Content: <textarea name="content"></textarea>
[x] Show IP <input type="checkbox" name="show_ip">
<input type="submit" value="Submit">
</form>
Four fields: name, email (optional -- fills in with a mailto link), content, and a show_ip checkbox. No captcha. No CSRF token. Pure 90s trust.
|
CGI Input Handling
def parse_form_data():
method = os.environ.get('REQUEST_METHOD', 'GET')
if method == 'GET':
qs = os.environ.get('QUERY_STRING', '')
form_data = parse_qs(qs)
else:
length = int(os.environ.get('CONTENT_LENGTH', 0))
body = sys.stdin.read(length)
form_data = parse_qs(body)
return {k: v[0] for k, v in form_data.items()}
CGI is refreshingly primitive: GET reads from QUERY_STRING, POST reads from stdin. After parsing, input goes through sanitization:
def sanitize(text):
text = text.replace('\n', ' ').replace('\r', ' ')
text = text.replace('|', ' ')
return html.escape(text)
Remove | because it's the data delimiter. Remove newlines to prevent format injection. HTML-escape everything to block XSS.
|
IP Detection
The server sits behind a reverse proxy, so REMOTE_ADDR only shows the proxy IP. web_server.py extracts the real client IP from proxy headers and injects it into the CGI environment:
cf_ip = headers.get('CF-Connecting-IP') # Cloudflare
xff = headers.get('X-Forwarded-For') # Standard proxy
real_ip = headers.get('X-Real-IP') # Nginx
if cf_ip:
real_client_ip = cf_ip
elif xff:
real_client_ip = xff.split(',')[0].strip()
elif real_ip:
real_client_ip = real_ip
else:
real_client_ip = client_ip # Direct connection
os.environ["REAL_CLIENT_IP"] = real_client_ip
Priority: Cloudflare > X-Forwarded-For > X-Real-IP > direct. If the user checks "Show IP", the address appears in gray below their message. Otherwise it's stored as "hidden".
|
Storage: TXT as Database
All messages live in data/guestbook.txt, one per line, pipe-delimited:
name|email|content|ip|time|show_ip <- current format (6 fields)
name|content|ip|time|show_ip <- legacy format (5 fields)
Backup is cp. Migration is scp. Manual editing is Notepad. The file has accumulated 30+ messages since 2020 across server migrations and blog rebuilds. A txt file never goes obsolete.
|
Build-Time Injection
During each rebuild, build.ps1 does the following:
# Read guestbook.txt
$lines = Get-Content $guestbookFile -Encoding UTF8
# Take last 20, reverse (newest first)
$lastLines = $lines | Select-Object -Last 20
[array]::Reverse($lastLines)
# Generate HTML for each message
foreach ($line in $lastLines) {
# Split by |, detect format by field count
# Build: name(mailto) + content + IP(optional) + time
}
# Inject at placeholder
$sidebarLeft = $sidebarLeft -replace "", $messagesHtml
The messages are compiled directly into the HTML at the <!-- GUESTBOOK_MESSAGES --> placeholder. Email becomes a mailto link. IP display is opt-in. Everything goes in a 123x260px scrollable container.
|
Defenses
Input sanitization: strip | and newlines to prevent format injection
HTML escaping: html.escape() on all user input
Opt-in IP: users choose whether to show their IP
Directory protection: /data/, /scripts/, /src/ return 403
robots.txt: crawlers blocked from /cgi-bin/ and /data/
No rate limiting, no captcha. If someone spams, the worst case is deleting a few lines from a txt file and rebuilding.
|
Why Do It This Way?
Honestly? Laziness.
No MySQL to install. No Redis to configure. No Express backend to maintain. No login system. A txt file + CGI + static build -- three pieces, under 200 lines of code, and the guestbook runs just fine.
The tradeoff: messages aren't real-time. They appear on the next rebuild. But since a personal blog rebuilds once a day anyway, that's good enough. Data portability and simplicity are the real wins here.
|
| [1] | CGI (Common Gateway Interface), introduced in 1993 by Rob McCool at NCSA, was the earliest standard for dynamic web content. Each request forks a new process -- slow by modern standards, but perfectly adequate for low-traffic sites with zero framework dependencies. |
|
中文
Search
Latest Posts
» Guestbook ...
» April 2026...
» Panlongge · Server
» FnOS Vulne...
» Wiki.js Setup Guide
» Article Archive
Tags
Web Dev
CGI
Python
Tutorial
Tools
» RSS Feed
» GitHub Source
» Back to Top
» Archive
|