Traditional Jekyll builds require complete site regeneration for content updates, causing delays in publishing. By implementing real-time synchronization between GitHub and Cloudflare, you can achieve near-instant content updates while maintaining Jekyll's static architecture. This guide explores an event-driven system that uses GitHub webhooks, Ruby automation scripts, and Cloudflare Workers to synchronize content changes instantly across the global CDN, enabling dynamic content capabilities for static Jekyll sites.
The real-time synchronization architecture connects GitHub's content repository with Cloudflare's edge network through event-driven workflows. The system processes content changes as they occur and propagates them instantly across the global CDN.
The architecture uses GitHub webhooks to detect content changes, Ruby web applications to process and transform content, and Cloudflare Workers to manage edge storage and delivery. Each content update triggers a precise synchronization flow that only updates changed content, avoiding full rebuilds and enabling sub-second update propagation.
# Sync Architecture Flow:
# 1. Content Change → GitHub Repository
# 2. GitHub Webhook → Ruby Webhook Handler
# 3. Content Processing:
# - Parse changed files
# - Extract front matter and content
# - Transform to edge-optimized format
# 4. Cloudflare Integration:
# - Update KV store with new content
# - Invalidate edge cache for changed paths
# - Update R2 storage for assets
# 5. Edge Propagation:
# - Workers serve updated content immediately
# - Automatic cache invalidation
# - Global CDN distribution
# Components:
# - GitHub Webhook → triggers on push events
# - Ruby Sinatra App → processes webhooks
# - Content Transformer → converts Markdown to edge format
# - Cloudflare KV → stores processed content
# - Cloudflare Workers → serves dynamic static content
GitHub webhooks provide instant notifications of repository changes. A Ruby web application processes these webhooks, extracts changed content, and initiates the synchronization process.
Here's a comprehensive Ruby webhook handler:
# webhook_handler.rb
require 'sinatra'
require 'json'
require 'octokit'
require 'yaml'
require 'digest'
class WebhookHandler < Sinatra::Base
set :github_secret, ENV['GITHUB_WEBHOOK_SECRET']
post '/webhook/github' do
# Verify webhook signature
verify_signature!(request)
# Parse webhook payload
payload = JSON.parse(request.body.read)
event_type = request.env['HTTP_X_GITHUB_EVENT']
case event_type
when 'push'
handle_push_event(payload)
when 'pull_request'
handle_pull_request_event(payload)
else
status 200
return "Event type #{event_type} not handled"
end
status 202
"Webhook processed successfully"
end
private
def verify_signature!(request)
signature = 'sha256=' + OpenSSL::HMAC.hexdigest(
OpenSSL::Digest.new('sha256'),
settings.github_secret,
request.body.read
)
unless Rack::Utils.secure_compare(signature, request.env['HTTP_X_HUB_SIGNATURE_256'].to_s)
halt 401, "Invalid signature"
end
end
def handle_push_event(payload)
repository = payload['repository']['full_name']
commits = payload['commits']
# Process each commit in the push
commits.each do |commit|
process_commit_changes(repository, commit)
end
# Trigger sync process
trigger_content_sync(repository, commits)
end
def process_commit_changes(repository, commit)
added_files = commit['added']
modified_files = commit['modified']
removed_files = commit['removed']
# Initialize GitHub client
client = Octokit::Client.new(access_token: ENV['GITHUB_ACCESS_TOKEN'])
# Process added and modified files
(added_files + modified_files).each do |file_path|
if content_file?(file_path)
content = client.contents(repository, path: file_path, ref: commit['id'])
process_content_file(file_path, content, :created_or_updated)
elsif data_file?(file_path)
data = client.contents(repository, path: file_path, ref: commit['id'])
process_data_file(file_path, data, :created_or_updated)
end
end
# Process removed files
removed_files.each do |file_path|
if content_file?(file_path)
process_content_file(file_path, nil, :deleted)
elsif data_file?(file_path)
process_data_file(file_path, nil, :deleted)
end
end
end
def content_file?(file_path)
file_path.start_with?('_posts/', '_pages/', '_docs/') && file_path.end_with?('.md')
end
def data_file?(file_path)
file_path.start_with?('_data/') && (file_path.end_with?('.yml', '.yaml', '.json'))
end
def process_content_file(file_path, content, action)
case action
when :created_or_updated
# Decode base64 content
raw_content = Base64.decode64(content['content'])
# Parse front matter and content
if raw_content =~ /^---\s*\n(.*?)\n---\s*\n(.*)/m
front_matter = YAML.safe_load($1)
content_body = $2
# Transform for edge storage
edge_content = transform_content_for_edge(file_path, front_matter, content_body)
# Sync to Cloudflare
sync_to_cloudflare(file_path, edge_content)
end
when :deleted
# Remove from Cloudflare
delete_from_cloudflare(file_path)
end
end
def trigger_content_sync(repository, commits)
# Prepare sync payload
sync_payload = {
repository: repository,
commits: commits.map { |c| { id: c['id'], message: c['message'] } },
timestamp: Time.now.iso8601
}
# Trigger Cloudflare Worker sync
uri = URI.parse("https://sync.yourdomain.com/api/sync")
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
request = Net::HTTP::Post.new(uri.path)
request['Authorization'] = "Bearer #{ENV['CLOUDFLARE_API_TOKEN']}"
request['Content-Type'] = 'application/json'
request.body = sync_payload.to_json
response = http.request(request)
unless response.is_a?(Net::HTTPSuccess)
logger.error "Failed to trigger sync: #{response.body}"
end
end
end
# Start the webhook handler
WebhookHandler.run! if __FILE__ == $0
Content processing transforms Jekyll content into edge-optimized formats and calculates delta updates to minimize synchronization overhead. Ruby scripts handle the intelligent processing and transformation.
# content_processor.rb
require 'yaml'
require 'json'
require 'digest'
require 'nokogiri'
class ContentProcessor
def initialize
@transformers = {
markdown: MarkdownTransformer.new,
data: DataTransformer.new,
assets: AssetTransformer.new
}
end
def process_content(file_path, raw_content, action)
case File.extname(file_path)
when '.md'
process_markdown_content(file_path, raw_content, action)
when '.yml', '.yaml', '.json'
process_data_content(file_path, raw_content, action)
else
process_asset_content(file_path, raw_content, action)
end
end
def process_markdown_content(file_path, raw_content, action)
# Parse front matter and content
front_matter, content_body = extract_front_matter(raw_content)
# Generate content hash for change detection
content_hash = generate_content_hash(front_matter, content_body)
# Transform content for edge delivery
edge_content = @transformers[:markdown].transform(
file_path: file_path,
front_matter: front_matter,
content: content_body,
action: action
)
{
type: 'content',
path: generate_content_path(file_path),
content: edge_content,
hash: content_hash,
metadata: {
title: front_matter['title'],
date: front_matter['date'],
tags: front_matter['tags'] || []
}
}
end
def process_data_content(file_path, raw_content, action)
data = case File.extname(file_path)
when '.json'
JSON.parse(raw_content)
else
YAML.safe_load(raw_content)
end
edge_data = @transformers[:data].transform(
file_path: file_path,
data: data,
action: action
)
{
type: 'data',
path: generate_data_path(file_path),
content: edge_data,
hash: generate_content_hash(data.to_json)
}
end
def extract_front_matter(raw_content)
if raw_content =~ /^---\s*\n(.*?)\n---\s*\n(.*)/m
front_matter = YAML.safe_load($1)
content_body = $2
[front_matter, content_body]
else
[{}, raw_content]
end
end
def generate_content_path(file_path)
# Convert Jekyll paths to URL paths
case file_path
when /^_posts\/(.+)\.md$/
date_part = $1[0..9] # Extract date from filename
slug_part = $1[11..-1] # Extract slug
"/#{date_part.gsub('-', '/')}/#{slug_part}/"
when /^_pages\/(.+)\.md$/
"/#{$1.gsub('_', '/')}/"
else
"/#{file_path.gsub('_', '/').gsub(/\.md$/, '')}/"
end
end
end
class MarkdownTransformer
def transform(file_path:, front_matter:, content:, action:)
# Convert Markdown to HTML
html_content = convert_markdown_to_html(content)
# Apply content enhancements
enhanced_content = enhance_content(html_content, front_matter)
# Generate edge-optimized structure
{
html: enhanced_content,
front_matter: front_matter,
metadata: generate_metadata(front_matter, content),
generated_at: Time.now.iso8601
}
end
def convert_markdown_to_html(markdown)
# Use commonmarker or kramdown for conversion
require 'commonmarker'
CommonMarker.render_html(markdown, :DEFAULT)
end
def enhance_content(html, front_matter)
doc = Nokogiri::HTML(html)
# Add heading anchors
doc.css('h1, h2, h3, h4, h5, h6').each do |heading|
anchor = doc.create_element('a', '#', class: 'heading-anchor')
anchor['href'] = "##{heading['id']}"
heading.add_next_sibling(anchor)
end
# Optimize images for edge delivery
doc.css('img').each do |img|
src = img['src']
if src && !src.start_with?('http')
img['src'] = optimize_image_url(src)
img['loading'] = 'lazy'
end
end
doc.to_html
end
end
Cloudflare Workers manage the edge storage and delivery of synchronized content. The Workers handle content routing, caching, and dynamic assembly from edge storage.
// workers/sync-handler.js
export default {
async fetch(request, env, ctx) {
const url = new URL(request.url)
// API endpoint for content synchronization
if (url.pathname.startsWith('/api/sync')) {
return handleSyncAPI(request, env, ctx)
}
// Content delivery endpoint
return handleContentDelivery(request, env, ctx)
}
}
async function handleSyncAPI(request, env, ctx) {
if (request.method !== 'POST') {
return new Response('Method not allowed', { status: 405 })
}
try {
const payload = await request.json()
// Process sync payload
await processSyncPayload(payload, env, ctx)
return new Response(JSON.stringify({ status: 'success' }), {
headers: { 'Content-Type': 'application/json' }
})
} catch (error) {
return new Response(JSON.stringify({ error: error.message }), {
status: 500,
headers: { 'Content-Type': 'application/json' }
})
}
}
async function processSyncPayload(payload, env, ctx) {
const { repository, commits, timestamp } = payload
// Store sync metadata
await env.SYNC_KV.put('last_sync', JSON.stringify({
repository,
timestamp,
commit_count: commits.length
}))
// Process each commit asynchronously
ctx.waitUntil(processCommits(commits, env))
}
async function processCommits(commits, env) {
for (const commit of commits) {
// Fetch commit details from GitHub API
const commitDetails = await fetchCommitDetails(commit.id)
// Process changed files
for (const file of commitDetails.files) {
await processFileChange(file, env)
}
}
}
async function handleContentDelivery(request, env, ctx) {
const url = new URL(request.url)
const pathname = url.pathname
// Try to fetch from edge cache first
const cachedContent = await env.CONTENT_KV.get(pathname)
if (cachedContent) {
const content = JSON.parse(cachedContent)
return new Response(content.html, {
headers: {
'Content-Type': 'text/html; charset=utf-8',
'X-Content-Source': 'edge-cache',
'Cache-Control': 'public, max-age=300' // 5 minutes
}
})
}
// Fallback to Jekyll static site
return fetch(request)
}
// Worker for content management API
export class ContentManager {
constructor(state, env) {
this.state = state
this.env = env
}
async fetch(request) {
const url = new URL(request.url)
switch (url.pathname) {
case '/content/update':
return this.handleContentUpdate(request)
case '/content/delete':
return this.handleContentDelete(request)
case '/content/list':
return this.handleContentList(request)
default:
return new Response('Not found', { status: 404 })
}
}
async handleContentUpdate(request) {
const { path, content, hash } = await request.json()
// Check if content has actually changed
const existing = await this.env.CONTENT_KV.get(path)
if (existing) {
const existingContent = JSON.parse(existing)
if (existingContent.hash === hash) {
return new Response(JSON.stringify({ status: 'unchanged' }))
}
}
// Store updated content
await this.env.CONTENT_KV.put(path, JSON.stringify(content))
// Invalidate edge cache
await this.invalidateCache(path)
return new Response(JSON.stringify({ status: 'updated' }))
}
async invalidateCache(path) {
// Invalidate Cloudflare cache for the path
const purgeUrl = `https://api.cloudflare.com/client/v4/zones/${this.env.CLOUDFLARE_ZONE_ID}/purge_cache`
await fetch(purgeUrl, {
method: 'POST',
headers: {
'Authorization': `Bearer ${this.env.CLOUDFLARE_API_TOKEN}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({
files: [path]
})
})
}
}
Ruby automation scripts handle the complex content transformation and synchronization logic, ensuring content is properly formatted for edge delivery.
# sync_orchestrator.rb
require 'net/http'
require 'json'
require 'yaml'
class SyncOrchestrator
def initialize(cloudflare_api_token, github_access_token)
@cloudflare_api_token = cloudflare_api_token
@github_access_token = github_access_token
@processor = ContentProcessor.new
end
def sync_repository(repository, branch = 'main')
# Get latest commits
commits = fetch_recent_commits(repository, branch)
# Process each commit
commits.each do |commit|
sync_commit(repository, commit)
end
# Trigger edge cache warm-up
warm_edge_cache(repository)
end
def sync_commit(repository, commit)
# Get commit details with file changes
commit_details = fetch_commit_details(repository, commit['sha'])
# Process changed files
commit_details['files'].each do |file|
sync_file_change(repository, file, commit['sha'])
end
end
def sync_file_change(repository, file, commit_sha)
case file['status']
when 'added', 'modified'
content = fetch_file_content(repository, file['filename'], commit_sha)
processed_content = @processor.process_content(
file['filename'],
content,
file['status'].to_sym
)
update_edge_content(processed_content)
when 'removed'
delete_edge_content(file['filename'])
end
end
def update_edge_content(processed_content)
# Send to Cloudflare Workers
uri = URI.parse('https://your-domain.com/api/content/update')
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
request = Net::HTTP::Post.new(uri.path)
request['Authorization'] = "Bearer #{@cloudflare_api_token}"
request['Content-Type'] = 'application/json'
request.body = processed_content.to_json
response = http.request(request)
unless response.is_a?(Net::HTTPSuccess)
raise "Failed to update edge content: #{response.body}"
end
end
def fetch_file_content(repository, file_path, ref)
client = Octokit::Client.new(access_token: @github_access_token)
content = client.contents(repository, path: file_path, ref: ref)
Base64.decode64(content['content'])
end
end
# Continuous sync service
class ContinuousSyncService
def initialize(repository, poll_interval = 30)
@repository = repository
@poll_interval = poll_interval
@last_sync_sha = nil
@running = false
end
def start
@running = true
@sync_thread = Thread.new { run_sync_loop }
end
def stop
@running = false
@sync_thread&.join
end
private
def run_sync_loop
while @running
begin
check_for_updates
sleep @poll_interval
rescue => e
log "Sync error: #{e.message}"
sleep @poll_interval * 2 # Back off on error
end
end
end
def check_for_updates
client = Octokit::Client.new(access_token: ENV['GITHUB_ACCESS_TOKEN'])
commits = client.commits(@repository, since: @last_sync_time)
if commits.any?
log "Found #{commits.size} new commits, starting sync..."
orchestrator = SyncOrchestrator.new(
ENV['CLOUDFLARE_API_TOKEN'],
ENV['GITHUB_ACCESS_TOKEN']
)
commits.reverse.each do |commit| # Process in chronological order
orchestrator.sync_commit(@repository, commit)
@last_sync_sha = commit['sha']
end
@last_sync_time = Time.now
log "Sync completed successfully"
end
end
end
Monitoring ensures the synchronization system operates reliably, while conflict resolution handles edge cases where content updates conflict or fail.
# sync_monitor.rb
require 'prometheus/client'
require 'json'
class SyncMonitor
def initialize
@registry = Prometheus::Client.registry
# Define metrics
@sync_operations = @registry.counter(
:jekyll_sync_operations_total,
docstring: 'Total number of sync operations',
labels: [:operation, :status]
)
@sync_duration = @registry.histogram(
:jekyll_sync_duration_seconds,
docstring: 'Sync operation duration',
labels: [:operation]
)
@content_updates = @registry.counter(
:jekyll_content_updates_total,
docstring: 'Total content updates processed',
labels: [:type, :status]
)
@last_successful_sync = @registry.gauge(
:jekyll_last_successful_sync_timestamp,
docstring: 'Timestamp of last successful sync'
)
end
def track_sync_operation(operation, &block)
start_time = Time.now
begin
result = block.call
@sync_operations.increment(labels: { operation: operation, status: 'success' })
@sync_duration.observe(Time.now - start_time, labels: { operation: operation })
if operation == 'full_sync'
@last_successful_sync.set(Time.now.to_i)
end
result
rescue => e
@sync_operations.increment(labels: { operation: operation, status: 'error' })
raise e
end
end
def track_content_update(content_type, status)
@content_updates.increment(labels: { type: content_type, status: status })
end
def generate_report
{
metrics: {
total_sync_operations: @sync_operations.get,
recent_sync_duration: @sync_duration.get,
content_updates: @content_updates.get
},
health: calculate_health_status
}
end
end
# Conflict resolution service
class ConflictResolver
def initialize(cloudflare_api_token, github_access_token)
@cloudflare_api_token = cloudflare_api_token
@github_access_token = github_access_token
end
def resolve_conflicts(repository)
# Detect synchronization conflicts
conflicts = detect_conflicts(repository)
conflicts.each do |conflict|
resolve_single_conflict(conflict)
end
end
def detect_conflicts(repository)
conflicts = []
# Compare GitHub content with edge content
edge_content = fetch_edge_content_list
github_content = fetch_github_content_list(repository)
# Find mismatches
(edge_content.keys + github_content.keys).uniq.each do |path|
edge_hash = edge_content[path]
github_hash = github_content[path]
if edge_hash && github_hash && edge_hash != github_hash
conflicts << {
path: path,
edge_hash: edge_hash,
github_hash: github_hash,
type: 'content_mismatch'
}
elsif edge_hash && !github_hash
conflicts << { path: path, type: 'orphaned_edge_content' }
elsif !edge_hash && github_hash
conflicts << { path: path, type: 'missing_edge_content' }
end
end
conflicts
end
def resolve_single_conflict(conflict)
case conflict[:type]
when 'content_mismatch'
# Use GitHub as source of truth
sync_content_from_github(conflict[:path])
when 'orphaned_edge_content'
# Remove orphaned content
delete_edge_content(conflict[:path])
when 'missing_edge_content'
# Sync missing content
sync_content_from_github(conflict[:path])
end
log "Resolved conflict: #{conflict[:path]} (#{conflict[:type]})"
end
end
This real-time content synchronization system transforms Jekyll from a purely static generator into a dynamic content platform with instant updates. By leveraging GitHub's webhook system, Ruby's processing capabilities, and Cloudflare's edge network, you achieve the performance benefits of static sites with the dynamism of traditional CMS platforms.