Dynamic Cacheing Nginx Reverse Proxy For Pacman
You set up a dynamic cacheing reverse proxy and then you put the ip address or hostname for that server in /etc/pacman.d/mirrorlist
on your client machines.
Of course if you want to you can set this up and run it in an Nspawn Container. The ArchWiki Page for pacman tips mostly spells out what to do, but I want to document the exact steps I would take.
As for how you would run this on a server with other virtual hosts? Who cares? That is what is so brilliant about using using an nspawn container, in that it behaves like just another computer on the lan with it's own ip address. But it only does one thing, and that's all you have to configure it for.
I see no reason to use nginx-mainline instead of stable.
pacman -S nginx
The suggested configuration in the Arch Wiki
is to create a directory /srv/http/pacman-cache
,
and that seems to work well enough
mkdir /srv/http/pacman-cache
# and then change it's ownershipt
chown http:http /srv/http/pacman-cache
nginx configuration
and then it references an nginx.conf in this gist, but that is not a complete nginx.conf and so here is a method to get that working as of July 2017 with a fresh install of nginx.
You can start with a default /etc/nginx/nginx.conf
,
and add the line include sites-enabled/*;
at the end of the http section.
# /etc/nginx/nginx.conf
#user html;
worker_processes 1;
#error_log logs/error.log;
#error_log logs/error.log notice;
#error_log logs/error.log info;
#pid logs/nginx.pid;
events {
worker_connections 1024;
}
http {
include mime.types;
default_type application/octet-stream;
#log_format main '$remote_addr - $remote_user [$time_local] "$request" '
# '$status $body_bytes_sent "$http_referer" '
# '"$http_user_agent" "$http_x_forwarded_for"';
#access_log logs/access.log main;
sendfile on;
#tcp_nopush on;
#keepalive_timeout 0;
keepalive_timeout 65;
#gzip on;
server {
listen 80;
server_name localhost;
#charset koi8-r;
#access_log logs/host.access.log main;
location / {
root /usr/share/nginx/html;
index index.html index.htm;
}
#error_page 404 /404.html;
# redirect server error pages to the static page /50x.html
#
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}
# proxy the PHP scripts to Apache listening on 127.0.0.1:80
#
#location ~ \.php$ {
# proxy_pass http://127.0.0.1;
#}
# pass the PHP scripts to FastCGI server listening on 127.0.0.1:9000
#
#location ~ \.php$ {
# root html;
# fastcgi_pass 127.0.0.1:9000;
# fastcgi_index index.php;
# fastcgi_param SCRIPT_FILENAME /scripts$fastcgi_script_name;
# include fastcgi_params;
#}
# deny access to .htaccess files, if Apache's document root
# concurs with nginx's one
#
#location ~ /\.ht {
# deny all;
#}
}
# another virtual host using mix of IP-, name-, and port-based configuration
#
#server {
# listen 8000;
# listen somename:8080;
# server_name somename alias another.alias;
# location / {
# root html;
# index index.html index.htm;
# }
#}
# HTTPS server
#
#server {
# listen 443 ssl;
# server_name localhost;
# ssl_certificate cert.pem;
# ssl_certificate_key cert.key;
# ssl_session_cache shared:SSL:1m;
# ssl_session_timeout 5m;
# ssl_ciphers HIGH:!aNULL:!MD5;
# ssl_prefer_server_ciphers on;
# location / {
# root html;
# index index.html index.htm;
# }
#}
include sites-enabled/*;
}
And then create the directory /etc/nginx/sites-enabled
mkdir /etc/nginx/sites-enabled
And then create /etc/nginx/sites-enabled/proxy_cache.conf
,
which is mostly a
copy-and-paste from this gist.
Notice the server_name. This has to match the entry in
/etc/pacman.d/mirrorlist
on the client machines you are
updating from. If you can use the hostname, great. But if you
have to assign static ip addresses and explicitly write the local
ip address instead, then that should match what you write in your mirrorlist.
And of course your mirrorlist entry on the client machine, has to preserve the directory scheme.
# /etc/pacman.d/mirrorlist
Server = http://<hostname or ip address>:<port if not 80>/archlinux/$repo/os/$arch
# /etc/nginx/sites-enabled/proxy_cache.conf
# nginx may need to resolve domain names at run time
resolver 8.8.8.8 8.8.4.4;
# Pacman Cache
server
{
listen 80;
server_name <hostname or ip address>; # has to match the entry in mirrorlist on client machine.
root /srv/http/pacman-cache;
autoindex on;
# Requests for package db and signature files should redirect upstream without caching
# Well that's the default anyway.
# But what if you're spinning up a lot of nspawn containers, don't want to waste all that bandwidth?
# I choose to instead run a systemd timer that deletes the *db files once every 15 minutes
location ~ \.(db|sig)$ {
try_files $uri @pkg_mirror;
# proxy_pass http://mirrors$request_uri;
}
# Requests for actual packages should be served directly from cache if available.
# If not available, retrieve and save the package from an upstream mirror.
location ~ \.tar\.xz$ {
try_files $uri @pkg_mirror;
}
# Retrieve package from upstream mirrors and cache for future requests
location @pkg_mirror {
proxy_store on;
proxy_redirect off;
proxy_store_access user:rw group:rw all:r;
proxy_next_upstream error timeout http_404;
proxy_pass http://mirrors$request_uri;
}
}
# Upstream Arch Linux Mirrors
# - Configure as many backend mirrors as you want in the blocks below
# - Servers are used in a round-robin fashion by nginx
# - Add "backup" if you want to only use the mirror upon failure of the other mirrors
# - Separate "server" configurations are required for each upstream mirror so we can set the "Host" header appropriately
upstream mirrors {
server localhost:8001;
server localhost:8002; # backup
server localhost:8003; # backup
}
# Arch Mirror 1 Proxy Configuration
server
{
listen 8001;
server_name localhost;
location / {
proxy_pass http://mirrors.kernel.org$request_uri;
proxy_set_header Host mirrors.kernel.org;
}
}
# Arch Mirror 2 Proxy Configuration
server
{
listen 8002;
server_name localhost;
location / {
proxy_pass http://mirrors.ocf.berkeley.edu$request_uri;
proxy_set_header Host mirrors.ocf.berkeley.edu;
}
}
# Arch Mirror 3 Proxy Configuration
server
{
listen 8003;
server_name localhost;
location / {
proxy_pass http://mirrors.cat.pdx.edu$request_uri;
proxy_set_header Host mirrors.cat.pdx.edu;
}
}
systemd service that cleans the proxy cache
don't enable the service, enable the timer
systemctl enable/start /etc/systemd/system/proxy_cache_clean.timer
Keeps the 2 most recent versions of each package using paccache command.
# /etc/systemd/system/proxy_cache_clean.service
[Unit]
Description=Clean The pacman proxy cache
[Service]
Type=oneshot
ExecStart=/usr/bin/find /srv/http/pacman-cache/ -type d -exec /usr/bin/paccache -v -r -k 2 -c {} \;
StandardOutput=syslog
StandardError=syslog
systemd timer for the systemd service that cleans the proxy cache
# /etc/systemd/system/proxy_cache_clean.timer
[Unit]
Description=Timer for clean The pacman proxy cache
[Timer]
OnBootSec=20min
OnUnitActiveSec=100h
Unit=proxy_cache_clean.service
[Install]
WantedBy=timers.target
systemd service that deletes the pacman database files from the proxy cache
don't enable the service, enable the timer
systemctl enable/start /etc/systemd/system/proxy_cache_database_clean.timer
You won't need this if you don't cache the database files. But if you do cache the database files, then you'll just be stuck with old database files, unless you periodically delete them. But I'm not sure about all this, will keep an eye on things.
# /etc/systemd/system/proxy_cache_database_clean.service
[Unit]
Description=Clean The pacman proxy cache database
[Service]
Type=oneshot
ExecStart=/bin/bash -c "for f in $(find /srv -name *db) ; do rm $f; done"
StandardOutput=syslog
StandardError=syslog
systemd timer for the systemd service that deletes the pacman database files from the proxy cache
# /etc/systemd/system/proxy_cache_database_clean.timer
[Unit]
Description=Timer for clean The pacman proxy cache database
[Timer]
OnBootSec=10min
OnUnitActiveSec=15min
Unit=proxy_cache_database_clean.service
[Install]
WantedBy=timers.target
If you prefer cron because the server is actually an ubuntu:16.04 LXD container
Make sure single quote in the command here.
#!/bin/bash
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin
5,20,35,50 * * * * /bin/bash -c 'for f in $(find /var/www/html/pacman-cache -name *db) ; do rm $f; done'