SuperGeekery: A blog probably of interest only to nerds by John F Morton.

A blog prob­a­bly of inter­est only to nerds by John F Mor­ton.

Analytics a different way. Plausible Analytics on Laravel Forge with Traefik and Docker.

Out into the desert

Near­ly every­one uses Google Ana­lyt­ics. It’s the indus­try stan­dard for a rea­son. It’s an extreme­ly pow­er­ful tool, but legal and pri­va­cy issues may con­cern you. Want to try a dif­fer­ent route? Come along with me.

TLDR: Install a self-host­ed instance of Plau­si­ble Ana­lyt­ics in a Dock­er con­tain­er with rout­ing by Trae­fik Proxy. Lar­avel Forge is used to pro­vi­sion the serv­er and deploy code.

Where we’re headed

We’re going to use Lar­avel Forge to cre­ate an inex­pen­sive serv­er. We’ll set up the DNS to point some domain names to the server’s IP address. Next, we’ll install Dock­er and a reverse proxy called Trae­fik (some­times called Trae­fik Proxy). Then, based on the instal­la­tion doc­u­men­ta­tion, we’ll install Plau­si­ble Ana­lyt­ics using a tweaked ver­sion of their pre-fab Dock­er image. Final­ly, we’ll dis­cuss back­ing up and restor­ing your Plau­si­ble Ana­lyt­ics data. As an option­al addi­tion­al step, we’ll set up a sta­t­ic web­site as an exam­ple of how to get more mileage from our Trae­fik instal­la­tion. Ready? Let’s go.

Acknowledgements

There are two posts that I want to rec­om­mend that helped me out as I worked on this project. (There are addi­tion­al links at the end of this post.)

Ben is a tal­ent­ed devel­op­er and also a friend. His post com­pelled me to final­ly move off of Google Ana­lyt­ics, a project that’s been on my to-do list for ages. The post by Patri­cio on Code is some­thing I found online when research­ing how to do this. This help­ful post taught me the basics of get­ting Trae­fik installed. I sug­gest­ed read­ing both posts. The Trae­fik fork in this post is a fork of the repo ref­er­enced in Patri­cio’s post.

Github repos we’ll use

There are two repos­i­to­ries I’ve cre­at­ed that I use in my own Plau­si­ble Ana­lyt­ics set­up. Both are forks of oth­er repos. The third repo I’ve includ­ed is one I cre­at­ed dur­ing my debug­ging process to demon­strate how to host a sta­t­ic web­site using Trae­fik.

Each repo con­tains doc­u­men­ta­tion. I hope you’ll con­tin­ue read­ing along with my post :smi­ley:, but you can prob­a­bly get your own Plau­si­ble Ana­lyt­ics instal­la­tion work­ing by going through the repos.

  1. https://​github​.com/​j​o​h​n​f​m​o​r​t​o​n​/​t​r​a​e​f​i​k​-​f​o​r​-​l​a​r​a​v​e​l​-​forge
  2. https://​github​.com/​j​o​h​n​f​m​o​r​t​o​n​/​p​l​a​u​s​i​b​l​e​-​w​i​t​h​-​t​r​a​e​f​i​k​-​u​p​d​a​t​e​-​f​o​r​-​l​a​r​a​v​e​l​-​forge
  3. https://​github​.com/​j​o​h​n​f​m​o​r​t​o​n​/​e​x​a​m​p​l​e​-​s​t​a​t​i​c​-​d​o​c​k​e​r​-​w​e​b​s​i​t​e​-​f​o​r​-​t​r​aefik

Why do this?

For site usage data, Google Ana­lyt­ics is near­ly always the solu­tion a client requests. It’s pow­er­ful, but there are unknowns about its com­pli­ance with Europe’s GDPR laws. To cite two exam­ples, see Stop using Google Ana­lyt­ics, warns Sweden’s pri­va­cy watch­dog, as it issues over $1M in fines and GA4 Legal In Europe Fol­low­ing New Data Pri­va­cy Frame­work. Which is cor­rect? Both? Nei­ther? I don’t know. I’m not a lawyer.

The confusion around various privacy law

If you don’t do busi­ness in Europe, there are also US-based laws around pri­va­cy. Here are a few I found with a cur­so­ry inter­net search.

What if you don’t do busi­ness in the EU or US?

That’s, well… a lot.

Regard­ing my clien­t’s needs, OneTrust was the solu­tion that fit them best. OneTrust gives the end-user con­trol over data col­lec­tion. OneTrust and com­pa­nies like it, Dra­ta, Truyo, Cook­iebot, and oth­ers, none of which I’ve used besides OneTrust, pro­vide toolk­its to cat­e­go­rize cook­ies and track­ing scripts so that end users can block cer­tain cat­e­gories of track­ing. These are paid ser­vices.

Their ser­vices allow sites to com­ply with GDPR and oth­er pri­va­cy rules around the globe and still col­lect user data. One side effect is that we end up with the annoy­ing cook­ie con­sent pop­up that plagues the web today. Ugh.

Not for sale

I don’t want to wor­ry about GDPR com­pli­ance for my per­son­al projects. I also don’t feel com­pelled to share the data my site gen­er­ates with data bro­kers. I sim­ply want basic traf­fic data to see if any­thing I write inter­ests oth­ers.

After years of using Google Ana­lyt­ics, I’ve recent­ly switched to Plau­si­ble Ana­lyt­ics. My friend Ben’s arti­cle, Replac­ing Google Ana­lyt­ics with Self-host­ed Ana­lyt­ics, gave me the push I need­ed to devote the time to get it set up the way I want­ed. That’s what the bulk of this post is about.

The data collected

What sort of data can you expect from Plau­si­ble Ana­lyt­ics?

  • unique vis­i­tors
  • total vis­its (ses­sions)
  • total pageviews
  • views per vis­it
  • bounce rate
  • vis­it dura­tion
  • most vis­it­ed pages
  • cus­tom events.

I’ve made my dash­board pub­lic, so check it out at https://analytics.jmx.dev/supergeekery.com. I don’t use cus­tom events, so you will not see that data in my dash­board. Don’t wor­ry about expos­ing your ana­lyt­ic data to the world if you use Plau­si­ble. Pub­licly shar­ing your ana­lyt­ics data is not the default. I’m shar­ing this to help you under­stand what data to expect if you fol­low this path.


Supergeekery plausible screenshot

A screenshot from this blog's Plausible Analytics dashboard on August 1, 2023.


To self-host or not self-host. That is the question.

Google Ana­lyt­ics is easy. Google gives you a tag to include on your site, and you don’t have to wor­ry about the ser­vice behind the scenes col­lect­ing the data. You can do the same thing with Plau­si­ble. Plau­si­ble pro­vides ana­lyt­ics as a ser­vice for a rea­son­able fee.

Why would you pay for Plau­si­ble when Google Ana­lyt­ics is free? Regard­ing the cost of Google Ana­lyt­ics, I think of the ser­vice as free-to-use,” but you’re pro­vid­ing data to Google, which has val­ue to the com­pa­ny. Data is cur­ren­cy. Their pow­er­ful free-to-use ser­vice, Google Ana­lyt­ics, is what you’re get­ting in exchange for that trans­ac­tion.

If you’d like to read more on the top­ic from the cre­ators of Plau­si­ble Ana­lyt­ics, read What makes Plau­si­ble a great Google Ana­lyt­ics alter­na­tive on their web­site.

If you want the eas­i­est and most fool-proof way to use Plau­si­ble Ana­lyt­ics, avoid self-host­ing and sign up for a Plau­si­ble host­ed plan. At the time I write this, it starts at about $9/​month.

If you want to self-host, your month­ly fees will be the costs of the serv­er, but you’ll be man­ag­ing that serv­er your­self.

Since I already man­age sev­er­al servers, self-host­ing is the path I’ve cho­sen. For cost-com­par­i­son pur­pos­es, using the cheap­est Dig­i­tal Ocean serv­er, I can con­fig­ure to host Plau­si­ble Ana­lyt­ics costs about $5 + tax/​month.

As I men­tioned ear­li­er, I read How to Deploy Dock­er Appli­ca­tions with Lar­avel Forge by Patri­cio on Code. It paved the way for my ini­tial work on get­ting Trae­fik installed. I ran into issues when adding Plau­si­ble Ana­lyt­ics to the mix, but I sug­gest you check out that post. Although the Trae­fik repo we’ll use has been mod­i­fied, the Trae­fik instal­la­tion process I describe below is inspired by that post.

Self-hosting? You need a server.

I use Lar­avel Forge to pro­vi­sion and man­age the servers I use. I host quite a few Craft CMS sites this way, includ­ing this blog you’re read­ing now. Since it’s the tool I already use and pay for, that’s how I am set­ting up my Plau­si­ble Ana­lyt­ics serv­er.

In the Lar­avel Forge con­trol pan­el, I cre­at­ed a brand new Dig­i­tal Ocean serv­er. I chose the cheap­est and small­est option avail­able. When select­ing the serv­er type, I set this up as a Work­er Serv­er, which only has PHP pre-installed. Make note of the IP address you were giv­en for the serv­er.


Set up your DNS

You could set up your DNS lat­er, but we now have the server’s IP address, so I sug­gest set­ting up DNS at this stage. This will allow time for the DNS records to prop­a­gate while the rest of the work hap­pens.

I set up three A records point­ing to the domain I want­ed to use.

  • analytics.jmx.dev
  • traffic.jmx.dev (Yes, I used traf­fic” instead of trae­fik” for rea­sons.)
  • hello-world.jmx.dev

Each points to the IP address Lar­avel Forge pro­vid­ed in the pre­vi­ous step. In Ben’s arti­cle, he sug­gest­ed using Cloud­flare as a proxy for the Plau­si­ble Ana­lyt­ics domain. Cloud­flare will cache your script and reduce the load on your serv­er.


Your server will need Docker.

This min­i­mal serv­er does not come with Dock­er pre-installed.

Patri­cio’s post includ­ed a recipe to install Dock­er in Lar­avel Forge. Although you could install Dock­er via the com­mand line, han­dling the instal­la­tion using a recipe allows you to reuse code snip­pets on future servers eas­i­ly. You will prob­a­bly want Dock­er again at some point, right?

You can ref­er­ence the orig­i­nal recipe or use my ver­sion, which includes a slight mod­i­fi­ca­tion, an addi­tion­al sudo com­mand, to the recipe.

Add this script as a Recipe in Lar­avel Forge and run it on your serv­er. Now you’ve got Dock­er on your serv­er. You can con­firm that Dock­er is installed suc­cess­ful­ly by log­ging into the serv­er and enter­ing docker --version on the com­mand line.


Set up Traefik Proxy

In the server’s dash­board in Lar­avel Forge, open the New Site sec­tion. Fill in the Direc­to­ry Name field. I named mine traefik. Select Sta­t­ic HTML as the Project Type and click the Add Site but­ton. After a brief wait, your emp­ty site will be cre­at­ed.

In the App sec­tion of your new site, install Trae­fik Proxy using a Git repo. You can use my repos­i­to­ry from GitHub or fork it and make a cus­tomized ver­sion.

Enter the repos­i­to­ry name, johnfmorton/traefik-for-laravel-forge, and choose the main branch.

Uncheck the Install Com­pos­er Depen­den­cies check­box because we are not installing a PHP app. Click Install Repos­i­to­ry.

Set up your environment variables

The repo’s code is now on the serv­er, but no app has been installed yet.

We need to set some envi­ron­ment vari­ables first. Click the Envi­ron­ment nav item and set the fol­low­ing with your own val­ues.

TRAEFIK_DASHBOARD_HOST=traefik.example.com
[email protected]
REDIRECT_IP_ADDRESS_TO_URL=https://example.com

The REDIRECT_IP_ADDRESS_TO_URL is only used to redi­rect site traf­fic that may come to your IP address direct­ly. I redi­rect that traf­fic to my blog.

Update your Traefik deployment script

Click the App nav item. Update your Deploy Script to the fol­low­ing.

Note that the first line reflects the direc­to­ry name I chose above, traefik. If you choose a dif­fer­ent direc­to­ry name, you’ll use that instead. Also, check the option to make your envi­ron­ment vari­ables avail­able to the deploy script so that the TRAEFIK_DASHBOARD_HOST vari­able is replaced with your dash­board URL. As you can tell from the length of this post, I like ver­bose things, which is also reflect­ed in the echo state­ments I include in my deploy­ment script.

cd /home/forge/traefik

# We assume Docker has already been installed using the "Install Docker and Docker-Compose" recipe.
# The "Make .env variables available to deploy script" must be checked in Laravel Forge

# Confirm Docker is installed and available
if ! command -v docker &> /dev/null; then
    echo "Docker is not installed. Please install Docker and try again."
    exit 1
fi

echo "Deploying Traefik at ${TRAEFIK_DASHBOARD_HOST}"
echo "commit @${FORGE_DEPLOY_COMMIT} -- ${FORGE_DEPLOY_MESSAGE}"

if [[ $FORGE_MANUAL_DEPLOY -eq 1 ]]; then
    echo "This deploy was triggered manually."
fi

git pull origin $FORGE_SITE_BRANCH

# Create the 'proxy' network, if it doesn't already exist
echo "Creating the proxy network if it does not already exist"
docker network ls | grep proxy || docker network create proxy

# Use the docker-compose.prod.yml file to spin up the service
echo "Docker up with docker-compose.prod.yml.\n"
docker-compose -f docker-compose.prod.yml up -d --remove-orphans

echo "Deployment complete."

Protecting your Traefik proxy from prying eyes

We want to keep our proxy from being avail­able to the pub­lic at large. We do that with a piece of mid­dle­ware for Trae­fik called Basi­cAuth. The docker-compose.yml file includ­ed the mid­dle­ware for basic auth.

## AUTH
- traefik.http.middlewares.auth.basicauth.usersfile=/auth/traefik.auth

It will look in a user­File called traefik.auth for the valid user­name and pass­word com­bi­na­tions. The orig­i­nal Trae­fik repo con­tained a handy script to make set­ting this up easy. Use the fol­low­ing com­mand, which you’ll run from the com­mand line with­in your server’s traefik direc­to­ry. Replace the user­name and pass­word with your own val­ues.

./Taskfile auth username password

You can see the name and pass­word com­bi­na­tion cre­at­ed by enter­ing cat traefik.auth in your ter­mi­nal.

If you don’t want to use the script, you can use htpass­wd to cre­ate the name/​password com­bi­na­tion and man­u­al­ly update the traefik.auth file.

You should now be able to nav­i­gate to the domain name you set up. You should be prompt­ed for the user­name and pass­word you set for basic authen­ti­ca­tion. Once you log in, you should see the Trae­fik dash­board.


Traefik dashboard

The Traefik dashboard


Set up Plausible Analytics

We will repeat a sim­i­lar process set­ting up the Plau­si­ble ser­vice, so I’ll jump through these steps more quick­ly.

Set up anoth­er new site in your Lar­avel Forge dash­board for the serv­er. I called mine analytics. Install Plau­si­ble using my repo or a ver­sion you have forked.

Configure the Plausible environmental variables

There are quite a few envi­ron­men­tal vari­ables. I sug­gest copy­ing the entire con­tents of my example.env into your Envi­ron­ments file. Some of them you can leave as is, but here are the ones you will need to cus­tomize. The URL set­tings are finicky about the pro­to­col.

  • URL_FOR_TRAEFIK — This does not include the pro­to­col. For exam­ple, mine is set to analytics.jmx.com.
  • BASE_URL — This does include the pro­to­col. For exam­ple, mine is set to https://analytics.jmx.com.
  • SECRET_KEY_BASE — This must be 64 char­ac­ters long. The Plau­si­ble docs sug­gest using openssl. You can also try the GRC Ultra High Secu­ri­ty Pass­word Gen­er­a­tor.

One note about the DISABLE_REGISTRATION vari­able. I have it set to false, but I ini­tial­ly launched this with it set to true so that I could reg­is­ter my account. I then set it to false and rede­ployed my app.

You will also need to cus­tomize the vari­ables for trans­ac­tion­al email, but you can fig­ure that out pret­ty eas­i­ly. You can have Plau­si­ble send you email reports of your data. It’s also used to reset your pass­word for the Plau­si­ble ser­vice.

Update your Plausible deployment script

Just like we cus­tomized the deploy­ment script for Trae­fik, we cus­tomize the deploy script for Plau­si­ble.

cd /home/forge/analytics

# We assume Docker has already been installed using the "Install Docker and Docker-Compose" recipe.
# The "Make .env variables available to deploy script" must be checked in Laravel Forge

# Confirm Docker is installed and available
if ! command -v docker &> /dev/null; then
    echo "Docker is not installed. Please install Docker and try again."
    exit 1
fi

# the "Make .env variables available to deploy script" is check in Laravel Forge

echo "Deploying: ${APP_NAME} at ${BASE_URL}"
echo "commit @${FORGE_DEPLOY_COMMIT} -- ${FORGE_DEPLOY_MESSAGE}"

if [[ $FORGE_MANUAL_DEPLOY -eq 1 ]]; then
    echo "This deploy was triggered manually."
fi

git pull origin $FORGE_SITE_BRANCH

echo "Docker up with docker-compose.yml combined with reverse-proxy/traefik/docker-compose.traefik.yml"
docker-compose -f docker-compose.yml -f reverse-proxy/traefik/docker-compose.traefik.yml up -d --remove-orphans

echo "Deploy complete."

Deploy your Plausible app

Now you can click the Deploy Now but­ton in the Lar­avel Forge dash­board. After it com­pletes deploy­ment, go vis­it the URL you set up for your ana­lyt­ics. I’ve noticed a slight delay with the site load­ing when I push a new Plau­si­ble ver­sion to my serv­er. You may see a Bad Gate­way” mes­sage for a few sec­onds ini­tial­ly while Trae­fik is con­fig­ur­ing its reverse proxy. If so, give it a few sec­onds to com­plete its con­fig­u­ra­tion. I’d say this should last no more than 15 to 25 sec­onds.

If every­thing goes accord­ing to plan, Plau­si­ble Ana­lyt­ics is run­ning on your serv­er, and you can reach it via your cho­sen URL. As for set­ting up Plau­si­ble to track your site, con­sult the offi­cial doc­u­men­ta­tion. It’s easy.


Backup your analytics data

After you set up your site to col­lect data, we must dis­cuss back­ups. Since we’re self-host­ing, this is our prob­lem to solve, not Plau­si­ble’s. Are you sure you don’t want to sim­ply give them $9/​month yet? :squinting_​face:

Four scripts for backups

There are four scripts in the repo we will use to back­up and restore data.

  1. back​up​-post​gres​.sh
  2. back​up​-click​house​-data​.sh
  3. restore​-post​gres​.sh
  4. restore​-click​house​-data​.sh

Plau­si­ble uses two data­bas­es. The first is a Post­gres data­base to store infor­ma­tion like your user infor­ma­tion and the sites you want to track. The sec­ond is a Click­house data­base with each site’s usage data. You need to back up both data­bas­es.

Environmental variable for backup scripts

Sev­er­al envi­ron­men­tal vari­ables are shown in the example.env relat­ed to the back­up scripts. You can leave all of them as their default val­ues or cus­tomize them to suit your needs.

  • LOCAL_BACKUP_PATH
  • LOCAL_POSTGRES_PATH
  • LOCAL_CLICKHOUSE_PATH
  • LOCAL_BACKUP_RETENTION_DAYS — this defaults to 7 if it is not set
  • POSTGRES_CONTAINER_NAME
  • CLICKHOUSE_CONTAINER_NAME

In the Lar­avel Forge dash­board, I set up my sched­uled jobs, as shown in the screen­shot below. I run backup-postgres.sh at mid­night and backup-clickhouse-data.sh one minute lat­er. Be sure they run as root, or they will exit with­out com­plet­ing.


Forge cron jobs plausible backups

The Plausible cron jobs should run as the root user.


If you set this up to run as I have, you’ll keep one week’s worth of back­up data in the /backups direc­to­ry on your serv­er. I sug­gest sync­ing this direc­to­ry to an off­site serv­er, like an S3 buck­et, every night. I don’t have that in this repo as of now, but I will like­ly update it to include it.

Restoring your Plausible Analytics data

If you’re mov­ing servers or recov­er­ing from a serv­er emer­gency, you need some way to restore your backed-up data. That’s where restore-postgres.sh and restore-clickhouse-data.sh come in.

restore-postgres.sh backup-postgres-data.zip
restore-clickhouse-data.sh backup-clickhouse-event-data.zip

Try to restore data some­time before you need it. Your future self will thank you.


Use Traefik to host a static site

You’ve made it this far and might be ready to get on with your life. But wait, there’s more.

You have a work­ing Trae­fik proxy that can do more than man­age traf­fic for your ana­lyt­ics. The third repo I’ve pre­pared will show you how to host a basic sta­t­ic web­site from inside a Dock­er con­tain­er. This is just to show you how easy it is to do once you have Trae­fik set up. Pay atten­tion to the labels in the dock­er con­fig file. For exam­ple, you could make an API for your­self and host that.

Make a new site and deploy it

You know how to do this already. Let’s dive in. Make a new site in the Lar­avel Forge dash­board. I called mine hello-world. Install my johnf­mor­ton/ex­am­ple-sta­t­ic-dock­er-web­site-for-trae­fik repo.

You have only one envi­ron­men­tal vari­able to set. We dis­cussed this in the DNS set­up at the begin­ning. I set mine to hello-world.jmx.dev.

SITE_URL=example.com

The deploy script needs to be updat­ed with the dock­er com­mand.

cd /home/forge/hello-world

# We assume Docker has already been installed using the "Install Docker and Docker-Compose" recipe.
# The "Make .env variables available to deploy script" must be checked in Laravel Forge

# Confirm Docker is installed and available
if ! command -v docker &> /dev/null; then
    echo "Docker is not installed. Please install Docker and try again."
    exit 1
fi

git pull origin $FORGE_SITE_BRANCH

if [ -f artisan ]; then
    $FORGE_PHP artisan migrate --force
fi

docker-compose -p basic-web-site -f docker-compose.yml up -d --remove-orphans

Then click Deploy Now” and vis­it your URL.

Pesto, you’ve got anoth­er site rout­ed with Trae­fik. Did you notice your Let’s Encrypt cer­tifi­cate was auto­mat­i­cal­ly cre­at­ed? Check out my ver­sion here: https://hello-world.jmx.dev/

Check out the docker-compose.yml to see how sim­ple this was to cre­ate.

The end… or just the beginning?

Whoa, part­ner. That was a long ride. Did you make it? Let me know how it went. Ping me on social media. I keep the con­tact page updat­ed with how to track me down. Hap­py trails.


Riding into sunset

Let the credits roll

I read a lot of post as I worked though this. Here are links ref­er­enced dur­ing devel­op­ment.

If you don’t use Lar­avel Forge, Dig­i­tal Ocean has a cou­ple of links about using Trae­fik and Plau­si­ble on a serv­er you set up.