A blog probably of interest only to nerds by John Morton.

10Aug2015

Tuto­r­i­al: Using Gulp with html­process and concatenate

I’ve been teach­ing myself Gulp recent­ly. I found the basics easy to pick up. Although I didn’t con­sid­er myself a Grunt expert, I’d taught myself Grunt pre­vi­ous­ly and that helped with me with Gulp.

Before we go fur­ther, check out the GitHub repos­i­to­ry that con­tains the fin­ished exam­ple project we’ll go over here: https://​github​.com/​j​o​h​n​f​m​o​r​t​o​n​/​u​s​i​n​g​-​g​u​l​p​-​h​t​m​l​p​r​o​c​e​s​s​-​e​x​ample

One thing I have seen in both Grunt and Gulp work­flows I’ve used from their repos­i­to­ries was a script code block that looks like the following.

<!-- build:js allscripts.min.js -->
  <script src="script1.js"></script>
  <script src="script2.js"></script>
  <script src="script3.js"></script>
<!--/build--> 

This is a real­ly cool block of code.

Dur­ing your devel­op­ment load­ing in this HTML file, three sep­a­rate script files would be loaded into your page. These indi­vid­ual script files help break up your code into more man­ag­i­ble chunks.

The com­ment tags sur­round­ing the 3 script tags give a hint as to what’s going to hap­pen when using the Grunt or Gulp work­flow. You would type in some­thing like gulp processfiles and your work­flow would take the HTML file and do var­i­ous types on it and out­put a set pro­duc­tion ready files.

In the process those three sep­a­rate script includes would be con­cate­nat­ed and mini­fied into a sin­gle script file and the HTML itself would be altered to use the includ­ed for only the mini­fied file. It’s magic.

I had seen this set up work­flows oth­er peo­ple had cre­at­ed but recent­ly I’ve been work­ing on my own cus­tom work­flow for project I am doing. I want­ed a sim­i­lar concatenate/​minify process in my own files. The trou­ble was I didn’t know how to do this after read­ing the doc­u­men­ta­tion for the var­i­ous Gulp plug-ins I was using. Some­times I find the doc­u­men­ta­tion a lit­tle dry.

Luck­i­ly, I’ve got it work­ing now so I thought I would doc­u­ment what I did in case it helps oth­ers get this work­ing. By oth­ers, I’m includ­ing myself, because I assume I will stum­ble upon this post at a lat­er date when I’m search­ing for this solu­tion again.

First of all, I’m only ref­er­enc­ing Gulp as I go through this, but this post can be applied to Grunt fair­ly eas­i­ly because Grunt, like Gulp, is just javascript.

So, giv­en that this is going to be a Gulp work­flow we have a num­ber of plug-ins to add. Each Gulp plug-in is used for a dis­crete piece of func­tion­al­i­ty; it is kept inten­tion­al­ly sin­gle-focused. That is why you some­times have what might seem like a large num­ber of plu­g­ins when you wan to accom­plish fair­ly tasks.

In ter­mi­nal, start by mak­ing your pack­age file.

npm init

You can just use the defaults or mod­i­fy them as you see fit. I end up with a package.json file that looks like this.

{
  "name": "gulp-htmlprocessing-example",
  "version": "0.0.1",
  "description": "Sample project for Gulp HTML Processing",
  "main": "gulpfile.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "John Morton",
  "license": "ISC"
} 

Now install the gulp module:

npm install gulp --save-dev

Then install gulp-processhtml:

npm install gulp-processhtml --save-dev

Then install gulp-con­cat:

gulp-concat

Last­ly, we’ll install del, which is short for delete. It’s not a Gulp plu­g­in, but a more basic node mod­ule. (See the Gulp doc for specifics on using del here.)

npm install del --save-dev

Now your package.json file should basi­cal­ly look like this:

{
  "name": "gulp-htmlprocessing-example",
  "version": "0.0.1",
  "description": "Sample project for Gulp HTML Processing",
  "main": "gulpfile.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "John Morton",
  "license": "ISC",
  "devDependencies": {
    "del": "^1.2.0",
    "gulp": "^3.9.0",
    "gulp-concat": "^2.6.0",
    "gulp-processhtml": "^1.1.0"
  }
} 

You will now also have a new direc­to­ry called note_​modules. It will con­tain the things we have just installed.

To make this test project work, we need a very basic HTML page built. It’s part of the github repos­i­to­ry that I’ve cre­at­ed for this post, but here’s the basic struc­ture of the files we need:

dev ├── index.html ├── script1.js ├── script2.js └── script3.js

At a min­i­mum, the index.html file will need to have the script tag men­tioned at the begin­ning of this arti­cle. The script#.js files can each have a console.log mes­sage in them.

Now it is time to cre­ate your gulpfile.js. You may do this from with­in the ter­mi­nal by typ­ing touch gulpfile.js and hit­ting enter. Open this file up in your text editor.

Let’s get all the mod­ules we’ve installed stored in vari­ables at the top of our gulpfile.js.

// dependencies
var gulp = require('gulp');
var processhtml = require('gulp-processhtml');
var concat = require('gulp-concat');
var del = require('del'); 

Now, let’s write our first Gulp task and name it processhtml”.

gulp.task('processhtml', function() {
    // remove existing replacementlist.txt & dist folder if they exist
    del([
        'replacementlist.txt',
        'dist'
    ])

    /* options for processhtml */
    var options = {
        list: "replacementlist.txt"
    };

    return gulp.src('dev/index.html')
        .pipe(processhtml(options))
        .pipe(gulp.dest('dist'));
}); 

Now, let’s test what we’ve got so far.

gulp processhtml

You should see a new dist fold­er was cre­at­ed and it con­tains a sin­gle file, index.html. It’s con­tents should be iden­ti­cal to the index.html file in your dev’ fold­er except in the script area. The lines that import­ed the 3 indi­vid­ual script files into your orig­i­nal index.html have been replaced with a sin­gle line that imports only a sin­gle script file, allscripts.min.js. That’s cool, but there is no allscripts.min.js there yet.

We need to get a list of the files that were replaced. In the processhtml call we made, we passed in an option to gen­er­ate just such a list. It’s stored in the file replacementlist.txt will will be cre­at­ed in the same direc­to­ry as your gulpfile.js. You should see it there now. Take a look at the con­tents of this file. It looks some­thing like this:

/Users/username/Documents/myproject/dev/index.html:script1.js /Users/username/Documents/myproject/dev/index.html:script2.js /Users/username/Documents/myproject/dev/index.html:script3.js

On the right hand side of each line the files that were replaced are list­ed. Pre­ced­ing the name of each file is a : and the full path to the file that this gen­er­at­ed this replace­ment. This path infor­ma­tion would be use­ful if you were pro­cess­ing a bunch of files in a sin­gle oper­a­tion. We’re deal­ing with a sin­gle file here, index.htmel, so we don’t need to wor­ry about that.

We need to take this list and pick out the file names we need to con­cate­nate. This is a job for a regex. Our regex will go through each line and find the colon and the remain­char­ac­ters in each line.

The regex for this is :.+. Regex can be tricky to under­stand so let’s go through this piece by piece.

  1. The : finds the colon character.
  2. The . match­es any sin­gle char­ac­ter except a new line char­ac­ter after the colon we just found.
  3. The + caus­es the pre­vi­ous selec­tion, the peri­od that is match­ing any sin­gle char­ac­ter, to be matched repeat­ed­ly as many times as pos­si­ble and for the selction to be as large as pos­si­ble. That means it will keep match­ing char­ac­ters until it gets to the new line character.

Now we need to use this regex in a javascript match func­tion to go through the list of files, and map each new file name into an array and store it in a vari­able we’ll call files. Also, dur­ing that map process, we’ll replace the : with the path, dev/, where these local files are stored with­in our project.

var files = fileList.match(/:.+/ig).map(function(matched) {
    // replace ':'' with 'dev/'
    return matched.replace(/:/, 'dev/');
}); 

If you look at the regex again, you might won­der how we got from :.+ to /:.+/ig. The / char­ac­ters sim­ply indi­cate the start and end of the regex expres­sion. The i means ignore case’. In oth­er words, it will treat caps and low­er­case let­ters the same. The g means glob­al, do this oper­a­tion on the whole file. (See Mozilla’s artice on regex flags.)

Now we’ve got an array of the files we want to replace in a vari­able called files. We can pass this vari­able into the concat task to con­cate­nate these files togeth­er in the dist’ folder.

return gulp.src(files)
  .pipe(concat('alerts.min.js'))
  .pipe(gulp.dest('./dist/')); 

Putting it all togeth­er, here’s what the conact task looks like:

gulp.task('concat', ['processhtml'], function() {
    // Try to read the replacementlist.txt file.
    try {
        var fileList = require('fs').readFileSync('replacementlist.txt', 'utf8');
        // remove the  replacementlist.txt because we're done with it
        del([
            'replacementlist.txt'
        ])
    } catch (e) {
        // If there was an error, it's probably  because the file wasn't there.
        console.error(e);
        // stop running this function
        return;
    }
    // we match a regex against the 'fileList' and map
    // the results back to an array called 'files'
    var files = fileList.match(/:.+/ig).map(function(matched) {
        // for each matched item (ie each line)
        // replace the ':'' with 'dev/'
        return matched.replace(/:/, 'dev/');
    });

    console.log("Files to be replaced:", files);

    return gulp.src(files)
        .pipe(concat('alerts.min.js'))
        .pipe(gulp.dest('./dist/'));
}); 

You’ll see I added a few extra things in here that I haven’t men­tioned so far. First, the first line (gulp.task('concat', ['processhtml'], function() {) includes an extra para­me­ter for the processhtml task. This is a way of telling Gulp that the con­cat task is depen­dent on the processhtml task to have been run.

I also includ­ed anoth­er del state­ment in there to get rid of the replacementlist.txt file we used to tem­porar­i­ly store the list of files we want­ed to con­cate­nate. You’re an adult, after all, and you’re clean­ing up after your­self is important.

To wrap all of this up in a bow, let’s make the default Gulp task run these tasks in order.

gulp.task('default', ['processhtml','concat']); 

With this line in your gulpfile.js, you can sim­ply type in gulp at the com­mand line at the root of your project and it will do all these tasks for you.

So where do you go from here? Obvi­ous­ly this is not a com­plete project. I’m sim­ply try­ing to iso­late the cre­ation of a sin­gle task.

Where to go from here.

You would also want to actu­al­ly mini­fy the con­tents of your JavaScript file that you con­cate­nat­ed in your dis­tri­b­u­tion fold­er. Also, for just sim­ple devel­op­ment ease of use, you’d prob­a­bly want to set up a serv­er to serve the files in your devel­op­ment fold­er as you work on them. Then, you’d prob­a­bly want to set up a watch task to mon­i­tor files in your devel­op­ment fold­er to trig­ger a live reload of your brows­er win­dow as you work on most of the files.