Updating static sites using Gatsby.js and Node.js in the back end

In this series, we’ll build a prototype for a static website that displays the most recent releases for well-known GitHub repositories. It will do this by producing simple static HTML pages that are updated daily. Frameworks for building static sites are great for this, and we’ll be using Gatsby.js, one of the most popular options.

Gatsby offers several serverless methods for gathering data for your front end, such as Headless CMS platforms and Gatsby source plugins. However, to have complete control over both the front and back ends, we’ll set up a back end to hold fundamental details about GitHub repositories and their most recent releases.

Additionally, I’ll go over a number of techniques that will enable you to manually update your application every day, or whenever a particular event takes place.

While our back-end application is hosted on Heroku with a free plan, our front-end application will operate on Netlify. It will sleep periodically: “The dyno manager will automatically awaken the web dyno to execute the web process type when someone accesses the app.” Thus, we can use AWS Lambda and AWS CloudWatch to rouse it. As of this writing, this is the most affordable method to keep a prototype online continuously.

Our Example Node Static Website: What to Anticipate

In the interest of clarity, this article won’t go into detail about authentication, validation, scalability, or other broad subjects. The project’s coding will be as straightforward as possible. More crucial are the tools’ proper use and the project’s organization.

We will create and deploy our back-end application in this first section of the series. We will create and deploy our front-end application in the second section, along with a trigger for daily builds.

The Back End in Node.js

Node.js will be used to create the back-end application (not required, but for simplicity), and all communication will be done via REST APIs. We won’t be gathering data from the front end for this project. (Take a look at Gatsby Forms if you’re interested in doing that.)

We’ll start by creating a straightforward REST API back end that exposes the CRUD operations for our MongoDB repository collection. We will then set up a cron job that uses the GitHub API v4 (GraphQL) to update the documents in this collection. After that, we’ll upload all of this to the Heroku cloud. Finally, after our cron job has finished, we’ll start a rebuild of the front end.

The Front End in Gatsby.js

We will concentrate on putting the createPages API](https://www.gatsbyjs.org/docs/creating-and-modifying-pages/) into practice in the second article. We’ll retrieve every repository from the back end and produce a single home page that lists every repository, along with a page for each repository record that is returned. After that, we’ll [publish our front end to Netlify.

From AWS Lambda and AWS CloudWatch

This step is optional if your application doesn’t go to sleep. You must ensure that your back end is operational while repositories are being updated if it doesn’t. You can set up a cron schedule on AWS CloudWatch 10 minutes before your daily update as a solution and link it to your AWS Lambda GET method as a trigger. The Heroku instance will awaken when the back-end application is accessed. At the conclusion of the second article, there will be more information.

The architecture we’ll be utilizing is as follows:

Architecture diagram showing AWS Lambda & CloudWatch pinging the Node.js back end, which gets daily updates by consuming the GitHub API and then builds the Gatsby-based front end, which consumes back end APIs to update its static pages and deploys to Netlify. The back end also deploys to Heroku with a free plan.

Assumptions

I presume that this article’s target audience is knowledgeable in the following fields:

HTML
CSS
JavaScript
REST APIs
MongoDB
Git
Node.js

It’s also beneficial if you are familiar with:

Express.js
Mongoose
GitHub API v4 (GraphQL)
Heroku, AWS, or another cloud provider
React

Let’s get started building the back end right away. We’ll divide it into two steps. Preparing REST API endpoints and linking them to our repository collection is the first step. Implementing a cron job that uses the GitHub API and updates the collection is the second.

Building the Back End of the Node.js Static Site Generator, Step 1: A Basic REST API

We’ll use Mongoose to handle our MongoDB connection and Express for our web application framework. You might be able to skip to Step 2 if you are already familiar with Express and Mongoose.

(If you need a refresher on Express, the official Express starter guide can help; if Mongoose is new to you, the official Mongoose starter guide should be helpful.)

Project Structure

Our project’s structure of files and folders will be straightforward:

A folder listing of the project root, showing config, controller, model, and node_modules folders, plus a few standard root files like index.js and package.json. The files of the first three folders follow the naming convention of repeating the folder name in each filename within a given folder.

In greater detail:

Environment variable configuration file: env.config.js
For mapping rest endpoints: routes.config.js
Methods for working with our repository model are located in repository.controller.js
MongoDB schema for the repository and CRUD operations: repository.model.js
Initializer class: index.js
Project properties and dependencies are listed in package.json

Implementation

After adding these dependencies to package.json, execute npm install (or yarn, if you have Yarn installed):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
{
  // ...
  "dependencies": {
    "body-parser": "1.7.0",
    "express": "^4.8.7",
    "moment": "^2.17.1",
    "moment-timezone": "^0.5.13",
    "mongoose": "^5.1.1",
    "node-uuid": "^1.4.8",
    "sync-request": "^4.0.2"
  }
  // ...
}

Our env.config.js file currently only contains the port, environment (dev or prod), and mongoDbUri properties:

1
2
3
4
5
module.exports = {
  "port": process.env.PORT || 3000,
  "environment": "dev",
  "mongoDbUri": process.env.MONGODB_URI || "mongodb://localhost/github-consumer"
};

Request mappings are stored in routes.config.js, which will invoke the appropriate method in our controller:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
const RepositoryController = require('../controller/repository.controller');

exports.routesConfig = function(app) {

  app.post('/repositories', [
    RepositoryController.insert
  ]);

  app.get('/repositories', [
    RepositoryController.list
  ]);

  app.get('/repositories/:id', [
    RepositoryController.findById
  ]);

  app.patch('/repositories/:id', [
    RepositoryController.patchById
  ]);

  app.delete('/repositories/:id', [
    RepositoryController.deleteById
  ]);
};

The file repository.controller.js represents our service layer. It is in charge of calling the appropriate function in our repository model:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
const RepositoryModel = require('../model/repository.model');

exports.insert = (req, res) => {
  RepositoryModel.create(req.body)
    .then((result) => {
      res.status(201).send({
        id: result._id
      });
    });
};

exports.findById = (req, res) => {
  RepositoryModel.findById(req.params.id)
    .then((result) => {
      res.status(200).send(result);
    });
};

exports.list = (req, res) => {
  RepositoryModel.list()
    .then((result) => {
      res.status(200).send(result);
    })
};

exports.patchById = (req, res) => {
  RepositoryModel.patchById(req.params.id, req.body)
    .then(() => {
      res.status(204).send({});
    });
};

exports.deleteById = (req, res) => {
  RepositoryModel.deleteById(req.params.id, req.body)
    .then(() => {
      res.status(204).send({});
    });
};

MongoDb connectivity and CRUD operations for the repository model are handled by repository.model.js. The model’s fields are:

owner: The repository’s owner (business or individual).
name: Repository’s name
createdAt: Last release’s creation date
resourcePath: Last release’s path
tagName: Last release’s tag
releaseDescription: Notes on the release
homepageUrl: The homepage URL of the project.
repositoryDescription: Repository’s description
avatarUrl: Project owner’s avatar URL

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
const Mongoose = require('mongoose');
const Config = require('../config/env.config');

const MONGODB_URI = Config.mongoDbUri;

Mongoose.connect(MONGODB_URI, {
  useNewUrlParser: true
});

const Schema = Mongoose.Schema;

const repositorySchema = new Schema({
  owner: String,
  name: String,
  createdAt: String,
  resourcePath: String,
  tagName: String,
  releaseDescription: String,
  homepageUrl: String,
  repositoryDescription: String,
  avatarUrl: String
});

repositorySchema.virtual('id').get(function() {
  return this._id.toHexString();
});

// Ensure virtual fields are serialised.
repositorySchema.set('toJSON', {
  virtuals: true
});

repositorySchema.findById = function(cb) {
  return this.model('Repository').find({
    id: this.id
  }, cb);
};

const Repository = Mongoose.model('repository', repositorySchema);

exports.findById = (id) => {
  return Repository.findById(id)
    .then((result) => {
      if (result) {
        result = result.toJSON();
        delete result._id;
        delete result.__v;
        return result;
      }
    });
};

exports.create = (repositoryData) => {
  const repository = new Repository(repositoryData);
  return repository.save();
};

exports.list = () => {
  return new Promise((resolve, reject) => {
    Repository.find()
      .exec(function(err, users) {
        if (err) {
          reject(err);
        } else {
          resolve(users);
        }
      })
  });
};

exports.patchById = (id, repositoryData) => {
  return new Promise((resolve, reject) => {
    Repository.findById(id, function(err, repository) {
      if (err) reject(err);
      for (let i in repositoryData) {
        repository[i] = repositoryData[i];
      }
      repository.save(function(err, updatedRepository) {
        if (err) return reject(err);
        resolve(updatedRepository);
      });
    });
  })
};

exports.deleteById = (id) => {
  return new Promise((resolve, reject) => {
    Repository.deleteOne({
      _id: id
    }, (err) => {
      if (err) {
        reject(err);
      } else {
        resolve(err);
      }
    });
  });
};

exports.findByOwnerAndName = (owner, name) => {
  return Repository.find({
    owner: owner,
    name: name
  });
};

Following our initial commit, this is what we have: A MongoDB connection and our REST operations.

Using the following command, we can launch our application:

1
node index.js

Testing

Send requests to localhost:3000 for testing (using e.g. Postman or cURL):

Insert a Repository (Required Fields Only)

Post: http://localhost:3000/repositories

Body:

1
2
3
4
{
  "owner" : "facebook",
  "name" :  "react"
}

Body:

1
2
3
4
{
  "owner" : "facebook",
  "name" :  "facebook-android-sdk"
}

It’s time to automate updates now that that’s working.

Building the Back End of the Node.js Static Site Generator, Step 2: Using a Cron Job to Update Repository Releases

In this section, we’ll set up a straightforward cron job (which will launch at UTC midnight) to update the GitHub repositories that we’ve added to our database. In our example above, we only included the owner and name parameters; however, these two elements are sufficient for us to retrieve general details about a specific repository.

We must use the GitHub API in order to update our data. It is advisable to be familiar with GraphQL and v4 of the GitHub API for this section.

Additionally, we must create a GitHub access token. These are the bare minimum scopes required:

The GitHub token scopes we need are repo:status, repo_deployment, public_repo, read:org, and read:user.

That will produce a token that we can use to communicate with GitHub and transmit requests.

Let’s now return to our code.

Two new dependencies exist in package.json:

"axios": "^0.18.0" is an HTTP client, allowing us to communicate with the GitHub API.
"cron": "^1.7.0" is a job scheduler for cron.

After adding dependencies, as usual, run npm install or yarn.

In config.js, we’ll also require two new properties:

"githubEndpoint": "https://api.github.com/graphql"
"githubAccessToken": process.env.GITHUB_ACCESS_TOKEN (you’ll need to configure the GITHUB_ACCESS_TOKEN environment variable with your personal access token)

Make a new file named cron.controller.js in the controller directory. At the scheduled times, it will simply call the updateResositories method from repository.controller.js:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
const RepositoryController = require('../controller/repository.controller');
const CronJob = require('cron').CronJob;

function updateDaily() {
  RepositoryController.updateRepositories();
}

exports.startCronJobs = function () {
  new CronJob('0 0 * * *', function () {updateDaily()}, null, true, 'UTC');
};

The final adjustments for this section will be made to repository.controller.js. We’ll build it to update all repositories at once for the sake of clarity. However, you might go over the resource limitations of GitHub’s API if you have a lot of repositories. You’ll need to adjust this to run in smaller batches over time if that’s the case.

The update functionality’s all-at-once implementation will resemble this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
async function asyncUpdate() {

  await RepositoryModel.list().then((array) => {
    const promises = array.map(getLatestRelease);

    return Promise.all(promises);
  });
}

exports.updateRepositories = async function update() {
  console.log('GitHub Repositories Update Started');

  await asyncUpdate().then(() => {
    console.log('GitHub Repositories Update Finished');
  });
};

Finally, we’ll call the endpoint to update the repository model.

The getLatestRelease function will generate a GraphQL query before contacting the GitHub API. Then, the updateDatabase function will handle the request’s response.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
async function updateDatabase(responseData, owner, name) {

  let createdAt = '';
  let resourcePath = '';
  let tagName = '';
  let releaseDescription = '';
  let homepageUrl = '';
  let repositoryDescription = '';
  let avatarUrl = '';

  if (responseData.repository.releases) {

    createdAt = responseData.repository.releases.nodes[0].createdAt;
    resourcePath = responseData.repository.releases.nodes[0].resourcePath;
    tagName = responseData.repository.releases.nodes[0].tagName;
    releaseDescription = responseData.repository.releases.nodes[0].description;
    homepageUrl = responseData.repository.homepageUrl;
    repositoryDescription = responseData.repository.description;

    if (responseData.organization && responseData.organization.avatarUrl) {
      avatarUrl = responseData.organization.avatarUrl;
    } else if (responseData.user && responseData.user.avatarUrl) {
      avatarUrl = responseData.user.avatarUrl;
    }

    const repositoryData = {
      owner: owner,
      name: name,
      createdAt: createdAt,
      resourcePath: resourcePath,
      tagName: tagName,
      releaseDescription: releaseDescription,
      homepageUrl: homepageUrl,
      repositoryDescription: repositoryDescription,
      avatarUrl: avatarUrl
    };

    await RepositoryModel.findByOwnerAndName(owner, name)
      .then((oldGitHubRelease) => {
        if (!oldGitHubRelease[0]) {
          RepositoryModel.create(repositoryData);
        } else {
          RepositoryModel.patchById(oldGitHubRelease[0].id, repositoryData);
        }
        console.log(`Updated latest release: http://github.com${repositoryData.resourcePath}`);
      });
  }
}

async function getLatestRelease(repository) {

  const owner = repository.owner;
  const name = repository.name;

  console.log(`Getting latest release for: http://github.com/${owner}/${name}`);

  const query = `
         query {
           organization(login: "${owner}") {
               avatarUrl
           }
           user(login: "${owner}") {
               avatarUrl
           }
           repository(owner: "${owner}", name: "${name}") {
               homepageUrl
               description
               releases(first: 1, orderBy: {field: CREATED_AT, direction: DESC}) {
                   nodes {
                       createdAt
                       resourcePath
                       tagName
                       description
                   }
               }
           }
         }`;

  const jsonQuery = JSON.stringify({
    query
  });

  const headers = {
    'User-Agent': 'Release Tracker',
    'Authorization': `Bearer ${GITHUB_ACCESS_TOKEN}`
  };

  await Axios.post(GITHUB_API_URL, jsonQuery, {
    headers: headers
  }).then((response) => {
    return updateDatabase(response.data.data, owner, name);
  });
}

We will have finished implementing a cron scheduler to get daily updates from our GitHub repositories following our second commit.

The back end is almost finished. However, we’ll cover the final step in the following article because it should be completed after the front end has been implemented.

Deploying the Node Static Site Generator Back End to Heroku

We will deploy our application to Heroku in this stage, so you’ll need to set up an account with them if you don’t already have one. If we link our Heroku account to GitHub, continuous deployment will be much simpler. Consequently, I’m hosting my project on GitHub.

Add a new app from the dashboard after logging into your Heroku account:

Choosing "Create new app" from the New menu in the Heroku dashboard.

Give it a distinct name:

You’ll be taken to a deployment area. Choose GitHub as your deployment strategy, then select “Connect” after finding your repository.

Linking your new GitHub repo to your Heroku app.

You can enable automatic deployments to make things simpler. Whenever you commit to your GitHub repository, it will deploy:

We must now include MongoDB as a resource. Click “Find more add-ons” under the Resources tab. (Personally, I use mLab mongoDB.)

Adding a MongoDB resource to your Heroku app.

Install it after entering the name of your app in the “App to provision to” field:

The mLab MongoDB add-on provision page in Heroku.

Finally, at the root of our project, we need to create a file called Procfile that lists the commands the app will run when Heroku launches.

Our Procfile is straightforward:

1
web: node index.js

Make the file, and then commit it. When you push the commit, Heroku will automatically deploy your application, which will be accessible at https://[YOUR_UNIQUE_APP_NAME].herokuapp.com/.

We can submit the same requests to localhost to see if it’s functioning.

Node.js, Express, MongoDB, Cron, and Heroku: We’re Halfway There!

We have this is what our repo will look like after our third commit.

We’ve established the Node.js/Express-based REST API on our back end, the updater that uses the GitHub API, and a cron job to start it. In order to later provide data for our static web content generator, we have deployed our back end using Heroku and a hook for continuous integration. You are now prepared for the second section, where we will finish the program and implement the front end.