Header image for blog post: Supporting Expiring OAuth Access Tokens for GitLab

Published 20th April 2022

Supporting Expiring OAuth Access Tokens for GitLab

GitLab added an option to OAuth integrations to have your access tokens expire after two hours, and are deprecating support for non-expiring tokens in their May 2022 release. As I have talked about GitLab OAuth integration before, I’m writing this as an update to tell you how you might want to handle token rotation for your GitLab integration, whether you are creating a new application or updating your existing one.

What is an expiring OAuth access token?

As I touched on previously, after creating an OAuth Application on GitLab, you can send your users to GitLab’s OAuth endpoint, where they are prompted to sign into their account and authorise your application. This redirects them to your website with a code in the query parameters, and you can exchange that code with GitLab to receive an access token. When you make the request, you will receive a response like this:

{
 "access_token": "de6780bc506a0446309bd9362820ba8aed28aa506c71eedbe1c5c4f9dd350e54",
 "token_type": "bearer",
 "expires_in": 7200,
 "refresh_token": "8257e65c97202ed1726cf9571600918f3bffb2544b26e00a61df9897668c33a1",
 "created_at": 1607635748
}

This access_token can then be passed as a bearer token in the Authentication header when making HTTP requests to the GitLab API or to authenticate via git. Previously, this access token would not expire unless you made a request to revoke it. This posed a possible security issue - if a malicious actor obtained the token, or the token was leaked, the malicious actor would be able to authenticate as the user, and the stolen token would remain valid until it is manually revoked by either you or the user - meaning they may have access for a very long time without you knowing!

To solve this issue, GitLab’s access tokens now expire after two hours. This means if a token is stolen, the malicious actor would have a very limited window to use the token. Whilst a lot of damage can still be done in the space of two hours, it helps to minimise the damage from some ways a token might get leaked, for example an access token being pushed to a public repository or a database dump being leaked.

How do you handle expiring tokens?

Once a token has expired, your API requests will fail and you will be prompted by GitLab to generate a new token. To do this, you must make another request to GitLab’s OAuth endpoint. Much like the initial link, you must provide your application’s Client ID and Client Secret, but instead of passing the linking code, you will pass in the user’s refresh token. This will invalidate both the existing access token (if it is still valid) and the refresh token you just used, and return a new access token and refresh token. The access token will be valid for another two hours. You will need to store the new refresh token, as this token will be used the next time you request a new token.

import fetch from 'node-fetch';

const CLIENT_ID = process.env.GITLAB_CLIENT_ID;
const CLIENT_SECRET = process.env.GITLAB_CLIENT_SECRET;

export const refreshGitlabOAuthToken = async ({ refreshToken }) => {
  const formBody = {
    client_id: CLIENT_ID,
    client_secret: CLIENT_SECRET,
    refresh_token: refreshToken,
    grant_type: 'refresh_token',
  };

  const body = new URLSearchParams(formBody).toString();

  const options = {
    method: 'POST',
    headers: { 'content-type': 'application/x-www-form-urlencoded' },
    body,
  };

  const url = 'https://gitlab.example.com/oauth/token';

  const response = await fetch(url, options);

  if (!response.ok) {
    const message = await response.text();
    throw new Error(`Failed to refresh token. Status ${response.status}. Message: ${message}`)
  }

  const { access_token: accessToken, refresh_token: newRefreshToken, expires_in: expiresIn } = await response.json();

  return {
    accessToken,
    refreshToken: newRefreshToken,
    expiresIn,
  };
}

By requiring the application’s Client ID and Client Secret, tokens are more secure. Even if a user’s refresh token is stolen by a malicious actor, that refresh token cannot be exchanged for a new token unless they also have access to your application’s secrets. This means the malicious actor would have to compromise your entire application to authenticate as a user, rather than compromising a single user’s token.

Storing access tokens in Redis

When you are making API requests to GitLab, you want to try and minimise the amount of overhead to make your requests as fast as possible. One way of doing this is by caching the access token using Redis, which has a fast read speed making it ideal for this situation. After making a request to generate a GitLab access token, we can store it in Redis with a time to live slightly shorter than the expires_in time returned by the refresh response. Then, whenever we want to make a request, we can check the Redis cache for the access token. If it exists in the cache, we know it is (probably) valid, and if it doesn’t exist, we know we need to generate a new token.

import { refreshGitlabOAuthToken } from './refresh-token';
import { encryptToken, decryptToken } from './encryption-utils';

// This needs to be deterministic and use the user's GitLab ID in case multiple users have linked the same GitLab account
const getCacheTokenName = (gitlabId) => `gl-cache-${gitlabId}`;

export const getGitlabTokenWithCache = async ({ userId, RedisClient, db, ignoreCache }) => {
  const userObject = await db.getCollection('users').findOne({ _id: userId });
  const { gitlabId, refreshToken: oldEncryptedRefreshToken } = userObject;

  const cacheTokenName = getCacheTokenName(gitlabId);

  // Check the cache if we haven't explicitly chosen to ignore it.
  const cachedToken = ignoreCache ? null : await RedisClient.get(cacheTokenName);

  if (cachedToken) {
    return decryptToken(cachedToken);
  }

  const oldDecryptedRefreshToken = decryptToken(oldEncryptedRefreshToken);

  const { accessToken, refreshToken, expiresIn } = await refreshGitlabOAuthToken({ refreshToken: oldDecryptedRefreshToken });

  // If the refresh request returned an expiry time, we should expire a little before that.
  // If it doesn't, set an expiry time anyway for security.
  const expiryTime = expiresIn ? expiresIn - 30 : 6000;

  // Encrypt the tokens for storage
  const encryptedAccessToken = encryptToken(accessToken);
  const encryptedRefreshToken = encryptToken(refreshToken);

  // Set the cache
  await RedisClient.set(cacheTokenName, encryptedAccessToken, 'EX', expiryTime);

  // Update the refresh token in the database
  await db.getCollection('users').updateMany({ gitlabId }, { $set: { refreshToken: encryptedRefreshToken }});

  return accessToken;
}

In the above example, we first fetch the user’s details from a database. Here we are using MongoDB to store user data, but this will work with any database. When performing the initial OAuth link, you should make sure to store the user’s refresh token as well as their GitLab user ID which you can fetch via the GitLab API. Then, we check whether there is an access token in the Redis cache for that GitLab user, and return the cached token if it’s available. If it’s not available, we call the refreshGitlabOAuthToken function from before and cache the result in Redis. Make sure not to store tokens as plaintext, especially the refresh token as that does not expire until used. When generating a key for the cache, you should use a function that is deterministic so it always returns the same result, and it should use the user’s GitLab user ID. Access tokens and refresh tokens are linked to a specific GitLab user, so if you want to allow multiple users of your application to link the same GitLab account, you need to make sure to update all the tokens correctly.

Implementing a lock to prevent multiple refreshes at once

One issue you might experience with the above is the situation where multiple refresh requests come in quick succession. For example, if you make three requests at the same time, and there is no token in the cache, all three requests will try to generate a new access token with the same refresh token, and only the first token will be successful. To prevent this, we can implement a lock - when a request wants to try and refresh the token, it tries to take the lock first. If the lock is available, it takes the lock and performs the refresh as normal, releasing the lock afterwards. If the lock is not available, it waits until the lock is released and takes the new cached value.

There are many ways to implement locks and many existing packages that will do it for you. Redis recommends using the Redlock algorithm. Here, we’re just implementing a simple lock primitive using the Redis SETEX command - there’s lots of improvements you can make here.

// If lock has been held for longer than this, force open the lock
const expireLockTime = 120000;

const getLockName = (gitlabId) => `gl-lock-${gitlabId}`;

const timeout = async (ms) =>
  await new Promise((resolve) => {
    setTimeout(resolve, ms);
  });

// Performs exponential backoff
const backoff = async (f, options) => {
  let { initialWait, maxRetries } = options || {};
  initialWait = initialWait || 1000;
  maxRetries = maxRetries || 4;

  let currentWait = initialWait;
  let currentRetries = 0;

  while (true) {
    try {
      return await f();
    } catch (e) {
      if (currentRetries > maxRetries) {
        throw e;
      }
      await timeout(currentWait + Math.floor(Math.random() * 300));
      currentWait *= 2;
      currentRetries += 1;
    }
  }
};


export const getRefreshLock = async ({ gitlabId, RedisClient }) => {
  const lockKeyName = getLockName(gitlabId);
  const lockTime = await RedisClient.get(lockKeyName);
  const currentTime = Date.now();

  let receivedLockImmediately = true;
  
  if (lockTime && (currentTime - lockTime) > expireLockTime) {
    const lockTime2 = await RedisClient.set(lockKeyName, 'GET');

    if (!lockTime2) {
      return { receivedLockImmediately };
    }

    if ((currentTime - lockTime2) > expireLockTime) {
      return { receivedLockImmediately };
    }
  }

  await backoff(async () => {
    const receivedLock = await RedisClient.setnx(lockKeyName, Date.now());

    if (!receivedLock) {
      receivedLockImmediately = false;

      throw new Error('Failed to refresh token - lock was not released in time.');
    }
  });

  return { receivedLockImmediately };
}

export const releaseRefreshLock = async ({ gitlabId, RedisClient }) => {
  const lockKeyName = getLockName(gitlabId);

  await RedisClient.destroy(lockKeyName);
}

In the above, we create a simple lock utility, which tries to receive the lock if it is available and waits with an exponential backoff if the lock isn’t available. It also has some simple handling to obtain the lock if it has been locked for too long, in case something goes wrong with the handling somewhere else. We return whether the lock was immediately received, and we use this handling in the refresh function. If the lock was received immediately, we can refresh the token, and if the thread had to wait for the lock, we take the new cached value.

import { refreshGitlabOAuthToken } from './refresh-token';
import { encryptToken, decryptToken } from './encryption-utils';
import { getRefreshLock, releaseRefreshLock } from './lock-utils';

// This needs to be deterministic and use the user's GitLab ID in case multiple users have linked the same GitLab account
const getCacheTokenName = (gitlabId) => `gl-cache-${gitlabId}`;

export const getGitlabTokenWithCache = async ({ userId, RedisClient, db, ignoreCache }) => {
  const userObject = await db.getCollection('users').findOne({ _id: userId });
  const { gitlabId, refreshToken: oldEncryptedRefreshToken } = userObject;

  const cacheTokenName = getCacheTokenName(gitlabId);

  // Check the cache if we haven't explicitly chosen to ignore it.
  const cachedToken = ignoreCache ? null : await RedisClient.get(cacheTokenName);

  if (cachedToken) {
    return decryptToken(cachedToken);
  }

  const { receivedLockImmediately } = await getRefreshLock({ gitlabId, RedisClient });

  try {
    if (!receivedLockImmediately) {
      const cachedToken2 = await RedisClient.get(cacheTokenName);

      if (cachedToken2) {
        await releaseRefreshLock({ gitlabId, RedisClient });
        return decryptToken(cachedToken2);
      }
    }

    const oldDecryptedRefreshToken = decryptToken(oldEncryptedRefreshToken);

    const { accessToken, refreshToken, expiresIn } = await refreshGitlabOAuthToken({ refreshToken: oldDecryptedRefreshToken });

    // If the refresh request returned an expiry time, we should expire a little before that.
    // If it doesn't, set an expiry time anyway for security.
    const expiryTime = expiresIn ? expiresIn - 30 : 6000;

    // Encrypt the tokens for storage
    const encryptedAccessToken = encryptToken(accessToken);
    const encryptedRefreshToken = encryptToken(refreshToken);

    // Set the cache
    await RedisClient.set(cacheTokenName, encryptedAccessToken, 'EX', expiryTime);

    // Update the refresh token in the database
    await db.getCollection('users').updateMany({ gitlabId }, { $set: { refreshToken: encryptedRefreshToken }});

    return accessToken;
  } finally {
    await releaseRefreshLock({ gitlabId, RedisClient });
  }
}

We can add in the lock utility to the getGitlabTokenWithCache function as seen above. Now, if multiple requests all come in at the same time, the first request can refresh the token and subsequent requests can receive the access token cached by the first request.

Error handling for invalid access tokens

Whilst the system we have implemented should handle things correctly the vast majority of the time, it’s useful for us to add in some handling just in case something goes wrong and we end up with a revoked access token in the cache. When you make an API request to GitLab using a token that has been revoked, it will return the following error message:

{"error":"invalid_token","error_description":"Token was revoked. You have to re-authorize from the user."}

Knowing this, we can wrap our API calls with a try catch block, and in the catch we can check whether the GitLab error code is invalid_token. If it is, we can try the request again using the ignoreCache flag to generate a fresh token.

import { getGitlabTokenWithCache } from './get-token-with-cache';
import { RedisClient, db } from './addons';

export const gitlabAPIRequestWrapper = async ({ userId }, func) => {
  const { accessToken } = await getGitlabTokenWithCache({ userId, RedisClient, db });

  try {
    return (args) => func({ ...args, accessToken });
  } catch (e) {
    if (e?.message && e.message.includes("invalid_token")) {
      const { accessToken: accessToken2 } = await getGitlabTokenWithCache({ userId, RedisClient, db, ignoreCache: true });

      return (args) => func({ ...args, accessToken: accessToken2 });
    }
  }
}

Implementing refresh token support for your GitLab OAuth application will help keep things safe and secure. The same handling can be used for other OAuth providers that support token rotation, however each provider can have their own differences so you should look into their documentation before implementing this. For example, Bitbucket uses refresh tokens that don’t expire after use, meaning you don’t have to store the new refresh token in the database every time.

Thank you for reading and I hope you found this useful. If you want more information about version control providers and OAuth integration, you can check out my previous posts on Github and Gitlab and Bitbucket. If you have any questions you can contact me at [first name] @northflank.com.

Share this article with your network

Also from the blog