Using UUIDs with Gorm and Postgres

In this post I would like to document my adventures of trying to set up a small Go project using

  • a Postgres database
  • with UUID primary keys for my models,
  • gorm as an ORM, and
  • goose as a migration tool.

Disclaimer: I don’t have vast experience in this topic, so take these tips with a pinch of salt.

# Contents

# UUIDs

UUID stands for universally unique identifier (the term GUID refers to the same thing, typically used in the Microsoft world). I wanted to use UUID for multiple reasons.

# Advantages of the UUIDs

  • They are unique, which is really handy when it comes to an identifier… They are unique across all tables, all your databases, all your systems. They are practically unique in the universe - theoretically, there could be a collision, however, to get a grasp of how unlikely that is, have a look at this great post: Are UUIDs really unique?. But in short:

    “A sample of 3.26 x 10^16 UUIDs has a 99.99% chance of not having any duplicates.”

  • They are marginally more secure (not that they should ever be exposed publicly, but still, they practically cannot be guessed.)
  • You can create them in the application code (no need to reach out to the DB) if you want to.

# Disatvantages of UUIDs

  • Storage: UUIDs are 128-bit values as opposed to either 32 or 62 bit integers. I can imagine this being a significant drawback in some cases.
  • Performance: on the same note as above, these are larger values which can adversly affect performance
  • Less human readable, thus less convenient to use them during debugging
  • No natural ordering, again not great for debugging and for certain types of pagination techniques (this one on the wordpress blog cannot be used, for instance).

# Postgres UUIDs

In Postgres, you can generate UUIDs with the uuid_generate_v4() function.

As an interesting aside, the Postgres documentation mentions that the first version of this function used the MAC address of the host machine, plus a timestamp, which meant it could reveal the identity of the machine and the time the particular record was generated, which is not great from a security perspective.

Another interesting thing is that if you want reproducible UUIDs without any randomness in the generation process, you can use the v3 of this function with a specified “namespace” (any UUID) and a name. The UUID generated for the same namespace with the same name will always be the same:

postgres=# SELECT
postgres-# uuid_generate_v3('488a05ae-c002-3e52-97a6-a987a1380cba', 'name');
           uuid_generate_v3           
--------------------------------------
 29f04949-f50b-37a9-a751-f8f27dafb950
(1 row)


postgres=# SELECT
postgres-# uuid_generate_v3('488a05ae-c002-3e52-97a6-a987a1380cba', 'name');
           uuid_generate_v3           
--------------------------------------
 29f04949-f50b-37a9-a751-f8f27dafb950
(1 row)

So it is best to generate them with the 4th generation family of these functions. Only the 4th generation is truly random (I assume “truly pseudo-random”, unless they solved the random number generation problem since I’ve last checked).

# GORM

Gorm is an ORM library for Go, as in, it handles the object-relational mapping problem of converting rows in your SQL database to objects in your object-oriented application code. I’m not very experienced with Gorm, but so far it definitely seems like an awesome tool which greatly facilitates development.

One could say that you might not need any ORM for your Go project, for instance, if your app has very simple models and relationships, or if performance and throughput are critical for your service. In these cases, using a raw database driver and the sql package might be best (http://go-database-sql.org/index.html).

To define an ORM model, you can use the gorm.Model struct embedded in your struct:

type User struct {
  gorm.Model
  Username string
  PwHash   string
}

The gorm.Model part in the struct definition add ID, CreatedAt, UpdatedAt and DeletedAt fields to your model (have a look at the source here). However, if you want to use UUIDs, you have to specify the type of the column ID for gorm, by manually typing it into your struct definition. This means you have to remove the gorm.Model embedding line, as that already contains the ID part. This has bitten me and I have a good hour or two debugging my application before I realised this. In this case, the User struct might look like this with the customisation of the ID field:

import (
  "time"

  "github.com/google/uuid"
  "github.com/jinzhu/gorm"
)

type User struct {
  ID       uuid.UUID `gorm:"primary_key; unique; 
                      type:uuid; column:id; 
                      default:uuid_generate_v4()"`
  Username string
  PwHash   string
  CreatedAt time.Time
  UpdatedAt time.Time
  DeletedAt *time.Time
}

In your docker-compose, you can define a Postgres database like this:

version: '3'

services:
    db: &db
      image: frodenas/postgresql
      ports:
        - "5432:5432"
      environment:
        POSTGRES_USERNAME: username
        POSTGRES_PASSWORD: password
        POSTGRES_DBNAME: postgres
        POSTGRES_EXTENSIONS: uuid-ossp

      container_name: db

Note that I am using a Postgres Docker image from frodenas on Dockerhub, because it’s trivial to install an extension with it - just specify it as an environment variable called POSTGRES_EXTENSIONS in the compose file, and it will be installed when you first build the image.

However, you could use the latest official Postgres image at dockerhub.com/postgres as well, and create a tiny SQL script with the contents

create extension if not exists "uuid-ossp";

in a file in the /docker-entrypoint-initdb.d directory to achieve the same results.

# Migrating the database with Goose

Creating the users table with the appropriate schema can be done manually inside the docker container. However, for developer sanity, you might want to use a migration tool to automate this. There is a nifty tool called goose for that - it generates the skeleton sql files that you use for up- and down migrations, and it runs them according to your needs.

To create an empty migration in the migrations directory, you can run the following:

$ goose -dir migrations postgres "user=username password=password dbname=postgres sslmode=disable" create add_users_table sql
2019/04/07 20:02:00 Created new file: 20190407200200_add_users.go

To check the status of your DB, run goose status:

$ goose -dir migrations postgres "user=username password=password dbname=postgres sslmode=disable" status
2019/04/07 20:08:46     Applied At                  Migration
2019/04/07 20:08:46     =======================================
2019/04/07 20:08:46     Pending                  -- 20190407200701_add_users.sql

In the newly created file, you can define your table schema:

-- +goose Up
-- SQL in this section is executed when the migration is applied.
CREATE TABLE IF NOT EXISTS users(
  id uuid PRIMARY KEY NOT NULL UNIQUE,
  username text,
  pw_hash text,
  created_at timestamp,
  updated_at timestamp,
  deleted_at timestamp
);

-- +goose Down
-- SQL in this section is executed when the migration is rolled back.
DROP TABLE IF EXISTS users;

Finally, to actually run the migrations, run goose up:

$ goose -dir migrations/ postgres "user=calendar password=calendar dbname=postgres sslmode=disable" up
2019/04/07 20:11:45 OK    20190407200701_add_users.sql
2019/04/07 20:11:45 goose: no migrations to run. current version: 20190407200701

And if you jump on to the docker container running Postgres, you should see the tables that goose created:

$ docker exec -it $(docker ps | grep postgres | awk '{print $1}')  psql postgres username
psql (9.6.2)
Type "help" for help.

postgres=# \d
                   List of relations
 Schema |          Name           |   Type   |  Owner   
--------+-------------------------+----------+----------
 public | goose_db_version        | table    | username
 public | goose_db_version_id_seq | sequence | username
 public | users                   | table    | username

The goose table is used by the migration tool to determine the current state of the DB. Its schema looks like this:

postgres=# \d goose_db_version
 id         | integer                     | not null default nextval('goose_db_version_id_seq'::regclass)
 version_id | bigint                      | not null
 is_applied | boolean                     | not null
 tstamp     | timestamp without time zone | default now()

I think that’s it, kind folks! This wasn’t too much fun perhaps, but maybe it helps some random people to set up UUIDs with Gorm + Goose + Postgres.

# Sources

Written on April 7, 2019

If you notice anything wrong with this post (factual error, rude tone, bad grammar, typo, etc.), and you feel like giving feedback, please do so by contacting me at samubalogh@gmail.com. Thank you!