adrianhesketh.com

Rotating AWS RDS Secrets with AWS Secrets Manager

AWS Secrets Manager allows teams to securely store secrets (connection strings, API keys etc.) that your applications need to function. This offers a tangible improvement to the security of your applications when compared to using unencrypted environment variables to store credentials. Environment variables are easy to read (e.g. by using the env command within the shell, and the values of environment variables are often visible in logs, or within Web consoles such as the Lambda console). This makes environment variables unsuitable to store important secrets.

AWS Secrets Manager can be configured to “rotate” secrets, that is, to update secrets on a schedule. This is a great feature - engineers often see secrets at first creation, or seed systems with initial passwords (for example when setting up a database platform) but best practice is that engineers don’t then continue to have access to systems after initial setup. If the engineers know the password because they set it, then the lack of access is not guaranteed.

The use of automated password rotation allows engineers to be able to honestly state that they cannot access systems because they do not know the passwords. Access to the secrets held in Secrets Manager is via the usual IAM restrictions, and is logged by CloudTrail, leaving an audit trail of access to secrets.

Application developers need to update applications to use the AWS Secrets Manager API to access secrets. There is a small monetary cost associated with storage of the secrets ($0.40 per secret per month at time of writing), and a cost of the API calls used to access secrets ($0.05 per 10,000 API calls at time of writing).

Given that applications must make an API call to access a secret, there is also a latency cost. The combination of latency and cost makes it a good idea to cache the results of a call to Secrets Manager.

However, the sample code presented by Secrets Manager doesn’t include examples of caching the secret in a way which also ensures that multiple threads which are trying to collect the secret only makes a single API call rather than each thread making an API call.

This post shows how to add a few extra layers to the basic capability offered by Secrets Manager:

  1. Collecting the secret string from Secrets Manager.
  2. Caching the secret.
  3. Converting an RDS secret to a Go DSN (connection string) used to access MySQL.
  4. Handling when the RDS password has changed since the secret was last retrieved or cached.

Collecting the secret string from Secrets Manager.

First we can take the sample code offered by AWS Secrets Manager and simplify it slightly, to retrieve a string from Secrets Manager without hard coding the region or secret name.

The github.com/a-h/go-sql-driver-rds-credentials/store/sm package exports a single utility function called DefaultRetrieve which uses the AWS SDK to retrieve a string secret from Secrets Manager. Unlike the sample code, it works out the region from the ARN of the secret name which is passed into the function.

// DefaultRetrieve retrieves data from AWS Secrets Manager.
func DefaultRetrieve(name string) (secret string, err error) {
	cfg := aws.NewConfig()
	if region, ok := getRegionFromARN(name); ok {
		cfg = cfg.WithRegion(region)
	}
	svc := secretsmanager.New(session.New(cfg))
	input := &secretsmanager.GetSecretValueInput{
		SecretId:     aws.String(name),
		VersionStage: aws.String("AWSCURRENT"),
	}
	var result *secretsmanager.GetSecretValueOutput
	result, err = svc.GetSecretValue(input)
	if err != nil {
		return
	}
	secret = *result.SecretString
	return
}

func getRegionFromARN(arn string) (region string, ok bool) {
	// arn:partition:service:region:account-id:resource
	split := strings.Split(arn, ":")
	if len(split) < 4 {
		return
	}
	region = split[3]
	ok = true
	return
}

Caching the secret.

Building on top of the secret retrieval function, we can create a type which caches calls to the Secrets Manager. While we could just use the sample code provided by AWS to recover the secret at application startup, if the secret was rotated, the secret would be stale and the application would have no way to recover, except by being restarted which would force it to load the secret again.

This type allows for the case that the secret has been updated / rotated since the secret was retrieved from Secrets Manager by offering a Get(force bool) function which forces the secret to be repopulated. We can then build another layer on top of this mechanism which lets us use secrets with the Go MySQL database driver.

First, we must define the type and the fields it must contain: The Name of the secret (an ARN), CacheFor to define the duration that the secret should be cached for, LastRefreshed to store when the secret was last retrieved from Secrets Manager, Value for the string value of the secret.

The type also contains a number of private fields. m is a lock used to prevent multiple goroutines (similar to threads) from attempting to retrieve the secret at the same time. retrieve is the function used to retrieve secrets, the default value is to use AWS Secrets Manager, alternative mock implementations are used to test behaviour under error conditions. Finally, callsMade keeps track of how many API calls to Secrets Manager have been made.

// Secret store, backed by AWS Secrets Manager.
type Secret struct {
	Name          string
	CacheFor      time.Duration
	LastRefreshed time.Time
	m             *sync.Mutex
	retrieve      func(name string) (secret string, err error)
	Value         string
	callsMade     int
}

The Secret type’s Get method locks the mutex and releases it when the function exits using the defer feature of Go. The method then retrieves the secret from the underlying Secrets Manager if the cache has expired, or the force parameter is set to true.

// Get the secret, optionally forcing a refresh.
func (s *Secret) Get(force bool) (secret string, err error) {
	s.m.Lock()
	defer s.m.Unlock()
	if force || time.Now().UTC().After(s.LastRefreshed.Add(s.CacheFor)) {
		secret, err = s.retrieve(s.Name)
		if err != nil {
			return
		}
		s.callsMade++
		s.Value = secret
		s.LastRefreshed = time.Now().UTC()
	}
	return s.Value, nil
}

// CallsMade to the underlying secret API.
func (s *Secret) CallsMade() int {
	return s.callsMade
}

In line with Go’s “batteries included” style, the type has sensible defaults defined, meaning that you only have to pass in the ARN of the secret to be able to create a thread-safe secret retrieval cache.

// New creates a new store.
func New(name string) *Secret {
	return &Secret{
		Name:          name,
		CacheFor:      defaultCacheDuration,
		LastRefreshed: time.Time{},
		m:             &sync.Mutex{},
		retrieve:      sm.DefaultRetrieve,
	}
}

const defaultCacheDuration = time.Hour * 24

With this in place, we’ve got a simple way to cache and retrieve a secret.

s := store.New("your_secret_arn")
secret, err := s.Get(false)

However, an RDS secret is actually a JSON document, so to support RDS, we’ll need to do more.

Converting an RDS secret to a Go DSN (connection string) used to access MySQL.

An RDS secret is a JSON document which contains details about the database engine, host, port and other details along with the password:

{
  "username": "user",
  "password": "pwd",
  "engine": "mysql",
  "host": "host_name",
  "port": 3306,
  "dbClusterIdentifier": "dbcid"
}

However, a Go MySQL connection string looks quite different:

user:password@tcp(host:port)/dbname?parseTime=true&multiStatements=true&collation=utf8mb4_unicode_ci

It’s then clear that we need some code to map between the two formats. Lets start with a type which adds extra functionality to the Secret type we just created to carry out this mapping. This is the child field on the struct. We also need the config field in order to store additional MySQL configuration properties which aren’t part of the RDS Secret JSON document, but are important to the operation of the system - for example, the collation setting which allows for the use of utf8mb4_unicode_ci (supports extended Unicode characters such as emojis and characters commonly used in Polish names etc.).

As per the other type, a mutex protects against multiple concurrent processes from modifying data. The dsn stores the connection string. The previous field is an optimisation used to determine whether a JSON decode would be wasted.

// RDS store, backed by AWS Secrets Manager.
type RDS struct {
	child    *Secret
	config   *mysql.Config
	previous string
	m        *sync.Mutex
	dsn      string
}

To support the use of MySQL RDS, our code will need to convert the JSON document into a MySQL connection string. In Go, the simplest way to do this is to create a type and unmarshal into it. First, lets create the rdsSecret type to allow us to convert the JSON document.

type rdsSecret struct {
	Username            string `json:"username"`
	Password            string `json:"password"`
	Engine              string `json:"engine"`
	Host                string `json:"host"`
	Port                int    `json:"port"`
	DbClusterIdentifier string `json:"dbClusterIdentifier"`
}

Next, we can implement the secret retrieval method of the RDS secret. This method retrieves the Secret from the cache and, if it has changed, creates a new MySQL DSN by combining the secret details and the MySQL configuration.

// Get the secret, optionally forcing a refresh.
func (s *RDS) Get(force bool) (secret string, err error) {
	j, err := s.child.Get(force)
	if err != nil {
		return
	}
	if j == s.previous {
		// Don't bother unmarshalling from JSON if nothing has changed.
		return s.dsn, nil
	}
	var r rdsSecret
	err = json.Unmarshal([]byte(j), &r)
	if err != nil {
		return
	}
	s.previous = j
	// It's changed, so update the cached dsn.
	s.m.Lock()
	defer s.m.Unlock()
	s.config.User = r.Username
	s.config.Passwd = r.Password
	s.config.Net = "tcp"
	s.config.Addr = r.Host + ":" + strconv.Itoa(r.Port)
	s.dsn = s.config.FormatDSN()
	return s.dsn, nil
}

// CallsMade to the underlying secret API.
func (s *RDS) CallsMade() int {
	return s.child.CallsMade()
}

Finally, a NewRDS method allows the RDS type to be created with all required fields populated.

// NewRDS creates a new RDS store, passing the name of the secret, and a template DSN.
// user:password@tcp(host:port)/dbname?parseTime=true&multiStatements=true&collation=utf8mb4_unicode_ci
func NewRDS(name, dbName string, params map[string]string) *RDS {
	conf := mysql.NewConfig()
	conf.DBName = dbName
	conf.Params = params
	return &RDS{
		child:  New(name),
		config: conf,
		m:      &sync.Mutex{},
	}
}

Now, we can retrieve RDS connection strings with a couple of lines of code:

s := store.NewRDS("your_secret_arn", "databaseName", map[string]string{
		"parseTime":       "true",
		"multiStatements": "true",
		"collation":       "utf8mb4_unicode_ci",
  })
dsn, err := s.Get(false)

Handling when the RDS password has changed since the secret was last retrieved or cached.

There is a cost associated with initiating a database connection, as the client initiates a TCP connection and authenticates with the server. To reduce the impact of this cost, most programming languages implement connection pooling where a number of connections are maintained with the database server, and are reused to reduce the setup and tear down cost of each connection.

With connection pooling, when a request is made to “Open” a connection, if the connection pool is empty or all current connections are being used (and the pool is not full), a new connection will be made to the database server. If all of the connections are being used and the connection pool is full, then the request to Open a new connection will wait until a connection becomes available.

This makes it hard to predict when a connection will actually be made to the database. The sql.Open function in Go accepts a dsn (connection string), but there’s usually no way to determine when it will actually be used to make a database connection.

In the case that we start up the database connection at the start of the program as per best practice with Go (http://go-database-sql.org/accessing.html) the database password may have been rotated when Go actually needs to make a connection. Connections which have already been made aren’t a concern because the authentication has already taken place, but new connections would definitely fail if the database user’s password has changed.

This is an operations nightmare - an ocassional database failure which only may only appear when the Website is under load and goes away when the application is restarted, so we need to work out a way to make sure that Go always uses the correct database password when it’s opening new connections.

The answer to this problem is to use sql.OpenDB which takes an implementation of the driver.Connector interface. This interface includes a Connect(context.Context) (Conn, error) method which actually creates database connections.

So, lets create a Secrets Manager-aware implementation of the driver.Connector interface. It can use the RDS-aware Secret Store we created in the previous section to retrieve secrets.

// Connector to MySQL.
type Connector struct {
	store *store.RDS
	m     *sync.Mutex
}

To implement the driver.Connector interface, we need to implement the Connect method to provide an open connection. Given the risk of a database credential becoming stale as it is rotated, this code will first try a credential update if the connection fails with an Error 1045 (Message: Access denied for user %s@%s (using password: %s)) before failing if it happens twice.

// Connect implements driver.Connector interface.
// Connect returns a connection to the database.
func (c *Connector) Connect(ctx context.Context) (conn driver.Conn, err error) {
	c.m.Lock()
	defer c.m.Unlock()
	creds, err := c.store.Get(false)
	if err != nil {
		return
	}
	conn, err = c.Driver().Open(creds)
	if err != nil && strings.Contains(err.Error(), "Error 1045") {
		creds, err = c.store.Get(true)
		if err != nil {
			return
		}
		conn, err = c.Driver().Open(creds)
	}
	return
}

The driver.Connector interface also includes a Driver method which returns the type of driver used for this connection.

// Driver implements driver.Connector interface.
// Driver returns &MySQLDriver{}.
func (c *Connector) Driver() driver.Driver {
	return mysql.MySQLDriver{}
}

Just like the other layers of this solution, a convenience function brings it all together.

// New connector.
func New(st *store.RDS) *Connector {
	return &Connector{
		store: st,
		m:     &sync.Mutex{},
	}
}

Bringing it all together.

With the building blocks in place, we’re now able to make database connections using AWS Secrets Manager and support background secrets rotation.

import (
  "github.com/a-h/go-sql-driver-rds-credentials/store"
  "github.com/a-h/go-sql-driver-rds-credentials/connector"
)

func main() {
	s := store.NewRDS("your_secret_arn", "databaseName", map[string]string{
		"parseTime":       "true",
		"multiStatements": "true",
		"collation":       "utf8mb4_unicode_ci",
	})
	c := connector.New(s)
	db := sql.OpenDB(c)
	err := db.Ping()
	if err != nil {
		fmt.Println("error:", err)
		os.Exit(1)
	}
	fmt.Println("OK")
}

Summary

AWS Secrets Manager is an effective building block which allows secrets to be automatically rotated, but supporting caching and stale credentials can require extra work which is specific to your platform and use case.

If you’re using Aurora MySQL and the Go Programming Language, you can use the ready-made https://github.com/a-h/go-sql-driver-rds-credentials package with AWS Secrets Manager to support background rotation of secrets.