Real-World Azure SQL DB: Unexpected Database Maximum Size Limit

Reviewed by: Kun Cheng, Sanjay Mishra, Denzil Ribeiro, Arvind Shyamsundar, Mike Weiner, and Murshed Zaman

The Problem: A Production Outage

A customer using Azure SQL Database recently brought an interesting problem to our attention. Unexpectedly, their production workload started failing with the following error message: “The database 'ProdDb' has reached its size quota. Partition or delete data, drop indexes, or consult the documentation for possible resolutions.” The database was in a Premium elastic pool, where the documented maximum size limit for each database is 500 GB. But when they checked the size of the database shown in the Azure Portal, it was only 10 GB, and the portal was showing that all available database space has been used. Naturally, they were wondering why the database was out of space even when they were not near the maximum database size limit for their premium elastic pool.

Explanation

One of the established capacity limits of each Azure SQL DB database is its size. The maximum size limit is determined by the service objective (a.k.a. performance tier, or service tier) of the database, as documented in resource limit documentation. To determine the size limit, or size quota, that is set for a particular database, the following statement can be used, in the context of the target database:

 SELECT DATABASEPROPERTYEX(DB_NAME(), 'MaxSizeInBytes');

When a new database is created, by default its size quota is set to the maximum allowed for the service objective. However, it is possible to set the limit to a lower value, either when creating the database, or later. For example, the following statement limits the size of an existing database named DB1 to 1 GB:

 ALTER DATABASE DB1 MODIFY (MAXSIZE = 1 GB);

Customers can use this ability to allow scaling down to a lower service objective, when otherwise scaling down wouldn’t be possible because the database is too large.

While this capability is useful for some customers, the fact that the actual size quota for the database may be different from the maximum size quota for the selected service objective can be unexpected, particularly for customers who are used to working with the traditional SQL Server, where there is no explicit size quota at the database level. Exceeding the unexpectedly low database size quota will prevent new space allocations within the database, which can be a serious problem for many types of applications.

In this context, there is one particular scenario that we would like to call out. Specifically, when a database with a size quota explicitly lowered from the default is scaled up to a higher service objective, its size quota remains unchanged. For an administrator expecting the maximum size quota for the new service objective to be in effect after the scaling operation completes, this may be an unpleasant surprise.

Let’s walk through an example. First, let’s create an S2 database without specifying an explicit database size quota:

 CREATE DATABASE DB1 (SERVICE_OBJECTIVE = 'S2');

Once the database is created, we can query its current size quota, and see that it is set to the expected maximum for S2:

 SELECT DATABASEPROPERTYEX(DB_NAME(), 'MaxSizeInBytes');
-- Result: 268435456000 == 256,000 MB

(There is a minor inconsistency here that you might have noticed: the default quota is actually 256 thousands of megabytes, not 256 GB.)

Now let’s lower the quota and query it again:

 ALTER DATABASE DB1 MODIFY (MAXSIZE = 10 GB);
SELECT DATABASEPROPERTYEX(DB_NAME(), 'MaxSizeInBytes');
-- 10737418240 == 10 GB

We see that the quota has been lowered to 10 GB as expected. Now, let’s scale the database up to P1:

 ALTER DATABASE DB1 MODIFY (SERVICE_OBJECTIVE = 'P1');

Note that scaling operations are asynchronous, so the ALTER DATABASE command will complete quickly, while the actual change can take much longer. To determine if the scaling operation on the DB1 database has completed, query the sys.dm_operation_status DMV in the context of the master database.

 SELECT operation, state_desc, percent_complete, start_time, last_modify_time
FROM sys.dm_operation_status
WHERE resource_type_desc = 'Database'
      AND
      major_resource_id = 'DB1'
ORDER BY start_time;

/*
operation state_desc percent_complete start_time last_modify_time
--------------------------- ---------------- ----------------------- -----------------------
CREATE DATABASE COMPLETED 100 2016-09-02 15:11:28.243 2016-09-02 15:12:09.933
ALTER DATABASE COMPLETED 100 2016-09-02 15:16:49.807 2016-09-02 15:16:50.700
ALTER DATABASE COMPLETED 100 2016-09-02 15:23:26.623 2016-09-02 15:25:24.837
*/

This shows all recent operations for the DB1 database. We see that the last ALTER DATABASE command has completed. Now we can query the size quota again (in the context of the DB1 database):

 SELECT DATABASEPROPERTYEX(DB_NAME(), 'MaxSizeInBytes');
-- 10737418240 == 10 GB

We see that even though the maximum size limit for a P1 database is 500 GB, the quota is still set to 10 GB.

Conclusion

It is important to know that in Azure SQL DB databases, an explicit database size quota always exists. This quota can be lower than the maximum (and default) quota for a given service objective. While for some customers this may be intentional, most would prefer the maximum quota to be in effect, particularly after scaling the database up.

We recommend that customers:

1. Proactively check the current size quota for your databases, to make sure it is set as expected. To do this, the following statement can be used in the context of the target database:

 SELECT DATABASEPROPERTYEX(DB_NAME(), 'MaxSizeInBytes');

2. When scaling up to the service objective with a larger maximum size quota, explicitly change the quota to match the maximum by using the ALTER DATABASE ... MODIFY (MAXSIZE = ...) command as shown above (unless a lower quota is desired to guarantee being able to scale down in the future). The change is executed in an online manner.

This is what the customer we mentioned in the beginning of this article did in order to resolve their application outage, and to proactively prevent a reoccurrence of the same problem.