API standards

Abstract

The TDA standards surrounding API design and implementation are based heavily on similar standards published by the UK Government Digital Services. They are intended for those designing, implementing or integrating APIs.

For guidance which is appropriate for those planning, purchasing or co-ordinating services within UIS, the Technical Design Authority have a separate API guidance page.

This document was last updated in May 2022 and may be cited as version a630d6b.

Introduction

This page provides specific technical guidance for those implementing APIs within UIS or those providing technical evaluation of a third-party product. It pre-supposes some familiarity with technical terms related to APIs and the wider web ecosystem. Where possible technical terms are linked to a page providing more information the first time they are mentioned.

The Technical Design Authority (TDA) mandates that new products being developed or deployed by UIS implement these standards or have a clearly demonstrable path on how they will be implemented as part of the transition from beta to live service.

Information in this page is based heavily on the UK Government Digital Services API standards which is re-used under the terms of the Open Government Licence v3.0. The Technical Design Authority release this document under like terms.

Publication and access

Publish your APIs over the internet by default. Contact the TDA if you feel that your API should not be published over public networks.

Info

Since the COVID-19 pandemic, it has become increasingly important that services are not needlessly corralled inside networks inaccessible to those working from home. Similarly, having an API be inaccessible from the public Internet may needlessly limit your customers' options for hosting their client applications.

Use RESTful

Follow the industry standard and where appropriate build APIs that are RESTful, which use HTTP verb requests to manipulate data.

When handling requests, you should use HTTP verbs for their specified purpose.

One of the advantages of REST is that it gives you a framework for communicating error states.

In some cases, it may not be applicable to build a REST API, for example, when you are building an API to stream data.

Use HTTPS

You should use HTTPS when creating APIs.

Adding HTTPS will secure connections to your API, preserve user privacy, ensure data integrity, and authenticate the server providing the API. It is not necessa

Secure APIs using Transport Layer Security (TLS) v1.2 or greater. Do not use Secure Sockets Layer (SSL) or TLS v1.0. UIS provide a web application to request TLS certificates or services such as Let's Encrypt may be used.

Tip

Mozilla, the stewards of the Firefox web browser, maintain a page of recommended cipher suites to use with TLS.

Make sure potential API users can establish trust in your certificates. Make sure you have a robust process for timely certificate renewal and revocation.

If possible, automate certificate provision and renewal.

Your API may warrant linking your data together. You can make your API more programmatically accessible by returning URIs, and by using existing standards and specifications.

Use Uniform Resource Identifiers (URIs) to identify certain data:

{
    "name": "Bob Person",
    "company": "https://your.api.example.com/company/bobscompany"
}

When your API returns data in response to an HTTP call, you should use URIs in the payload to identify certain data. Where appropriate, you should use specifications that use hypermedia, including CURIES, JSON-LD or HAL. This makes it easier to find those resources.

Use JSON

Your first choice for all web APIs should be JSON where possible.

Only use another representation to build something in exceptional cases, like when you:

  • need to connect to a legacy system, for example, one that only uses XML.
  • will receive clear advantages from complying with a broadly adopted standard. For example, one protocol offered by Raven is SAML which is a widely-used identity API.

We recommend you should:

  • Create responses as a JSON object and not an array although JSON objects within the response can contain JSON arrays. Arrays can limit the ability to include metadata about results and limit the API’s ability to add additional top-level keys in the future.
  • Document your JSON object to ensure it is well described, and so that it is not treated as a sequential array.
  • Avoid unpredictable object keys such as those derived from data as this adds friction for clients.
  • Use a single grammar case for object keys. Choose under_score or camelCase and be consistent. If you have no preference, use camelCase.

Use ISO 8601

The TDA mandates using the ISO 8601 standard to represent date and time in your payload response. This helps people read the time correctly.

Use a consistent date format. For dates, this looks like 2017-08-09. For dates and times, use the form 2017-08-09T13:58:07Z.

Use the UTC or "zulu" time zone for all dates and times. Note that UTC includes leap seconds.

Use seconds for durations

Durations should be specified as seconds, including decimals if sub-second precision is required.

Be wary of ISO 8601 durations as they do not always map into fixed length durations. For example, P3Y6M4DT12H30M5S represents a duration of "three years, six months, four days, twelve hours, thirty minutes, and five seconds". Due to the calendar, leap seconds and leap years, the duration of a "year", "month", "day", "hour" and "minute" can vary.

Prefer WGS 84 to represent location

Prefer the World Geodetic System 1984 (WGS 84) standard to represent geographic location. You can also use other geographic coordinate systems if appropriate but ensure that the corresponding EPSG code is part of your documentation or, preferably, is included with the response payload.

You should use GeoJSON for the exchange of location information.

Use Unicode for encoding

The Unicode Transformation Format (UTF-8) standard is mandatory for use in government when encoding text or other textual representations of data.

How to respond to data requests

Configure APIs to respond to explicit "requests" for data rather than "sending" or "pushing" data. This makes sure the API user only receives the information they require.

When responding, your API must answer the request fully and specifically. For example, an API should respond to the request "is this user married?" with a boolean. The answer should not return any more detail than is required and should rely on the client application to correctly interpret it.

For example:

{ "married": false }

Rather than returning an entire "Person" response:

{
  "person": {
    "name": "Alice Betterland",
    "dob": "1999-01-01",
    "married": false,
    "validFrom":"2011-04-03",
    "validTo":""
  }
}

Tip

Consider using a "fields" query parameter to allow clients to indicate which fields they want included in the response. This allows you to optimise your back end and not unnecessarily poplulate fields which are expensive to compute.

Design data fields with user needs in mind

When designing your data fields, you should consider how the fields will meet user needs. Having a technical writer in your team can help you do this. You can also regularly test your documentation.

For example, if you need to collect personal information as part of your dataset, before deciding on your payload response, you may need to consider whether:

  • the design can cope with names from cultures which don’t have first and last names,
  • the abbreviation "DOB" makes sense or whether it’s better to spell out the field to date of birth, or
  • whether "DOB" makes sense when combined with "DOD" (date of death) or "DOJ" (date of joining).

You should also make sure you provide all the relevant options. For example, the "marriage" field is likely to have more than 2 states you wish to record: married, unmarried, divorced, widowed, estranged, annulled and so on.

Depending on what you decide, you may choose the following payload as a response:

{
  "person": {
    "name": "Alice Wonderland",
    "dob": "1999-01-01",
    "married": true,
    "validFrom":"2010-03-12",
    "validTo":"2011-04-03"
  },
  "person": {
    "name": "Alice Betterland",
    "dob": "1999-01-01",
    "married": false,
    "validFrom":"2011-04-03",
    "validTo":""
  }
}

Let users download whole datasets in bulk

When providing a Data API, you should let users download whole datasets unless the datasets contain restricted information. This gives users:

  • the ability to analyse the dataset locally, and
  • support when performing a task requiring access to the whole dataset, for example, plotting a graph on school catchment areas in England.

Users should be able to index their local copy of data using their choice of database technology and then perform a query to meet their needs. This means that future API downtime won’t affect them because they already have all the data they need.

Using a record-by-record data API query to perform the same action would be suboptimal, both for the user and for the API. This is because:

  • rate limits would slow down access, or may even stop the whole dataset from downloading entirely, and
  • if the dataset is being updated at the same time with the record-by-record download, users may get inconsistent records

If you allow a user to download an entire dataset, you should consider providing a way for them to keep it up to date. For example you could live stream your data or notify them that new data is available so that API consumers know to download you API data periodically.

Encourage users to keep local dataset copies up to date

Don’t encourage users to keep large datasets up to date by re-downloading them because this approach is wasteful and impractical. Instead, let users download incremental lists of changes to a dataset. This allows them to keep their own local copy up to date and saves them having to re-download the whole dataset repeatedly.

There isn’t a recommended standard for this pattern, so users can try different approaches such as:

  • encoding data in Atom/RSS feeds,
  • using emergent patterns, such as event streams used by products such as Apache Kafka, or
  • making use of open data registries.

Use common bulk data formats

Make data available in CSV formats as well as JSON when you want to publish bulk data. This makes sure users can use a wide range of tools, including off-the-shelf software, to import and analyse this data.

Keep a log of requests for personal data

If your API serves personal or sensitive data, you must log when the data is provided and to whom. This will help you meet your requirements under the General Data Protection Regulations (GDPR) and Data Protection Act (DPA), respond to data subject access requests, and detect fraud or misuse.

Be careful with unauthenticated access

Use unauthenticated or "open acces" APIs if you want to give unfettered access to your API and you do not need to identify your users, for example when providing open data. However, do bear in mind the risk of denial-of-service attacks.

Open access does not mean you are unable to throttle your API.

Prefer authentication

Authentication is required when you want to identify clients for the purposes of:

  • rate limiting/throttling,
  • auditing,
  • billing, or
  • authorisation.

Your purpose will dictate the security requirements for your authentication solution.

Note

"Authentication" and "authorisation" are not the same thing. The Raven service documentation has a section explaining the difference.

The API gateway service provides a turn-key authentication solution which can be used for service-to-service APIs.

Provide application-level authorisation

Use application-level authorisation if you want to control which applications can access your API, but not which specific end users. This is suitable if you want to use rate limiting, auditing, or billing functionality.

The TDA recommends using OAuth 2.0, the open authorisation framework. Use the Client Credentials grant type to authenticate applications. This service gives each registered application an OAuth2 Bearer Token, which can be used to make API requests on the application’s own behalf.

The API gateway service provides a turn-key authentication solution which can be used for service-to-service APIs.

Use allow lists to limit API access

Use an allow list if you want your API to be permanently or temporarily private, for example, to run a private beta.

The API gateway service can mark an API product as needing manual approval for each application's use.

You should not add the IP addresses of the APIs you consume to your allow list. This is because APIs may be provided using Content Delivery Networks (CDNs) and scalable load balancers, which rely on flexible, rapid allocation of IP addresses and sharing. Instead of using an allow list, you should use an HTTPS egress proxy.

Follow good practice for tokens and permissions

You should:

  • Choose a suitable refresh frequency and expiry period for your user access tokens. Failure to refresh access tokens regularly can lead to vulnerabilities.
  • Allow your users to revoke authority.
  • Invalidate an access token yourselves and force a reissue if there is a reason to suspect a token has been compromised.

The API gateway service provides a turn-key solution for this.

When possible:

  • Use time-based one-time passwords (TOTP) for extra security on APIs with application-level authorisation.
  • Use multi-factor authentication (MFA) and identity verification (IV) for extra security on APIs with user-level authorisation
  • Wnsure the tokens you provide have the narrowest permissions possible. Nnarrowing the permissions means there’s a much lower risk to your API if the tokens are lost by users or compromised.

Monitor APIs for unusual activity

Your API security is only as good as your day-to-day security processes.

Monitor APIs for unusual behaviour just like you’d closely monitor any website. Look for changes in IP addresses or users using APIs at unusual times of the day. Read the National Cyber Security Centre (NCSC) guidance to find out how to implement a monitoring strategy and the specifics of how to monitor the security status of networks and systems.

Nomenclature

All API naming in URLs (including the name of your API, namespaces and resources) should:

  • use nouns rather than verbs,
  • be short, simple and clearly understandable,
  • be human-guessable, avoiding technical or specialist terms where possible, and
  • use hyphens rather than underscores as word separators for multi-word names.

For example, you might host your API under https://api.apps.cam.ac.uk/your-api-name.

Sub-resources

Sub-resources must appear under the resource they relate to, but should go no more than three deep, for example: /resource/id/sub-resource/id/sub-sub-resource.

If you reach a third level of granularity (sub-sub-resource), you should review your resource construction to see if it is actually a combination of multiple first or second level resources.

Query arguments

You should use path parameters to identify a specific resource or resources. For example, /users/1.

You should only allow query strings to be used in GET requests for filtering the values returned from an individual resource, for example /users?state=active or /users?page=2.

You should never use query strings in GET requests for identification purposes, for example, avoid using the query string /users?id=1.

Query strings should not be used for defining the behaviour of your API, for example /users?action=getUser&id=1.

API iteration and versioning

When iterating your API to add new or improved functionality, you should minimise disruption for your users so that they do not incur unnecessary costs.

To minimise disruption for users, you should:

  • Make backwards compatible changes where possible and specify parsers ignore properties they don’t expect or understand to ensure changes are backwards compatible. Doing this allows you to add fields to update functionality without requiring changes to the client application.
  • Make a new endpoint available for significant changes.
  • Provide notices for deprecated endpoints.

New endpoints do not always need to accompany new functionality if they still maintain backward compatibility.

Consider proactively versioning your endpoints to allow for cases where backwards compatibility is not feasible. For example: /v1/resource, /v2alpha1/resource.

Backwards incompatible changes

When you need to make a backwards incompatible change you should:

  • Increment a version number in the URL or the HTTP header. Start with /v1/ and increment with whole numbers.
  • Support both old and new endpoints in parallel for a suitable time period before discontinuing the old one.
  • Tell users of your API how to validate data. For example, let them know when a field is not going to be present so they can make sure their validation rules will treat that field as optional.

Sometimes you’ll need to make a larger change and simplify a complex object structure by folding data from multiple objects together. In this case, make a new object available at a new endpoint, for example:

Combine data about users and accounts from /v1/users/123 and /v1/accounts/123 to produce /v1/consolidatedAccount/123.

Set clear deprecation policies

Set clear API deprecation policies so that you’re not supporting old client applications forever.

State how long users have to upgrade, and how you’ll notify them of these deadlines.

Provide users with a test service

Your API consumers will want to test their application against your API before they go live. If you have a read only API then you do not necessarily need to provide a test service.

Provide them with a test service, sometimes referred to as a sandbox.

If your API has complex or stateful behaviour, consider providing a test service that mimics the live service as much as possible, but bear in mind the cost of doing this.

If your API requires authorisation, for example using OAuth 2.0, you’ll need to include this in your test service or provide multiple levels of a test service.

To help you decide what to provide, do user research; ask your API consumers what a sufficient test service would look like.

Test your API’s compliance

You should provide your development team with the ability to test your API using sample test data, if applicable. Testing your API should not involve using production systems and production data.

Test your API’s performance and scalability

For highly cacheable open data access APIs, a well-configured Content Delivery Network (CDN) may provide sufficient scalability.

The API gateway service offers a caching layer for the APIs it hosts.

For APIs that don’t have those characteristics, you should set quota expectations for your users in terms of capacity and rate available. Start small, according to user needs, and respond to requests to increase capacity by making sure your API can meet the quotas you have set.

Make sure users can test your full API up to the quotas you have set.

Enforce the quotas you have set, even when you have excess capacity. This makes sure that your users will get a consistent experience when you don’t have excess capacity, and will design and build to handle your API quota.

The API gateway service offers a quota and rate-limiting feature.

As with user-facing services, you should test the capacity of your APIs in a representative environment to help make sure you can meet demand.

Where the API delivers personal or private information you, as the data controller, must provide sufficient timeouts on any cached information in your delivery network.

Document your API

To document your API start you should:

In your documentation, you should include the following sections.

An overview with contextual information

Cover what the API does, who it might be used by and under what circumstances.

Business and data rules

Under what circumstances is will the data provided by this API be made available.

Error scenarios

Preconditions and outcomes for error conditions including error codes and payloads returned.

Details of a test service

How to use it and how to simulate the various success and error scenarios.

Request and response parameters

Including meaning, data type and any other constraints. Give examples of valid values.

Authentication

Include information on how your API is authenticated and details of any rate-limiting or quotas.

Describe any authorisation rules. For example, when using of OAuth 2.0 specify which scopes are required for this API.

Management

Document any recent or planned design changes along with version information.

Detail and availability, latency, ownership, deprecation policies.

Discuss your approach to backwards compatibility.

Governance

Provide guidance on configuring the API to make sure any relevant governance frameworks such as Payment Card Industry Data Security Standard and Health and Social Care Network are followed

Security information

Discuss any security implications from use or misuse of the API.

Cost

If applicable detail any costs associated with the API which may be incurred by clients directly or indirectly by the service provider.