Machine Learning Service Specification Template¶

The first stage of the experiment lifecycle is to update the ML Services section of our documentation. This is a template for the articles in that section. Use it for new services. When updating existing services, check to see that the article still follows this template.

Begin from the first sub-heading below. The sub-headings are fixed and should be directly copied. The content within each subheading is placeholder content. Some of it is intended to motivate your writing, such as leading questions to answer. Some of it is specific direction on things to mention.

Executive Summary¶

Give a one or two sentence summarizing description of the service.
This might start with "The purpose of this service is to..." or "The capability this service provides is..."
When non-engineers in the organization are discussing this service, this is the "tagline" they should have in mind. Make it easy for anyone to understand.

All machine learning projects should be related to at least one specific OKR. List those OKRs here.
Ideally, this is a list of references to other documentation that holds descriptions of our OKRs. If that does not exist, repeat the OKR word for word here.

Value to the Growth Platform¶

Explain what features of the growth platform are planning to use the service.
Explain how these growth platform features receive a benefit from this service that improves on what could be done without machine learning.

Service Level Agreements (SLAs)¶

Throughput¶

Set the average (median) number of requests per second that the service should expect.
Set a maximum number of requests per second the service should reasonably be able to handle.
Identify the general level of variance in the distributions of requests over a day and a week.
- Will the service receive a consistent stream of requests throughout a typical day, or will the requests come in spikes?
- Will the service receive a regular number of requests each day, or will some days of the week or year receive significantly higher traffic than others?

Use this as a template:

Median requests per second: [X]
95th percentile requests per second: [X]
Variance in daily request distribution: [high | medium | low]
Variance in weekly request distribution: [high | medium | low]
Times of year with higher than normal expected traffic: [certain holiday | certain month | None]
Uptime: [X]%

Latency¶

Set the ideal upper bound in milliseconds to receive a response to a typical request.
Set a worst-case upper bound in milliseconds to receive a response to a typical request.
Set a worst-case upper bound in milliseconds to receive a response to a computationally intensive request (e.g. long form text generation, prediction for a client with high customer volume).

Use this as a template:

Ideal upper bound, typical request: [X]ms.
Worst-case upper bound, typical request: [X]ms.
Worst-case upper bound, intensive request: [X]ms.

Schema¶

Write out the Instance, Parameters, and Prediction schema for the service.
Include any runtime restrictions as well as descriptions of the fields in the data models.
This schema should be able to be ported fairly directly to a schema.py file in an experiment.
If it is a new major version, overwrite the schema in this document.
Write it out according to the format given in the example below.

For example, take a service which could receive the following request and response:

Request: {"instances": [{"content": "Some content"}], "parameters": {}}

Response: {"predictions": [{"sentiment": "positive"}]}

The schema corresponding to this example would be represented by the following YAML-like structure in the service specification:

- Instance
    - ID: content
      Description: Text content from a review, website, or email.
      Data Type: String
      Runtime Restrictions:
        - Maximum length is 512 characters
- Parameters
    - None.
- Prediction
    - ID: sentiment
      Description: Text content can either have a positive tone, a negative tone, or a neutral tone. The sentiment of a text is its overall tone.
      Data Type: Enum ("positive" | "neutral" | "negative")
    - ID: probabilities
      Description: The model predicted a probability distribution over the possible categorical sentiments. These are the probabilities of each category.
      Data Type: Struct
        positive: float
        neutral: float
        negative: float
      Runtime Restrictions:
        - Each value must be between 0 and 1, inclusive.

Feedback Mechanisms¶

Describe how the client and/or customer will interact with this service -- via application features -- in order to produce data that can be used to improve the service.
Describe how that data will be used to improve the service in the future.
Technical detail and rigour is not necessary here; the metrics chosen later in the lifecycle will be the specific implementation of the general ideas in this section.