Service Specification¶
Executive Summary¶
The content completion service is a machine learning service that can be used to automatically complete/extend a piece of given content. For example, when provided with the following input content:
"Mark’s Window Cleaners is the best window cleaning business in the entirety of Ontario."
The content completion service extends the paragraph to:
"Mark’s Window Cleaners is the best window cleaning business in the entirety of Ontario. If you're looking for the best window cleaning business in all of Ontario, look no further than Mark's Window Cleaners. We provide top-notch service at an affordable price, and our customer satisfaction is second to none. Contact us today to schedule a free consultation, and see for yourself why we're the best in the business."
Related Objectives and Key Results (OKRs)¶
N/A
Value to Growth Platform¶
Content completion is a service that allows the convert team to automatically generate content for a convert site as long as context is given. It is a foundational capability that will unlock new customer benefits in the future in applications such as automated messaging and review replies.
Service Level Agreements (SLAs)¶
Throughput¶
Median requests per day: 4
Variability in Daily Request Distribution: Medium
Variability in Weekly Request Distribution: Medium
Latency¶
Median: 4.6 seconds
95th Percentile: 26.9 seconds
99th Percentile: 56 seconds
Worst Case Latency: 59 seconds
Schema¶
A POST request must be submitted to the API. The defined schema of this service is:
-Instance
- ID: prompt
Description: Input text for the generation.
Data Type: String
Runtime Restrictions:
- Must be greater than 10 characters
-Parameters
- ID: seed
Description: Set the random seed for generation.
Data Type: Integer
Runtime Restrictions:
- None
- ID: num_return_sequences
Description: How many outputs are required.
Data Type: Integer
Runtime Restrictions:
- None
- ID: num_sentences
Description: How many sentences in one output (only for local model).
Data Type: Integer
Runtime Restrictions:
- None
- ID: max_length
Description: Set the maximum length of the generated text.
Data Type: Integer
Runtime Restrictions:
- None
- ID: min_length
Description: Set the minimum length of the generated tex.
Data Type: Integer
Runtime Restrictions:
- None
- ID: num_beams_gpt2
Description: Number of beams, more is better but it will slow the generation.
Data Type: Integer
Runtime Restrictions:
- None
- ID: num_beam_groups_gpt2
Description: When do_sample is False and needs to be divisible by num_beams.
Data Type: Integer
Runtime Restrictions:
- None
- ID: no_repeat_ngram_size_gpt2
Description: How many times a 2-gram (from default) can be repeated.
Data Type: Integer
Runtime Restrictions:
- None
- ID: do_sample_gpt2
Description: If turned on, use top-p and top-k.
Data Type: Boolean
Runtime Restrictions:
- None
- ID: top_k_gpt2
Description: Probability mass is redistributed for top-k tokens for generation. See https://huggingface.co/blog/decision-transformers.
Data Type: Integer
Runtime Restrictions:
- None
- ID: top_p_gpt2
Description: Picks the words exceeding p probability.
Data Type: Float
Runtime Restrictions:
- None
- ID: num_beams_gpt3
Description: More beams are better but it will increase cost for GPT3.
Data Type: Integer
Runtime Restrictions:
- None
- ID: temperature
Description: Lower values will make the outputs deterministic and repetitive.
Data Type: Float
Runtime Restrictions:
- Must be between 0 and 1, inclusive.
- ID: gpt3_selection_probability
Description: Probability of calling gpt3 api in comparison to gpt2.
Data Type: Float
Runtime Restrictions:
- Must be between 0 and 1, inclusive.
- ID: repetition_penalty_gpt3
Description: Penalize words that were already generated or belong to the context. https://beta.openai.com/docs/api-reference/parameter-details.
Data Type: Float
Runtime Restrictions:
- Must be between -2 and 2, exclusive.
- ID: presence_penalty_gpt3
Description: Penalize new tokens if they are already in the text. https://beta.openai.com/docs/api-reference/parameter-details".
Data Type: Float
Runtime Restrictions:
- Must be between -2 and 2, exclusive.
- ID: need_sentiment
Description: Individual sentiment scores for generated content.
Data Type: Boolean
Runtime Restrictions:
- None
- ID: gpt3_model
Description: Select from text-davinci-002, text-curie-001, text-ada-001.
Data Type: String
Runtime Restrictions:
- None
- ID: user_name
Description: Email address or name of the api caller to uniquely identify the request.
Data Type: String
Runtime Restrictions:
- None
-Prediction
- ID: generated_text
Description: Generated Text.
Data Type: String
Runtime Restrictions:
- None
- ID: prediction_id
Description: A unique identifier for each prediction instance used for logging. The value maps to the table in BigQuery and can be ignored by users.
Data Type: String
Runtime Restrictions:
- None
- ID: sentiment
Description: Positive neutral or negative sentiment.
Data Type: Float
Runtime Restrictions:
- None
Feedback Mechanisms¶
The main feedback mechanism is direct communication with the end users. Currently, this is limited to the convert team members. Through understanding their difficulties when using the content completion service, we can understand specific improvements that need to be made.
In addition, the response data is collected whenever a request is made to the service, as well as changes that convert members made to the text. This data can be used to fine tune either of the two models in the future.