Service Specification¶
Executive Summary¶
The content completion service is a machine learning service that can be used to automatically complete/extend a piece of given content. For example, when provided with the following input content:
"Mark’s Window Cleaners is the best window cleaning business in the entirety of Ontario."
The content completion service extends the paragraph to:
"Mark’s Window Cleaners is the best window cleaning business in the entirety of Ontario. If you're looking for the best window cleaning business in all of Ontario, look no further than Mark's Window Cleaners. We provide top-notch service at an affordable price, and our customer satisfaction is second to none. Contact us today to schedule a free consultation, and see for yourself why we're the best in the business."
Related Objectives and Key Results (OKRs)¶
N/A
Value to Growth Platform¶
Content completion is a service that allows the convert team to automatically generate content for a convert site as long as context is given. It is a foundational capability that will unlock new customer benefits in the future in applications such as automated messaging and review replies.
Service Level Agreements (SLAs)¶
Throughput¶
Median requests per day: 4
Variability in Daily Request Distribution: Medium
Variability in Weekly Request Distribution: Medium
Latency¶
Median: 4.6 seconds
95th Percentile: 26.9 seconds
99th Percentile: 56 seconds
Worst Case Latency: 59 seconds
Schema¶
A POST request must be submitted to the API. The defined schema of this service is:
-Instance - ID: prompt Description: Input text for the generation. Data Type: String Runtime Restrictions: - Must be greater than 10 characters -Parameters - ID: seed Description: Set the random seed for generation. Data Type: Integer Runtime Restrictions: - None - ID: num_return_sequences Description: How many outputs are required. Data Type: Integer Runtime Restrictions: - None - ID: num_sentences Description: How many sentences in one output (only for local model). Data Type: Integer Runtime Restrictions: - None - ID: max_length Description: Set the maximum length of the generated text. Data Type: Integer Runtime Restrictions: - None - ID: min_length Description: Set the minimum length of the generated tex. Data Type: Integer Runtime Restrictions: - None - ID: num_beams_gpt2 Description: Number of beams, more is better but it will slow the generation. Data Type: Integer Runtime Restrictions: - None - ID: num_beam_groups_gpt2 Description: When do_sample is False and needs to be divisible by num_beams. Data Type: Integer Runtime Restrictions: - None - ID: no_repeat_ngram_size_gpt2 Description: How many times a 2-gram (from default) can be repeated. Data Type: Integer Runtime Restrictions: - None - ID: do_sample_gpt2 Description: If turned on, use top-p and top-k. Data Type: Boolean Runtime Restrictions: - None - ID: top_k_gpt2 Description: Probability mass is redistributed for top-k tokens for generation. See https://huggingface.co/blog/decision-transformers. Data Type: Integer Runtime Restrictions: - None - ID: top_p_gpt2 Description: Picks the words exceeding p probability. Data Type: Float Runtime Restrictions: - None - ID: num_beams_gpt3 Description: More beams are better but it will increase cost for GPT3. Data Type: Integer Runtime Restrictions: - None - ID: temperature Description: Lower values will make the outputs deterministic and repetitive. Data Type: Float Runtime Restrictions: - Must be between 0 and 1, inclusive. - ID: gpt3_selection_probability Description: Probability of calling gpt3 api in comparison to gpt2. Data Type: Float Runtime Restrictions: - Must be between 0 and 1, inclusive. - ID: repetition_penalty_gpt3 Description: Penalize words that were already generated or belong to the context. https://beta.openai.com/docs/api-reference/parameter-details. Data Type: Float Runtime Restrictions: - Must be between -2 and 2, exclusive. - ID: presence_penalty_gpt3 Description: Penalize new tokens if they are already in the text. https://beta.openai.com/docs/api-reference/parameter-details". Data Type: Float Runtime Restrictions: - Must be between -2 and 2, exclusive. - ID: need_sentiment Description: Individual sentiment scores for generated content. Data Type: Boolean Runtime Restrictions: - None - ID: gpt3_model Description: Select from text-davinci-002, text-curie-001, text-ada-001. Data Type: String Runtime Restrictions: - None - ID: user_name Description: Email address or name of the api caller to uniquely identify the request. Data Type: String Runtime Restrictions: - None -Prediction - ID: generated_text Description: Generated Text. Data Type: String Runtime Restrictions: - None - ID: prediction_id Description: A unique identifier for each prediction instance used for logging. The value maps to the table in BigQuery and can be ignored by users. Data Type: String Runtime Restrictions: - None - ID: sentiment Description: Positive neutral or negative sentiment. Data Type: Float Runtime Restrictions: - None
Feedback Mechanisms¶
The main feedback mechanism is direct communication with the end users. Currently, this is limited to the convert team members. Through understanding their difficulties when using the content completion service, we can understand specific improvements that need to be made.
In addition, the response data is collected whenever a request is made to the service, as well as changes that convert members made to the text. This data can be used to fine tune either of the two models in the future.