
Creating an Endpoint in Generative AI

Creating an Endpoint in Generative AI
Create an endpoint for a custom or pretrained model on a hosting dedicated AI cluster
 in OCI
Generative AI.
In the navigation bar of the Console, select a region with Generative AI, for example, US Midwest
 (Chicago) or UK South (London). See which models are offered in your
 region.
Open the navigation menu and click Analytics & AI. Under AI Services, click Generative AI.
Select the compartment that contains the custom model that you want to add an
 endpoint to.
Perform one of the following actions:
To create an endpoint for a custom model with the model name
 and version pre-populated:
Click Custom models.
Click the name of the custom model that you want to add an endpoint
 for.
Check the base model for the custom model to match it to a cluster
 in the following steps. For example,
 cohere.command-r-plus.
Under Resources, click
 Endpoints.
Click Create endpoint.
To create an endpoint for any out-of-the-box pretrained or
 custom model:
Click Endpoints.
Click Create endpoint
(Optional) 
 Enter a name for the endpoint. Start the name with a letter or underscore,
 followed by letters, numbers, hyphens, or underscores. The length can be 1 to
 255 characters. If you don't enter a name, the system generates a name that you
 can change later.
The generated name has the format
 generativeaiendpoint<timestamp>. 
generativeaiendpoint20240531235319
(Optional) 
 To moderate the model's generated responses turn on Content
 moderation toggle. This option is off by default. Learn about
 Content Moderation. You can add this feature later when
 you edit the endpoint.
If not selected, choose the model name and version that you want to add an endpoint for.
 Tip
If the model is in a different compartment than the current compartment, click Change compartment and choose the compartment that hosts the model. We recommend that you create the endpoint in the same compartment as the model.
If the custom model that you're looking for isn't listed, click Cancel. Then under Generative AI, click Custom models and ensure that the custom model is in an active state.
 Choose a hosting dedicated AI cluster by performing one of the following actions:
If you already have a cluster, choose a Dedicated AI cluster from the drop-down list. If you just created a cluster, wait for that cluster to become active. Ensure that the base model that 's associated with this cluster matches the base model of the custom model.
To create a cluster, in the Dedicated AI cluster drop-down list, click Create new dedicated AI cluster and perform the following steps:
(Optional) Enter a name and description.
Choose a Base model that matches the base model of the model that you want to host.
Add 1 model replica to the endpoint. When you create a cluster you
 need at least one unit for an endpoint. For an existing cluster, you
 can use that same unit to host new endpoints. Each instance hosts
 all the active endpoints. Going from 1 to 2 instance doubles the
 number of supported RPM for all active endpoints hosted on the
 cluster.
Read the commitment unit hours for the hosting dedicated AI
 cluster and select the checkbox to agree to the commitment.
Click Create and wait for the cluster to become active.
From the Dedicated AI cluster drop-down list, click the dedicated AI cluster that you created.
(Optional) 
 Click Show advanced options and assign tags to the
 endpoint.
Click Create endpoint.
You're directed to the endpoint details page where you can track the state of
 the endpoint.
After the endpoint is active, click View in playground and start using the model from this endpoint.
