How to Deploy Llama 2 on AWS Sagemaker

In previous article we mentioned that Llama 2 is available on Sagemaker. In this piece we will guide you how to Deploy Llama 2 on AWS Sagemaker.

Deploying on AWS Sagemaker

To deploy the Llama-2–7B model on AWS Sagemaker, ensure you possess an AWS Account with administrative rights. Begin by logging in and navigating to the Amazon Sagemaker console, preferably in the us-east-1, N. Virginia region.

Step-1: Check your Quotas

Not all resources in Amazon Sagemaker are automatically available, so it’s advisable to perform a preliminary check.

Search for these service quotas in Sagemaker,

If any of these services display a quota value of 0, you’ll need to submit a request for a quota increase. Monitor the status of your requests in the quota request history. Be aware that approvals can sometimes take up to 2 days.

Step-2: Create Domain

Setting Up a Domain:

Start by creating a domain, especially if this is your maiden voyage with Sagemaker.
Opt for “Quick Setup”.
Decide on a domain name.
The user profile name can be left as is or modified to your preference.
If you don’t already have one, you’ll need to establish a role.

Then choose “Any S3 bucket” and hit create.

Finalizing Domain Setup:

Review your settings; it should match the described configuration.
Click on “Submit” to finalize the domain creation.
If you encounter any errors during this process, they are likely related to user permissions or VPC setup.

Launching and Deploying

Once your domain and user profile are set up, proceed to launch the Sagemaker Studio.
From there, you can deploy your model.

Go to Jumpstart and search for Llama2–7b-chat.

You can stick with the default settings. The ml.g5.2xlarge is the minimum instance needed to operate the Llama2–7B model. Be aware that it’s priced at $1.515 per hour, summing up to $36.36 daily if it’s continuously active.

According to our calculations, LLama 2 AWS Environment Cost for one month should be about $1.5K.

To set up the model as an endpoint, hit “Deploy”. Before moving forward, you’ll have to agree to the license terms.

The deployment process might take a short while.

Credits: Mudassir Aqeel Ahmed

Read related articles: