In previous article we mentioned that Llama 2 is available on Sagemaker. In this piece we will guide you how to Deploy Llama 2 on AWS Sagemaker.
Deploying on AWS Sagemaker
To deploy the Llama-2–7B model on AWS Sagemaker, ensure you possess an AWS Account with administrative rights. Begin by logging in and navigating to the Amazon Sagemaker console, preferably in the us-east-1, N. Virginia region.
Step-1: Check your Quotas
Not all resources in Amazon Sagemaker are automatically available, so it’s advisable to perform a preliminary check.
Search for these service quotas in Sagemaker,
- Total domains
- Maximum number of Studio user profiles allowed per account
- ml.g5.2xlarge for endpoint usage
- Maximum number of running Studio apps allowed per account
If any of these services display a quota value of 0, you’ll need to submit a request for a quota increase. Monitor the status of your requests in the quota request history. Be aware that approvals can sometimes take up to 2 days.
Step-2: Create Domain
Setting Up a Domain:
- Start by creating a domain, especially if this is your maiden voyage with Sagemaker.
- Opt for “Quick Setup”.
- Decide on a domain name.
- The user profile name can be left as is or modified to your preference.
- If you don’t already have one, you’ll need to establish a role.
Then choose “Any S3 bucket” and hit create.
Finalizing Domain Setup:
- Review your settings; it should match the described configuration.
- Click on “Submit” to finalize the domain creation.
- If you encounter any errors during this process, they are likely related to user permissions or VPC setup.
Launching and Deploying
- Once your domain and user profile are set up, proceed to launch the Sagemaker Studio.
- From there, you can deploy your model.
Go to Jumpstart and search for Llama2–7b-chat.
You can stick with the default settings. The ml.g5.2xlarge is the minimum instance needed to operate the Llama2–7B model. Be aware that it’s priced at $1.515 per hour, summing up to $36.36 daily if it’s continuously active.
According to our calculations, LLama 2 AWS Environment Cost for one month should be about $1.5K.
To set up the model as an endpoint, hit “Deploy”. Before moving forward, you’ll have to agree to the license terms.
The deployment process might take a short while.
Credits: Mudassir Aqeel Ahmed
Read related articles: