In previous article we mentioned that Llama 2 is available on Sagemaker. In this piece we will guide you how to Deploy Llama 2 on AWS Sagemaker.
Deploying on AWS Sagemaker
To deploy the Llama-2–7B model on AWS Sagemaker, ensure you possess an AWS Account with administrative rights. Begin by logging in and navigating to the Amazon Sagemaker console, preferably in the us-east-1, N. Virginia region.
Step-1: Check your Quotas
Not all resources in Amazon Sagemaker are automatically available, so it’s advisable to perform a preliminary check.
![](https://llama-2.ai/wp-content/uploads/2023/10/llama-on-aws-step-1-1024x456.webp)
Search for these service quotas in Sagemaker,
- Total domains
- Maximum number of Studio user profiles allowed per account
- ml.g5.2xlarge for endpoint usage
- Maximum number of running Studio apps allowed per account
If any of these services display a quota value of 0, you’ll need to submit a request for a quota increase. Monitor the status of your requests in the quota request history. Be aware that approvals can sometimes take up to 2 days.
![](https://llama-2.ai/wp-content/uploads/2023/10/step-2-1024x282.webp)
Step-2: Create Domain
Setting Up a Domain:
- Start by creating a domain, especially if this is your maiden voyage with Sagemaker.
- Opt for “Quick Setup”.
- Decide on a domain name.
- The user profile name can be left as is or modified to your preference.
- If you don’t already have one, you’ll need to establish a role.
![](https://llama-2.ai/wp-content/uploads/2023/10/step-3-1024x484.webp)
Then choose “Any S3 bucket” and hit create.
![](https://llama-2.ai/wp-content/uploads/2023/10/step-4-1024x470.webp)
Finalizing Domain Setup:
- Review your settings; it should match the described configuration.
- Click on “Submit” to finalize the domain creation.
- If you encounter any errors during this process, they are likely related to user permissions or VPC setup.
Launching and Deploying
- Once your domain and user profile are set up, proceed to launch the Sagemaker Studio.
- From there, you can deploy your model.
![](https://llama-2.ai/wp-content/uploads/2023/10/step-5-1024x406.webp)
![](https://llama-2.ai/wp-content/uploads/2023/10/step-6-1024x466.webp)
Go to Jumpstart and search for Llama2–7b-chat.
![](https://llama-2.ai/wp-content/uploads/2023/10/step-7-1024x472.webp)
![](https://llama-2.ai/wp-content/uploads/2023/10/step-8.webp)
You can stick with the default settings. The ml.g5.2xlarge is the minimum instance needed to operate the Llama2–7B model. Be aware that it’s priced at $1.515 per hour, summing up to $36.36 daily if it’s continuously active.
According to our calculations, LLama 2 AWS Environment Cost for one month should be about $1.5K.
To set up the model as an endpoint, hit “Deploy”. Before moving forward, you’ll have to agree to the license terms.
![](https://llama-2.ai/wp-content/uploads/2023/10/step-9.webp)
The deployment process might take a short while.
![](https://llama-2.ai/wp-content/uploads/2023/10/step-11-1024x472.webp)
Credits: Mudassir Aqeel Ahmed
Read related articles: