Cloud setup
Core Module
Google cloud project (GCP) is the cloud service provided by Google. The key concept, or selling point, of any cloud provider is the idea of near-infinite resources. Without the cloud it simply is not feasible to do many modern deep learning and machine learning tasks because they cannot be scaled locally.
The image below shows a subset of all the different services that the Google cloud platform offers. The ones marked in red are the ones we are actually going to investigate in this course. Therefore, if you get done with exercises early I highly recommend that you deep dive more into the Google cloud platform.
❔ Exercises
As the first step, we are going to get you setup with some Google Cloud credits.
-
Go to https://learn.inside.dtu.dk. Go to this course. Find the recent message where there should be a download link and instructions on how to claim the $50 cloud credit. Please do not share the link anywhere as there are a limited amount of coupons. If you are not officially taking this course at DTU, Google gives $300 cloud credits whenever you signup with a new account. NOTE that you need to provide a credit card for this so make sure to closely monitor your credit use so you do not end up spending more than the free credit.
-
Login to the homepage of gcp. It should look like this:
-
Go to billing and make sure that your account is showing $50 of cloud credit
make sure to also checkout the
Reports
throughout the course. When you are starting to use some of the cloud services these tabs will update with info about how much time you can use before your cloud credit runs out. Make sure that you monitor this page as you will not be given another coupon. -
One way to stay organized within GCP is to create projects.
Create a new project called
dtumlops
. When you clickcreate
you should get a notification that the project is being created. The notification bell is good way to make sure how the processes you are running are doing throughout the course. -
For setup we are going to install
gcloud
.gcloud
is the command line interface for working with our Google cloud account. Nearly everything that we can do through the web interface we can also do through thegcloud
interface. Follow the installation instructions here for your specific OS.-
After installation, try in a terminal to type:
the command should and show the help page. If not, something went wrong in the installation (you may need to restart after installing).
-
Now login by typing
you should be sent to an web page where you link your cloud account to the
gcloud
interface. Afterwards, also run this command:If you at some point want to revoke this you can type:
-
Next you will need to set the project that we just created. In your web browser under project info, you should be able to see the
Project ID
belonging to yourdtumlops
project. Copy this an type the following command in a terminalYou can also get the project info by running
-
Next install the Google Cloud Python API:
Make sure that the Python interface is also installed. In a Python terminal type
this should work without any errors.
-
(Optional) If you are using VSCode you can also download the relevant extension called
Cloud Code
. After installing it you should see a smallCloud Code
button in the action bar.
-
-
Finally, we need to activate a couple of developer APIs that are not activated by default. In a terminal write
gcloud services enable apigateway.googleapis.com gcloud services enable servicemanagement.googleapis.com gcloud services enable servicecontrol.googleapis.com
you can always check which services are enabled by typing
After following these step your laptop should hopefully be setup for using gcp
locally. You are now ready to use their
services, both locally on your laptop and in the cloud console.
IAM and Quotas
A big part of using the cloud in a bigger organization has to do with Admin and quotas. Admin here in general refers
to the different roles that users of GCP and quotas refers to the amount of resources that a given user has access to.
For example, one employee, let's say a data scientist, may only be granted access to certain GCP services that have to
do with the development and training of machine learning models, with X
amounts of GPUs available to use to make sure
that the employee does not spend too much money. Another employee, a DevOps engineer, probably does not need access to
the same services and not necessarily the same resources.
In this course, we are not going to focus too much on this aspect but it is important to know that it exists. One
feature you are going to need for doing the project is how to share a project with other people. This is done through
the IAM (Identities and Access Management) page. Simply click the Grant Access
button, search for the email of the
person you want to share the project with and give them either Viewer
, Editor
or Owner
access, depending on what
you want them to be able to do. The figure below shows how to do this.
What we are going to go through right now is how to increase the quotas for how many GPUs you have available for your project. By default, any free accounts in GCP (or accounts using teaching credits) the default quota for GPUs that you can use is either 0 or 1 (their policies sometimes change). We will in the exercises below try to increase it.
❔ Exercises
-
Start by enabling the
Compute Engine
service. Simply search for it in the top search bar. It should bring you to a page where you can enable the service (may take some time). We are going to look more into this service in the next module. -
Next go to the
IAM & Admin
page, again search for it in the top search bar. The remaining steps are illustrated in the figure below.-
Go to the
quotas page
-
In the search field search for
GPUs (all regions)
(needs to match exactly, the search field is case sensitive), such that you get the same quota as in the image. -
In the limit, you can see what your current quota for the number of GPUs you can use is. Additionally, to the right of the limit, you can see the current usage. It is worth checking in on if you are ever in doubt if a job is running on GPU or not.
-
Click the quota and afterward the
Edit
quotas button. -
In the pop-up window, increase your limit to either 1 or 2.
-
After sending your request you can try clicking the
Increase requests
tab to see the status of your request
-
If you are ever running into errors when working in GPU that contains statements about quotas
you can always try to
go to this page and see what you are actually allowed to use currently and try to increase it. For example, when you
get to training machine learning models using Vertex AI in the next module, you would most likely
need to ask for a quota increase for that service as well.
Finally, we want to note that a quota increase is sometimes not allowed within 24 hours of creating an account. If your request gets rejected, we recommend to wait a day and try again. If this does still not work, you may need to use their services some more to make sure you are not a bot that wants to mine crypto.
🧠 Knowledge check
-
What considerations to take when choosing an GCP region for running a new application?
Solution
A series of factors may influence your choice of region, including:
- Services availability in the region, not all services are available in all regions
- Resource availability: some regions have more GPUs available than others
- Reduced latency: if your application is running in the same region as your users, the latency will be lower
- Compliance: some countries have strict rules that require user info to be stored inside a particular region eg. EU has GDPR rules that require all user data to be stored in the EU
- Pricing: some regions may have different pricing than others
-
The 3 major cloud providers all have the same services, but they are called something different depending on the provider. What are the corresponding names of these GCP services in AWS and Azure?
- Compute Engine
- Cloud storage
- Cloud functions
- Cloud run
- Cloud build
- Vertex AI
It is important to know these correspondences to navigate blogpost etc. about MLOps on the internet.
Solution
GCP AWS Azure Compute Engine Elastic Compute Cloud (EC2) Virtual Machines Cloud storage Simple Storage Service (S3) Blob Storage Cloud functions Lambda Functions Serverless Compute Cloud run App Runner, Fargate, Lambda Container Apps, Container Instances Cloud build CodeBuild DevOps Vertex AI SageMaker AI Platform