Comparing Google Colab and Microsoft Azure in terms of GPU resources

Tanuj Sur
3 min readMay 29, 2021

Introduction:

Modern developments in the field of AI and machine learning are pushing the boundaries of computational resources or maybe recent advancements in computational technology are inspiring researchers to expand the horizons of their research. It is difficult to say advancement in which of these sectors drives the other. But one thing is for sure that cloud technology is the future of AI driven research and applications.

In this article we will compare the time of execution of a ResNet34 neural network architecture on Google Colab (Free version) and Microsoft Azure for students (Free $100 credit edition)

Experiment:

For our purpose of experimenting, we are considering the paper Rethinking Bias-Variance Trade-off for Generalization of Neural Networks which was published in ICML’20. We will be executing the code for mainline experiment on ResNet34. However due to time constraint and limited computational resources, we will be tweaking certain aspects of the original paper which are:

  1. Number of epochs in original paper is 500 however we have considered 50.
  2. Number of copies of each model built, in original paper is 6 whereas we have considered only 2.
  3. Decay learning rate by a factor of 10 every 20 epochs instead of 200 as in the original paper.

We just wanted to compare the execution times of both the platforms so it was sufficient to train the model for a smaller number of epochs.

In a typical convolutional neural network, width is referred to as the number of filters of the convolutional layers. You can refer to this article by Pablo Ruiz for more details on this topic.

In this paper, risk, bias and variance are compared with respect to different widths of the network ranging from 2 to 64. Hence we compile the results for each of the widths 2, 6, 10, 20, 40, 50, 60 for both platforms and finally compare them.

Resources available:

We are using Standard_NC6 VM on Microsoft Azure and free GPU provided by Google Colab for conducting this experiment. Both the platforms provide NVDIA Tesla K80 GPU.

Results:

We plot the time of execution for each of the two platforms with respect to the varying width of the network architecture.

Graph showing time of execution vs width of the network for the two platforms

Conclusion:

Google Colab outperforms Microsoft Azure student edition in terms of time of execution of this code. However, Google Colab restricted us from using GPU resources after a certain period of time due to their policy of limited usage. On the other hand , one can use Microsoft Azure for as long as their $100 credit limit allows.

If we analyze in terms of cost of VM vs time of execution, Standard_NC6 VM on Microsoft Azure charges $1.55/hr and the time taken by us to perform this experiment on this VM is around 6 hours, so we had to spend around $10 for this experiment alone. However, Google Colab Pro charges around $10 a month for its upgraded version and also provides occasional access to T4 and P100 GPU’s as well.

We hope that you have found this article useful and will help you to make decisions in future regarding which platform to use for such heavy weight CNN model execution.

Follow me on LinkedIn for more such articles.

--

--

Tanuj Sur

I am a master's student at Chennai Mathematical Institute. I am a deep learning researcher with an interest in Computer Vision and NLP.