HP Z8 G4 and ZBook Studio in the Kaggle competition

February 1st, 2021

Machine Learning Engineer

As a Z by HP Data Science Global Ambassador, Qishen Ha's content is sponsored and he was provided with HP products.

 

Hi, my name is Qishen Ha. I work for LINE Corp. as a Machine Learning Engineer, the 11th ranked Kaggle Grandmaster in the world now, mainly focusing on computer vision problems like image classification, semantic segmentation or object detection.

 

I am very honored to be a Z by HP Data Science Global Ambassador and I am very grateful to Z by HP for giving me this opportunity and providing me with the HP Z8 G4 workstation and ZBook Studio. This has increased my competitiveness in Kaggle competitions.

 

I used the HP Z8 G4 workstation and ZBook in the NFL 1st and Future - Impact Detection competition that ended last month (Jan. 2021) and our team finished in 3rd place. And now (Feb. 2021) I'm using them for another Kaggle competition: Cassava Leaf Disease Classification. So I'd like to talk about how I felt about using the HP Z8 G4 workstation and ZBook Studio in real Kaggle competitions.

Configuration introduction

I'll start with a brief overview of the specific configuration of the HP Z8 G4 workstation and ZBook Studio provided to me by Z by HP.

 

HP Z8 G4 workstation:

  • Dual NVIDIA Quadro RTX 6000 GPUs (2 x 24 GB)
  • Dual Intel Xeon Gold 6254 CPUs (2 x 18 cores)
  • Memory 96 GB
  • Storage 2 TB

 

ZBook Studio:

  • NVIDIA Quadro RTX 5000 GPU (16 GB)
  • Intel Core i9-10885H (8 cores)
  • Memory 32 GB
  • Storage 2 TB

 

My main area of interest is computer vision, so the most important thing I look for in a workstation is GPU performance, followed by CPU performance.

 

The importance of GPU performance needs no introduction, but perhaps some would underestimate the need for CPU performance in computer vision tasks. Usually in computer vision tasks, the CPU is responsible for reading and pre-processing the data in multiple threads and then handing it over to the GPU for training. If the resolution of the images is large, or if more augmentation methods are used, this can put more pressure on the CPU. And once the CPU reads and preprocesses the data slower than the GPU trains the model, then the CPU performance will become the bottleneck of the whole system. In my opinion, a minimum of 8 CPU cores per GPU in a workstation is required to prevent CPU performance from becoming a bottleneck, while 16+ CPU cores per GPU are more desirable.

 

Large GPU memory is crucial in tabular competitions, because it is usually necessary to load the entire dataset into memory during feature engineering of the data. Therefore, hundreds of GB of memory are usually required for workstations in large-scale tabular competitions. However, in computer vision there is not much memory requirement, and about 24GB of memory per GPU is sufficient for most cases.

 

As regards to laptops, for a long time I just used them as a tool to connect to servers (cloud instances, etc.). As laptop performance is usually positively correlated with their size and weight, powerful ones are necessarily bigger and heavier. In this case, using a lightweight laptop to connect to the server and only debug the code and train the model on the server has become the mainstream practice today.

 

But when I got my hands on the ZBook Studio, I was amazed at how thin and light a laptop can be with this level of performance - I hope you haven't forgotten that this laptop has a Quadro RTX 5000 GPU (16GB) inside. This means that ZBook Studio can not only be a tool for connecting to a server or workstation, but it can also debug deep learning code itself.

How I use the HP Z8 G4 Workstation and ZBook Studio

A quick note: I set up a Jupyter notebook on my HP Z8 G4 workstation and then connected to it via ZBook Studio.

 

Jupyter notebook is a web application that makes it easy to write and debug code cell by cell through your browser. It’s so well-known that I believe many machine learning engineers have used it, or at least heard about it. I always wrote the code and trained models on cloud instances via Jupyter notebook when I was Kaggling last year before the partnership with Z by HP. So when I got the HP Z8 G4 workstation from Z by HP, the first thing I did was to configure it with a Jupyter Notebook.

 

When I connect to a Jupyter notebook in cloud instances, I need to go through a public network, while when I connect to my HP Z8 G4 I only need to go through a LAN. So one of the obvious advantages of a workstation at home is that it is not dependent on the network environment and has very low latency. When I rent instances from vast.ai, I don't know where on Earth the instance I'm renting is. It could be in the US, it could be in China or Europe, it could even be in the Arctic. Usually, unless the instance is in Japan, I experience significant latency when I connect to it via Jupyter notebook. But connecting to my workstation at home over the LAN is almost indistinguishable from setup Jupyter notebook directly from my laptop — I don't feel any latency.

 

As for why I don't use the workstation directly, but connect it via LAN, the reason is very simple. Because Ubuntu's GUI takes up some of the GPU memory, usually 300-500mb, which is not a huge amount. But once I started training the model on the GPU, Ubuntu's GUI would become so laggy that I could hardly do anything else properly. So turning off the Ubuntu's GUI and connecting the workstation via a laptop not only solves the GUI lag problem when training the model, but also saves a few hundred MB of GPU memory.

 

In my previous workflow, I had no way to debug new experimental code if all the GPUs in my rented cloud instance were fully loaded. But that's not the case with the ZBook Studio, which not only has a 16GB GPU, but is also very thin and light. Now I can use ZBook Studio to debug new experiment’s code even when the GPUs on my workstation and instance are fully loaded. When the code is able to run on my ZBook Studio, I copy it to the workstation or instance and can quickly start the next experiment when the current one is finished. This effectively improves my GPU utilization.

 

Of course, it is possible to train models directly on ZBook Studio as well, but I personally don't recommend it because laptops have poor heat dissipation capabilities and working in high temperatures for long periods of time can damage the life of the laptop.

Speed of the NVIDIA Quadro RTX 6000

I have compared the speed of the NVIDIA Quadro RTX 6000 and NVIDIA V100 GPUs while Kaggling in the NFL and Cassava competitions. The comparison was done by running some of the same experiments with an NVIDIA Quadro RTX 6000 on my HP Z8 G4 workstation, and a V100 on vast.ai. The CNN architectures used are EfficientNet and EfficientDet. To draw a direct conclusion, the NVIDIA Quadro RTX 6000 is about 90% ~ 100% as fast as the V100, with slight variations depending on the CNN architecture. So I'm also very happy with the speed of the HP Z8 G4's GPU.

About Z by HP Data Science Software Stack

To make it faster for users to get started with the workstation or ZBook, the Z by HP team can pre-install a set of software for users called "Z by HP Data Science Software Stack", which includes CUDA, RAPIDS, Tensorflow, Pytorch, Docker and dozens of other software commonly used by data scientists. Of course, users can also install this software by themselves.

Have a Question?
Contact Sales Support. 

Follow Z by HP on Social Media

Instagram

X

YouTube

LinkedIn

Facebook

Monday - Friday

7:00am - 7:30pm (CST) 

Enterprise Sales Support

1-866-625-0242 

Small Business Sales Support

1-866-625-0761

Monday - Friday

7:00am - 7:00pm (CST) 

Government Sales Support 

Federal

1-800-727-5472

State and local 

1-800-727-5472

Go to Site 

Monday - Friday

7:00am - 7:00pm (CST) 

Education Sales Support 

K-12 Education

1-800-727-5472

Higher Education

1-800-727-5472

Go to Site  

Monday - Sunday

9:00am - 11:00pm (CST) 

Chat with a Z by HP Live Expert

Click on the Chat to Start

 Need Support for Your Z Workstation? 

Go to Support Page

Disclaimers
  1. Product may differ from images depicted.

     

    The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.

     

    NVIDIA, the NVIDIA logo, and NVIDIA NGC, NVIDIA Omniverse, NVIDIA RAPIDS, NVIDIA RTX are trademarks and/or registered trademarks of NVIDIA Corporation in the United States and other countries.