Google Data Engineer: Professional
1. Introduction
Theory, Practice and Tests
Lab: Setting Up A GCP Account
Lab: Using The Cloud Shell
2. Compute
About this section
Compute Options
Google Compute Engine (GCE)
Lab: Creating a VM Instance
More GCE
Lab: Editing a VM Instance
Lab: Creating a VM Instance Using The Command Line
Lab: Creating And Attaching A Persistent Disk
3. Google Container Engine – Kubernetes (GKE)
More GKE
Lab: Creating A Kubernetes Cluster And Deploying A WordPress Container
App Engine
Contrasting App Engine, Compute Engine and Container Engine
Lab: Deploy And Run An App Engine App
Compute
4. Storage
Storage Options
Quick Take
Cloud Storage
Lab: Working With Cloud Storage Buckets
Lab: Bucket And Object Permissions
Lab: Life cycle Management On Buckets
Fix for AccessDeniedException: 403 Insufficient Permission
Lab: Running A Program On a VM Instance And Storing Results on Cloud Storage
5. Virtual Machines and Images
Live Migration
Machine Types and Billing
Sustained Use and Committed Use Discounts
Rightsizing Recommendations
RAM Disk
Images
Startup Scripts And Baked Images
6. VPCs and Interconnecting Networks
VPCs And Subnets
Global VPCs, Regional Subnets
IP Addresses
Lab: Working with Static IP Addresses
Routes
Firewall Rules
Lab: Working with Firewalls
Lab: Working with Auto Mode and Custom Mode Networks
Lab: Bastion Host
7. Cloud VPN
Lab: Working with Cloud VPN
Cloud Router
Lab: Using Cloud Routers for Dynamic Routing
Dedicated Interconnect Direct and Carrier Peering
Shared VPCs
Lab: Shared VPCs
VPC Network Peering
Lab: VPC Peering
Cloud DNS And Legacy Networks
Networking
8. Managed Instance Groups and Load Balancing
Managed and Unmanaged Instance Groups
Types of Load Balancing
Overview of HTTP(S) Load Balancing
Forwarding Rules Target Proxy and Url Maps
Preview
Backend Service and Backends
Load Distribution and Firewall Rules
Lab: HTTP(S) Load Balancing
Lab: Content Based Load Balancing
SSL Proxy and TCP Proxy Load Balancing
Lab: SSL Proxy Load Balancing
Network Load Balancing
Internal Load Balancing
Autoscalers
Lab: Autoscaling with Managed Instance Groups
9. Ops and Security
StackDriver
StackDriver Logging
Lab: Stackdriver Resource Monitoring
Lab: Stackdriver Error Reporting and Debugging
10. Cloud Deployment Manager
Lab: Using Deployment Manager
Lab: Deployment Manager and Stackdriver
11. Cloud Endpoints
Cloud IAM: User accounts, Service accounts, API Credentials
Cloud IAM: Roles, Identity-Aware Proxy, Best Practices
Lab: Cloud IAM
12. Data Protection
Operations and Security
13. Transfer Service
Lab: Migrating Data Using The Transfer Service
gcloud init
Lab: Cloud Storage Versioning, Directory Sync
14. Cloud SQL, Cloud Spanner ~ OLTP ~ RDBMS
Cloud SQL
Lab: Creating A Cloud SQL Instance
Lab: Running Commands On Cloud SQL Instance
Lab: Bulk Loading Data Into Cloud SQL Tables
15. Cloud Spanner
More Cloud Spanner
Lab: Working With Cloud Spanner
16. BigTable ~ HBase = Columnar Store
BigTable Intro
Columnar Store
Denormalised
Column Families
BigTable Performance
Getting the HBase Prompt
Lab: BigTable demo
17. Datastore ~ Document Database
Datastore
Lab: Datastore demo
18. BigQuery ~ Hive ~ OLAP
BigQuery Intro
BigQuery Advanced
Lab: Loading CSV Data Into Big Query
Lab: Running Queries On Big Query
Lab: Loading JSON Data With Nested Tables
Lab: Public Datasets In Big Query
Lab: Using Big Query Via The Command Line
Lab: Aggregations And Conditionals In Aggregations
Lab: Subqueries And Joins
Lab: Regular Expressions In Legacy SQL
Lab: Using The With Statement For SubQueries
19. Dataflow ~ Apache Beam
About this section
Data Flow Intro
Apache Beam
Lab: Running A Python Data flow Program
Lab: Running A Java Data flow Program
Lab: Implementing Word Count In Dataflow Java
Lab: Executing The Word Count Dataflow
Lab: Executing MapReduce In Dataflow In Python
Lab: Executing MapReduce In Dataflow In Java
20. Dataproc ~ Managed Hadoop
Data Proc
Lab: Creating And Managing A Dataproc Cluster
Lab: Creating A Firewall Rule To Access Dataproc
Lab: Running A PySpark Job On Dataproc
Lab: Running The PySpark REPL Shell And Pig Scripts On Dataproc
Lab: Submitting A Spark Jar To Dataproc
Lab: Working With Dataproc Using The GCloud CLI
21. Pub/Sub for Streaming
Pub Sub
Lab: Working With Pubsub On The Command Line
Lab: Working With PubSub Using The Web Console
Lab: Setting Up A Pubsub Publisher Using The Python Library
Lab: Setting Up A Pubsub Subscriber Using The Python Library
Lab: Publishing Streaming Data Into Pubsub
Lab: Reading Streaming Data From PubSub And Writing To BigQuery
Lab: Executing A Pipeline To Read Streaming Data And Write To BigQuery
Lab: Pubsub Source BigQuery Sink
22. Datalab ~ Jupyter
Data Lab
Lab: Creating And Working On A Datalab Instance
Lab: Importing And Exporting Data Using Datalab
Lab: Using The Charting API In Datalab
23. Composer ~ Airflow
Directed Acyclic Graph (DAG)?
Apache Airflow architecture
Google Cloud Platform: Cloud composer used as Apache Airflow
Understanding Apache Airflow program structure
Lab 1 : Create and submit Apache airflow DAG program
Lab 2: Using Template functionality in Apache Airflow program
Using Variables in Apache Airflow
Lab 3: Calling Bash script in different folder / different machine.
24. Cloud Functions
Virtual Machines – Cloud Functions
What is Cloud Functions?
Architecture of Cloud Function
Use cases of Cloud Functions
Cloud Functions Demo
25. Vision, Translate, NLP and Speech: Trained ML APIs
Lab: Taxicab Prediction – Setting up the dataset
Lab: Taxicab Prediction – Training and Running the model
Lab: The Vision, Translate, NLP and Speech API
Lab: The Vision API for Label and Landmark Detection
26. Additional topics in brief which are prerequisite for this course.
Appendix: Hadoop Ecosystem
Introducing the Hadoop Ecosystem
Hadoop
HDFS
MapReduce
Yarn
Hive
Hive vs. RDBMS
HQL vs. SQL
OLAP in Hive
Windowing Hive
Pig
Spark
Streams Intro
Microbatches
Window Types
Hadoop Ecosystem
Introduction
Theory, Practice and Tests
Lab: Setting Up A GCP Account
Lab: Using The Cloud Shell