Luz Plaja
Member since 2021
Silver League
3800 points
Member since 2021
In this course you will get hands-on in order to work through real-world challenges faced when building streaming data pipelines. The primary focus is on managing continuous, unbounded data with Google Cloud products.
Complete the intermediate Engineer Data for Predictive Modeling with BigQuery ML skill badge to demonstrate skills in the following: building data transformation pipelines to BigQuery using Dataprep by Trifacta; using Cloud Storage, Dataflow, and BigQuery to build extract, transform, and load (ETL) workflows; and building machine learning models using BigQuery ML.
Complete the introductory Prepare Data for ML APIs on Google Cloud skill badge to demonstrate skills in the following: cleaning data with Dataprep by Trifacta, running data pipelines in Dataflow, creating clusters and running Apache Spark jobs in Dataproc, and calling ML APIs including the Cloud Natural Language API, Google Cloud Speech-to-Text API, and Video Intelligence API.
In this intermediate course, you will learn to design, build, and optimize robust batch data pipelines on Google Cloud. Moving beyond fundamental data handling, you will explore large-scale data transformations and efficient workflow orchestration, essential for timely business intelligence and critical reporting. Get hands-on practice using Dataflow for Apache Beam and Serverless for Apache Spark (Dataproc Serverless) for implementation, and tackle crucial considerations for data quality, monitoring, and alerting to ensure pipeline reliability and operational excellence. A basic knowledge of data warehousing, ETL/ELT, SQL, Python, and Google Cloud concepts is recommended.
Complete the introductory Implementing Cloud Load Balancing for Compute Engine skill badge to demonstrate skills in the following: creating and deploying virtual machines in Compute Engine and configuring network and application load balancers.
While the traditional approaches of using data lakes and data warehouses can be effective, they have shortcomings, particularly in large enterprise environments. This course introduces the concept of a data lakehouse and the Google Cloud products used to create one. A lakehouse architecture uses open-standard data sources and combines the best features of data lakes and data warehouses, which addresses many of their shortcomings.
This course introduces the Google Cloud big data and machine learning products and services that support the data-to-AI lifecycle. It explores the processes, challenges, and benefits of building a big data pipeline and machine learning models with Vertex AI on Google Cloud.
This 1-week, accelerate course builds upon previous courses in the Data Engineering on Google Cloud Platform specialization. Through a combination of video lectures, demonstrations, and hands-on labs, you'll learn how to create and manage computing clusters to run Hadoop, Spark, Pig and/or Hive jobs on Google Cloud Platform. You will also learn how to access various cloud storage options from their compute clusters and integrate Google's machine learning capabilities into their analytics programs.