Data Engineering with Google Dataflow and Apache Beam

First steps to Extract, Transform and Load data using Apache Beam and Deploy Pipelines on Google Dataflow

This course wants to introduce you to the Apache Foundation’s newest data pipeline development framework: The Apache Beam, and how this feature is becoming popular in partnership with Google Dataflow. In a summary, we want to cover the following topics:

What you’ll learn

  • Apache Beam.
  • ETL.
  • Python.
  • Google Cloud.
  • DataFlow.
  • Google Cloud Storage.

Course Content

  • Apache Beam Concepts –> 3 lectures • 13min.
  • Apache Beam Main Functions –> 10 lectures • 41min.
  • Batch Dataflow Pipelines –> 8 lectures • 1hr 26min.

Auto Draft

Requirements

  • Basic Python.
  • Free GCP Account.

This course wants to introduce you to the Apache Foundation’s newest data pipeline development framework: The Apache Beam, and how this feature is becoming popular in partnership with Google Dataflow. In a summary, we want to cover the following topics:

 

1. Understand your inner workings

2. What are your benefits

3. Explain how to use on your local machine without installation via Google Colab for development

4. Its main functions

5. Configure Apache Beam python SDK locallyvice

6. How to deploy this resource on Google Dataflow to a Batch pipeline

 

This course is dynamic, you will be receiving updates whenever possible.

It is important to remember that this course does not teach Python, but uses it. So, get comfortable with knowing Python basics, defining a function, creating objects and data types.

Also, if you are interested in learning section 4, which consists of deploying a pipeline on Google Dataflow, you will need to have a free counter in GCP. It’s a simple process, but it requires a credit card!

 

___________________________________________________________________________________________________________

 

Requirements:

· Basic knowledge of Python

· Have Python 3.7 or greater installed locally (from section 4)

· Free account at GCP (from section 4)

 

Schedule:

· Section 2 – Concepts

· Section 3 – Main Functions

· Section 4 – Apache Beam on Google Dataflow