Serverless Data Processing with Dataflow - Batch Analytics Pipelines with Dataflow (Python) Reviews

6768 reviews

Ferdie O. · Reviewed حوالي 4 ساعات ago

spent two hours getting message not sufficient workers/resources, in zone/region, but restricted from selecting another.

Ferdie O. · Reviewed حوالي 6 ساعات ago

EZHUMALAI A. · Reviewed حوالي 14 ساعة ago

Vignesh T. · Reviewed حوالي 14 ساعة ago

The dataflow jobs failed with "Startup of the worker pool in us-east1 failed to bring up any of the desired 1 workers. This is likely a quota issue or a Compute Engine stockout. The service will retry." There was also an SSL Certificate error to the GCS bucket which I solved by "gcloud auth application-default login" and clicking the link and pasting the code from the link. Thirdly, part 2 of the lab required dill imports which I installed with the following command. pip install apache-beam[dill]

Sayed Fawad Ali S. · Reviewed يوم واحد ago

spent two hours getting message not sufficient workers/resources, in zone/region, but restricted from selecting another.

Ferdie O. · Reviewed يومان ago

Had many errors running the pipeline due to missing certificates. Fixed it by adding: export GCE_METADATA_MTLS_MODE=none

Martin H. · Reviewed 4 أيام ago

vicente b. · Reviewed 5 أيام ago

The lab is very cool, but in all of the course labs I am hitting the problem that I cannot run dataflow jobs in the specified regions and zones because no workers are available in those zones (though I am running the jobs multiple times). It completely ruins the learning experience, because I am not able to finish any lab though they are very well prepared. Please, add the option to switch to different regions and zones or keep some compute quota for Qwik labs.

Jan K. · Reviewed 5 أيام ago

ZONE_RESOURCE_POOL_EXHAUSTED getting this error frequently

Ashok K. · Reviewed 6 أيام ago

Gustavo L. · Reviewed 7 أيام ago

Allan L. · Reviewed 7 أيام ago

Harsh A. · Reviewed 8 أيام ago

Guilherme A. · Reviewed 13 يوم ago

Wayne F. · Reviewed 13 يوم ago

Luis Antonio C. · Reviewed 14 يوم ago

Luis Antonio C. · Reviewed 14 يوم ago

already submitted feedback for this lab, with the issues that i've encountered

Rafael D. · Reviewed 14 يوم ago

WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1017)'))': /computeMetadata/v1/instance/service-accounts/default/?recursive=true Traceback (most recent call last): File "/home/jupyter/training-data-analyst/quests/dataflow_python/3_Batch_Analytics/solution/batch_user_traffic_pipeline.py", line 99, in <module> run() File "/home/jupyter/training-data-analyst/quests/dataflow_python/3_Batch_Analytics/solution/batch_user_traffic_pipeline.py", line 78, in run (p | 'ReadFromGCS' >> beam.io.ReadFromText(known_args.input_path) File "/home/jupyter/training-data-analyst/quests/dataflow_python/3_Batch_Analytics/lab/df-env/lib/python3.10/site-packages/apache_beam/io/textio.py", line 808, in __init__ self._source = self._source_class( File "/home/jupyter/training-data-analyst/quests/dataflow_python/3_Batch_Analytics/lab/df-env/lib/python3.10/site-packages/apache_beam/io/textio.py", line 144, in __init__ super().__init__( File "/home/jupyter/training-data-analyst/quests/dataflow_python/3_Batch_Analytics/lab/df-env/lib/python3.10/site-packages/apache_beam/io/filebasedsource.py", line 127, in __init__ self._validate() File "/home/jupyter/training-data-analyst/quests/dataflow_python/3_Batch_Analytics/lab/df-env/lib/python3.10/site-packages/apache_beam/options/value_provider.py", line 193, in _f return fnc(self, *args, **kwargs) File "/home/jupyter/training-data-analyst/quests/dataflow_python/3_Batch_Analytics/lab/df-env/lib/python3.10/site-packages/apache_beam/io/filebasedsource.py", line 190, in _validate match_result = FileSystems.match([pattern], limits=[1])[0] File "/home/jupyter/training-data-analyst/quests/dataflow_python/3_Batch_Analytics/lab/df-env/lib/python3.10/site-packages/apache_beam/io/filesystems.py", line 240, in match return filesystem.match(patterns, limits) File "/home/jupyter/training-data-analyst/quests/dataflow_python/3_Batch_Analytics/lab/df-env/lib/python3.10/site-packages/apache_beam/io/filesystem.py", line 779, in match raise BeamIOError("Match operation failed", exceptions) apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions {'gs://qwiklabs-gcp-04-497cbdcab972/events.json': RefreshError(TransportError("Failed to retrieve https://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Compute Engine Metadata server unavailable. Last exception: HTTPSConnectionPool(host='metadata.google.internal', port=443): Max retries exceeded with url: /computeMetadata/v1/instance/service-accounts/default/?recursive=true (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1017)')))"))}

Mykola O. · Reviewed 15 يوم ago

Mara Malina F. · Reviewed 15 يوم ago

Daniela L. · Reviewed 16 يوم ago

Gabriela C. · Reviewed 16 يوم ago

Luis Antonio C. · Reviewed 19 يوم ago

Leighton C. · Reviewed 19 يوم ago

There are some issues with running the dataflow; first, the version of a library has to be downgraded: pip install "google-auth==2.43.0" due to error: https://github.com/googleapis/google-cloud-python/issues/16090. Secondly, one has to specify the machine type and in the part B additionally have to set worker zone: options.view_as(beam.options.pipeline_options.WorkerOptions).machine_type = "e2-standard-4" options.view_as(beam.options.pipeline_options.WorkerOptions).worker_zone = "us-west1-c"

Przemyslaw S. · Reviewed 20 يوم ago

We do not ensure the published reviews originate from consumers who have purchased or used the products. Reviews are not verified by Google.