Logo of Google Cloud Dataproc

Google Cloud Dataproc

Website LinkedIn Twitter

Last updated on

Company health

Employee growth
69% increase in the last year
Web traffic
2% decrease in the last quarter
Financing
July 2018 - $16M

Ratings

G2
4.4/5
(20)
Glassdoor
3.5/5
(2)

Google Cloud Dataproc description

Google Cloud Dataproc is a cloud-based service that makes it easier and cheaper for your company to analyze large amounts of data. It uses popular open-source tools like Apache Spark and Hadoop, but Google Cloud handles all the setup and management. This means your team can focus on getting insights from your data without worrying about the technical details. Dataproc is also integrated with other Google Cloud services for a complete data processing platform.


Who is Google Cloud Dataproc best for

Google Cloud Dataproc is a fully managed and scalable cloud-based service for processing and analyzing big data. Users praise its ease of use, cost-effectiveness with per-second billing, and seamless integration with other Google Cloud Platform (GCP) services. Some users have noted slow cluster creation times and occasional autoscaling delays. Dataproc is ideal for companies working with terabytes of data using open-source tools like Apache Spark and Hadoop.

  • Ideal for medium to large enterprises (101+ employees) seeking scalable data solutions.

  • Suitable for businesses across all industries needing robust data analysis capabilities.


Google Cloud Dataproc features

Supported

Automated logging, monitoring, and serverless deployment lets user focus on data and analytics.

Supported

Users can utilize Dataproc with GKE, enabling job portability and isolation.

Supported

Features like default at-rest encryption, OS Login, VPC Service Controls, and customer-managed encryption keys (CMEK) are provided.

Supported

Supports popular open-source tools like Apache Hadoop, Spark, Flink, Presto, etc. at scale.

Supported

Integrated with Vertex AI, BigQuery, and Dataplex

Supported

Offers lower TCO compared to on-premise data lakes, with per-second pricing.

Supported

Supports serverless deployments, managed clusters on Google Compute and Kubernetes.


Google Cloud Dataproc pricing

The commentary is based on 2 reviews from Google Cloud Dataproc G2 reviews.

Dataproc offers cost-effective big data processing with features like idle cluster deletion and autoscaling. While some users mention occasional autoscaling delays, the overall pricing sentiment is positive, especially due to the managed infrastructure reducing operational overhead.

See the Google Cloud Dataproc pricing page.

  • Google Cloud Dataproc has a free trial.

Dataproc on Compute Engine
$0.01 per vCPU per hour

Charges are calculated based on a rate of $0.01 per vCPU per hour.

Dataproc on GKE
$0.01 per vCPU per hour

Pricing mirrors that of Dataproc on Compute Engine, charged at $0.01 per vCPU per hour for virtual machines in Dataproc-created node pools.


Google Cloud Dataproc alternatives

  • Logo of Amazon EMR
    Amazon EMR
    Simplified big data processing in the cloud.
    Read more
  • Logo of Cloudera Data Platform
    Cloudera Data Platform
    Hybrid data cloud platform for faster, simpler analytics.
    Read more
  • Logo of Azure Data Lake Store
    Azure Data Lake Store
    Scalable, secure storage for big data analytics in the cloud.
    Read more
  • Logo of Google Cloud Datalab
    Google Cloud Datalab
    Interactive data exploration, analysis, and visualization in the cloud.
    Read more
  • Logo of Google Cloud Scheduler
    Google Cloud Scheduler
    Schedules and automates cloud tasks reliably and easily.
    Read more
  • Logo of Google Cloud Dataprep
    Google Cloud Dataprep
    Visually prepare data for analysis, no coding needed. Cloud-based.
    Read more

Google Cloud Dataproc FAQ

  • What is Google Cloud Dataproc and what does Google Cloud Dataproc do?

    Google Cloud Dataproc is a fully managed and scalable cloud-based service for processing and analyzing big data. It simplifies the use of open-source tools like Apache Spark and Hadoop, enabling users to focus on insights. It offers cost-effective, serverless options and integrates with other Google Cloud services.

  • How does Google Cloud Dataproc integrate with other tools?

    Google Cloud Dataproc integrates seamlessly with other Google Cloud services like Vertex AI, BigQuery, and Dataplex, providing a unified data platform. This integration simplifies data pipelines and enables advanced analytics workflows. It also supports open-source tools like Apache Spark and Hadoop.

  • What the main competitors of Google Cloud Dataproc?

    Top competitors to Google Cloud Dataproc include Amazon EMR, Cloudera Data Platform, and Azure Data Lake Store. These alternatives offer similar big data processing capabilities in the cloud, with varying features and pricing models. For interactive data analysis and visualization, consider Google Cloud Datalab.

  • Is Google Cloud Dataproc legit?

    Yes, Google Cloud Dataproc is a legitimate service from Google Cloud. It's a safe and reliable platform for big data processing using open-source tools like Apache Spark and Hadoop. Google Cloud handles the infrastructure, so you can focus on data analysis.

  • How much does Google Cloud Dataproc cost?

    Google Cloud Dataproc's pricing is $0.01 per vCPU per hour for both Compute Engine and Google Kubernetes Engine deployments. This product offers a free trial but no free plan. Additional costs may apply for related Google Cloud services.

  • Is Google Cloud Dataproc customer service good?

    Customer reviews on Google Cloud Dataproc's support are mixed. While some users praise the helpfulness of the GCP support team, others point out that documentation could be improved and the user interface for IAM is not ideal.


Reviewed by

MK
Michal Kaczor
CEO at Gralio

Michal has worked at startups for many years and writes about topics relating to software selection and IT management. As a former consultant for Bain, a business advisory company, he also knows how to understand needs of any business and find solutions to its problems.

TT
Tymon Terlikiewicz
CTO at Gralio

Tymon is a seasoned CTO who loves finding the perfect tools for any task. He recently headed up the tech department at Batmaid, a well-known Swiss company, where he managed about 60 software purchases, including CX, HR, Payroll, Marketing automation and various developer tools.

NEW: Introducing Gralio Screen Buddy

An AI tool that observes your work, finds inefficiencies, and suggests smarter ways to do things. Maybe you can use your tools better, automate tasks, or switch software.

For Individuals
Streamline your daily tasks, get helpful AI tips, and find the right tools for your workflow.
For Businesses
See how your team really works, uncover automation opportunities, and get software recommendations tailored to your processes.