14 Jun technical writting
Google BigQuery
Description and purpose
Google BigQuery is serverless data warehouse and a cloud-based big data analytics web service for processing very large read-only data sets. BigQuery is highly scalable enterprise data warehouse. It helps data analysts to be more productive with its performance. With the BigQuery, you don’t have to worry for you infrastructure. You can just use SQL to perform analysis and get insights without a data administrator.
You can stream and analyze the data by creating logical data warehouse over managed columnar storage, object storage and spreadsheets. It also has in memory BI engine, which can be used to create dashboards and reports. It is even a secure means of sharing queries, data sets, insights within your organization. It has capability to build and operationalize ML solutions and perform geospatial analysis.
Distinctive Features
· Flexible data ingestion
One can load the dataset from google cloud storage and directly stream it into BigQuery and perform real time analysis.
· Ease of Collaboration
Big data sets can be saved, shared and accessed using BigQuery. One can even customize the permissions on the dataset.
· Fast and performant
Its columnar architecture helps in handling the nested and repeated fields in a highly efficient manner, which helps in saving both time and money.
· Strong partner ecosystem
Many leading tools like informatica and tableau have partnered with BigQuery to perform loading, visualization and transforming.
· Affordability
Loading and exporting of data, metadata operations are free of cost. User are charged on the basis of what they store and what queries they do. Even 1Tb of processed data each month is free.
Use Cases
· Data Warehouse solution
Big Query can replace the typical hardware setup for the traditional data warehouse. it can be a place to perform all the analytical processes and subsequently reducing the cost of require hardware as well as operating cost.
· Retailers
It can be used by retailers for forecasting their sales.
Cloud Dataflow
Description and purpose
Google Cloud Dataflow Cloud Dataflow is a fully-managed service for transforming and enriching data in stream (real time) and batch (historical) modes with equal reliability and expressiveness. It allows you to build pipelines, monitor their execution, and transform and analyse data, all in the cloud. Cloud dataflow also helps in gaining actionable insights from the data while reducing the operational costs without hassles of deploying, maintaining or scaling infrastructure.
It is basically collection of SDKs for building batch or streaming parallelized data processing pipelines.
Cloud Dataflow can be used for:
ETL
· Movement, Filtering, enrichment, Shaping,
ANALYSIS
· reduction, Bach Computation, continuous computation
ORCHESTRATION
· Composition, External orchestration, simulation
Distinctive Features
· Automated Resource Management
Cloud Dataflow can automate provisioning and management of processing resources to minimize latency and maximize utilization.
· Dynamic work Rebalancing
The lagging work is dynamically rebalanced by optimizing and automating work partitioning.
· Reliable and consistent exactly once processing
· It provides built in support for fault tolerant execution which is consistent and correct regardless of data size, cluster size, processing pattern or pipeline complexity.
· Built on foundation for Machine learning
Cloud Dataflow is very convenient to use as integration point to bring predictive analytics to fraud detection and real time personalization by adding TensorFlow based Cloud Machine learning models and API’s to data processing pipelines
Use Cases
· Point-of-Sale analysis and segmentation in the retail world
· Fraud detection in the financial industry
· IoT information in the healthcare and manufacturing industries
Cloud Dataproc
Description and purpose
Cloud Dataproc is a fast, easy-to-use, low-cost and fully managed service that lets you run the Apache Spark and Apache Hadoop ecosystem on Google Cloud Platform. Cloud Dataproc provisions big or small clusters rapidly, supports many popular job types, and is integrated with other Google Cloud Platform services, such as Cloud Storage and Stackdriver Logging, thus helping you reduce TCO.
Cloud Dataproc is a managed framework that runs on the Google Cloud Platform and ties together several popular tools for processing data, including Apache Hadoop, Spark, Hive, and Pig. Cloud Dataproc has a set of control and integration mechanisms that coordinate the lifecycle, management, and coordination of clusters. Cloud Dataproc is integrated with the YARN application manager to make managing and using your clusters easier.
Distinctive Features
· Automated Cluster Management
Its automated cluster management helps user to concentrate only on data, not on the cluster as it manages deployment, monitoring and logging on its own.
· Developer Tools
It provides multiple ways to manage the cluster which includes easy-to-use Web UI, the Google Cloud SDK, RESTful APIs, and SSH access.
· Integrated
It has built in integration with cloud storage, BigQuery, Bigtable, Stackdriver logging and Stackdriver Monitoring, prividing a complete and robust data platform.
· Versioning
With Image versioning user can toggle between different versions of Apache Spark, Apache Hadoop, and other tools.
Use Cases
· Database separation
It can definitely be used in database separation like tasks as these tasks take a lot of time because of the data. Dataproc can deal with these kinds of problems very efficiently and can definitely save a lot of time.
· The use of clusters can help in predicting the opportunities for determining the future sales, this increasing efficiency.
Cloud Pub/Sub
Description and purpose
Cloud Pub/Sub brings the adaptability and dependability of big business message-arranged middleware to the cloud. In the meantime, Cloud Pub/Sub is a scalable, durable event ingestion and delivery system that fills in as an establishment for modern day stream analytics pipelines. It is a fully-managed real-time messaging service that allows you to send and receive messages between independent applications. with cloud pub/sub user can leverage Cloud Pub/Sub’s flexibility to decouple systems and components hosted on Google Cloud Platform or elsewhere on the Internet.
Distinctive Features
· Open
Open APIs and client libraries in seven languages support cross-cloud and hybrid deployments.
· Exactly one processing
Cloud Dataflow supports reliable, expressive, exactly-once processing of Cloud Pub/Sub streams.
· No provisioning, auto-everything
Cloud Pub/Sub does not have shards or partitions. Just set your quota, publish and consume.
· Compliance and security
Cloud Pub/Sub is a HIPAA-compliant service, offering fine-grained access controls and end-to-end encryption.
· Seek and replay
Rewind your backlog to any point in time or a snapshot, giving the ability to reprocess the messages. Fast forward to discard outdated data.
· Integrated
Take advantage of integrations with multiple services, such as Cloud Storage and Gmail update events and Cloud Functions for serverless event-driven computing.
Use Cases
· Workload migration and hybrid cloud features allows for easy access anywhere. Workload is managed in a way that makes it easier for businesses to compile and sort files, data and manage applications.
· Balancing workloads in network clusters
a large queue of tasks can be efficiently distributed among multiple workers, such as Google Compute Engine instances.
· Distributing event notifications
For example, a service that accepts user signups can send notifications whenever a new user registers, and downstream services can subscribe to receive notifications of the event.
· Logging to multiple systems
For example, a Google Compute Engine instance can write logs to the monitoring system, to a database for later querying, and so on.
Cloud Data Fusion
Description and purpose
Cloud Data Fusion is a fully managed, cloud-native data integration service that helps users efficiently build and manage ETL/ELT data pipelines. It provides a graphical interface to increase time efficiency and reduce complexity, and allows business users, developers, and data scientists to easily and reliably build scalable data integration solutions to cleanse, prepare, blend, transfer and transform data without having to wrestle with infrastructure.
Distinctive Features
· Code-free self-service
It Removes bottlenecks by enabling nontechnical users through a code-free graphical interface that delivers point-and-click data integration.
· Collaborative data engineering
Data Fusion offers the ability to create an internal library of custom connections and transformations that can be validated, shared, and reused across an organization.
· GCP-native
Fully managed, GCP-native architecture unlocks the scalability, reliability, security, and privacy guarantees of Google Cloud.
· Integration metadata and lineage
Search integrated datasets by technical and business metadata. Track lineage for all integrated datasets at the dataset and field level.
· Hybrid enablement
Open source provides the flexibility and portability required to build standardized data integration solutions across hybrid and multi-cloud environments.
· Comprehensive integration toolkit
Built-in connectors to a variety of modern and legacy systems, code-free transformations, conditionals and pre/post processing, alerting and notifications, and error processing provide a comprehensive data integration experience.
Use Cases
· Modern, more secure cloud data lakes
Cloud Data Fusion helps users build scalable, distributed data lakes on GCP by migrating data from siloed on-premises platforms. users can leverage the scale of the cloud to centralize data and drive more value out of their data as a result. The self-service capabilities of Cloud Data Fusion increase process visibility and lower the overall cost of operational support.
· Unified analytics environment
Many users today want to establish a unified analytics environment across a myriad of expensive, on-premises data marts. Integrating data from all these sources using a wide range of disconnected tools and stop-gap measures creates data quality and security challenges. Cloud Data Fusion’s vast variety of connectors, visual interfaces, and abstractions centered around business logic helps in lowering TCO, promoting self-service and standardization, and reducing repetitive work.
Google Data Studio
Description and purpose
Google Data Studio is a fully managed visual analytics service that can help anyone in the organization to unlock insights from data through easy-to-create and interactive dashboards that inspire smarter business decision-making. When Data Studio is combined with BigQuery BI Engine, an in-memory analysis service, data exploration and visual interactivity reach sub-second speeds, over massive datasets.
Distinctive Features
· Connect ability
With Data Studio, user can easily report on data from a wide variety of sources, without programing. User can connect to data sets such as:
· Google Marketing Platform products, including Google Ads, Analytics, Display & Video 360, Search Ads 360
· Google consumer products, such as Sheets, YouTube, and Search Console
· Databases, including BigQuery, MySQL, and PostgreSQL
· Flat files via CSV file upload and Google Cloud Storage
· Social media platforms such as Facebook, Reddit, and Twitter
· Sharing
Google data studio makes it easy to share insights with individual, teams or anyone. With it you can invite anyone to view or edit the reports. Reports can be embed in other pages like google sites, blog posts, marketing articles and annual reports. When you share a Data Studio file with another editor, you can work it together in real time as a team.
Use Cases
· With this tool we can analyze and measure what happens in any website, something essential when carrying out online marketing campaigns. Measuring is necessary to improve any online process.
· As a company, it can provide many benefits, like streamlining the process to create web reports and capture our information when building useful and actionable dashboards.
Cloud Dataprep
Description and purpose
Google Cloud Dataprep is an intelligent data service for visually exploring, cleaning, and preparing structured and unstructured data for analysis. Cloud Dataprep is serverless and works at any scale. Cloud Dataprep in collaboration with trifacta is an intelligent data service for visually exploring, cleaning, and preparing structured and unstructured data for analysis. Because Cloud Dataprep is serverless and works at any scale, there is no infrastructure to deploy or manage. Your next ideal data transformation is suggested and predicted with each UI input, so you don’t have to write code. And with automatic schema, data type, possible joins, and anomaly detection, you can skip time-consuming data profiling and focus on data analysis.
Distinctive Features
· Predictive transformation
It uses a proprietary inference algorithm to interpret the data transformation intent of a user’s data selection. It automatically generates a ranked set of suggestions and patterns for the selections to match.
· Parameterization
It can execute a recipe across multiple instances of identical datasets by parameterizing a variable to replace the parts of the file path that change with each refresh. This variable can be modified as needed at job runtime.
· Collaboration
It is useful in team environments to have multiple users work on the same assets or to create copies of good quality work to serve as templates for others. Cloud Dataprep enables users to collaborate on the same flow objects in real time or to create copies for others to use for independent work.
· Target matching
Define target schemas, through imported or created datasets, and assign to an existing recipe to systematize and speed up your wrangling efforts. Targets appear in the Transformer page and can be applied against the entire dataset or selected columns of the dataset you need to wrangle.
Use Cases
· There are many business problems this product is solving for example its less time consuming since it analyzes many issues itself which includes data transformation and its predicted with each UI input. It provides user with access and prepare data from storage themselves. It makes data handling very easy.
· Preparing log data for analytics, basic cleansing and parsing. Instead of doing this manually in Google Sheets, Dataprep has automated most of this in a few steps.
Cloud BigTable
Description and purpose
Google Bigtable is a distributed, column-oriented data store created by Google Inc. to handle very large amounts of structured data associated with the company’s Internet search and Web services operations. Google Bigtable serves as the database for applications such as the Google App Engine Datastore, Google Personalized Search, Google Earth and Google Analytics.
Distinctive Features
· Fast and performant
Cloud BigTable can be used as storage engine for large scale, low latency apps as well as throughput-intensive data processing and analytics.
· Seamless scaling and replication
It can provision and scale hundreds of petabytes, and it can even smoothly handle millions of operations per second. changes to deployment configuration are very fast, thus reducing downtime during configuration. Replication adds high availability for live serving apps, and workload isolation for serving vs. analytics.
· Simple and integrated
Cloud Bigtable integrates easily with popular big data tools like Hadoop, Cloud Dataflow, and Cloud Dataproc. Plus, Cloud Bigtable supports the open source industry standard HBase API, which makes it easy for your development teams to get started.
Use Cases
· It can be easily deployed and used by big companies to monitor the data from any of their location.
· It can be used as storage engine for large scale, low latency applications as well as throughput intensive data processing and analytics.
Cloud Storage
Description and purpose
Cloud Storage is a unified object storage solution that allows worldwide storage and retrieval of any amount of data at any time. Cloud Storage can be used for a range of scenarios, including serving website content, storing data for archival and disaster recovery, or distributing large data objects to users via direct download. With the cloud storage one can easily access data instantly from any storage class. It can also reduce data storage carbon emissions to zero.
Distinctive Features
· Designed for eleven 9’s of durability
Cloud Storage is designed for 99.999999999% durability. It stores data redundantly, with automatic checksums to ensure data integrity. With Multi-Regional Storage, your data is maintained in geographically distinct locations.
· Strongly consistent
When a write succeeds, the latest copy of the object is guaranteed to be returned to any GET, globally. This applies to PUTs of new or overwritten objects and DELETEs.
· A single API for all storage classes
Cloud Storage’s consistent API, latency, and speed across storage classes simplifies development integration and reduces code complexity. Implement Object Lifecycle Management to set a Time to Live (TTL) for objects, archive older version of objects, or downgrade storage classes without compromising on latency or accessibility. Set custom policies to transition data seamlessly from one storage class to the next, depending on your cost and availability needs at the time.
Use Cases
Instead of having to send a giant email that has all recipients entered by hand, we can just share a document on the cloud, and everyone has instant access. Google will even notify people by email that a new document has arrived into their cloud storage, which helps remind people.
Cloud Datalab
Description and purpose
Cloud Datalab is a powerful interactive tool created to explore, analyze, transform, and visualize data and build machine-learning models on Google Cloud Platform. It is an interactive notebook based on Jupyter, and it’s integrated with BigQuery and Cloud Machine Learning Engine to provide easy access to key data processing services. And with TensorFlow or Cloud Machine Learning Engine, you can easily turn data into deployed machine-learning models ready for prediction.
Distinctive Features
· Machine Learning
It Supports TensorFlow-based deep ML models in addition to scikit-learn. Scales training and prediction via specialized libraries for Cloud Machine Learning Engine.
· Multi-Language Support
Cloud Datalab currently supports Python, SQL, and JavaScript (for BigQuery user-defined functions).
· IPython Support
Datalab is based on Jupyter (formerly IPython) so you can use a large number of existing packages for statistics, machine learning etc. Learn from published notebooks and swap tips with a vibrant IPython community.
· Pay-per-use Pricing
Only pay for the cloud resources you use Google Compute Engine VMs, BigQuery, and any additional resources you decide to use, such as Cloud Storage.
Use Cases
· It can be used to connect the data together from various different sources.one can relate the defects, quality and cost aspects of the products of company from different regions. Visualization the Datalab can also help in providing the better picture of current state of affairs. It can help companies in providing ideas where the improvement is needed in their business.
Our website has a team of professional writers who can help you write any of your homework. They will write your papers from scratch. We also have a team of editors just to make sure all papers are of HIGH QUALITY & PLAGIARISM FREE. To make an Order you only need to click Ask A Question and we will direct you to our Order Page at WriteDemy. Then fill Our Order Form with all your assignment instructions. Select your deadline and pay for your paper. You will get it few hours before your set deadline.
Fill in all the assignment paper details that are required in the order form with the standard information being the page count, deadline, academic level and type of paper. It is advisable to have this information at hand so that you can quickly fill in the necessary information needed in the form for the essay writer to be immediately assigned to your writing project. Make payment for the custom essay order to enable us to assign a suitable writer to your order. Payments are made through Paypal on a secured billing page. Finally, sit back and relax.
About Writedemy
We are a professional paper writing website. If you have searched a question and bumped into our website just know you are in the right place to get help in your coursework. We offer HIGH QUALITY & PLAGIARISM FREE Papers.
How It Works
To make an Order you only need to click on “Order Now” and we will direct you to our Order Page. Fill Our Order Form with all your assignment instructions. Select your deadline and pay for your paper. You will get it few hours before your set deadline.
Are there Discounts?
All new clients are eligible for 20% off in their first Order. Our payment method is safe and secure.
