Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Azure Data Fundamentals DP-900: Exercises and Questions, Exams of Computer Science

DeVry University - Pennsylvania Computer Science

A collection of exercises and questions related to azure data fundamentals (dp-900). It covers various aspects of azure data services, including data storage, processing, and analytics. Useful for students preparing for the dp-900 certification exam or for anyone seeking to enhance their understanding of azure data solutions.

Typology: Exams

2024/2025

Available from 01/23/2025

Shantelle 🇺🇸

5

(2)

3K documents

1 / 11

This page cannot be seen from the preview

Don't miss anything!

Azure Data Fundamentals DP-900

Which service can you use to perpetually retrieve data from a Kafka

queue, process the data, and write the data to Azure Data Lake? - ✔

✔ Azure Stream Analytics

Which three services can be used to ingest data for stream processing?

- ✔ ✔ Azure Data Lake Storage. Azure Event Hubs. Azure IoT Hub.

Apache Spark in Azure - ✔ ✔ distributed processing framework.

Azure Synapse Analytics

Azure Databricks

Azure HDInsight

Delta Lake - ✔ ✔ Delta Lake is an open-source storage layer that

adds support for transactional consistency, schema enforcement, and

other common data warehousing features to data lake storage. It also

unifies storage for streaming and batch data, and can be used in Spark

to define relational tables for both batch and stream processing. When

used for stream processing, a Delta Lake table can be used as a

streaming source for queries against real-time data, or as a sink to

which a stream of data is written.

The Spark runtimes in Azure Synapse Analytics and Azure Databricks

include support for Delta Lake.

Delta Lake combined with Spark Structured Streaming is a good

solution when you need to abstract batch and stream processed data in

Partial preview of the text

Download Azure Data Fundamentals DP-900: Exercises and Questions and more Exams Computer Science in PDF only on Docsity!

Azure Data Fundamentals DP-

Which service can you use to perpetually retrieve data from a Kafka queue, process the data, and write the data to Azure Data Lake? - ✔ ✔ Azure Stream Analytics

Which three services can be used to ingest data for stream processing?

✔ ✔ Azure Data Lake Storage. Azure Event Hubs. Azure IoT Hub.

Apache Spark in Azure - ✔ ✔ distributed processing framework.

Azure Synapse Analytics

Azure Databricks

Azure HDInsight

Delta Lake - ✔ ✔ Delta Lake is an open-source storage layer that adds support for transactional consistency, schema enforcement, and other common data warehousing features to data lake storage. It also unifies storage for streaming and batch data, and can be used in Spark to define relational tables for both batch and stream processing. When used for stream processing, a Delta Lake table can be used as a streaming source for queries against real-time data, or as a sink to which a stream of data is written.

The Spark runtimes in Azure Synapse Analytics and Azure Databricks include support for Delta Lake.

Delta Lake combined with Spark Structured Streaming is a good solution when you need to abstract batch and stream processed data in

a data lake behind a relational schema for SQL-based querying and analysis.

Azure streaming Analytics - ✔ ✔ Azure Stream Analytics: PaaS solution to define streaming jobs

Spark Structured Streaming: Open-source library on to develop streaming solutions on Apache Spark based services including Azure Synapse Analytics, Azure Databricks and Azure HDInsight

Azure Data Explorer. High-performance database and analytics service that is optimized for ingesting and querying batch or streaming data with a time-series element

Sources for stream processing (ingestion) - ✔ ✔ Azure event hubs

Azure IoT hub

Azure Data lake Store Gen 2

Apache Kafka. (ujsed with Apache Spark

Sinks (output) - ✔ ✔ Azure event hubs

Azure Data lake store gen 2 or azure blob storage

Azure SQL Database or Azure synapse Analytics, or Azure Databricks

Power BI

You need to aggregate and store multiple JSON files that contain records for sales transactions. The solution must minimize the development effort.

ORC - ✔ ✔ Optimized Row Columnar format. organizes data into columns rather than rows.

Parquet - ✔ ✔ Columnar data format. contains row groups.

Which type of Azure Storage is used for VHDs and is optimized for random read and write operations? - ✔ ✔ page blob.

Page blobs are optimized for random access and used for VHDs. Append blobs cannot be updated. Block blobs are not used for VHDs.

Blobs - ✔ ✔ Binary large objects for massive amounts of unstructured data.

Block blobs - ✔ ✔ a set of blocks. Each block can vary in size up to 100 MB. A block blob can contain up to 50,000 blocks, giving a max size of over 4.7 TB. The block is the smallest amount of data that can be read or written as an individual unit. Block blobs are best used to store discrete, large, binary objects that change infrequently.

Page blobs - ✔ ✔ A page blob is organized as a collection of fixed size 512-byte pages. Optimized to support random read and write operations; fetch and store data for a single page if necessary. A page blob can hold up to 8 TB of data. Azure uses page blobs to implement virtual disk storage for virtual machines.

Append blobs - ✔ ✔ An append blob is a block blob optimized to support ammend operations. You can only add blocks to the end of an append blob; updating or deleting existing blocks isn't supported.

Hot tier - ✔ ✔ the default for blob storage access. Use this for blobs that are accessed frequently. The blob data is stored on high- performance media.

Cool tier - ✔ ✔ lower performance and incurs reduced storage charges compared to the hot tier. Use for data accessed infrequently. Can start hot and move to cool.

Archive tier - ✔ ✔ Lowest storage cost, increased latency. Can take hours for data to become available. Must rehydrate.

Which two storage solutions can be mounted in Azure Synapse Analytics and used to process large volumes of data? - ✔ ✔ Azure Blob storage. Azure Data lake storage.

What are two characteristics of Azure Table storage? - ✔ ✔ Each RowKey value is unique within a table partition. Items in the same partitions are stored in a row key order.

You need to replace an existing on-premises SMB shared folder with a cloud solution. Which storage option should you choose? - ✔ ✔ Azure Files. Azure Files allows you to create cloud-based network shares to make documents and other files available to multiple users.

Which Azure Cosmos DB API should you use for data in a graph structure? - ✔ ✔ Apache Gremlin. The Gremlin API is used for graph databases. The MongoDB API stores data in the BSON format. The Table API is used to retrieve key/value pairs. The Cassandra API is used to retrieve tabular data.

Azure Cosmos DB - ✔ ✔ Fully managed and serverless distributed database for applications of any size or scale, with support for both

Azure Cosmos DB for Table - ✔ ✔ Azure Cosmos DB for Table is used to work with data in key-value tables, similar to Azure Table Storage. It offers greater scalability and performance than Azure Table Storage.

Azure Cosmos DB for Apache Cassandra - ✔ ✔ Azure Cosmos DB for Apache Cassandra is compatible with Apache Cassandra, which is a popular open source database that uses a column-family storage structure. Column families are tables, similar to those in a relational database, with the exception that it's not mandatory for every row to have the same columns. Can be queried using SQL

Azure Cosmos DB for Apache Gremlin - ✔ ✔ Azure Cosmos DB for Apache Gremlin is used with data in a graph structure; in which entities are defined as vertices that form nodes in connected graph. Nodes are connected by edges that represent relationships, like this:

Which type of data store uses star schemas, fact tables, and dimension tables? - ✔ ✔ Data warehouses use fact and dimension tables in a star/snowflake schema. Relational databases do not use fact and dimension tables. Cubes are generated from a data warehouse but are a table themselves. Data lakes store files.

Which is the best type of database to use for an organizational chart? - ✔ ✔ Graph. Graph databases are the best option for hierarchical data. Azure SQL Database is the best option for create, read, update, and delete (CRUD) operations and uses the least amount of storage space, but our solution does not require a database management system (DBMS). Object storage is the best option for file storage, not hierarchical databases. Table storage is not suited for files.

You design an application that needs to store data based on the following requirements:

Store historical data from multiple data sources

Load data on a scheduled basis

Use a denormalized star or snowflake schema

Which type of database should you use? - ✔ ✔ OLAP (online analytical processing).

OLAP databases are used for snowflake schemas with historical data.

Which type of database can be used for semi-structured data that will be processed by an Apache Spark pool in Azure Synapse Analytics? - ✔ ✔ column-family. Column-family databases are used to store unstructured, tabular data comprising rows and columns. Azure Synapse Analytics Spark pools do not directly support graph or relational databases.

Which two attributes are characteristics of an analytical data workload?

✔ ✔ optimized for read operations.

highly denormalized. (denormalized means that it's put together in one table to be queried fast)

Which feature of transactional data processing guarantees that concurrent processes cannot see the data in an inconsistent state? - ✔ ✔ Isolation. Isolation in transactional data processing ensures that concurrent transactions cannot interfere with one another and must result in a consistent database state.

What should you include in the recommendation? - ✔ ✔ a stored procedure.

A stored procedure can encapsulate any type of business logic that can be reused in the application. A stored procedure can modify existing data as well as add new entries to tables. A stored procedure can be run from an application as well as from the server.

Which service is managed and serverless, avoids the use of Windows Server licenses, and allows for each workload to have its own instance of the service being used? - ✔ ✔ Azure SQL Database.

Azure SQL Database is a serverless platform as a service (PaaS) SQL instance. SQL Managed Instance is a PaaS service, but databases are maintained in the same SQL Managed Instance cluster. SQL Server on Azure Virtual Machines running Windows or Linux are not serverless options.

Which three open-source databases are available as platform as a service (PaaS) in Azure? - ✔ ✔ MariaDB, PostgreSQL, MySQL

All in Azure Database.

Which data service allows you to control the amount of RAM, change the I/O subsystem configuration, and add or remove CPUs? - ✔ ✔ SQL Server on Azure Virtual Machines.

Which open-source database has built-in support for temporal data? - ✔ ✔ MariaDB

Which two services allow you to pre-process a large volume of data by using Scala? Each correct answer presents a complete solution. - ✔ ✔ Azure Databricks,

a serverless Apache Spark pool in Azure Synapse Analytics

What is a characteristic of batch processing - ✔ ✔ Batch processing is used to execute complex analysis. Batch processing handles a large amount of data at a time. Batch processing is usually measured in minutes and hours.

Azure Data Fundamentals DP-900: Exercises and Questions, Exams of Computer Science

Related documents

Partial preview of the text

Download Azure Data Fundamentals DP-900: Exercises and Questions and more Exams Computer Science in PDF only on Docsity!

Azure Data Fundamentals DP-