What is the primary purpose of Azure Synapse Analytics?
Azure Synapse Analytics integrates big data and data warehouse analytics into a single platform, enabling organizations to analyze data at scale.
Microsoft Certification
59 practice questions with correct answers and detailed explanations. Use this guide to review concepts before taking the practice exam.
The Microsoft Microsoft Azure Data Fundamentals (DP-900) certification validates professional expertise in Microsoft technologies. This study guide covers all 59 practice questions from our DP-900 practice test, complete with correct answers and explanations to help you understand each concept thoroughly.
Review each question and explanation below, then test yourself with the full interactive practice exam to measure your readiness.
What is the primary purpose of Azure Synapse Analytics?
Azure Synapse Analytics integrates big data and data warehouse analytics into a single platform, enabling organizations to analyze data at scale.
Which Azure service is best suited for storing large volumes of unstructured data such as videos and images?
Azure Blob Storage is designed specifically for storing massive amounts of unstructured data including documents, videos, images, and backups.
What type of database model does Azure Cosmos DB primarily support?
Azure Cosmos DB supports multiple data models including documents, key-value pairs, graphs, and column-family data, making it highly flexible for various application scenarios.
In the context of data classification, what does the term 'hot' tier refer to in Azure Storage?
The 'hot' tier in Azure Storage is optimized for frequently accessed data, providing the fastest retrieval speeds and highest access costs.
Which of the following statements about normalization in relational databases is correct?
Database normalization is a process that reduces data redundancy and improves data integrity by organizing data into logical, related tables with proper relationships.
What is the primary advantage of using polyglot persistence in cloud architectures?
Polyglot persistence enables teams to choose different database technologies optimized for specific use cases, such as document stores for flexible schemas and relational databases for structured data.
In Azure Data Lake Storage Gen2, what feature enables fine-grained access control at the file and directory level?
Azure Data Lake Storage Gen2 uses Access Control Lists (ACLs) to provide fine-grained access control at the file and directory level, similar to POSIX permissions.
What is OLTP primarily used for in database systems?
OLTP (Online Transactional Processing) is optimized for handling frequent, small transactions such as sales orders, inventory updates, and user registrations in real-time.
Which Azure service provides serverless SQL query capabilities for querying data in Azure Storage?
Azure Synapse Analytics SQL on-demand (now called serverless SQL pool) allows you to query files in Azure Storage using T-SQL without provisioning dedicated compute resources.
In data modeling, what is a dimension table primarily used for?
Dimension tables in star schemas or data warehouses store descriptive information like dates, locations, products, and customers that provide context for analyzing facts.
What is the primary difference between Azure SQL Database and Azure SQL Managed Instance?
Azure SQL Managed Instance provides greater compatibility with SQL Server and instance-level features like cross-database queries and SQL Agent, whereas SQL Database is a fully managed PaaS offering.
Which characteristic defines a data warehouse?
A data warehouse is specifically designed to store integrated, historical data optimized for analytical queries, reporting, and business intelligence rather than transactional operations.
What does eventual consistency mean in the context of distributed databases like Azure Cosmos DB?
Eventual consistency is a model where updates to distributed replicas propagate over time, meaning brief inconsistencies may exist temporarily, but all replicas will converge to the same state.
In the Azure data platform, what is the primary purpose of Azure Data Factory?
Azure Data Factory is a cloud-based data integration service that enables creating, scheduling, and orchestrating data pipelines for ETL/ELT operations across various sources and destinations.
Which of the following best describes the 'archive' tier in Azure Blob Storage?
The archive tier is the most cost-effective storage option for data accessed infrequently, designed for long-term retention with retrieval latencies of hours (compared to seconds for hot tier).
What is a key characteristic of a fact table in a dimensional data model?
Fact tables in dimensional models contain quantitative measures and metrics of business events, with foreign keys that link to dimension tables to provide context for analysis.
Which Azure service is specifically designed for batch processing of large volumes of data using Apache Spark?
Azure Synapse Spark pools provide managed Apache Spark clusters within Azure Synapse Analytics, enabling batch processing and data transformation of large datasets.
In relational database design, what does referential integrity ensure?
Referential integrity is a database constraint that ensures every foreign key value references an existing primary key value, maintaining valid relationships between tables.
What is the primary benefit of using partitioning in large tables?
Table partitioning divides large tables into smaller segments based on criteria like date ranges, enabling faster queries, more efficient maintenance, and better resource utilization.
Which statement accurately describes the difference between structured and unstructured data?
Structured data conforms to a defined schema with organized columns and types (like databases), while unstructured data lacks predefined organization (like documents, images, videos).
In Azure Synapse Analytics, what is the purpose of a dedicated SQL pool?
Dedicated SQL pools in Azure Synapse provide provisioned compute resources optimized for data warehousing and running complex analytical queries on large datasets.
What is a primary advantage of using Azure Data Lake Storage for big data analytics?
Azure Data Lake Storage is optimized for storing massive amounts of diverse data formats with hierarchical organization and security features ideal for big data analytics and machine learning workloads.
Which of the following scenarios would benefit most from using a document database like Azure Cosmos DB?
Document databases like Azure Cosmos DB excel with flexible, semi-structured data like user profiles with varying attributes and provide global distribution with low latency.
What is the primary purpose of data classification in a data governance framework?
Data classification organizes data by sensitivity levels (public, internal, confidential) and compliance requirements, enabling organizations to apply appropriate security controls and handling procedures.
What is the primary purpose of a data warehouse?
A data warehouse is specifically designed to consolidate and integrate data from multiple operational sources to support analytical queries, reporting, and business intelligence rather than handle real-time transactions.
Which Azure service is best suited for serverless data analytics with SQL queries on files in Azure Storage?
Azure Synapse Analytics provides serverless SQL pools that allow querying data directly from files in Azure Storage without provisioning dedicated infrastructure.
What does OLTP primarily optimize for?
OLTP (Online Transaction Processing) systems are optimized for rapid, concurrent transactions on current operational data with requirements for ACID compliance and low latency.
In Azure, which type of database is optimized for handling large volumes of unstructured data such as documents, images, and videos?
Azure Cosmos DB and Azure Blob Storage are designed to handle unstructured data at scale, while Azure SQL Database is optimized for structured relational data.
What is the main difference between structured and semi-structured data?
Structured data follows a rigid, predefined schema (like tables in SQL), while semi-structured data has some organizational structure through metadata or tags (like JSON or XML) but not a strict schema.
Which Azure service is designed for ingesting, processing, and analyzing streaming data in real-time?
Azure Event Hubs ingests high-volume streaming data while Azure Stream Analytics processes and analyzes it in real-time, making them ideal for streaming scenarios.
What is a key characteristic of NoSQL databases like Azure Cosmos DB?
NoSQL databases like Cosmos DB offer flexible schemas, horizontal scalability, and are designed to handle large, distributed datasets without requiring complex JOIN operations.
Azure Data Factory is primarily used for which data operation?
Azure Data Factory is designed for orchestrating ETL pipelines that move and transform data from multiple sources to destinations at scheduled intervals.
Which normalization form ensures that all non-key attributes are functionally dependent only on the primary key?
Third Normal Form (3NF) requires that all non-key attributes depend only on the primary key, eliminating transitive dependencies that exist in lower normal forms.
What is the primary benefit of using Azure SQL Database managed backups?
Azure SQL Database automatically manages backups, creating point-in-time restore capabilities and reducing the operational overhead of manual backup management.
In the context of data analytics, what does 'denormalization' typically mean?
Denormalization deliberately adds redundant data to reduce the need for complex JOINs, improving query performance in analytical scenarios where read speed is prioritized over update efficiency.
Which Azure service provides a distributed data processing framework for big data analytics similar to Hadoop or Spark?
Azure HDInsight is a managed service that provides Apache Hadoop, Spark, Hive, and other distributed processing frameworks for big data analytics.
What is the main advantage of using Azure Blob Storage's hot, cool, and archive tiers?
Azure Blob Storage tiers allow cost optimization by assigning different access patterns to appropriate tiers—hot for frequent access, cool and archive for less frequent access at lower costs.
Which statement best describes the CAP Theorem in distributed databases?
The CAP Theorem states that distributed databases must make trade-offs between consistency, availability, and partition tolerance—no system can perfectly guarantee all three simultaneously.
What is the primary purpose of indexing in a relational database?
Indexes allow the database engine to quickly locate data without scanning every row, significantly improving query performance on frequently searched columns.
In Azure Cosmos DB, what does the term 'Request Unit (RU)' represent?
Request Units (RUs) in Cosmos DB are a normalized measure of throughput cost, where every operation consumes a certain number of RUs based on its computational complexity.
Which type of database schema design would be most appropriate for a data warehouse serving analytical queries?
Data warehouses typically use star or snowflake schemas that balance query performance through denormalization while maintaining some organizational structure around dimensions and facts.
What is the primary difference between Azure SQL Database and SQL Server running on Azure Virtual Machines?
Azure SQL Database is a fully managed Platform-as-a-Service (PaaS) offering with automatic patching and maintenance, while SQL Server on Azure VMs (IaaS) requires manual administration by the user.
In data modeling, what is a 'grain' in the context of a fact table?
The grain of a fact table defines the level of detail for each row—for example, 'daily sales by product by store' has a finer grain than 'monthly sales by region'.
Which Azure service allows you to query data across multiple Azure data sources using a single SQL-like interface?
Azure Synapse Analytics provides a unified platform to query and analyze data across multiple sources including Data Lake, SQL databases, and other data stores with a single SQL interface.
What is the primary challenge of using polyglot persistence (multiple database technologies) in an organization?
While polyglot persistence allows using the right database for each use case, it increases operational overhead by requiring teams to maintain expertise in multiple technologies and manage data consistency.
Which of the following best describes the relationship between data lakes and data warehouses?
Data lakes store raw, diverse data in its native format for flexibility and future use, while data warehouses store cleaned, structured data optimized for specific analytical queries and business intelligence.
In Azure, what is the purpose of Azure Private Link for databases?
Azure Private Link establishes a private network connection between clients and Azure services, eliminating public internet exposure and improving security posture.
What does the principle of 'data sovereignty' require in cloud deployments?
Data sovereignty requires that data be stored and processed within specific geographic boundaries as mandated by laws like GDPR and various country regulations.
Which normal form eliminates the possibility of anomalies caused by non-key attributes being dependent on other non-key attributes?
Third Normal Form (3NF) removes transitive dependencies where non-key attributes depend on other non-key attributes, preventing update, insert, and delete anomalies.
Your organization needs to store large volumes of unstructured data such as videos, images, and documents. Which Azure service is best suited for this requirement?
Azure Blob Storage is optimized for storing massive amounts of unstructured data such as images, videos, and documents. While Data Lake Storage Gen2 is also suitable, Blob Storage is the primary service for general unstructured data.
You are designing a data warehouse solution for a retail company. The data warehouse needs to support complex analytical queries on historical sales data spanning five years. Which type of database would be most appropriate?
A relational data warehouse (like Azure Synapse Analytics) is specifically designed for complex analytical queries on large volumes of historical data and supports OLAP operations effectively.
What is the primary advantage of using Azure Cosmos DB for globally distributed applications?
Azure Cosmos DB is designed specifically for globally distributed applications, providing automatic multi-region replication and guaranteed low-latency access through its global distribution capabilities.
An organization wants to implement a data retention policy where data older than 7 years must be automatically deleted. Which Azure Storage feature enables this?
Lifecycle management policies in Azure Storage allow you to automatically transition blobs between access tiers or delete blobs based on age and other criteria, enabling automated data retention enforcement.
Which of the following best describes the difference between OLTP and OLAP workloads?
OLTP (Online Transaction Processing) optimizes for many small, fast transactions typical of operational systems, while OLAP (Online Analytical Processing) handles complex analytical queries on large historical datasets.
You need to query data in an Azure Data Lake using standard SQL syntax without moving the data. Which service should you use?
Azure Synapse Analytics allows you to query data in a data lake using standard SQL (via Serverless SQL pool) without requiring data movement or complex transformations.
An organization processes millions of events per second from IoT devices. The data must be ingested, processed in real-time, and stored for later analysis. Which combination of Azure services is most suitable?
Event Hubs is designed for high-throughput event ingestion, Stream Analytics provides real-time processing of streaming data, and Blob Storage (or Data Lake Storage) efficiently stores the processed results for analysis.
What is the primary purpose of normalization in a relational database?
Normalization is a database design process that reduces data redundancy, minimizes anomalies, and ensures data integrity by organizing data into properly structured tables with appropriate relationships.
A company uses Azure SQL Database and experiences unpredictable performance during peak hours due to resource constraints. Which feature allows automatic scaling based on workload demand?
The serverless compute tier in Azure SQL Database includes autoscaling capabilities that automatically adjust compute resources based on workload demand, providing flexibility during peak hours.
Which Azure data service is specifically designed for time-series data and IoT analytics?
Azure Time Series Insights is purpose-built for collecting, processing, storing, and analyzing time-series data from IoT devices and sensors at scale with specialized capabilities for temporal analysis.
You've reviewed all 59 questions. Take the interactive practice exam to simulate the real test environment.
▶ Start Practice Exam — Free