Oracle Certification

1Z0-133 — Data Warehousing Study Guide

61 practice questions with correct answers and detailed explanations. Use this guide to review concepts before taking the practice exam.

▶ Take Practice Exam 61 questions  ·  Free  ·  No registration

About the 1Z0-133 Exam

The Oracle Data Warehousing (1Z0-133) certification validates professional expertise in Oracle technologies. This study guide covers all 61 practice questions from our 1Z0-133 practice test, complete with correct answers and explanations to help you understand each concept thoroughly.

Review each question and explanation below, then test yourself with the full interactive practice exam to measure your readiness.

61 Practice Questions & Answers

Q1 Medium

Which of the following best describes the purpose of a slowly changing dimension (SCD)?

  • A To compress dimension tables for faster queries
  • B To store historical changes in dimensional attributes over time ✓ Correct
  • C To eliminate redundant fact table records
  • D To partition large fact tables by date
Explanation

SCDs are designed to track and maintain historical changes in dimension attributes, allowing you to analyze how dimension members have evolved. This enables accurate historical reporting and analysis.

Q2 Medium

In a data warehouse, what is a conformed dimension?

  • A A dimension that has been normalized to third normal form
  • B A dimension that conforms to star schema requirements only
  • C A dimension used solely for conforming to audit requirements
  • D A dimension table shared across multiple fact tables with consistent definitions ✓ Correct
Explanation

A conformed dimension is a shared dimension used across multiple fact tables in different subject areas, ensuring consistent definitions and enabling drill-across analysis across different data marts.

Q3 Medium

Which ETL operation is most critical when dealing with data quality issues in source systems?

  • A Fact table partitioning and indexing
  • B Dimension table joins
  • C Aggregation and rollup processing
  • D Data validation and cleansing ✓ Correct
Explanation

Data validation and cleansing is fundamental to address data quality issues before data enters the warehouse. Without proper validation, poor data quality cascades through all downstream analytics and reporting.

Q4 Easy

What is the primary advantage of implementing a staging area in a data warehouse architecture?

  • A It eliminates the need for backup and recovery procedures
  • B It allows source systems to query transformed data directly
  • C It provides a landing zone where data can be extracted, cleaned, and validated before loading into the warehouse ✓ Correct
  • D It reduces the number of dimensions required
Explanation

The staging area acts as a temporary holding zone where raw data from source systems can be extracted, cleaned, validated, and prepared before being loaded into the production data warehouse.

Q5 Easy

In Kimball dimensional modeling, what does the fact table typically contain?

  • A Foreign keys to dimensions and measurable numeric facts or metrics ✓ Correct
  • B Historical snapshots of all dimension changes
  • C Normalized business rules and validation logic
  • D Only dimension keys and descriptive attributes
Explanation

Fact tables contain foreign keys linking to dimension tables plus quantifiable metrics (measures) that represent business events or transactions. This structure enables efficient analysis across multiple dimensions.

Q6 Medium

Which type of slowly changing dimension (SCD) overwrites the old attribute value with the new value?

  • A Type 1 - Overwrite ✓ Correct
  • B Type 0 - Fixed Dimension
  • C Type 3 - Add New Column
  • D Type 2 - Add New Row
Explanation

Type 1 SCD overwrites existing attribute values with new ones, losing historical information. It is the simplest approach when historical tracking is not required.

Q7 Hard

What is a bridge table in dimensional modeling, and when is it typically used?

  • A A temporary table used during the ETL process for staging data transformations
  • B A table used to resolve many-to-many relationships between dimensions and facts, allowing flexible attribute allocation ✓ Correct
  • C A table that stores hierarchical relationships between dimensions and is used for enforcing referential integrity
  • D A table that connects fact tables to conformed dimensions and is used in all data warehouse implementations
Explanation

Bridge tables are used to handle many-to-many relationships in dimensional models, allowing facts to be allocated across multiple dimension members (e.g., a single transaction belonging to multiple cost centers).

Q8 Hard

In a data warehouse, what is the purpose of the bus matrix?

  • A To document conformed dimensions and subject areas across the organization ✓ Correct
  • B To optimize query performance across distributed warehouse nodes
  • C To manage the physical indexing strategy for large fact tables
  • D To schedule ETL processes based on data dependencies
Explanation

The bus matrix is a planning tool that identifies which conformed dimensions apply to which subject areas (fact tables), helping ensure consistency and enabling integration across data marts.

Q9 Medium

Which of the following best describes a junk dimension?

  • A A dimension used only for temporary or experimental analysis
  • B Unused or deprecated dimensions that should be removed from the schema
  • C A dimension containing low-cardinality flags and indicators that are grouped together to reduce fact table width ✓ Correct
  • D A dimension with poor data quality that requires extensive cleaning
Explanation

A junk dimension is a convenient grouping of low-cardinality flags and indicators (like yes/no or true/false values) into a single dimension to reduce the number of foreign keys in the fact table.

Q10 Medium

What is the primary purpose of a degenerate dimension?

  • A To replace a conformed dimension with a simplified version
  • B To store the primary key of the source system transaction ✓ Correct
  • C To handle slowly changing dimension Type 4 scenarios
  • D To reference a dimension that no longer exists in the warehouse
Explanation

A degenerate dimension is a reference number (like order number or ticket ID) taken from the source transaction that is stored directly in the fact table without creating a separate dimension table.

Q11 Easy

Which ETL approach processes all data from the beginning of time on each run?

  • A Delta load
  • B Incremental load
  • C Full refresh ✓ Correct
  • D Snapshot load
Explanation

A full refresh reloads all data from the source system, regardless of what has already been loaded. While simple to implement, it is less efficient than incremental approaches for large datasets.

Q12 Hard

In data warehouse terminology, what is a fact-less fact table?

  • A A table created during ETL that has not yet been populated with data
  • B A dimension table that contains no foreign keys
  • C A corrupted fact table where all measure values are NULL
  • D A fact table that contains only dimension keys and no measures, used to record events or coverage scenarios ✓ Correct
Explanation

Fact-less fact tables record events or relationships between dimensions without numeric measures, such as student enrollments or insurance coverage periods. They enable powerful dimensional analysis of events.

Q13 Medium

What is the primary benefit of implementing surrogate keys in dimension tables?

  • A They automatically enforce data quality rules at the database level
  • B They improve query performance by reducing key size compared to natural keys
  • C They provide independence from source system changes and enable tracking of slowly changing dimensions ✓ Correct
  • D They eliminate the need for foreign key constraints in fact tables
Explanation

Surrogate keys (synthetic, system-generated keys) decouple the warehouse from source system changes and enable Type 2 SCD implementations where multiple rows can represent the same natural key entity.

Q14 Medium

Which of the following is NOT a typical responsibility of ETL processes in a data warehouse?

  • A Extracting data from multiple heterogeneous source systems
  • B Creating real-time user dashboards and visualizations ✓ Correct
  • C Transforming data to conform to the warehouse schema
  • D Loading data into the data warehouse and data marts
Explanation

ETL focuses on data extraction, transformation, and loading. Dashboard and visualization creation is typically a reporting or BI tool responsibility, not an ETL function.

Q15 Medium

What is the purpose of implementing a change data capture (CDC) mechanism in a data warehouse?

  • A To compress historical data that is no longer actively queried
  • B To capture only the rows that have changed since the last load, enabling efficient incremental updates ✓ Correct
  • C To create audit trails for all user query access to the warehouse
  • D To automatically generate aggregate tables from detailed data
Explanation

CDC identifies and captures only the data that has changed since the previous load, enabling efficient incremental ETL processes that reduce load time and resource consumption.

Q16 Hard

In dimensional modeling, what is a role-playing dimension?

  • A A single physical dimension table referenced multiple times in a fact table with different logical meanings through different foreign keys ✓ Correct
  • B A dimension that changes roles based on different slowly changing dimension types
  • C A dimension that rotates its attributes quarterly based on business cycles
  • D A dimension created specifically for metadata and auditing purposes
Explanation

A role-playing dimension is a single dimension table referenced multiple times in a fact table (e.g., Order Date, Ship Date, and Delivery Date all referencing the same Date dimension with different foreign key names).

Q17 Medium

Which of the following best describes a snowflake schema?

  • A A schema where dimension tables are fully normalized into multiple related tables ✓ Correct
  • B A schema that automatically adjusts table structures based on query patterns
  • C A schema used exclusively for temporal or time-series data warehouses
  • D A schema designed for very small data warehouses with limited storage
Explanation

A snowflake schema normalizes dimension tables into multiple related tables, reducing redundancy but increasing query complexity compared to a star schema.

Q18 Hard

What is a semi-additive fact in a data warehouse?

  • A A measure that can be summed across some dimensions but not others ✓ Correct
  • B A measure that is derived from multiple source systems
  • C A measure that is only partially populated in the fact table
  • D A measure that changes values based on slowly changing dimension updates
Explanation

Semi-additive facts can be summed across some dimensions but not others. For example, account balance is additive by account but not additive across time; you cannot sum balances across months.

Q19 Medium

Which approach is best for handling NULL values in dimension tables?

  • A Exclude rows with NULL values from the dimension entirely
  • B Replace NULLs with default values or a special 'Unknown' or 'N/A' dimension member ✓ Correct
  • C Convert NULLs to zero for numeric columns and empty strings for text columns
  • D Store NULLs as they are and handle them in reporting queries
Explanation

Creating explicit 'Unknown' or 'N/A' dimension members replaces NULLs, enabling cleaner aggregations and more intuitive reporting. This approach is preferable to storing actual NULLs in dimensions.

Q20 Medium

What is the primary advantage of using aggregate tables in a data warehouse?

  • A They automatically enforce data quality standards across all tables
  • B They eliminate the need for indexing on large fact tables
  • C They reduce the number of dimensions required in the schema
  • D They pre-compute and store summarized data at various levels of granularity, enabling faster query response times ✓ Correct
Explanation

Aggregate tables pre-compute and store summary data (e.g., daily sales by product and region), allowing queries to access pre-computed aggregates instead of scanning detailed fact tables.

Q21 Medium

In a data warehouse, what does data lineage documentation help accomplish?

  • A It tracks the source, transformations, and destination of data to support impact analysis and auditing ✓ Correct
  • B It automatically optimizes query execution plans based on data relationships
  • C It reduces the storage space required for historical data
  • D It eliminates the need for change management procedures in production environments
Explanation

Data lineage documentation maps the journey of data from source systems through transformations to final warehouse tables, supporting impact analysis, regulatory compliance, and troubleshooting.

Q22 Medium

What is the purpose of implementing slowly changing dimension Type 3?

  • A To create separate dimension tables for each time period
  • B To overwrite old values with new values without maintaining history
  • C To add a new row for each change while maintaining a current flag
  • D To add new columns to store previous and current values of changing attributes ✓ Correct
Explanation

Type 3 SCD adds new columns (typically 'previous value' and 'current value') to store both the old and new attribute values, providing limited history without adding new rows.

Q23 Easy

Which of the following is a characteristic of a well-designed data warehouse?

  • A It minimizes the use of historical data to reduce storage costs
  • B It closely mirrors the operational structure of source systems to ensure data compatibility
  • C It is optimized for analytical query performance rather than transactional write performance ✓ Correct
  • D It requires real-time updates to all dimension and fact tables
Explanation

Data warehouses are designed for efficient analytical queries through denormalization, aggregation, and indexing strategies that differ from operational database optimization.

Q24 Medium

What is the primary purpose of implementing a metadata repository in a data warehouse?

  • A To document and manage definitions, lineage, and relationships of all warehouse objects ✓ Correct
  • B To encrypt sensitive data across all warehouse tables
  • C To automatically generate report designs based on warehouse schema
  • D To store backup copies of all warehouse data for disaster recovery
Explanation

A metadata repository documents warehouse objects, their definitions, data lineage, transformations, and relationships, serving as the data warehouse's 'data about data' for governance and impact analysis.

Q25 Hard

In a dimensional model, what is a confirmed fact?

  • A A fact that uses only confirmed dimensions that are shared across multiple subject areas or data marts ✓ Correct
  • B A fact value that appears in multiple fact tables with identical values
  • C A fact table that has been verified for data quality and is ready for production use
  • D A fact table that stores confirmed or validated transactions only, excluding uncertain records
Explanation

Confirmed facts use conformed dimensions, enabling consistency and drill-across analysis across different subject areas. This supports integrated, enterprise-wide analytics.

Q26 Hard

Which technique is most appropriate for handling hierarchies in a dimension table?

  • A Create a separate fact table for each level of the hierarchy
  • B Store hierarchies only in the BI tool, not in the data warehouse
  • C Add parent-child relationship columns to the dimension, normalized dimension levels, or use a bridge table depending on the hierarchy type ✓ Correct
  • D Denormalize all hierarchy levels into a single flat dimension column
Explanation

Hierarchies can be handled through multiple approaches: parent-child columns for flexible hierarchies, normalized levels for fixed hierarchies, or bridge tables for complex many-to-many hierarchies.

Q27 Easy

What is the primary purpose of a data warehouse?

  • A To store operational data for daily transactions
  • B To replace operational databases in production environments
  • C To manage real-time customer interactions across multiple touchpoints
  • D To integrate, consolidate, and provide historical data for analytical decision-making ✓ Correct
Explanation

A data warehouse is specifically designed to consolidate data from multiple sources and provide historical information for business intelligence and analytical purposes, not for operational transaction processing.

Q28 Medium

Which of the following best describes the concept of slowly changing dimensions (SCD)?

  • A Dimensions whose attributes change infrequently and require specific handling strategies to maintain historical accuracy ✓ Correct
  • B Dimensions used exclusively in time-series data warehouses for tracking temporal changes
  • C Dimensions that change very slowly and require complete data replacement each time
  • D Dimensions that must be updated in real-time to reflect operational changes immediately
Explanation

Slowly Changing Dimensions are dimensions with attributes that change infrequently; strategies like Type 1 (overwrite), Type 2 (add new row), and Type 3 (add new column) handle these changes while preserving historical data.

Q29 Medium

What is a conformed dimension in a data warehouse?

  • A A dimension shared across multiple fact tables with the same structure and meaning in all contexts ✓ Correct
  • B A dimension that has been validated and cleansed according to data quality standards
  • C A dimension created specifically for compliance with regulatory requirements
  • D A dimension that conforms to industry-standard naming conventions and hierarchies
Explanation

Conformed dimensions are reusable dimension tables that maintain consistent structure and definition across multiple fact tables, ensuring consistency and enabling drill-across analysis.

Q30 Easy

In a star schema, what is the relationship between fact and dimension tables?

  • A Fact tables serve as lookup tables that reference multiple dimension tables for validation
  • B Fact and dimension tables are joined at runtime without predefined relationships
  • C Dimension tables are nested within fact tables to optimize query performance and reduce storage
  • D Fact tables contain detailed transaction data and connect to multiple dimension tables via foreign keys ✓ Correct
Explanation

In a star schema, fact tables contain measurable data and foreign keys that reference dimension tables, which contain descriptive attributes; this denormalized structure optimizes query performance.

Q31 Medium

Which ETL process component is responsible for identifying and handling data quality issues before loading into the warehouse?

  • A Load phase, where data is validated against the target schema
  • B Post-load phase, conducted after data has been populated into the warehouse
  • C Extract phase only
  • D Transform phase, where data validation, cleansing, and enrichment occur ✓ Correct
Explanation

The Transform phase is where data quality checks, cleansing, validation, and enrichment happen to ensure only high-quality data enters the warehouse.

Q32 Medium

What is the primary advantage of using a snowflake schema over a star schema?

  • A Snowflake schemas require fewer joins and are simpler for users to understand
  • B Snowflake schemas reduce data redundancy and storage requirements through normalization of dimension tables ✓ Correct
  • C Snowflake schemas eliminate the need for indexes on fact tables
  • D Snowflake schemas provide faster query performance due to complete denormalization
Explanation

Snowflake schemas normalize dimension tables into multiple related tables, reducing storage redundancy and update anomalies, though at the cost of more complex queries requiring additional joins.

Q33 Medium

Which of the following best describes the purpose of a staging area in data warehouse architecture?

  • A A performance optimization layer that caches frequently accessed dimension data
  • B An archival system for storing historical data that is no longer needed in the main warehouse
  • C A production database where all operational transactions are initially recorded
  • D A temporary storage location where extracted data is validated, cleansed, and prepared before loading into the warehouse ✓ Correct
Explanation

A staging area serves as an intermediate repository where data undergoes transformation, validation, and quality checks before being moved to the actual warehouse.

Q34 Easy

What is a fact table, and what does it typically contain?

  • A A table used exclusively for storing aggregated data and pre-calculated summaries
  • B A table containing quantitative measurements, foreign keys to dimensions, and timestamp information for business events ✓ Correct
  • C A normalized table structure that eliminates all redundancy in dimensional data
  • D A table containing descriptive attributes such as customer names, addresses, and product categories
Explanation

Fact tables contain measurable metrics (facts), keys to dimension tables, and temporal information related to business events; they form the central table in dimensional models.

Q35 Hard

In the context of data warehousing, what is a junk dimension?

  • A A temporary dimension used only during the ETL process and discarded afterward
  • B A dimension that contains historical data deemed obsolete and removed from regular queries
  • C A dimension containing low-quality data that should be removed from the warehouse
  • D A small dimension combining multiple low-cardinality, unrelated flags and indicators to avoid cluttering the fact table ✓ Correct
Explanation

A junk dimension consolidates multiple low-cardinality flags and indicators (like yes/no fields) into a single dimension table to keep the fact table clean and improve query performance.

Q36 Medium

What is the primary difference between a data warehouse and a data mart?

  • A A data warehouse uses star schemas while a data mart exclusively uses snowflake schemas
  • B A data warehouse is for tactical decisions while a data mart is for strategic decisions
  • C A data warehouse contains integrated data from multiple sources for the entire organization, while a data mart is a subset tailored to a specific department or business process ✓ Correct
  • D A data warehouse requires real-time updates while a data mart uses batch processing
Explanation

A data warehouse is enterprise-wide and integrates data from multiple operational sources, while a data mart is a focused subset serving specific business units or analytical needs.

Q37 Medium

Which of the following is a key characteristic of Type 2 Slowly Changing Dimensions?

  • A New rows are inserted for each change, with effective date and end date fields to track version history ✓ Correct
  • B The old dimension record is overwritten with new attribute values
  • C Historical changes are stored in a separate audit table outside the dimension
  • D A new column is added to store the previous attribute value alongside the current value
Explanation

Type 2 SCD maintains full history by adding new rows with surrogate keys, effective dates, and end dates, allowing queries to retrieve the version of a dimension that was active at any point in time.

Q38 Medium

What is the primary purpose of a surrogate key in data warehouse dimensional tables?

  • A To improve data compression and reduce overall storage requirements in the warehouse
  • B To replace the natural key and make the dimension table independent of operational systems
  • C To serve as the official business identifier and remain stable even if business attributes change
  • D To provide a system-generated unique identifier that enables efficient storage, join performance, and handling of slowly changing dimensions ✓ Correct
Explanation

Surrogate keys are artificial, system-generated identifiers that provide stable references in fact tables, enable efficient joins, and allow multiple versions of slowly changing dimensions without affecting fact table foreign keys.

Q39 Medium

In data warehouse design, what is denormalization, and why is it commonly used?

  • A Denormalization reduces storage space by eliminating redundant data through careful normalization
  • B Denormalization splits dimension tables into multiple normalized forms to improve data integrity
  • C Denormalization combines related data into fewer tables to reduce joins and improve query performance in read-heavy analytical environments ✓ Correct
  • D Denormalization is an outdated practice replaced by modern column-store database technologies
Explanation

Denormalization intentionally introduces redundancy in dimensional models to minimize join operations, reduce complexity, and accelerate analytical queries in data warehouses.

Q40 Easy

Which dimension table type is used to represent time and date attributes in a fact table?

  • A Historical dimension, used exclusively for tracking changes in other dimensions over time
  • B Event dimension, containing timestamp information for when transactions occurred
  • C Temporal dimension, which only stores the current date and updates it daily
  • D Calendar dimension, containing date, day of week, month, quarter, year, and business calendar attributes ✓ Correct
Explanation

A date or calendar dimension contains all relevant date/time attributes (fiscal periods, holidays, weekdays, etc.) and is connected to fact tables to enable time-based analysis.

Q41 Hard

What is a bridge table in dimensional modeling, and when is it used?

  • A A temporary table created during ETL to stage data before final transformation
  • B An intermediate table used to resolve many-to-many relationships between dimensions and fact tables ✓ Correct
  • C A table that directly connects fact tables to slowly changing dimensions to handle version management
  • D A table that combines multiple unrelated fact tables into a single denormalized structure
Explanation

A bridge table (or helper table) resolves many-to-many relationships between dimensions and facts, allowing proper aggregation when a single fact event is associated with multiple dimension members.

Q42 Medium

In the context of data warehouse architecture, what does the term 'granularity' refer to?

  • A The physical partitioning strategy applied to large dimension tables
  • B The quality and consistency of data values within the warehouse
  • C The frequency at which dimension tables are updated with new records
  • D The level of detail and lowest grain of data stored in a fact table ✓ Correct
Explanation

Granularity refers to the lowest level of detail in a fact table; choosing the correct grain is fundamental to dimensional design and affects what analyses are possible.

Q43 Medium

Which of the following best describes incremental loading in an ETL process?

  • A Loading data in multiple batches spread throughout the day to avoid server congestion
  • B Loading data by gradually increasing the volume to test system capacity before full deployment
  • C Loading only new and changed data since the last successful load, which improves efficiency and reduces load time ✓ Correct
  • D Loading all data from source systems regardless of previous load status
Explanation

Incremental loading captures and loads only changed, inserted, or deleted records since the previous run, reducing processing time and network bandwidth compared to full refreshes.

Q44 Hard

What is a degenerate dimension, and what is its purpose?

  • A A dimension with only one attribute that should be split into multiple small dimensions
  • B A dimension created from obsolete data that is retained only for historical analysis purposes
  • C A dimension attribute stored directly in the fact table rather than in a separate dimension table, used for transaction identifiers or reference numbers ✓ Correct
  • D A dimension that has deteriorated in data quality and should be removed from queries
Explanation

A degenerate dimension is a dimension attribute placed directly in a fact table to capture transaction-level identifiers without requiring a separate dimension table.

Q45 Hard

In data warehouse design, what is a role-playing dimension?

  • A A dimension that has multiple foreign keys in the fact table, each representing a different business role or context ✓ Correct
  • B A dimension that can be assigned different hierarchical structures depending on the query context
  • C A dimension used exclusively for audit trails and compliance tracking
  • D A dimension modified during the ETL process to simulate operational system behavior
Explanation

A role-playing dimension is reused in a fact table with multiple foreign keys, each representing a different business context (e.g., Order Date and Ship Date both referencing the same Date dimension with different keys).

Q46 Medium

What is data lineage in the context of data warehousing?

  • A The chronological sequence of changes applied to a dimension table over time
  • B The validation rules applied to data during the extract phase of ETL
  • C The process of tracking and visualizing the path of data from operational sources through transformation to warehouse tables ✓ Correct
  • D The hierarchy of dimension attributes organized from general to specific categories
Explanation

Data lineage documents the origin, transformations, and destination of data, providing visibility into how data flows through the warehouse and supporting impact analysis.

Q47 Medium

Which of the following is a primary benefit of implementing a data warehouse versus querying operational systems directly?

  • A Data warehouses reduce the total cost of ownership by consolidating all databases into one
  • B Data warehouses automatically eliminate the need for data quality management
  • C Data warehouses provide separated, optimized environments for analytical queries without impacting operational system performance ✓ Correct
  • D Data warehouses eliminate the need for backup and recovery procedures
Explanation

A data warehouse isolates analytical workloads from operational systems, preventing complex queries from degrading transaction processing performance and allowing query optimization specific to analytical needs.

Q48 Medium

In a dimensional model, what is a measure, and how does it differ from a dimension attribute?

  • A Measures and attributes are interchangeable terms referring to the same data elements
  • B A measure is a descriptive characteristic, while an attribute is a quantifiable fact that can be aggregated
  • C A measure is a quantifiable metric that can be aggregated in fact tables, while a dimension attribute is a descriptive characteristic that provides context ✓ Correct
  • D A measure is a temporal attribute, while a dimension attribute describes the business context of an event
Explanation

Measures (facts) are numeric values that can be summed, averaged, or aggregated (like sales revenue), while dimension attributes are descriptive characteristics that contextualize those measures (like product category).

Q49 Hard

What is the primary challenge addressed by implementing a master data management (MDM) system in conjunction with a data warehouse?

  • A Reducing the need for slowly changing dimension handling in the warehouse
  • B Eliminating the requirement for ETL processes in loading warehouse data
  • C Ensuring consistent, authoritative definitions of key business entities across the organization to prevent data inconsistencies ✓ Correct
  • D Automating the physical partitioning of large fact tables
Explanation

MDM establishes a single source of truth for critical business entities (like customers or products), ensuring consistency across systems and reducing data quality issues in the warehouse.

Q50 Medium

Which metric is used to measure the effectiveness of an ETL process in terms of data completion?

  • A The compression ratio achieved during the load phase
  • B Query response time in milliseconds
  • C The average storage utilization across all dimension tables
  • D The percentage of expected records successfully loaded compared to total records in source systems ✓ Correct
Explanation

Data completeness is measured by comparing the count of records loaded to the count of records expected from the source, identifying any missing or rejected data.

Q51 Hard

In a fact table, what is the difference between an additive, semi-additive, and non-additive measure?

  • A Additive measures relate to incremental loading, semi-additive to full loads, and non-additive to archival processes
  • B Additive measures use addition operators, semi-additive use subtraction, and non-additive use division
  • C Additive measures aggregate across all dimensions, semi-additive measures cannot be summed across time dimensions, and non-additive measures cannot be meaningfully aggregated at all ✓ Correct
  • D These terms refer to different normalization levels of fact tables rather than properties of measures
Explanation

Additive measures sum across all dimensions (revenue), semi-additive sum across some dimensions but not time (account balances), and non-additive don't sum meaningfully (ratios or counts from multiple rows).

Q52 Medium

When designing a data warehouse schema, which normal form is typically NOT recommended for fact tables due to query performance considerations?

  • A Boyce-Codd Normal Form (BCNF) ✓ Correct
  • B Third Normal Form (3NF)
  • C Second Normal Form (2NF)
  • D First Normal Form (1NF)
Explanation

Data warehouses typically denormalize data for query performance, avoiding higher normal forms like BCNF. Fact tables often contain denormalized data with foreign keys to dimension tables rather than strictly adhering to BCNF constraints.

Q53 Medium

What is the primary purpose of implementing slowly changing dimensions (SCDs) in a data warehouse?

  • A To eliminate the need for surrogate keys in the warehouse schema
  • B To automatically partition dimension tables across multiple databases
  • C To reduce the physical storage space required for dimension tables
  • D To track historical changes in dimension attributes while maintaining data accuracy and allowing historical analysis ✓ Correct
Explanation

SCDs enable the data warehouse to maintain historical context by tracking how dimension attributes change over time. This allows analysts to perform accurate historical analysis and understand business evolution.

Q54 Medium

In a star schema, a conformed dimension is one that:

  • A Is stored in third normal form to minimize update anomalies
  • B Contains only attributes relevant to a single specific fact table
  • C Has the same structure and content across multiple fact tables and data marts ✓ Correct
  • D Uses only numerical identifiers instead of descriptive attributes
Explanation

Conformed dimensions are shared across multiple fact tables and maintain consistency in structure and content, enabling drill-across analytics and ensuring semantic consistency across the warehouse.

Q55 Hard

Which ETL technique is most appropriate for handling late-arriving facts in a data warehouse?

  • A Delay the entire ETL load process until all expected data sources have provided their updates
  • B Reject all late-arriving facts and log them as errors for manual correction
  • C Automatically assign default or placeholder dimension keys for missing dimension references
  • D Implement a staging area that temporarily holds incomplete records until all dimensions are available ✓ Correct
Explanation

A staging area buffers late-arriving facts, allowing them to be matched with their corresponding dimensions once they become available, maintaining data quality without rejecting valid data.

Q56 Hard

What is a junk dimension and why is it used in data warehouse design?

  • A A low-cardinality dimension that combines multiple unrelated yes/no or flag attributes to reduce fact table grain and improve query performance ✓ Correct
  • B Any dimension that has not been properly normalized according to third normal form standards
  • C A dimension table containing obsolete or archived data from previous warehouse versions
  • D A dimension table that contains only primary keys without any descriptive attributes
Explanation

Junk dimensions consolidate multiple low-cardinality flag or indicator attributes (like yes/no fields) into a single dimension table, reducing the number of foreign keys in the fact table and improving performance.

Q57 Medium

In the context of data warehouse incremental loading, what does a full outer join in a merge process typically accomplish?

  • A It restricts the load process to only include records from the most recent source extraction
  • B It removes duplicate records from the source before loading begins
  • C It identifies rows that exist in either the source or target table and determines which records need to be inserted, updated, or deleted ✓ Correct
  • D It ensures that only records matching on all columns are considered for an update operation
Explanation

A full outer join compares source and target records to detect inserts, updates, and deletes needed for efficient incremental loading by finding all rows present in either dataset.

Q58 Hard

Which approach best describes the role of a bridge table in a data warehouse?

  • A A temporary table used during the ETL process to stage data before it is moved to permanent warehouse tables
  • B A table that combines multiple source systems into a single unified data model representation
  • C A table that resolves many-to-many relationships between a fact table and dimension table by storing the Cartesian product of both ✓ Correct
  • D A lookup table that maintains the history of all ETL job executions and their status codes
Explanation

Bridge tables handle many-to-many relationships by creating rows for each combination of dimension values, allowing proper dimensional analysis when a single fact may relate to multiple dimension members.

Q59 Medium

What is the primary benefit of using a data warehouse appliance compared to a traditional relational database for analytical workloads?

  • A Appliances provide built-in data quality tools that completely eliminate the need for data validation
  • B Data warehouse appliances automatically generate BI reports without requiring any user configuration
  • C Specialized hardware and columnar storage in appliances optimize query performance for analytical queries on large datasets ✓ Correct
  • D Data warehouse appliances eliminate the need for data modeling and schema design
Explanation

Appliances combine specialized hardware with columnar storage and advanced indexing to dramatically accelerate analytical queries compared to traditional row-oriented relational databases.

Q60 Medium

In master data management (MDM) for data warehousing, what is the primary challenge addressed by implementing an MDM solution?

  • A Eliminating the need to store historical data in the warehouse by maintaining only current versions
  • B Automatically converting all data to a standardized format without requiring any transformation logic
  • C Creating a single, authoritative source of truth for core business entities across multiple systems and data marts ✓ Correct
  • D Reducing the number of fact tables required in the warehouse schema design
Explanation

MDM establishes a single authoritative version of critical master data (customers, products, etc.) across the enterprise, ensuring consistency and accuracy in analytical reporting.

Q61 Hard

Which of the following scenarios would most strongly justify using a snowflake schema instead of a star schema in a data warehouse?

  • A When the data warehouse supports only a single subject area with no cross-functional analysis requirements
  • B When dimension tables are extremely large, contain many hierarchical levels, or when minimizing redundancy and storage space is critical ✓ Correct
  • C When query performance on fact tables is the primary concern and storage capacity is unlimited
  • D When all dimension attributes are single-valued and have low cardinality across the entire dataset
Explanation

Snowflaking normalizes dimension tables to save storage and reduce redundancy, beneficial when dealing with large hierarchical dimensions; however, it increases join complexity and query overhead.

Ready to test your knowledge?

You've reviewed all 61 questions. Take the interactive practice exam to simulate the real test environment.

▶ Start Practice Exam — Free