Tuesday, October 21, 2014

MCQ FOR Datawarehouse and mining

1. Which of the following is the most important when deciding on the data structure of a data mart?
(a) XML data exchange standards
(b) Data access tools to be used
(c) Metadata naming conventions
(d) Extract, Transform, and Load (ETL) tool to be used
(e) All (a), (b), (c) and (d) above.

2. The process of removing the deficiencies and loopholes in the data is called as
(a) Aggregation of data
(b) Extracting of data
(c) Cleaning up of data.
(d) Loading of data
(e) Compression of data.

3. Which one manages both current and historic transactions?
(a) OLTP
(b) OLAP
(c) Spread sheet
(d) XML
(e) All (a), (b), (c) and (d) above.

4. Which of the following is the collection of data objects that are similar to one another within the same group?
(a) Partitioning
(b) Grid
(c) Cluster
(d) Table
(e) Data source.

5. Which of the following employees data mining techniques to analyze the intent of a user query, provided additional generalized or associated information relevant to the query?
(a) Iceberg query method
(b) Data analyzer
(c) Intelligent query answering 
(d) DBA
(e) Query parser.

6. Which of the following process includes data cleaning, data integration, data selection, data transformation, data mining, pattern evolution and knowledge presentation?
(a) KDD process
(b) ETL process
(c) KTL process
(d) MDX process
(e) None of the above.

7. At which level we can create dimensional models?
(a) Business requirements level
(b) Architecture models level
(c) Detailed models level
(d) Implementation level
(e) Testing level.

8. Which of the following is not related to dimension table attributes?
(a) Verbose
(b) Descriptive
(c) Equally unavailable
(d) Complete
(e) Indexed.

9. Data warehouse bus matrix is a combination of
(a) Dimensions and data marts
(b) Dimensions and facts
(c) Facts and data marts
(d) Dimensions and detailed facts
(e) All (a), (b), (c) and (d) above.

10. Which of the following is not the managing issue in the modeling process?
(a) Content of primary units column 
(b) Document each candidate data source
(c) Do regions report to zones
(d) Walk through business scenarios
(e) Ensure that the transaction edit flat is used for analysis.

11. Data modeling technique used for data marts is
(a) Dimensional modeling
(b) ER – model
(c) Extended ER – model
(d) Physical model
(e) Logical model.

12. A warehouse architect is trying to determine what data must be included in the warehouse. A meeting has been arranged with a business analyst to understand the data requirements, which of the following should be included in the agenda?
(a) Number of users
(b) Corporate objectives
(c) Database design
(d) Routine reporting
(e) Budget.

13. An OLAP tool provides for
(a) Multidimensional analysis
(b) Roll-up and drill-down
(c) Slicing and dicing
(d) Rotation
(e) Setting up only relations.

14. The Synonym for data mining is
(a) Data warehouse
(b) Knowledge discovery in database
(c) ETL
(d) Business intelligence
(e) OLAP.

15. Which of the following statements is true?
(a) A fact table describes the transactions stored in a DWH
(b) A fact table describes the granularity of data held in a DWH
(c) The fact table of a data warehouse is the main store of descriptions of the transactions stored in a DWH
(d) The fact table of a data warehouse is the main store of all of the recorded transactions over time
(e) A fact table maintains the old records of the database.

16. Most common kind of queries in a data warehouse
(a) Inside-out queries
(b) Outside-in queries
(c) Browse queries
(d) Range queries
(e) All (a), (b), (c) and (d) above.

17. Concept description is the basic form of the
(a) Predictive data mining
(b) Descriptive data mining
(c) Data warehouse
(d) Relational data base
(e) Proactive data mining.

18. The apriori property means
(a) If a set cannot pass a test, all of its supersets will fail the same test as well
(b) To improve the efficiency the level-wise generation of frequent item sets
(c) If a set can pass a test, all of its supersets will fail the same test as well
(d) To decrease the efficiency the level-wise generation of frequent item sets
(e) All (a), (b), (c) and (d) above.

19. Which of following form the set of data created to support a specific short lived business situation?
(a) Personal data marts
(b) Application models
(c) Downstream systems
(d) Disposable data marts
(e) Data mining models.

20. What is/are the different types of Meta data?
I. Administrative.
II. Business.
III. Operational.
(a) Only (I) above
(b) Both (II) and (III) above
(c) Both (I) and (II) above
(d) Both (I) and (III) above
(e) All (I), (II) and (III) above.

21. Multiple Regression means
(a) Data are modeled using a straight line
(b) Data are modeled using a curve line
(c) Extension of linear regression involving only one predicator value
(d) Extension of linear regression involving more than one predicator value
(e) All (a), (b), (c) and (d) above.

22. Which of the following should not be considered for each dimension attribute?
(a) Attribute name
(b) Rapid changing dimension policy
(c) Attribute definition
(d) Sample data
(e) Cardinality.

23. A Business Intelligence system requires data from:
(a) Data warehouse
(b) Operational systems
(c) All possible sources within the organization and possibly from external sources
(d) Web servers
(e) Database servers.

24. Data mining application domains are
(a) Biomedical 
(b) DNA data analysis
(c) Financial data analysis
(d) Retail industry and telecommunication industry
(e) All (a), (b), (c) and (d) above.

25. The generalization of multidimensional attributes of a complex object class can be performed by examining each attribute, generalizing each attribute to simple-value data and constructing a multidimensional data cube is called as
(a) Object cube
(b) Relational cube
(c) Transactional cube
(d) Tuple
(e) Attribute.

26. Which of the following project is a building a data mart for a business process/department that is very critical for your organization?
(a) High risk high reward
(b) High risk low reward
(c) Low risk low reward
(d) Low risk high reward
(e) Involves high risks.

27. Which of the following tools a business intelligence system will have?
(a) OLAP tool
(b) Data mining tool
(c) Reporting tool
(d) Both(a) and (b) above
(e) (a), (b) and (c) above.

28. Which of the following is/are the Data mining tasks?
(a) Regression
(b) Classification
(c) Clustering
(d) inference of  associative rules
(e) All (a), (b), (c) and (d) above.

29. In a data warehouse, if D1 and D2 are two conformed dimensions, then
(a) D1 may be an exact replica of D2
(b) D1 may be at a rolled up level of granularity compared to D2
(c) Columns of D1 may be a subset of D2 and vice versa
(d) Rows of D1 may be a subset of D2 and vice versa
(e) All (a), (b), (c) and (d) above.

30. Which of the following is not an ETL tool?
(a) Informatica
(b) Oracle warehouse builder
(c) Datastage
(d) Visual studio
(e) DT/studio.

Answers  And Reasons


      
1. B Data access tools to be used when deciding on the data structure of a data mart.

2. C The process of removing the deficiencies and loopholes in the data is called as cleaning up of data.

3. B Online Analytical Processing (OLAP) manages both current and historic transactions.

4. C Cluster is the collection of data objects that are similar to one another within the same group.

5. C Intelligent Query Answering    employee’s data mining techniques to analyze the intent of a user query provided additional generalized or associated information relevant to the query.

6. A KDD Process includes data cleaning, data integration, data selection, data transformation, data mining, pattern evolution, and knowledge presentation.

7. B Dimensional models can be created at Architecture models level.

8. C Equally unavailable is not related to dimension table attributes.

9. A Data warehouse bus matrix is a combination of Dimensions and data marts.

10. E Ensure that the transaction edit flat is used for analysis is not the managing issue in the modeling process.

11. A Data modeling technique used for data marts is Dimensional modeling.
12. D Routine reporting should be included in the agenda.
13. C An OLAP tool provides for Slicing and dicing.
14. C The synonym for data mining is Knowledge discovery in Database.
15. D The fact table of a data warehouse is the main store of all of the recorded transactions over time is the correct statement.
16. A The Most common kind of queries in a data warehouse is Inside-out queries.
17. B Concept description is the basis form of the descriptive data mining.
18. B The apriori property means to improve the efficiency the level-wise generation of frequent item sets.
19. D Disposable Data Marts is the form the set of data created to support a specific short lived business situation.
20. E The different types of Meta data are Administrative, Business and Operational.

21. D Multiple Regression means extension of linear regression involving more than one predicator value.

22. B Rapid changing dimension policy should not be considered for each dimension attribute.

23. A A business Intelligence system requires data from Data warehouse

24. E Data mining application domains are Biomedical, DNA data analysis, Financial data analysis and Retail industry and telecommunication industry

25. A The generalization of multidimensional attributes of a complex object class can be performed by examining each attribute, generalizing each attribute to simple-value data and constructing a multidimensional data cube is called as object cube.

26. A High risk high reward project is a building a data mart for a business process/department that is very critical for your organization

27. A Business intelligence system will have OLAP, Data mining and reporting tolls.

28. E Regression, Classification and Clustering are the data mining tasks.

29. A In a data warehouse, if D1 and D2 are two conformed dimensions, then D1 may be an exact replica of D2.

30. D Visual Studio is not an ETL tool.

No comments: