Most Important Long Questions
-
Introduce the concept of data mining and cite two application areas. What are the different steps of a data mining task? Suppose that the data mining task is to cluster the following eight points (with (x, y) representing location) into 3 clusters. ... Use k-means algorithm to determine the three clusters.
- WBUT Years: [2010, 2012, 2013, 2014, 2016, 2018]
- Key Concepts Covered: Data Mining introduction, KDD process steps, K-means clustering algorithm (with numerical example).
- Source Page: 9 (Long Answer Type Questions, Q3)
-
What is Clustering? Why is it required for data warehousing and data mining? What are the differences between partitioning clustering and Hierarchical clustering?
- WBUT Years: 2009, 2013, 2014, 2016, 2018
- Key Concepts Covered: Definition of clustering, its importance in DW/DM, comparison of major clustering paradigms (partitioning vs. hierarchical).
- Source Page: 26 (Long Answer Type Questions, Q1a) and 28 (Long Answer Type Questions, Q2a)
-
Define decision tree. What are the advantages and disadvantages of the decision tree approach over other approaches for data mining? Discuss briefly the tree construction principle.
- WBUT Years: 2009, 2010, 2013
- Key Concepts Covered: Decision tree definition, pros and cons, core algorithm/principles of tree construction.
- Source Page: 35 (Long Answer Type Questions, Q1a, b, c)
-
Define Data Warehouse and briefly describe its characteristics.
- WBUT Years: [2009, 2018]
- Key Concepts Covered: Core definition of a data warehouse, and its defining characteristics (subject-oriented, integrated, time-variant, non-volatile).
- Source Page: 8 (Long Answer Type Questions, Q1)
-
In data warehouse technology explain ROLAP, MOLAP and HOLAP techniques of implementing a multidimensional view. OR, Compare between HOLAP, ROLAP and MOLAP.
- WBUT Years: [2009, 2013]
- Key Concepts Covered: Different OLAP server architectures, their working principles, advantages, and disadvantages.
- Source Page: 52 (Long Answer Type Questions, Q1)
-
What is tree pruning? What are the different tree pruning techniques? Evaluate Information Gain and Gain Ratio with suitable example.
- WBUT Years: [2013, 2014]
- Key Concepts Covered: Concept of decision tree pruning (to avoid overfitting), types of pruning (pre-pruning, post-pruning), and evaluation metrics like Information Gain and Gain Ratio.
- Source Page: 53 (Long Answer Type Questions, Q3a, b)
-
Introduce the concept of Support, Confidence and Frequent Itemset and then give a formal definition of Association Rule. Generate all Frequent Itemsets from the following transaction data given minimum support = 0.3. Find the Association Rules from the above Frequent sets at min. 50% confidence.
- WBUT Years: 2010
- Key Concepts Covered: Fundamental definitions of association rule mining (Support, Confidence, Frequent Itemsets), and practical application of finding them (implicitly Apriori algorithm).
- Source Page: 31 (Long Answer Type Questions, Q3)
-
Discuss the principle of FP-tree Growth algorithm. OR, Discuss the different phases of FP-tree growth algorithm. OR, Define FP tree. Discuss the method of computing FP tree.
- WBUT Years: [2010, 2012, 2018]
- Key Concepts Covered: Working principle of the FP-Growth algorithm, including FP-tree construction and pattern mining without candidate generation.
- Source Page: 40 (Long Answer Type Questions, Q3b)
-
Explain Slicing, Dicing, Roll-up and Drill-down with a suitable example.
- WBUT Years: 2017
- Key Concepts Covered: Core OLAP operations for multidimensional data analysis, with examples.
- Source Page: 58 (Long Answer Type Questions, Q6)
These questions cover the foundational concepts and most frequently tested algorithms and techniques in Data Warehousing and Data Mining based on the provided WBUT previous year questions.
Top comments (0)