Learning Attribute Hierarchies from Data: Exploratory Approaches
In cognitive diagnostic assessment in education, multiple fine-grained attributes are measured simultaneously. Attribute hierarchies are considered important structural features of cognitive diagnostic models (CDM) that provide useful information about the nature of attributes. Templin and Bradshaw (2014) first introduced a hierarchical diagnostic classification model (HDCM) that directly takes into account attribute hierarchies, and hence HDCM is nested within more general CDMs. They also formulated an empirically driven hypothesis test to statistically test one hypothesized link (between two attributes) at a time. However, their likelihood ratio test statistic does not have a known reference distribution so it is cumbersome to perform hypothesis testing at scale. Instead, we studied two exploratory approaches that could learn the attribute hierarchies directly from data, namely, the latent variable selection (LVS) approach and the regularized latent class modeling approach (RLCM). An identification constraint was proposed for the LVS approach. Simulation results revealed that both approaches could successfully identify different types of attribute hierarchies, when the underlying CDM is either the DINA model or the saturated logliner CDM (LCDM) model. The LVS approach outperformed the RLCM approach, especially when the total number of attributes increases. An illustrative example using data from the Examination for the Certificate of Proficiency in English (ECPE) is provided. Possible analyses for longitudinal learning data will be discussed, along with future directions.