Estimation of Gaussian directed acyclic graphs using partial ordering information with applications to DREAM3 networks and dairy cattle data

Abstract

Estimating a directed acyclic graph (DAG) from observational data represents a canonical learning problem and has generated a lot of interest in recent years. Research has focused mostly on the following two cases: when no information regarding the ordering of the nodes in the DAG is available and when a domain-specific complete ordering of the nodes is available. In this paper, motivated by a recent application in dairy science, we develop a method for DAG estimation for the middle scenario, where partition-based partial ordering of the nodes is known based on domain-specific knowledge. We develop an efficient algorithm that solves the posited problem, coined Partition-DAG. Through extensive simulations, using the DREAM3 Yeast networks, we illustrate that Partition-DAG effectively incorporates the partial ordering information to improve both speed and accuracy. We then illustrate the usefulness of Partition-DAG by applying it to recently collected dairy cattle data, and inferring relationships between various variables involved in dairy agroecosystems.

Publication
Annals of Applied Statistics