Unsupervised learning of finite mixture models in data mining applications. The extraction of useful information from massively large databases is known as data mining. Its broad but vague goal is to find "interesting structure" in the data, which typically leads to breaking the data into clusters. To this end, we consider the fast, efficient, and automatic learning of finite mixture models in hugh data sets without any prior knowledge of the structure. This probabilistic approach to the discove ....Unsupervised learning of finite mixture models in data mining applications. The extraction of useful information from massively large databases is known as data mining. Its broad but vague goal is to find "interesting structure" in the data, which typically leads to breaking the data into clusters. To this end, we consider the fast, efficient, and automatic learning of finite mixture models in hugh data sets without any prior knowledge of the structure. This probabilistic approach to the discovery and validation of group structure in data mining applications will considerably enhance knowledge management and decision support in science, industry, and government.
Read moreRead less
Coarse Grained Parallel Algorithms. Various fields of research face barriers created by problems that are computationally hard and/or require processing of large amounts of data. For example, some computational biochemistry methods on protein or gene sequences can not be scaled up to data sets required for human health research because of performance problems. Parallel computing enables new research by increasing the size of solvable problems. In addition to fundamental parallel computing resear ....Coarse Grained Parallel Algorithms. Various fields of research face barriers created by problems that are computationally hard and/or require processing of large amounts of data. For example, some computational biochemistry methods on protein or gene sequences can not be scaled up to data sets required for human health research because of performance problems. Parallel computing enables new research by increasing the size of solvable problems. In addition to fundamental parallel computing research, this project studies parallel algorithms for structure-based drug design and protein-protein interaction prediction that will enable new biochemistry research, as well as parallel algorithms for data cubes that will help enable the next generation of very large data warehouses.Read moreRead less
Efficient Pre-Processing of Hard Problems: New Approaches, Basic Theory and Applications. Computers store even larger amounts of data about all aspects of human and industrial activity. However, they have not become significantly better at solving common problems in optimization and search. Traditional complexity theory indicates many of these problems require algorithms that are very unlikely to exist. The Parameterized Complexity approach allows us to obtain very efficient algorithms for a lar ....Efficient Pre-Processing of Hard Problems: New Approaches, Basic Theory and Applications. Computers store even larger amounts of data about all aspects of human and industrial activity. However, they have not become significantly better at solving common problems in optimization and search. Traditional complexity theory indicates many of these problems require algorithms that are very unlikely to exist. The Parameterized Complexity approach allows us to obtain very efficient algorithms for a large variety of problems, but the machinery required was diverse and complicated. This research will organize the machinery into a new approach that systematically finds good algorithms by applying simplifications around a parameter of the domain of the problem. As a result, efficient algorithms are obtained for many diverse areas.Read moreRead less