Advances in computational methods and the exponential growth of computing power have made it feasible to use high-throughput calculations to populate large databases of material properties. These databases, which may be combined with available experimental data, can be queried to identify materials that are likely to be well-suited for particular applications or analyzed to identify important trends that facilitate the design of new materials.
There are many challenges associated with data-intensive materials research. For high-throughput calculations, it is necessary to generate a computational infrastructure that automates the execution of jobs and storage of results. The methods used must be able to calculate relevant property values both accurately enough to be of practical use and quickly enough to be used on a large number of materials. Once the data has been generated, it is necessary to create tools and interfaces that allow humans and machines to analyze the data. For some large data sets, it is necessary to develop data mining and machine learning algorithms for effective data analysis. These challenges drive our research in this emerging field.