Research Topics

The research plan is based on developing domain-specific solutions to general data management problems, with the hypothesis that by carefully limiting the scope of operations to the needs of climate-change research, we can devise and implement effective and efficient tools. We may classify the research topics into four groups.

Data integration
The goal of this component is to provide an integrated logical view of datasets relevant to climate-change research, shielding the researcher from low-level details such as the data formats, storage location, database schema, document formats, varying nomenclature, etc.
Data mining
Once we have an integrated view of data, we will develop a suite of tools that researchers can use to identify interesting patterns and features in the integrated data.
Provenance
When consulting an integrated view of data, it is important to know of the source from which a displayed value or fact is derived. While this task poses few challenges when the integration is simple (e.g., the displayed value being the average of a set of values from different sources), it is much more complex with the interaction uses more sophisticated operations (which are necessary for effective integration), such as extraction of numerical data from text, semantic mapping of terms, and schema transformations.
Workflows
If we visualize the lifecycle of data, from the point of origin to the scientific discovery or other product that they enable, they typically go through several steps that include both human and automated processing. While our earlier components are designed to ease and enhance the individual steps, the goal of this component is to develop methods for effectively managing the entire collection of steps (workflow).

Our initial work has focused on the integration and interactive analysis of data, as described next.