THE REGNET & REGBASE PROJECT
- Overview
The Regnet & Regbase research work so far has focused on developing tools and formalisms for making regulations more accessible to interest groups and the public. The approach has been to
process regulations into a standardized, machine-comprehendible format that can be easily augmented with
new information. This foundation can be exploited for a number of tasks, from searching to reasoning.
- Taxonomy Building
The project is also focused on constructing taxonomic conceptual hierarchies for environmental regulations
using software from Semio Corporation. These concept hierarchies allow grouping of regulations by content,
rather than by the structure of the organizations that authored the regulations. Browsing the taxonomy
structure enables one to find regulations pertaining to a particular topic, regardless of the regulations'
origin.
- XML-Standardized Format
A standardized XML format has been developed that is universally applicable to regulations, and that
can be validated against a DTD (Document-Type Definition). This DTD was designed to be applicable to
all regulations, with a focus on federal, state, and local regulations.
Regulation-parsers are developed that will transform a variety of regulation formats
in plain text into our standardized XML format. Parsers have already been built for federal and
Illinois state environmental regulations, as well as for federal and several
European accessibility standards. The next step is to complete a universal parser for environmental regulations.
- Adding XML Meta-Data and Enabling Regulation Reasoning
An automated system has been built to add concepts and taxonomy categories to regulation
provisions. This additional meta-data is inserted in the XML representation
of the regulations, and can be used for tasks such as searching or determining context.
Additionally, a system has been built to enable reasoning with environmental regulations. The system scans regulations
for logic sentences (pre-encoded into the XML) and allows the user to pose logic questions. A demo for
this system is available under the Search 40 CFR section.
Finally, a parser is developed to facilitate automated linking of references appearing within a regulation.
This meta-data will be added to the XML documents and can be used for searching and retrieving complete
information.
- Feature Extraction and Similarity Analysis
To enable a similarity analysis between different sources of accessibility
regulations, features are extracted and incorporated into the XML legal
corpus. Common features include concepts, definitions, as well as domain
knowledge such as measurements and effective dates. Features are tagged
as additional meta-data.
A similarity analysis is performed using a combination of content comparison
and structural analysis. Results are documented in several papers under
the Publications section. In addition, we applied our analysis core to
the domain of e-rulemaking and a demo of use can be found under the
Presentations section.
Currently, we are working on extending the technology to analysis and compare
drinking water standards.
|