Getting from Research to Feature: CFR Definitions

"I read this great article on entity extraction for legal text - can you add that feature to our repository web site?" The proliferation of legal informatics research over the past few years has placed the promise of machine learning and semantic web technologies in the spotlight. Bridging the gap from research results to feature development, however, is not trivial.

This presentation will show the process of research, design, and development that LII is using for the CFR definition feature, which will allow readers of a regulatory text to see definitions for the defined terms within that text. We found that many users of our website (www.law.cornell.edu), including law students, find it very difficult and time-consuming to get the correct definitions of terms in regulations - if they are even aware of the necessity of doing so. Our challenges include:

  • Using natural language processing and machine learning techniques to detect defined terms, their definitions, the scope of their definitions, and the occurrence of the terms in context.
  • Establishing quality guidelines.
  • Presenting definitions in a way that indicates complexity without bogging down in it.
  • Designing and developing a tool for collecting user feedback that can be used to improve accuracy.
  • Fostering effective collaboration between law students and engineers.

This presentation will go into detail on the technologies used but will also address issues of interest to managers of research-driven feature development projects.

 

Schedule Info and Session Details

No sessions have been submitted.