WEBVIGIL: SENTINEL SPECIFICATION AND USER-INTENT BASED CHANGE DETECTION FOR EXTENSIBLE MARKUP LANGUAGE (XML)
With the exponential increase of information on the web, there is a need for efficient retrieval and notification of selective information. Currently, users have to retrieve (by pull/poll) the pages manually to check for changes of interest, resulting in waste of human resources and associated high cost. Hence, WebVigiL is designed as a general-purpose, active capability-based monitoring and notification system, for handling the specification, management, and propagation of changes on unstructured/semi-structured documents based on user specification.
In this thesis, we present the semantics of a change specification language for specifying user policies for web page monitoring. We also present a design for efficient validation and storage of user specifications in a persistent repository. For handling customized change detection based on user-intent, we propose an algorithm for change detection to the contents of semi-structured documents. Though the approach taken is general, we will explain the change detection in the context of XML documents.