WEBVIGIL: ADAPTIVE FETCHING AND USER-PROFILE BASED CHANGE DETECTION OF HTML PAGES
Data on the web is constantly increasing. Many a times, users are interested in specific changes to the data on the web. Currently, in order to detect changes of interest, users have to poll the pages and check for the changes he/she is interested in. Efficient and effective change detection and notification is critical in many environments where a lot of resources are wasted in monitoring changes to the web manually. WebVigiL is a change monitoring system, which efficiently monitors changes to the page on behalf of the user and notifies the changes in a timely manner. It is a general-purpose, server based information monitoring and notification system.
This thesis focuses on detecting changes of interest to the users in HTML pages and the problem of fetching pages of interest by using the change pattern of the pages. The CH-Diff algorithm is introduced for computing customized changes to an HTML page. An adaptive Best-Effort-Algorithm, based on history of observed changes, is introduced for fetching.