|Web Archive Collective (WAC) at Harding University||Research | Home|
The Web Archive Collective (WAC) is a three-year NSF sponsored project (1008492) whose goal is to address the barriers of accessing and sharing disparate archives of web data (e.g., archived web pages, Twitter updates, user-generated tags, and newsgroup postings). WAC is a joint venture with Hector Garcia-Molina and Andreas Paepcke at Stanford University, Michael L. Nelson at Old Dominion University, and Frank McCown at Harding University. See the Stanford WAC website for work performed at Stanford University and the Web Science and Digital Libraries Research Group blog to keep up-to-date with work performed at ODU.
Dr. McCown taught a course entitled Introduction Web Science in Spring 2011. This was a survey course for junior and senior Computer Science majors which covered some of the fundamental concepts of Web Science: web architecture, web characterization and analysis, web archiving, Web 2.0, web search engines, analyses of social networks, collective intelligence, recommender systems, and clustering algorithms.
See the class web page for slides, homework problems, and project descriptions.
Dr. McCown presented a talk entitled Teaching an Introduction to Web Science Course to CS Undergraduates at the Teaching the Web with Web Science workshop at the Web Science 2012 conference (June 21, 2012 in Evanston, IL).
Harding comp sci majors are hired as researchers each summer to work on WAC projects. Below is a summary of the research performed the past few summers.
Vivens worked on a project that would automatically locate non-accessible music videos from YouTube.
For example, if someone posted a video to YouTube that was later taken down (because of copyright
violations or other reasons), Viven's project would help locate where the same or similar video
would be located in YouTube.
Date: Summer 2011
Richard performed some research examining the Mobile Web from a web archiving perspective.
He built a program that would find web pages designed for mobile devices and measured the similarity
with standard web pages. He also built a classifier to determine if a web page was designed
for a mobile device or desktop based on some web page features.
Date: Summer 2012
Daniel continued the work performed by Vivens the year before. He has designed a Firefox add-on
called Volitrax which will automatically redirect the user to a new location of a music video on
YouTube if the video is removed. The add-on communicates with a web server who stores information
about music videos in Twitter, tumblr, and delicious so the data is likely to survive long
after the application itself has deceased.
Date: Summer 2012
Heather completed work on an iOS version of the Memento Browser.
The web browser uses the Memento protocol to allow users to see
archived versions of web pages in a seemless mannor. The initial browser code was developed by
Dr. Steve Baber and completed by Heather in August 2011. It is
available for download
from iTunes, and you can download the source code from
Date: Summer 2012
The WAC Summer Workshop was held at Stanford University on June 29-30, 2012. Twenty-one graduate and undergraduate students from a number of universities attended the workshop free of charge (thanks to NSF grant 1009916). The workshop featured speakers from Stanford, Los Alamos National Laboratory, Internet Archive, UC Berkeley School of Law, California Digital Library, Microsoft Research, and other research labs.