Skip Navigation

Scout Archives

Home Projects Publications Archives About Sign Up or Log In

Browse Resources

(2 classifications) (6 resources)

Data mining

Classification
Automation (6)
Research (2)

Resources
Audio Mining

Occasionally referred to as audio indexing, audio mining is a computerized task involving the processing of an audio file, extracting the dialog and creating a textual transcript, and searching the transcript for certain words or phrases. Considering the amount of audio content on the Internet and other sources, it is clear that audio mining is a growing technology. To get an idea of what audio...

https://scout.wisc.edu/report/nsdl/met/2002/1220
Web Robots and Web Mining

Manually indexing the World Wide Web is obviously an impossible task, and it is even a daunting challenge for automated techniques. Web content mining is a general term used to describe these techniques, which are intended for information categorization and filtering. Web robots serve a variety of purposes, including indexing; and they can be useful or, in some cases, harmful. Web usage mining, on...

https://scout.wisc.edu/report/nsdl/met/2003/0328
Data Mining

Data Mining, also known as Knowledge Discovery in Databases, is a process used to extract implicit, previously unknown, but potentially useful information from raw data. This first website (1) provides a basic overview of Data Mining and some applications for the process. Common applications of data mining include fraud detection and marketing, but data mining has also been applied in...

https://scout.wisc.edu/report/nsdl/met/2005/0506
The Promise and Peril of Big Data [pdf]

Some data-crunchers and others are thrilled by the prospect of the growing amount of "big data". According to a recent report, the amount of digital content available on the Internet is approaching five hundred billion gigabytes. This 66-page report from the Aspen Institute asks some key questions about these developments, including "Does Big Data represent an evolution of knowledge, or is more...

https://www.aspeninstitute.org/publications/promise-peril-bi...
Machine Learning for Data Streams

Published in 2018 and now available as an open-access text, Machine Learning for Data Streams is a great guide for "data stream mining and real-time analytics." The book is authored by a group of computer science experts, Albert Bifet (Telecom Paris Tech, France), Ricard Gavalda (Politecnica de Catalunya, Barcelona), Geoff Holmes (University of Waikato, Hamilton, New Zealand) and Bernhard...

https://mitpress.mit.edu/books/machine-learning-data-streams
Robots Reading Vogue

Robots Reading Vogue explores the digital humanities (DH) possibilities presented using data from Vogue magazine. Vogue creates a DH bonanza, as it has been "continuously published for over a century," and is "completely digitized," resulting in some six terabytes of data and thousands of covers and images. Several experiments are showcased on the website, including the Diana Vreeland Memo...

http://dh.library.yale.edu/projects/vogue/