P. Ayris, R. Davies, R. McLeod, R. Miao, H. Shenton, and P. Wheatley.
The life2 final project report. Final project report, LIFE Project, London,
 L. C. David Tarrant, Steve Hitchcock. Where the semantic web and web
2.0 meet format risk management: P2 registry. International Journal of
Digital Curation, 6(1):165–182, 2011.
 N. Dehak, R. Dehak, J. Glass, D. Reynolds, and P. Kenny.
Cosine similarity scoring without score normalization techniques. in
Proceedings of Odyssey 2010 - The Speaker and Language Recognition
Workshop (Odyssey 2010), pages 71–75, 2010.
 S. Gordea, A. Lindley, and R. Graf. Computing recommendations for
long term data accessibility basing on open knowledge and linked data.
Joint proceedings of the RecSys 2011 Workshops [email protected]
and UCERSTI 2, 811:51–58, November 2011.
 R. Graf and S. Gordea. Aggregating a knowledge base of file formats
from linked open data. Proceedings of the 9th International Conference
on Preservation of Digital Objects, poster:292–293, October 2012.
 R. Graf and S. Gordea. A risk analysis of file formats for preservation
planning. In Proceedings of the 10th International Conference on
Preservation of Digital Objects (iPres2013), pages 177–186, Lissabon,
Portugal, Sep 2013. Biblioteca Nacional de Portugal, Lisboa.
 R. Graf, S. Gordea, and H. Ryan. A model for format endangerment
analysis using fuzzy logic. In Proceedings of the 11th International
Conference on Digital Preservation (iPres2014), pages 160–168,
Melbourne, Australia, Oct 2014. State Library of Victoria, Melbourne.
 D. Heckerman. Bayesian networks for data mining. Data Mining and
Knowledge Discovery, 1(1):79–119, 1997.
 J. Hunter and S. Choudhury. Panic: an integrated approach to the
preservation of composite digital objects using semantic web services.
International Journal on Digital Libraries, 6, (2):174–183, September
 A. N. Jackson. Formats over time: Exploring uk web history.
Proceedings of the 9th International Conference on Preservation of
Digital Objects, pages 155–158, October 2012.
 A. Karnik, S. Goswami, and R. Guha. Detecting obfuscated viruses
using cosine similarity analysis. In Modelling Simulation, 2007. AMS
’07. First Asia International Conference on, pages 165–170, March
 G. W. Lawrence, W. R. Kehoe, O. Y. Rieger, W. H. Walters, and
A. R. Kenney. Risk management of digital information: A file format
investigation. june 2000.
 D. Pearson and C. Webb. Defining file format obsolescence: A risky
journey. The International Journal of Digital Curation, Vol 3, No
1:89–106, July 2008.
 H. Ryan. File format study. School of Information and Library Science,
University of North Carolina at Chapel Hill, 2, 2013.
 D. Tanner. Using statistics to make educational decisions. Library of
Congress Cataloging-in-Publication Data, pages 77–104, 2012.
 S. Vermaaten, B. Lavoie, and P. Caplan. Identifying threats to successful
digital preservation: the spot model rsik assessment. D-Lib Magazine,
18(9/10), September 2012.
 X. Wu, V. Kumar, J. Ross Quinlan, J. Ghosh, Q. Yang, H. Motoda,
G. McLachlan, A. Ng, B. Liu, P. Yu, Z.-H. Zhou, M. Steinbach, D. Hand,
and D. Steinberg. Top 10 algorithms in data mining. Knowledge and
Information Systems, 14(1):1–37, 2008.
 J. Ye. Cosine similarity measures for intuitionistic fuzzy sets and their
applications. Mathematical and Computer Modelling, 53(1?2):91 – 97,
 R. Zacharski. A Programmer’s Guide to Data Mining: The Ancient Art
of the Numerati. 2012.
 H. Zhang. The Optimality of Naive Bayes. In V. Barr and Z. Markov,
editors, FLAIRS Conference. AAAI Press, 2004.