Data Preprocessing
- M. Hernandez and S. Stolfo, Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem, Journal of Data Mining and Knowledge Discovery, 1998
Presenter: Wyatt Pease
Date: 2/10/2015
- Wei Cheng, Xiaoming Jin, Jian-Tao Sun, Xuemin Lin, Xiang Zhang, and Wei Wang, Searching Dimension Incomplete Databases, IEEE Transactions on Knowledge and Data Engineering, 2013.
Presenter: Preston Tunnel Wilson
Date: 2/10/2015
Association Rules
- Chun-Nan Hsu and Graig A. Knoblock, Discovering Robust Knowledge from Databases that Change, Data Mining and Knowledge Discovery, Volume 2, Issue 1, 1998, 69-95.
Presenter: Daniel Morris
Date: 2/12/2015
- Xindong Wu, Chengqi Zhang, and Shichao Zhang, Efficient Mining of Both Positive and Negative Association Rules, ACM Transactions on Information Systems, 2004.
Presenter: None (will not be presented)
Date: 2/12/2015
- S.D. Lee, David Cheung and Ben Kao, Is Sampling Useful in Data Mining? A Case in the Maintenance of Discovered Association Rules, Data Mining and Knowledge Discovery, Volume 2, Issue 3, 1998, 233-262.
Presenter: Hong Xu
Date: 2/17/2015
- R. Agrawal and R. Srikant, Fast Algorithms for Mining Association Rules, Proceedings of the 20th VLDB Conference, Santiago, Chile, 1994
Presenter: Evan Deere
Date: 2/17/2015
- R. Srikant and R. Agrawal, Mining Quantitative Association Rules in Large Relational Tables, SIGMOD 1996.
Presenter: John Alar
Date: 2/19/2015
- D. Cheung, J. Han, V. Ng, and C.Y. Wong, Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique, ICDE, 1996.
Presenter: James Simpson
Date: 2/24/2015
- J. Han and Y. Fu, Mining Multiple-Level Association Rules in Large Databases, IEEE Transactions on Knowledge and Data Engineering, 1999.
Presenter: William Watkins
Date: 2/24/2015
- Eui-Hong (Sam) Han, George Karypis, and Vipin Kumar, Scalable Parallel Data Mining for Association Rules, IEEE Transactions on Knowledge and Data Engineering, 1999.
Presenter: William Kalescky
Date: 2/26/2015
Pattern Mining
- Mohammed J. Zaki, Efficiently Mining Frequent Trees in a Forest, KDD 2002.
Presenter: Haley Adams
Date: 2/26/2015
- Pedro Domingos and Geoff Hulten, Mining High-Speed Data Streams, KDD 2000.
Presenter: Alex Hofmann
Date: 3/3/2015
- X. Yan and J. Han, gSpan: Graph-Based Substructure Pattern Mining, ICDM 2002
Presenter: Casey Means
Date: 3/17/2015
- Jiawei Han, Jian Pei, and Yiwen Yin, Mining Frequent Patterns without Candidate Generation, SIGMOD, 2000.
Presenter: Andrew Tackett
Date: 3/17/2015
- R. Agrawal and R. Srikant, Mining Sequential Patterns, Proc. of the Int'l Conference on Data Engineering (ICDE), Taipei, Taiwan, March 1995.
Presenter: Katie Wiener
Date: 3/19/2015
- Jian Pei, Jiawei Han, and Runying Mao, CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets, SIGMOD, 2000.
Presenter: Max Tilka
Date: 3/24/2015
Classification
- Pedro Domingos, Meta-Cost: A General Method for Making Classifiers Cost-Sensitive, KDD, 1999.
Presenter: Kristopher Baker
Date: 3/24/2015
- B. Abelson, K. Varshney, and J. Sun, Targeting Direct Cash Transfers to the Extremely Poor, KDD, 2014.
Presenter: Alex Wang
Date: 3/26/2015
Clustering
- George Karypis, Eui-Hong (Sam) Han, and Vipin Kumar, CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling, IEEE Computer, 1999.
Presenter: David Thomas
Date: 3/26/2015
- Hastie, T. and Tibshirani, R., Discriminant Adaptive Nearest Neighbor Classification, IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI), 1996.
Presenter: Joel Michelson
Date: 3/31/2015
- Jinze Liu, Qi Zhang, Wei Wang, Leonard McMillan, and Jan Prins. Clustering Pair-wise Dissimilarity Data into Partially Ordered Sets, KDD, 2006.
Presenter: Lucas Grim
Date: 3/31/2015
- S. Arya, D. Mount, N. Netanyahu, R. Silverman, and A. Wu, An Optimal Algorithm for Approximate Nearest Neighbor Searching in Fixed Dimensions, J. ACM 45, 6 (November 1998), 891-923.
Presenter: Wyatt Gale
Date: 4/7/2015
Neural Networks
- Zheng Zhang, Jun Li, C.N. Manikopoulos, Jay Jorgenson, and Jose Ucles, HIDE: a Hierarchical Network Intrusion Detection System Using Statistical Preprocessing and Neural Network Classification, IEEE Workshop on Information Assurance and Security, 2001.
Presenter: Sumner Magruder
Date: 4/7/2015
Boosting/Bagging
- Y. Freund and R. Schapire, A decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences, 55(1): 119-139, 1997.
Presenter: Matthew Jackoski
Date: 4/9/2015
- R. Schapire and Y. Singer, Improved Boosting Algorithms Using Confidence-rated Predictions, Machine Learning, 37(3):297-336, 1999.
Presenter: Khang Nguyen
Date: 4/9/2015
Big Data
- Brin, S. and Page, L. The anatomy of a large-scale hypertextual Web search engine. In Proceedings of the Seventh international Conference on World Wide Web (WWW-7), 1998.
Presenter: Joshua Ladd
Date: 4/14/2015
- R. Kosala and H. Blockeel, Web Mining Research: A Survey, SIGKDD Explorations, June 2000. Volume 2, Issue 1.
Presenter: Catherine Grace Jernigan
Date: 4/14/2015
- Roberto J. Bayardo Jr., Efficiently Mining Long Patterns from Databases, SIGMOD, 1998.
Presenter: Connor Jerow
Date: 4/16/2015
Applications
- Tom Fawcett and Foster Provost, Data Mining for Adaptive Fraud Detection, Data Mining and Knowledge Discovery, 1997.
Presenter: Corrie Moore
Date: 4/16/2015
- J. Dorre, P. Gerstl, and R. Seiffert, Text Mining: Finding Nuggets in Mountains of Textual Data, KDD, 1999.
Presenter: Farah Sharis
Date: 4/21/2015
- J.Yang, J.McAuley, and J. Leskovec, Community Detection in Networks with Node Attributes, IEEE International Conference On Data Mining (ICDM), 2013.
Presenter: Yuanshuo Li
Date: 4/21/2015
- G. Simon, H. Xiong, E. Eilertson, and V. Kumar, Scan Detection: A Data Mining Approach, SIAM International Conference on Data Mining, 2006.
Presenter: Morgan McCullough
Date: 4/23/2015
- J. Liu, C. Aggarwal, and J. Han. On Integrating Network and Community Discovery, WSDM, 2015.
Presenter: Trevor Tamura
Date: 4/23/2015