CURRICULUM VITAE KAMAL MAHMOOD ALI kamal3@yahoo.com OBJECTIVES: Applied research & development in IE, data & web analytics, machine learning, text-mining, NLP, QA, Active Learning, Ensemble classification. ACADEMIC QUALIFICATIONS: PhD (UC Irvine), MS (UC Irvine; GPA 3.96), GRE/CS:95%, GRE/Math: 99%, BS (Sydney University). WORK STATUS: US Citizen. * Research & Applied research - Yahoo,Tivo,IBM,UCI,Intuit * Text Mining - Document clustering (Vividence) collocations (Yahoo) * Search engine - Rank & index quality evaluation (Yahoo) * Recommender systems - group lead at TiVo * Data mining modeling - Yahoo, Vividence, IBM * Web Analytics - TiVo, Yahoo * Active learning - research and system at Stanford * Consulting - while at IBM research * Corporate training - while at IBM research HONORS: * Invited to write book on Data & Machine Learning for Astrophysicts * Invited to chair KDD 2009 Industry Track - Paris, France. * Invited speaker: 2007 Google Tech Talks, 06 IIIA, various other. * 25 published papers * 150 citations for journal paper on committee machines, ensembles. * 3rd out of 40,000 students in Mathematics in Aust. High School Grad Exam * Horner Award (Mathematics), Sydney University. * Prize in National Australian Mathematics Competition. * 13th out of 40,000 students overall in Australian Grad Exam. * Awarded National Science Foundation grant for $300K for 3 years, 1993. * Member of various NASA External Artificial Intelligence committees, 1997+. * Member of various Machine-Learning, Data-Mining conference review 93-03. * Reviewer for National (US) Science Foundation proposals $300K project. * Won award at IBM; only granted to top 0.5%. * Thesis fellowship (1995) (Only 20 granted at UCI). * Chair, International Conference on Machine Learning session, 1997. * Regents Fellowship, UCI. CONFERENCE PUBLICATIONS: Konik T., Ali K., Shapiro D., Li N. and Stracuzzi D. Improving Structural Knowledge Transfer with Parametic Adaptation, 2010 FLAIRS, FL. Ali K., Leung K., Konik T., Choi D and Shapiro D. Knowledge-Directed Theory Revision. 2009 ILP: Inductive Logic Programming, Leuven, Belgium. Li N., Stracuzzi D., Cleveland G., Konik T., Shapiro D., Molineaux M, Aha D. and Ali K. Constructing Game Agents from Video of Human Behavior, 2009, AIIDE Stanford, CA. Ali K., Scarr M. Modeling distribution of clicks for Web search. 2007 - Accepted as full paper to WWW 2007. 1/7 papers accepted. Ali K. and Pan J. Reproducible Bernoulli Cluster Sampling for Analytics Applications at the Petabyte Scale. 2007 - In progress. Ali K., Scarr M and Pan J. Pushing sampling below join in query trees. 2007 - In progress. Ali K. and Chang C. 2006. On the relationship between click-rate and relevance for search engines. In Proceedings of Data-Mining and Information Engineering 2006. Ali K., Juan Y. and Chang C. 2005. Exploring Cost-effective Approaches to Human Evaluation of Search Engine Relevance. In Advances in Information Retrieval. LNCS 3408. Springer-Verlag. Ali K. and Van Stam W. 2004. TiVo: Making Show Recommendations using a Distributed Collaborative Filtering Architecture. In Proceedings of the Tenth International Conference on Knowledge Discovery in Databases. AAAI Press. Ali K. and Ketchpel S. 2003. Golden Path Analyzer: Using Divide-and-Conquer to Cluster Web Clickstreams. In Proceedings of the Ninth International Conference on Knowledge Discovery in Databases. AAAI Press. Ali K., Manganaris S. and Srikant R. (1997). Partial Classification using Association Rules. In Proceedings of the Third International Conference on Knowledge Discovery in Databases. AAAI Press. Ali K., Brunk C. and Pazzani M. (1994). On Learning Multiple Descriptions of a Concept. In Proceedings of the Sixth International Conference on Tools with Artificial Intelligence. New Orleans, LA: IEEE Press. Pazzani M., Merz C., Murphy P., Ali K., Hume T. and Brunk C. (1994). Reducing Misclassification Costs. In Machine Learning: Proceedings of the Eleventh International Conference. New Brunswick, NJ. Morgan Kaufmann. Ali K. and Pazzani M. (1993). HYDRA: A Noise-tolerant Relational Concept Learning Algorithm. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence. Chambery, France. .. older papers. JOURNAL ARTICLES: Ali K. (2005). (Invited). A Framework for Human Evaluation of Search Engine Relevance, Special Issue on the 27th European Conference on Information Retrieval. Ali K. and Pazzani M. (1995). Error Reduction through Learning Multiple Descriptions. Machine Learning Journal. 150 citations. Ali K. and Pazzani M. (1995). (Invited for submission) HYDRA-MM: Learning Multiple Descriptions to Improve Classification Accuracy. Special Issue of International Journal on AI Tools 4,1&2. World Scientific. Ali, K., Lister, R., Horsfall C. (1989). TOWARDS KNOWLEDGE-BASED IDENTIFICATION OF MINERAL MIXTURES FROM REFLECTANCE SPECTRA in Knowledge-Based Systems 2,1. Butterworth Scientific. BOOK CHAPTERS: Ali K. and Pazzani M. (1995). Learning Multiple Relational Rule-based Models. In Fisher, D. and Lenz H. Learning from Data: Artificial Intelligence and Statistics, Vol. 5. Springer-Verlag. Ali K. and Pazzani M. (1992). Reducing the small disjuncts problem by learning probabilistic concept descriptions. In Petsche, T., Hanson, S.J. & Shavlik, J. (Eds), Computational Learning Theory and Natural Learning Systems, Vol. 3. Cambridge, Massachusetts. MIT Press. Hirschberg D., Pazzani M., Ali K. (1991). Average Case Analysis of k-CNF and k-DNF Learning Algorithms. In Hanson, S.J., Petsche, T., Kearns, M., & Rivest, R.L. (Eds), Computational Learning Theory and Natural Learning Systems, Vol. 2. Cambridge, Massachusetts. MIT Press.} WORKSHOP PUBLICATIONS: Li N., Stracuzzi D., Cleveland G., Langley P., Konik T., Shapiro D., Ali K., Molineaux M. and Aha D. Learning Hierarchical Skills for Game Agents from Video of Human Behavior. 2009 IJCAI Workshop Ali K., Langley P., Maloof M., Sage S. and Binford T. (1998). Improving Rooftop Detection with Interactive Visual Learning. In Proceedings of the 1998 IMAGE UNDERSTANDING WORKSHOP (pp 479-492). Monterey, CA. Pazzani M., Murphy P., Ali K. and Schulenberg D. (1994). Trading off coverage for accuracy in forecasts: Applications to clinical data analysis. In Proceedings of the 1994 AAAI Symposium on AI in Medicine} (pp 106-110). Stanford, CA. PROFESSIONAL WORK EXPERIENCE: DATA-MINING CONSULTANT, Self 2006 - * TiVo, AnswerLab, Elder Research, others. * Consulting, data-analysis, engineering for recommenders, targeted advertising, medical model building. DATA-MINING ARCHITECT, Yahoo 2002 - 2006 * Text Mining - Finding collocations for search engine indexing. * Data Mining Architect for Recommendation systems for shopping. * Sampling Project: Led team of engineers to apply it to 16 datamarts and applications in Yahoo. * Comparative search-engine quality analysis project. Built dashboard to analyze diffferences between Yahoo and Google. Influential in launch of Yahoo using Inktomi technology. * Yahoo Search Quality dashboard production system. Ran daily for 4 years. Used by executives to make key decisions. PRINCIPAL RESEARCH SCIENTIST at Vividence. 2000 - 2002 * Built 3 research-quality prototypes incorporated into product: 1. Golden Path Analyzer: clustering clickstreams using divide and conquer. 2. Comment/text clustering: developed novel improvement on density-based clustering. Helped cut down human interpretation time 10 fold (C++). 3. Automatic Correlation finder: used "bubbling-up" to bring surprising correlations to analyst. Products were amongst the top sales drivers in 3 releases. * Strategic planning and vision for analytics at Vividence. * Consulted with analysts and BU's to field prototypes. LEAD ENGINEER at TiVo. 1999 - 2000 * Recommenation System: Led team to build TiVo's "Thumbs-up/Thumbs-down" system using Naive-Bayes and Collaborative filtering. Used by 3M+ TiVo viewers (C++). * Tech lead for system to anonymously mine user's remote control behavior - used for Nielsen. SENIOR RESEARCH SCIENTIST at Stanford Univ. (Comp Learn Lab) 1998 - 1999 * Automated Satellite Image Analysis: research into methods for detecting rooftops from images built on top of SRI's CME Pattern Recognition Environment (Java,Lisp). Used Active-learning to reduce human labelling time; used boosting, bagging over different feature sets to improve ROC performance. * Used Matlab for diagnosing cavitation for Rockwell proj. SENIOR DATA MINING ANALYST at IBM Almaden 1995 - 1998 * Research into using Apriori algorithm for classification. * Ranked #1 out of 15 Data-Mining PhDs (incl Stanford, MIT) in Global Business Intelligence Group (GBIS). * Won awards (1997,1998) (top 0.5%) for outstanding work and initiative. * Built models using Radial-Basis mixtures, EM, logistic and neural regression to predict churn, upsell, acquisition vulnerability for banks, retailers and insurance * Used IBM's Intelligent Miner, SAS's EM, DB2, DIAMOND. * Helped sell several contracts and licenses worth hundreds of thousands of dollars each as tech lead on sales calls. * Organized initiative to manage GBIS customer success stories and publicize them on the Web. * Organized initiative to prioritize customer requirements for IBM's Data-Mining tools. INVITED TALKS: Google Tech Talks 2007, IIIA 2006, Data Mining Marketing 2003, Stanford University Machine Learning Series 1996, SIGMOD recommender workshop, 1998, Supercomputing conference, Stockholm, 1997, Interval Research, 1996, REVIEWER POSITIONS (sample): NSF: 2003, 2004, WebKDD 04, KDD 03, NSF 1996, NASA 1998, MLJ 95-97, ICML 97, PAMI 96, KDD 1996, TAI 1996, 1997. PATENTS * Efficient and Reproducible Visitor Targeting Based on Propagation of Cookie Information - Yahoo. * Sampling for SQL Aggregate Queries - Yahoo. * Survey/panelist comment text-mining - Vividence. * Path Analysis - Vividence. * Distributed collaborative filtering - TiVo. SKILLS: * Funding: - Acquired funding at UCI ($300K) and IBM ($300K) from own initiative. * Leadership: - Tech lead on several data-mining projects: Yahoo, TiVo, IBM. - Organized initiative to publish customer success stories at IBM. - Kick-started initiative at TIVO that enriched viewer logs enabling TIVO to raise revenue from selling audience behavior reports. - Tech lead on IBM data-mining classes, TIVO research partnerships. - Tech lead on pre-sales visits for IBM DM software and services. * Management: - Managed sampling team at Yahoo. - Managed initiative at Tivo to gain revenue through reports based on clickstream viewer data - Managing design, user-requirements and user-interface design for IBM data-mining software system in JAVA and C++. * Customer interaction: - Collected and prioritized customer requirements for s/w apps - Won several customers for IBM. * Project planning: - Skilled at converging between data-mining algorithm possibilities and customer data and business needs. * Initiative: - Changed the direction of the IBM DM software from "application enabler" to a series of industry- and task-specific applications. Initiated project to systematically collect and manage customer requirements. * Public Speaking: - One of the top choices for public speaking assignments in IBM's data-mining organization. * Initiative: - Organized classes for prospective customers at IBM that won several contracts with Canadian firm, US banks and overseas business partners. * Teaching - Tech lead and main teacher for classes for prospective customers for IBM DM tools. - Taught undergrad programming languages classes at UCI * Programming: - C++ (10 yrs); Perl (11 yrs); R, S, S-plus (2 yrs); Java (2 yrs); ASP/COM (1 yr); Visual Basic (1 yr); C (9 yrs); LISP (8 yrs); SAS (2 yrs); SQL(1 yr); Unix shell scripts (15 yrs). REFERENCES: Mehran Sahami, Google Research sahami@google.com Michael Mahoney, Yahoo Research mahoney@yahoo-inc.com Michael Pazzani, Rutgers pazzani@rutgers.edu Wray Buntine Helsinki Inst. buntine@hiit.fi Byron Dom, Yahoo bdom@yahoo-inc.com