Accomplishments

With the unprecedented growth-rate at which data is being collected and stored electronically in almost all fields of human endeavor, the efficient and responsible extraction of useful information from data has become a crucial scientific challenge and a critical economic need. In the early 1990’s, Rakesh and his team began devising algorithms for asking open-ended queries, eventually authoring a 1993 paper on association rule discovery that later became the foundational paper for the field of knowledge discovery and data mining. Association rule discovery is a data mining approach to find unexpected patterns in large data sets. Rakesh is fondly referred to as the father of data mining because of this seminal work and other fundamental data mining concepts and technologies he devised. It is noteworthy that the ACM SIG on Knowledge Discovery and Data Mining awarded its inaugural Innovations Award for outstanding technical contributions to the field to Rakesh. Four of his papers on data mining have received “test-of-time” awards: two from SIGMOD and one each from VLDB and ICDE.

It is rare that a researcher’s work creates not only a product, but a whole new industry. IBM’s data mining product, Intelligent Miner, grew straight out of Rakesh’s research. IBM’s introduction of Intelligent Miner and associated services created a new category of software and services. His research has been incorporated into many other commercial products, including DB2 Mining Extender, DB2 OLAP Server, WebSphere Commerce Server, and Microsoft Bing Search engine, as well as many research prototypes and applications.

Subsequently, Rakesh and his team pioneered key concepts in data privacy, including Hippocratic Database, Privacy-Preserving Data Mining, and Sovereign Information Sharing. In a series of papers, the Hippocratic database work laid out the principles, architecture, and technologies for a database system that included the responsibility for privacy of data as a founding tenet. The privacy-preserving data mining work invented techniques for building accurate decision models without accessing precise information in individual data records. The sovereign information sharing work allows autonomous entities to apply database operations across private databases in such a way that no information apart from the result is revealed. These bodies of work have gained increasing importance given the emergence of cloud computing and well-publicized inadvertent and intentional misuse of data collections. In recognition, SIGMOD 2014 selected one of his data privacy related papers to receive the test-of-time award.

Recently, Rakesh and colleagues have been furthering and applying data mining in a very innovative and novel way to enhance electronic textbooks and online education. This body of work has already provided technologies for algorithmically identifying deficient sections in a textbook, augmenting textbooks with rich content in multiple format mined from the Web, and forming study teams with the goal of maximizing overall learning. Given the criticality of high quality education for success in modern society, this body of work is destined to gain increasing importance.