Computing Science and Statistics Symposium on the Interface with the theme of "Mining and Modeling Massive Data Sets In Science, Engineering, and Business" ==Symposium Summary== The 1997 Symposium on the Interface of Computing Science and Statistics will be held May 14-17, 1997 at the Houston Medical Center Holiday Inn Hotel in Houston, Texas. The Symposium is being organized around the theme of ``Mining and Modeling Massive Data Sets In Science, Engineering, and Business,'' with a sub-theme of the environment and quantitative environmental science. The conference is sponsored by the Interface Foundation of North America, a non-profit educational corporation. The Statistics Department at Rice University is hosting the meeting with David W. Scott as chair. ==General Information=== This document describes support for the 1997 Interface Symposium. It is a unique opportunity for professionals in statistics, computer science, and various applications areas to interact on issues at the interface of these disciplines. The Symposium fills a critical gap between the activities of our large professional societies. The theme for the 1997 meeting will be ``Mining and Modeling Massive Data Sets In Science, Engineering, and Business,'' with sub-themes of multimedia education and quantitative environmental science. We are working toward attracting eminent leaders from computer science, statistics, engineering, numerical analysis, computational biochemistry, and business. At this intermediate stage the program, the program committee reports great response to the invited paper program by prospective speakers. The combination of data mining with statistical thinking is attractive to primary as well as secondary sources. Given the enormous investments in the data collection phase as well as the potential payoff for meaningful analyses and data exploration, we expect a diverse and enthusiastic audience. The preliminary program includes the following: Jerry Friedman, keynote address. The invited program includes some thirty sessions on topics including virtual reality, marketing applications, multimedia education, numerical methods, pattern recognition, visualization, mapping, environmental statistics, wavelets, dimension reduction, computational biochemistry, Bayesian methods, networks and clusters of workstations, virtual departments, and information retrieval for massive data sets. A partial list of invited speakers includes Russell Almond (ETS), Mike Berry (UT), Adrian Bowman (Glasgow), Syd Burrus (Rice), Dan Carr (GMU), Sid Chib (UW-SL), Bill Cleveland (ATT), Di Cook (ISU), Dennis Cox (Rice), Jan DeLeeuw (UCLA), Bill Eddy (CMU), John Elder (Consultant), Kathy Ensor (Rice), David Findley (Census), Peter Guttorp (UW), Trevor Hastie (Stanford), Lasse Holmstrom (Finland), Samuel Kaski (Finland), Jon Kettenring (Bellcore), Vicki Lancaster (KSU), Al Liebetrau (Battelle), Mike Locke (BD), David Madigan (ETS), David Marchette (NSWC), Mark Marson (DOD), Cleve Moler (Matlab), Marlene Mueller (Berlin), Bala Narasimham (Stanford), Guy Nason (Bristol), Sallie McNulty (KSU), Michael O'Connell (BD), Wayne Oldford (UW), Art Owen (Stanford), Jan Pedersen (Verity), George Phillips (Rice), Wendy Poston (NSWC), Carey Priebe (JHU), Brian Ripley (Oxford), Peter Rousseeuw (Belgium), John Sall (SAS), Bill Sallas (Sandoz), Michael Schimek (Austria), Juregen Symanzik (ISU), Bill Symes (Rice), Terry Therneau (Mayo), Ed Wegman (GMU), Sandy Weisberg (UM), Jerry Whittaker (USDA), Leland Wilkinson (SPSS), and Russell Wolfinger (BD). Our aim is to bring a number of new faces to the Interface Symposium and thereby enrich both computational statistics and the applications areas of massive data sets. ==The Interface Foundation of North America, Inc.== The Interface series has grown almost by accident. An informally organized Board of Governors has existed since the Fourth Symposium consisting of all past program chairs. Thus the Board of Governors has grown by one with each passing year. With this structure, the corporate planning experience has been able to be passed on to new program chairs. However, the sole administrative responsibility of the Board has been to choose the next Program Chair. Once this has been done, the total responsibility for program, publication, finances, advertising, and local arrangements has been in the hands of this newly elected Program Chair. As the scale of the Symposia increased, this became a burdensome responsibility. In addition, because there had been no corporate entity underpinning the Symposium series, all funding for each Symposium has been funneled through the university or corporate host with essentially no mechanism for passing on seed money to the subsequent Program Chairs. In several cases, this has meant that Program Chairs have had to take loans out to do initial financing of their Symposia. In all cases, contracts signed with the hotels have been the personal liability of the Program Chair. More significantly for the funding agencies, this lack of corporate underpinning has meant that the Symposium series could not be self-sustaining since there was no legal entity to perpetuate the funding. Finally, the unique interdisciplinary character of the Interface series had been threatened as more disciplinary oriented societies offered in essence to take over the Interface series. It was felt by the Board of Governors that this would be a serious threat to the integrity of the series as a true interdisciplinary forum. For all of these reasons, in 1986 the Board of Governors established a committee chaired by Lynne Billard and including Bill Eddy, Bill Kennedy, Jim Gentle, and Ed Wegman to investigate the possibility of incorporating as a non-profit educational foundation. Ed Wegman spearheaded this investigation and proposed a set of bylaws at the Nineteenth Interface Symposium held in Philadelphia. The Board of Governors voted unanimously to incorporate and the Interface Foundation of North America, Inc. was established as a Virginia corporation in late August, 1987. ==The Interface '97 Program== For many years advances in statistics and particularly statistical computation have been driven by the general demands of industry. An early example is Student's t-test, invented by an industrial statistician to solve a quality control problem. However, the explosion of on-line resources and performance-price computer power has dramatically increased expectations of what computational and statistical scientists can provide. Data warehousing is a new trend in industry which is intended to provide information support to all segments of a business. Government is moving to provide on-line access to many of its databases. These databases are approaching terabyte size. This explosive growth is being matched in many academic research labs, libraries, among others. The problems of efficiently and effectively searching and modeling based on such massive data sets are the focus of the 1997 Interface symposium. Many traditional statistical and computational tools have been brought to bear on these types of problems. Innovative visualization methods, where possible, often shed insight on underlying structure. Novel computational paradigms, such as neural nets and Bayesian methods and artificial intelligence tools, are required in order to even begin to understand and model the data. There are a wealth of ad-hoc techniques developed by both statisticians and computational scientists. An active discussion, between subject matter specialists and statistical and computational practitioners is sure to lead to a fruitful interchange. Our intention is to foster such a dialog via the Interface conference. A conference on massive data sets was hosted by the National Academy of Sciences July 7-8, 1995. Jon Kettenring and Darly Pregibon chaired the largely applications-oriented meeting of scientists and users. Government participants from NASA, DOD, EPA, Bureau of Justice Statistics, NSF, NSA, and NSWC. We hope the papers at the Interface will provide an initial indication of how their problems may be solved, or directions that will lead to solutions. We also plan to involve professionals active in environmental modeling and visualization and will have multiple sessions emphasizing the environment and the interplay between computer science and statistics related to massive data sets generated by study of the environment. Peter Guttorp is organizing one session on collaborate research technologies that underlie the organization and collection of MDS. ==Participation by Other Societies== In the past, several professional societies have acted as cooperating societies. For the Interface `95, we have invited the Institute of Mathematical Statistics, the American Statistical Association, the International Association for Statistical Computing, the Society for Industrial and Applied Mathematics, and the Operation Research Society of America to be cooperating societies and jointly sponsor this meeting.