Center for Information Systems Integration and Evolution Dr. Larry Kerschberg, Director Dr. Hassan Gomaa, Associate Director School of Information Technology and Engineering George Mason University Fairfax, Virginia 22030-4444 1. Introduction The Center for Information Systems Integration and Evolution (CISIE) has been established within the School of Information Technology and Engineering at George Mason University to focus on the issues and barriers of the integration and evolution of information systems. In particular, CISIE will focus on the important research and development issues confronting the nation in areas such as business process re-engineering, data, information and knowledge interchange, information integration in large-scale systems, and information systems evolution. We are witnessing the right-sizing and re-engineering of organizations to better address customer needs and to meet business imperatives for inserting information technology solutions. New architectures are being investigated which bring the computing resource to customers so as to capture information at its source, and to share it as a corporate resource through information repositories. Moreover, there is a need to address national challenges such as health care, advanced manufacturing technology, the national information infrastructure and crisis management, all of which require inter-disciplinary approaches to the problem, therefore bringing together domain experts as well as information-, computer- and software- scientists and engineers to create new information system paradigms. Large-scale systems, as for example, the Human Genome Project, the Earth Observing System Data Information System (EOSDIS), and the Defense Information Systems Agency's efforts to integrate and evolve existing information systems, all require the integration and interchange of information from heterogeneous autonomous data systems. 1.1. Personal and Corporate Information Management The integration of computing and communications has implications for the way people work, interact and play. Increasingly, knowledge workers can use collaboration technologies to work at distant locations on the same project, edit the same document simultaneously, and interchange information via electronic mail. Laptop computers allow a traveler to keep abreast of developments at headquarters via facsimile, electronic mail, remote access to corporate servers, and on-line database searches. The advent of the personal digital assistant (PDA) indicates that remote, mobile, computing and communications will become commonplace. Clearly, the means to capture and manipulate information via portable, hand-held computers, has far-reaching implications for both individuals and enterprises. There will be a need to foster individual computing while also allowing the enterprise to establish policies and standards to ensure the integrity, confidentiality, consistency and validity of its information resources. These resources include data stored in databases, software stored in software repositories, and corporate knowledge found in application programs, data dictionaries, corporate policies and procedures, and in the minds of corporate experts. 1.2. Enterprise Engineering In the last few years, our investment in information technology has begun to pay off not only through increased worker productivity, but also by the judicious application of such technology in the automation of corporate processes and work flow. Enterprises are beginning to use information technology to allow vertically-integrated, stove-pipe systems to exchange information to support enterprise-wide applications. Modern information technology is breaking down the organizational barriers created by these stove-pipe systems. Departmental local area networks allow functional units to organize large-scale applications in support of their information processing needs. Data and knowledge interchange standards and protocols allow the diverse functional units to interact with the corporate information systems, by accessing and manipulating data locally, and then uploading the data to augment the corporate databases. The trends toward right-sizing and re-engineering the corporation are a direct result of the impact of information technology; its ability to gather and process information at its point of origination, and then move that information throughout the organization. Further, enterprises are realizing that data can be separated from the processes manipulating that data. This has two benefits: 1) the processes can be re-engineered to reflect the most appropriate match of enterprise resources to the capture and manipulation of the information, and 2) data can be viewed as a corporate resource to be managed by modern tools such as data dictionaries and database management systems. 1.3. Information Integration in Large-Scale Systems The are several large-scale engineering and scientific information systems such as the Human Genome Project, the Earth Observing System Data and Information System (EOSDIS), the Defense Information Systems Agency's efforts to integrate existing information systems, and the Intelligence Community's Open Source Database System which all require the integration and interchange of information from heterogeneous data systems and information repositories. These systems share several similar characteristics and requirements: 1) Support collections of heterogeneous computer systems with diverse hardware, software, file systems and database systems, 2) Support "legacy" applications constructed as stand-alone systems using software tools that are no longer supported by vendors, 3) Support and use standard commercial-off-the-shelf (COTS) software systems such as User Interface Management Systems, Database Management Systems, Computer Assisted Software and System Engineering (CASE) Tools, etc., 4) Integrate information from multiple sources, even though this information may be heterogeneous in nature, that is, of different data types such as text, formatted records, temporal data, sequence data, spatial data, analog video and audio data, and digital imagery, 5) Support the specification, computation, versioning and visualization of derived data, called data products, which may be based on multiply typed data, simulation models, complex computational models, knowledge-based reasoning methods, etc., 6) provide collaborative environments where researchers, planners and practitioners can evolve a system by using modular, domain-specific tools that allow a new system component to be specified, constructed, tested and integrated into the existing system. The above trends and requirements indicate that information integration, sharing, interchange and evolution will be key concepts that will drive the technological and policy considerations for future systems during this decade and beyond. New concepts, tool, techniques and policies will be needed to support the goals of seamless integration of legacy systems and modern applications. Further, the emerging National Information Infrastructure will present new application areas such as Knowledge Servers and Digital Libraries. For this reason, we propose the Center for Information Systems Integration and Evolution to focus on these issues and to serve as a national resource for R&D activities. 1.4. System Evolution As our systems become increasingly complex, they must be designed to evolve and adapt to changing conditions within their environment. Thus we need to endow these systems with knowledge which will allow them to adapt. This will entail research into new methodologies, methods and tools for the specification, design and implementation of these evolutionary systems. Further, they should be able to learn from their behavior and environmental changes. Automated techniques are needed for systems to discover new knowledge, plan for its incorporation into the system knowledgebase, and effect the evolutionary changes required for it to adapt to changing situations. 2. Goals of CISIE The center has a wide-ranging and ambitious agenda in research, development and education. The goal is to serve as a national and international resource to academe, industry and government. Our unique location near the nation's capitol provides George Mason University access to many regional software and systems companies, as well as government laboratories. The primary purpose of the Center will be to perform leading-edge research and development. We wish to create an environment in which industry and government will look to the Center for advanced doctoral and MS students, for meaningful partnerships for creating "pre-commercialization" proof-of-concept prototypes, and for advanced conceptual and technological solutions to systems-integration-related problems. CISIE will also focus on innovative mechanisms for technology insertion, technology transition and technology transfer The educational component of the Center will focus on the publication of technical reports, research review briefings, and short courses in related topics. It is hoped that the research topics addressed by the center will also find their way into regular academic courses. 2.1. Research Agenda The center has established a long-range strategic research and development agenda to address fundamental issues that pertain to systems integration and evolution. The interdisciplinary nature of the center will provide the crucial knowledge and expertise required to solve large-scale science and technology problems, many of them serving "dual-use" purposes for civilian as well as governmental and military applications. 2.1.1. Integration Science Concepts for Structural, Semantic and Behavioral Heterogeneity Interoperability of Computer-Based Information Systems Knowledge-Based Information Integration and Interchange Distributed Collaborative Planning Intelligent Agents for Electronic Commerce Knowledge Repositories 2.1.2. Software Systems Engineering Domain Modeling Methods and CASE Tools for Software Reuse and Evolution Large-Scale Reuse of System-Level Components Client/Server Domain Models for Information Commerce Software Performance Engineering User Interface Design for Integrated Systems Software Test and Tools Reuse of Software Artifacts Software Repositories 2.1.3. Large-Scale Information Systems Integration Earth Observing System Data and Information System (EOSDIS) Open Source Literature Analysis for the Intelligence Community Knowledge Repositories for the Information Superhighway Information Systems Integration for the Defense Information Systems Agency 2.1.4. Architectural Issues in Information Systems Integration Degrees of Interoperability Mediation Architectures Federation Architectures Multidatabase Architectures Support for Massive Digital Databases and Knowledge bases 2.1.5. Knowledge-Based Information Systems Integration and Evolution Intelligent Access to Information in Heterogeneous Systems Interoperability of Meta-data and Information Repositories Knowledge Discovery from Multiple Databases Evolutionary Architectures for Structural and Functional Evolution Distributed Planning and Cooperative Problem Solving in Heterogeneous Systems 2.1.6. National Information Infrastructure Seamless Integration of Information Superhighway Resources Distributed Query Processing over Heterogeneous Systems Information Security and Network Reliability Protection of Intellectual Property Rights Concept of Universal Service Digital Libraries Life-Long Learning 2.1.7. Policy and Security for Integrated Systems Multi-level Security for National Information Services Internet Intellectual Property Guidelines Software Standards Security Policies for Interoperable Systems Security Models for Systems Integration Network Security Models 2.2. Development Agenda Development activities will revolve around research laboratories whose projects will implement the concepts and tools proposed in their research. The particular R&D projects will depend on the funding available. 2.2.1. Laboratory for Evolutionary Software Architectures This laboratory already exists and is headed by Dr. Gomaa. The major projects underway are the Knowledge-Based Software Engineering Environment funded by CIT/VCOE. Other laboratories will be formed as funding becomes available. 2.2.1 Knowledge Rover Laboratory for Logistics This is a new laboratory that will be formed based on funding from an ARPA-Defense Logistics Agency contract awarded to American Management System with CISIE and MIT as University team members. The contract calls for the development of an architecture for intelligent software agents and an active Data Warehouse for electronic commerce in the logistics area. AMS, GMU and MIT will have internetworked laboratories linked to the client, the Defense Personnel Support Center in Philadelphia. 2.3. Recent CISIE Activities Highlights of the recent activities of Center Faculty are listed below: 1) ARPA Research in the Intelligent Integration of Information - work continues on the federated client/server architecture with its service-oriented approach to handling semantic heterogeneity among databases, temporal mediation services and the harmonization of conflicting answers to a query. Drs. Kerschberg, Gomaa, Jajodia, Motro and Wang are the participating faculty and Dr. Jong Pil Yoon was a Research Associate on the grant. He has returned to Korea as an Assistant Professor at the Sookmyung Woman's University in Seoul. 2) Independent Architectural Study of Earth Observing System Data and Information System (EOSDIS) - an interdisciplinary effort of Earth Scientists, Information and Computer Scientists in developing an architecture for EOSDIS. Drs. Kerschberg, Gomaa and Menascˇ were CISIE members on this interdisciplinary project. Dr. Yoon served as Research Associate. The overall GMU-lead consortium had 27 persons from four institutions. The final report and presentation were well-received by NASA and Hughes. 3) PROGEN, Process Model Generation - here we continue the development of domain modeling tools and techniques to specify a domain model of the family of process models, and the generation of members of the family via the selection of features relevant to that member. Drs. Kerschberg and Gomaa are PI and Co-PI respectively on this project. Dr. Yoon served as Research Associate. 4) The Knowledge Discovery in Databases - This project is concerned with the development of a large-scale multi-type reasoning system, called INLEN, for extracting knowledge from databases. The system assists a user in discovering general patterns or trends, meaningful relationships, conceptual or numerical regularities or anomalies in large databases. The volume of information in a database is often too vast for a data analyst to be able to detect such patterns or regularities. INLEN integrates symbolic learning and statistical techniques with database and knowledge base technologies. It provides a user with "knowledge generation operators" (KGO) for discovering rules characterizing sets of data, generating meaningful conceptual classifications, detecting similarities and formulating explanations for the rules, generating rules and equations characterizing data, selecting and/or generating new relevant variables or representative examples, and testing the discovered rules on new data. Drs. Kerschberg and Michalski are working with several graduate students: Ken Kaufman, Jim Ribeiro, Ibrahim Imam and Eric Bloedorn on developing aspects of the system. Jim Ribeiro is investigating knowledge discovery in multiple databases. 4) Linear Constraint Databases - this is new work in conjunction with Brown University and focuses on extensions and refinements of Dr. Brodsky's research. We note with pleasure that Dr. Paris Kanellakis, a senior researcher at Brown University, invited Dr. Brodsky to participate in this research project. This attests to the relevant and leading-edge research being performed by Dr. Brodsky. 5) NSF Research Initiation Awards - Drs. Brodsky and Wang, both CISIE members, were awarded RIAs from the Expert Systems and Databases Program within NSF. There were a total of eight awards, and GMU was the only university to receive two awards. The faculty have a number of grants related to the general theme of Information System Integration and Evolution. We believe the Center will help to establish GMU as a leader in this field. At present the Center personnel are funded by ARPA, ONR, and Siemens, and proposals have been submitted to NSF and to the Software Productivity Consortium/CIT. We plan to supplement our current grants by soliciting sponsored research funding from major agencies such as NSF, ARPA, ORD, NASA, NSA, etc. 3. CISIE Organizational Structure The Center Director is Dr. Larry Kerschberg, Chair of the Information and Software Systems Engineering (ISSE) Department in SITE. The Associate Director is Dr. Hassan Gomaa, Professor of Information and Software Systems Engineering in ISSE, and Director of the Laboratory for Evolutionary Software Architectures. Our Research Administrator is Mrs. Robbie Gossner, who currently serves that role in the ISSE Department, and is supported by a number of grants associated with CISIE and well as the Center for Secure Information Systems. Figure 1 presents the evolving organizational structure for the Center. Figure 1: Organizational Chart: Center for Information Systems Integration and Evolution The current faculty members of the Center are listed in Section 4 below. In addition to the management team, and its core faculty, the Center will have a Steering Committee composed of senior center members, and will also establish an Industrial and Governmental Advisory Panel consisting of leaders from private industry as well as government agencies. The External Partners to the Center will be invited from both private and public corporations as well as government agencies. Those we intend to contact include: Center for Innovative Technology, Software Productivity Consortium, Virginia Center of Excellence in Software Reuse and Technology Transfer, Northrop-Grumman Corporation, Hughes Applied Information Systems, MITRE, and Institute for Defense Analyses. The government agencies to be contacted include: ARPA, Naval Research Lab, NIST, NASA, US Geological Survey, and the Office of Research and Development of the Central Intelligence Agency. 4. Academic Members The initial core faculty of CISIE are listed below. The membership is quite interdisciplinary and we have already shown that interdisciplinary research can be performed successfully, as shown by the cooperative nature of the six-month EOSDIS effort for Hughes Applied Information Systems. The Center faculty are listed by academic unit. Information and Software Systems Engineering Ammann, Paul E., Software Testing, Fault Tolerance, Secure Databases, Access Control Models, Formal Methods Baum, Richard, Object-Oriented Programming and Methods Brodsky, Alexander, Linear Constraint Databases, Spatio-Temporal Information Systems, Geographic Information Systems Gomaa, Hassan, Analysis and Design of Real-time Systems, Domain Modeling Jajodia, Sushil, Information Systems Security, Temporal Databases Kerschberg, Larry, Expert Databases, Software Engineering, Knowledge Discovery and Evolution Motro, Ami, Intelligent Databases, Integration of Diverse Information, Uncertainty Management in Information Systems Offutt, A. Jefferson, Automatic test data generation, Real-time Software Reliability Sanden, Bo, Software Design, System Software and Concurrent Software Sandhu, Ravi S., Information Security Models, Policies and Mechanisms Sibley, Edgar H., Information Systems Policy, Organizational Informatics Wang, Xiaoyang, Database Query Languages, Temporal Databases Computer Science Menascˇ, Daniel A., Performance Evaluation of Computer Systems, Software Performance Engineering, Distributed and Parallel Systems. Institute for Computational Sciences and Informatics Kafatos, Menas, Computational Science and Astrophysics and EOSDIS Michaels, George, Computational Biology, the Human Genome and EOSDIS School of Business Administration Chen, Minder, Information Engineering, Collaboration Environments, and CASE Tools Dutta, Amitava, Intelligent Network Management, Network Optimization Program in Organizational Learning Cox, Brad, Object-Oriented Software, Electronic Commerce, Robust Market Mechanisms for Electronic Goods, Software Entrepreneurship. 5. External Partners The External Partners of the Center will participate in various activities designed to foster technology interchange, insertion and transfer. This will be done by dissemination of research results through technical reports, research reviews, teaming on selected grant proposals and contracts, etc. The Center will serve as a focal point for research and development, and will allow industry and government to build effective mechanisms to take R&D concepts, to prototypes, and on to commercialization. The Center will invite the organizations listed below to become External Partners. Note that we have already made contact with several potential industrial partners, particularly Northrop-Grumman, and SPC/VCOE. The SPC-VCOE have indicated that its joint-research program with GMU is of strategic importance. Industrial Partners Northrop-Grumman Corporation. Interactive Development Environments. American Management Systems (AMS) BDM International The Analytical Sciences Corporation (TASC) Hughes Applied Information Systems. Software Productivity Consortium (SPC). Virginia Center of Excellence in Software Reuse and Technology Transfer (VCOE). CISIE will also invite selected governmental agencies and FFRDCs to participate in its research and development program. They include: Government Partners NIST NASA Department of Energy Department of Transportation (DOT) US Geological Survey Virginia's Center for Innovative Technology