Publications and the usability of scientific data Peter N. Schweitzer 1. Who am I Office of the Chief Geologist, Information Resources Group Coordinate scientific data management divisionwide (4 years) Previous work: Data management in GD Global Change Program (5 years) Software developer, Microway Inc. (2 years) Science background, Ph.D. Oceanography (micropaleontology) 2. Purpose of talk: Describe and interpret changes I have seen in the interaction of publication processes and scientific data management during the past 10 years, emphasizing USGS. Main points: (a) Rising technological capability of both scientists and the public has increased the usefulness of digital scientific data and diverted the interest of USGS research staff from the printed publication towards the release of operable digital data products. Operable digital data products means data that people can process using software that understands the scientific content of the information, not merely graphics. Print is not going away, but print is no longer the only game in town. (b) The use of digital data products by the public requires changes in the way we do our work; these changes are partly organizational, partly philosophical, and are nowhere completely effective. Positions in science teams explicitly designated as data managers have had their roles expanded to subsume more Documentation directed towards outside users Design and deployment of data access systems Direct user support Positions in the publications groups have been explicitly redirected towards digital processes, and also include some of these same roles. Emerging needs now widely recognized as crucial tend to appear between the boundaries of existing organizational purviews: Reengineering of data catalog systems Enhancement of data documentation increasing consistency widening audience Support of data users These functions stress existing personnel in data management and publication roles. (c) Institutional resistance to these changes stems in part from the diversity of functional roles that USGS plays, each of which is supported by a rather different, but valid philosophy of interaction with the public. Historically, USGS has had 3 different ways of dealing with the users of our information: (1) Basic researcher Anticipate questions nobody has asked yet Limited contact with users, mostly core professionals Credibility established through publication (2) Consultancy Address specific questions from clients Intimate interaction with client organizations Data not generally available to the public Credibility established through clientele (3) Data warehouse Questions answered are longstanding and well known Data transfer (order processing) support Redirects customers to consultants for help Credibility established by quality of data, which is governed by procedures These represent a continuum along which the types and roles of the data and information users varies. The increasing usefulness of digital data by the public and non-core professional strains these roles because operable digital data have different properties than printed publications or client-specific services. Data products permit (and require) rapid revision Data improve as people use them if you're paying attention to the users' experiences and feedback Changes in users' tools tend to change real or perceived needs in data structure and format Concepts and methods from the software industry can usefully inform our handling of digital data products As with software, review of data cannot detect all errors and usability problems Early release of products with promise of technical support and working feeback mechanisms Flexible, multi-tier technical support Teachers and learners vs wise hermits Organizations that rely on people who know what nobody else knows tend to be brittle, weak, and vulnerable to staff turnover. A more robust philosophy is for people to strive to know what everybody else knows. The result is a community in which everyone is a teacher and everyone is a learner as well. This effect is most obvious in the field of desktop technical support. Reputation for responsiveness counts, but takes time Explicit versioning of products, increasing the awareness of changes among end-users End-users become partners in review process and consequently subsequent development efforts (4) Software developer Questions arise in feedback from users Work with users to understand their needs Modifies data, develops tools for users Credibility established by usability of data and quality of technical support (end)