How To Respond

The SPARC 2019 Roadmap for Action (INSERT LINK) accompanying the SPARC 2019 Landscape Analysis (INSERT LINK) outlined a three-pronged approach to addressing issues stemming from the increasing pursuit by commercial vendors of data and data analytics in the academic community. Of particular concern is that the academic community is fragmented at many levels, hindering its capacity to respond to challenges posed by the deployment of approaches and technologies that the academic community is unprepared to manage. At the base of SPARC’s recommendations were three critical ideas that supported the suggested way forward:

1. Base actions and plans on principles.

It is vital to identify a structured set of principles that represent a foundation and a compass for action. SPARC has identified principles used across many organizations when dealing with artificial intelligence,¹ complemented it, and given examples of how these principles can be translated into actual contractual clauses (Exhibits 5 and 6). A robust debate around these principles will make the list even more useful.

2. Increase coordination, alignment and—where possible—cooperation within the knowledge community.

The academic community is fragmented into in a variety of institutions that struggle to operate as a coherent group. Academic institutions operate in different countries and regions, balance research and teaching differently, and often compete for funding, for faculty, and for students. In addition to this fragmentation is a long tradition of independ-ence and self-management of schools and departments within large universities, as well as a distinctive separation between academic institutions and learned societies.

A significant change to this culture of decentralized decision making is unlikely. However, important opportunities exist for both closer cooperation and community realignment. It is certainly difficult to ask schools and departments that are historically independent to abandon that position, but these entities will likely support coordination if it helps them. Better procurement processes, help and support in the management of data sets and flows, and updated policies that focus on the best way to support authorized uses of data (as opposed to merely protecting from unauthorized access, as is often the case today) could prove popular even in highly decentralized institutions. Similarly, in recent years learned societies have operated at arm’s length from (and sometime in conflict with) academic institutions.

3. Support structured processes involving as many stakeholders as possible to define responses.

Because of fragmentation, each institution will likely define its own approach and develop actions and programs to fit its specific culture and priorities. Many of these decisions will require the in-depth understanding of specific legal, ethical, business, computer science, and economics issues and solutions. It is just as important to bring in the voices of all stakeholders, including those in communities that may be least protected from the impact of the actions and programs.

Specific Actions

The SPARC 2019 Roadmap for Action (INSERT LINK) that accompanied the original SPARC 2019 Landscape Analysis (INSERT LINK) also proposed several specific actions to manage the impact of data analytics and AI on academic institutions and their communities. The proposed actions (Exhibit 7) fall into three categories: Risk Mitigation, Strategic Choices, and Community Actions.

Though this diagram is not intended to be prescriptive, the algorithms describe some examples of potential steps to promote open infrastructure, and the metrics describe how to measure the success of those efforts.

Metrics (what to measure) should be clearly differentiated from algorithms (how to measure). Metrics should be determined only by the academic community, while algorithms can come from a variety of sources, although their use must be subject to the principles we outlined earlier.

In addition to the actions proposed in 2019 and 2020, there are three additional courses of action to consider this year: organizational change, cooperation, and community-controlled infrastructure.

Introduce necessary organizational change. Institutions have sometimes treated data and infrastructure as an afterthought, but one lesson of 2020 is that understanding what data and information are collected, by whom, for what purposes, and with what protocols, is a necessity. It is also critical to decide when and how the information should be used, what principles and codes of ethics should apply, and what control the academic community should exercise over the process. This points to the growing need for institutions to establish key roles: chief data officer and university (or college) ethicist.

The chief data officer (CDO) role is becoming a familiar one. For example, Title II of the Open Government Data Act² (passed into law on December 31, 2018 and signed on January 14, 2019) requires every federal agency to appoint a chief data officer and defines a list of strategic tasks. Educause (the largest community of technology, academic, and campus leaders advancing education through IT) has a higher education chief data officer working group. In many institutions, this role may need to be upgraded: the position’s objective should include managing the strategic uses of data and the principles that should be adhered to when using data tools in addition to the operational ones, while collaborating with IT departments on IT issues. Many CDOs today are instead focused on more technical activities around data collection, preservation, and security.

The proper role of the CDO is not managing data (which should continue to be the responsibility of individual offices and departments), but rather developing strategies, policies, procedures, and guidelines, as well as transferring best practices. Most current institutional data policies are almost exclusively oriented towards security, focusing primarily on limiting the risks of unauthorized access to different types of data, and they do not address the strategic uses of data and the non-negotiable principles which should be required for authorized data access. It is also critical that CDOs work with their peers to identify—and demand that vendors correct—the systemic biases that characterize virtually all algorithms.

The role of university/college ethicist is a newer concept. Several issues posed by the adoption of data analytics have no single, straightforward answer that suits every campus. Is it acceptable for an academic institution to screen applications with software or to monitor the online behavior of applicants? Is it fair to use software to detect possible online cheating in spite of reports that it may disproportionately single out minorities, women, and some people with disabilities? Is it acceptable to use software that predicts possible violent behavior of students and staff and take action for acts that have not been yet committed? Should research data in relevant disciplines be aggressively harvested for commercial purposes to make up for falling income from other sources, or should data be made open for society at large to profit? Institutions will legitimately come to diverging conclusions.

The role of a university/college ethicist is to lead and facilitate the institution’s response to these and the many additional issues that data analytics and AI will pose. Ideally, answering these questions would be based on collecting inputs from disciplines such as ethics, law, economics, and computer science and then consulting representatives of all the categories of people affected, such as faculty, staff, students, and administration. However, this process must be properly organized and managed, and having an individual explicitly tasked with leading this process is crucial. In addition, a university/college ethicist would be available to consult with anyone seeking ethical advice when studying an academic initiative. This role should be distinct from those of both an ombudsperson (who is generally charged with collecting and investigating allegations of misadministration or violation of rights and codes and who may also be alerted to violations of the law) and that of a compliance officer (who is primarily tasked with reducing legal risks for an academic institution).

The role of chief ethicist is a novel concept, but it is not a completely unknown one. In the corporate world, by 2019, several companies had identified individuals to steer corporate decisions in accordance with values.³ Some pioneering examples are also found in the academic community (e.g., Penn State and UC San Diego⁴). However, a review of several current academic searches for these positions indicates that many academic institutions view this role as overlapping with the management of compliance, narrowing the role to support of legal and audit activities and reducing its visibility.

In a prescient article published in 2003, John B. Bennet argues that ethics officers in a diminished role could do more harm than good⁵ by conveying to the faculty and staff that they do not have to exercise personal ethical judgment (since someone else is now in charge) and by failing to influence institutions if they do not have the ear of governing boards. Bennet’s article underscores the need for these appoint-ments to be made at senior level, be focused on ethics rather than compliance and auditing, and have regular and unfiltered communications with governance boards.
Pursue cooperation. Individual academic institutions are generally not able to negotiate from a position of strength with publishers that have access to much more information, such as the prices paid by comparable institutions, or how many different customers the publisher may have in a country or institution for their different products.
Investing in community-controlled infrastructure is the most obvious next step, but not the only one possible. Academic institutions could work with learned societies, for example, to lure them away from their dependence on subscription revenues from publishers. Many societies must be wondering whether Elsevier’s decision to sign a memorandum of understanding with the University of California that has no “reading” revenues and includes society journals in their OA scheme should alarm them. Their leaders, regardless of existing short-term agreements with publishers, must be wondering whether—over time—they can count on Open Access revenues to replace what they earned through their share of subscription revenues. This event presents a unique opportunity to launch programs aimed at aligning the interests and capabilities of academic institutions and societies (and perhaps some publishers).

Other opportunities for collaboration include, for example, pursuing advocacy on specific themes of common interest (such as surveillance and the sale of data to third parties), supporting litigation and antitrust actions, funding and developing open educational resources, and lobbying for student protections against inclusive access and for digital circulation rights.

Invest in community-controlled infrastructure. Corporations move fast—often much faster than academic institutions. Since the November SPARC 2019 Roadmap for Action (INSERT LINK), the pandemic has understandably set back plans for community investment in infrastructure. However, commercial players have continued to advance their plans for leveraging data analytics and further entrenching themselves in critical academic processes. Senior leaders of academic institutions still have an opportunity to mobilize the financial resources and talent necessary to develop communi-ty-owned infrastructures that both support open and equitable dissemination and preservation of research communications and the attached metadata, and that also allow analyzing those metadata to help senior decision makers manage their institu-tions by their own priorities.

Considering the benefit to the community, the resources required to fund such a project may be a wise investment. Building a fully functioning research dissemination and data analytics company may require an investment of less than $40–50 million, but this money must be raised, and that leads to questions of whether this is best accomplished by partnerships between the academic community and the private sector, between the academic community and NGOs, or between the academic community and governments. In turn, this requires understanding if there is an opportunity to build and operate a sustainable community-owned infrastructure, how it should be funded, and whether the intellectual and knowledge output of academic institutions should generate financial resources to fund this infrastructure. The launch of Invest in Open Infrastructure (IOI) provides appropriate coordination for the academic community to develop a full community-controlled infrastructure. Alternatively, leaders from research institutions around the world should commit to building this infrastructure, with the support of funding bodies, if necessary. This leadership group would commit to designing the infrastructure to further the interests of the global academic community, and not just those of wealthy countries or institutions.

The choice between open and closed data and knowledge has implications along a spectrum of issues extending beyond funding academic knowledge infrastructure. For example, open data raises national security and economic competitiveness issues, as well as questions about academic freedom, academic priorities, and even the fundamental goals of academic institutions. Launching a structured process to analyze these implications appears a critical step that leaders of academic institutions need to take sooner rather than later.

4. Learn from the pandemic.

In addition to the initiatives SPARC identified as part of the SPARC 2019 Roadmap for Action (INSERT LINK), the subsequent 18 months demonstrated both the value of science and knowledge and the necessity of fostering open science practices. These two broad societal themes must be pursued, both because the pandemic has demonstrated the need to ensure equitable access to progress in health care practices and because other looming challenges such as climate change and the loss of biodiversity require major advance-ments in research.

Foster equitable open science practices. Open science cannot be equitable if research is inequitably focused on the most privileged members of society. The weight accorded to leading journals because of their impact factors (IF) has given these journals the incentive to operate a covert science policy: publishers and editors have incentives to maintain or raise their IF, and this leads them to prioritize publishing articles that are likely to be widely cited. This means they will prefer to publish articles in areas that are “fashionable” and of wide interest, and this focus of the leading publishers in turn affects funding and the priorities of funding bodies (Exhibit 8). Unfashionable disciplines and approaches (like those affecting rare diseases or people in disadvantaged communities) are structurally disadvantaged by these dynamics.

SPARC has been aware of these issues for a long time because of its global work, but vast groups within the academic community have not yet focused on them. The academic community must acknowledge that it cannot be held hostage to impact factors, not just because of their limitations in assessing researchers when making key tenure and promotion decisions, but also because they foster fundamental inequities.

When the Declaration on Research Assessment (DORA) was released in 2012, the distortions caused by the IF were well understood,⁶ yet little changed over the next eight years. In 2019 the National Academies of Sciences launched a roundtable to define new incentives for open science that align tenure and promotion decisions to virtuous behavior. It is vital that the academic community support this process and translate the roundtable’s recommendations into policies and actions.
Raise societal investments in knowledge as a critical priority. The academic and research community has achieved major accomplishments since the onset of the pandemic in early 2020, leading to the fastest identification of a new virus and development of vaccines, treatments, and protocols ever seen. At the same time, most academic institutions were able to provide continuity in teaching, learning, and research services. More broadly, knowledge has helped mitigate the risks of a pandemic that some expected but few were properly equipped to tackle. The academic community should build on this success to demand much more support by the rest of society.

Politicians and regulators increasingly recognize that open knowledge dissemination accelerates progress. As part of its infrastructure plan, the Biden administration initially proposed an ambitious increase ($250 billion over several years) in the government’s research and innovation investment. Though this seems like a large amount, it is less than 10% of the total additional infrastructure spending the administration initially proposed, and it remains to be seen what will happen as the plan goes through Congress.

In this report, we have pointed out how the past year has seen more deals that led to even more concentration, to loss of diversity, and ultimately further eroded the academic community’s control over its destiny. We have also highlighted some positive signs: a large merger failed, Invest in Open Infrastructure was launched as a concerted effort to build a community-owned infrastructure, and some legislative progress has been made.

Much remains to be done, but we see many emerging signals that the academic community understands that regaining control over its content, its data, and its infrastructure is vital to achieving its objectives and staying true to its values. We look forward to continuing our efforts to support the knowledge community as it regains control over these critical elements.

1. Base actions and plans on principles.

2. Increase coordination, alignment and—where possible—cooperation within the knowledge community.

3. Support structured processes involving as many stakeholders as possible to define responses.

4. Learn from the pandemic.

About the authors

Claudio Aspesi

Scholarly Publishing and Academic Resources Coalition