Tag Archives: chemistry

How Open are Commercial Scientific Software Packages?

A revised version of this post has been published as a Viewpoint in the Journal of Physical Chemistry Letters, DOI: 10.1021/acs.jpclett.5b02609.

Most scientific research nowadays relies on some kind of software. This is particularly true in fields such as my own, quantum chemistry. Such software is used both in applications to study various problems in chemistry, often in close connection with experiment, and it serves as platform for the development of new theoretical and computational methods.

In quantum chemistry, many program packages are available [1], that differ in functionality, usability (from easy-to-use by non-specialists to usable only by the person who wrote it) and computational efficiency. What basically all available codes have in common is that they have been developed with public (i.e., tax payers’) money. Nevertheless, the terms under which they are made available differ very significantly: Some program packages are available under open-source licenses (meaning that anyone can “study, change, and distribute the software to anyone and for any purpose” [2]), others are owned by commercial companies who sell them to both academic groups and industry users for a small or large fee. Intermediate models (free, but not open source) also exists, such as closed-source software that is distributed for free by academic groups [3] or software for which the source code is available to academic users, but with license terms that prohibit changes or redistribution.

Pushing towards Open-Source in Science

Open-source scientific software offers a number of advantages for science as a whole [4]. The most important one is that publicly funded scientific software should be available for everyone to use and extent. This has led some funding agencies, in particular in the US, to require software developed under certain grants to be open source. Recently, Krylov et al. published a viewpoint in J. Phys. Chem. Lett. that criticizes such open-source mandates [5].

The piece is written by eminent scientist, whose work in quantum-chemical method and software development I admire. All of them have in common, that, besides being professors at research universities, they are co-owners of companies selling quantum-chemical software packages [6]. I find many of the arguments put forward in this opinion piece flawed, miss a consistent use of terminology (free as in speech vs. free as in beer), and think that it is full of contradicting statements.

Perspective of the Method Developer

Here, I want to focus on one particular perspective: The one I have as a developer of new quantum-chemical methods. To develop, test, and finally use a new idea, it needs to be implemented in software. Usually, this requires using a lot of well-established tools, such as integral codes, basic methods developed many decades ago, and advanced numerical algorithms. All of these are a prerequisite for new developments, but not “interesting” by itself anymore today. Even though all these tools are well-documented in the scientific literature, recreating them would be a major effort that cannot be repeated every time and by every research group – because both time and funding are limited resources, especially for young researches with rather small groups such as myself.

Therefore, method developers in quantum chemistry need some existing program package as a “development platform”. Both open-source and commercial codes can offer such a platform. Open-source codes have the advantage that there is no barrier to access. Anyone can download the source code and start working on a new method. I have so far mostly contributed my developments to commercial codes. These also offer a lot of advantages: For successful codes, the revenue from selling licenses can be used by the companies owning them to employ software developers who maintain and document the code. These can further improve code contributed by academic groups in order to make it maintainable, efficient, and easily extendable. This can speed up new developments and improve the quality and efficiency of the resulting new software.

Commercial Codes as “Open Teamware”?

The authors of the opinion piece in Ref. [5] argue that there is no need for open-source development platforms because many commercial codes, such as Q-Chem [7] and others, operate under what they call a “open teamware” model. As they point out, many commercial code have assembled rather large communities of academic developers.

However, I would argue that access to commercial codes as a development platform is not as open as the authors of Ref. [5] claim. First of all, it is subject to signing a developer agreement, the terms of which are dictated by the companies owning the source code and that are drafted to protect their commercial interests. Usually, they include a transfer of intellectual property rights for the new developments to these companies as well as non-disclosure clauses concerning the source code and algorithms implemented in it. (Here and in the following, I am not talking about specific software packages, because the terms of developer agreements are usually covered by non-disclosure clauses themselves. Therefore, I either do not know the precise terms, or I am not allowed to reveal them).

Often, such developer agreements require exclusiveness, meaning that new source code cannot be contributed to different commercial packages. Sometimes, developers are even banned from using competing program packages [8]. Such requirements for exclusiveness prevent scientific collaborations. I have encounters this on several occasions, when fellow scientists told me that they would love to collaborate, but that they cannot do so because we are contributing to competing packages. Thus, the commercial interests of software companies lead to a segregation of the scientific community based on affiliation with certain codes. Often, methods developed in one program package are reinvented in others because scientists cannot collaborate or use each others’ software.

Perpetuating Power Structures

The use of commercial codes as development platform also puts the few scientists owning the corresponding companies into a gatekeeper position. It is up to them to decide who is allowed to contribute new ideas and developments. The policies of different companies may differ significantly. However, all of them will require to reveal novel research ideas to the scientists in these gatekeeper positions. These will in many cases be competing scientists, who might reject access because ideas are opposite to their own “scientific beliefs” or because they might interfere with their own lines of research.

These mechanism lead to perpetuating power structures that put very few individual scientists, the owners of commercial software packages, in control of most method development in our field. It should be pointed out here that many of the authors of Ref. [5] are not the original developers of the commercial codes they now own, put that they have inherited these from their academic teachers. Such decisions will certainly have been based on scientific achievements, but they have not been taken by the academic community as a whole through peer-review and funding panels, but by the few pioneers who developed the software infrastructure our whole field relies on today.

This contradicts the merit-based access to scientific resources that the authors of Ref. [5] so keenly advertise. The possibility to carry out new method developments should only be based on the quality of new ideas (as judged by grant reviewers and panels of funding agencies) and not on whether or not a scientist is part of a certain school. The “track record of productivity” [5] rewarded by funding agencies with grant money should have been established with competitive ideas, not because of access to a software infrastructure built by a researchers’ academic ancestors. (Again, let me point out that I admire the track record of all of the authors of Ref. [5] – but I think that the playing field has to be leveled for the next generation of scientists).

Finally, I have to admit that, at least in part, the problems discussed above also exist for open-source program packages. Often, these codes are less well documented and maintained (because of the lack of revenue from selling licenses), with the consequence that the barrier to contributing them might be significant. Often, it can only be overcome by collaborating with one of the lead authors of such open-source codes, which again puts these into a similar gatekeeper position. In addition, open-source code is often not immediately released to the public in order to maintain a competitive advantage over scientists that might want to improve or built upon new methods.

Possible Solutions

A first step towards a solution would be to remove the conflict of interest many scientist owning and running scientific software companies face. If these companies are run by businessmen instead of active scientists, then decisions to grant access to new external developers will be based on the possible merits for the (paying) users of the software packages and will not be influenced by fear of scientific competition. Some commercial codes use such a model [9]. A least, the policies underlying decisions whether or not to grant access to external developers should be made transparent.

Second, I believe that funding initiative to create open-source packages and to sustain their maintenance are an important piece in creating truly open platforms for method development. Apparently, such initiative are being implemented in the US, both through national laboratories and via funding agencies [5]. Such initiatives provide a means to level the playing field, by making funding available to open-source packages that commercial codes can obtain via their revenues form selling licenses to academic and industrial users. Such initiatives should, of course, not destroy commercial codes, but level the playing field. In fact, there are also funding opportunities that are exclusively available to commercial codes, such as technology grants. In Europe, many programs under the Horizon2020 framework encourage or require the involvement of small or medium enterprises, and some quantum chemistry software companies have been very successful in securing such grants [10].

Concerning funding for fundamental research, open source mandates might indeed have severe consequences for commercial codes because it would cut them off from academic method development. This could be mediated by requiring such codes – if they want to profit from public funding for basic research – to implement a truly open platform strategy, that allows non-discriminatory access to the source code for interested developers. With strict open-source mandates, commercial codes would still have the possibility to create new development in the form of modular libraries released under open-source licenses.

Conclusions

I have focused here on the perspective of the quantum-chemical method developer. Of course, there are other aspects of this discussion that are equally relevant, such as the one of the users and the global perspective of science as a whole. Related discussions on open access and open data policies are often mixed with those on open source software, which I find detrimental because the players are very different ones (small software companies run by scientists in the case of open source vs. huge publishers with monopolies in the case of open access). I any case, I want to repeat that this blog post only records some of my personal thought and I welcome any comments and discussions.

Conflict-of-Interest Statement I am a university professor in theoretical chemistry whose research depends on funding by public money – via government funding of our university and via funding agencies. Our research is also supported by industry grants from Volkswagen AG, Wolfsburg.
Most of my past method development has been contributed to the commercial software package ADF, owned by Scientific Computing and Modeling (SCM) B.V., Amsterdam, under a developer agreement. I also have access to the Turbomole program package under a developer agreement with Turbomole GmbH, Karlsruhe. I have no financial assets in SCM, Turbomole, or other scientific software companies, and I did not receive direct or indirect financial compensation for these contributions. I have also contributed to the Dirac and Dalton packages, which are free for academic users, but not open source (yet). Some software developed in my research group is – or will soon be – available under open-source licenses.

References

[1] https://en.wikipedia.org/wiki/List_of_quantum_chemistry_and_solid-state_physics_software
[2] https://en.wikipedia.org/wiki/Open-source_software
[3] see e.g., the ORCA code, http://www.cec.mpg.de/forschung/mts-forschungsprojekte/orca-prof-frank-neese-dr-frank-wennmohs.html
[4] J. D. Gezelter, “Open Source and Open Data Should Be Standard Practices”, J. Phys. Chem. Lett. 6, 1168−1169 (2015). DOI: 10.1021/acs.jpclett.5b00285
[5] A. I. Krylov, J. M. Herbert, F. Furche, M. Head-Gordon, P. J. Knowles, R. Lindh, F. R. Manby, P. Pulay, C.-K. Skylaris, H.-J. Werner, J. Phys. Chem. Lett. 6, 2751-2754 (2015). DOI: 10.1021/acs.jpclett.5b01258
[6] see the “conflict-of-interest statements” at the end of Ref. [5]
[7] http://www.q-chem.com/
[8] http://www.bannedbygaussian.org/
[9] http://www.scm.com/
[10] http://www.scm.com/EUprojects/

Advertisement

OpenAccess: A Young Scientist’s View

Even though I wanted this blog to be mainly about science, I had some discussions about (science) politics in the last days on Twitter and Facebook. So I decided to share some of my thoughts and hope that the discussion will continue here.

I am working as an independent young chemists in the southern German state of Baden-Württemberg. In Germany, the states are responsible for the universities and Baden-Württemberg is currently planning a new university law. There are many points in the current proposal that are worth being criticized by scientists. I will not go into details here, that would be another post and I am not an expert on many of these aspects. However, instead of discussing these big issues, lots of criticism is focussed on one small provision concerning OpenAccess publication. Examples can be found in the newspaper FAZ [1] and in an article in the newspaper of the Hochschulverband – the largest association of German professors [2].

What is all the fuzz about? The way science works is that I do stuff (science) and when I have found something more or less interesting I write an article about it and send it to a scientific journal. The editor there send it to colleagues, who read it carefully (most of the time) and write some feedback to help the editor decide whether my article should be published in this journal. If the feedback is positive, the article is edited, formatted and published online. However, before publication I have to sign a form in which I hand over more or less all rights on my article to the publisher of the journal. Usually this happens already when submitting the article – meaning without signing such a copyright transfer agreement nobody will ever look at or read my work.

Once my article is published online, you can only download it if you subscribed to this journal. These subscriptions are usually bought by the universities (for a lot of money). Depending on the terms in the copyright agreement, I am usually not allowed to make my article available in any other way, for example by posting it on my website. So if I do not like the terms, I could pick another publisher. However, the choice of journal is determined by many other considerations (readership of the journal, topic of the article, and – unfortunately – also the journal’s impact factor). This would be a topic for another blog post. Often, there is not much choice for a given article – in particular for young scientists that have to “play by the rules”. Finally, while there are slight differences, the copyright agreements of different publishers are in general very similar.

The government of Baden-Württemberg now wants to make it mandatory for scientists funded by the state that when publishing their results in a scientific journal, they maintain the right to post their article on their own websites or in university repositories six months after the journal publication. To me, this sound like a good thing. More colleagues will be able to download and read my articles, even of their universities do not have the money to subscribe to the journal in question. Hopefully, these additional readers will cite my article, which boosts my ego and is also good for my career. The more political argument for such a law is that if my research has been funded by the taxpayers’ money, also the results produced with this money should be available to the public.

Sound all reasonable, right? So why is this policy criticized at all? The main argument I read is that it undermines the “freedom of science”. In fact, this is partly true: As a scientist, I cannot choose any journal anymore, but only those that agree to the terms dictated by the state of Baden-Württemberg. However, I think that this is a necessary step to give back scientist the freedom to make their work freely available.

In the end, publishers rely on scientist that write articles for their journals. If enough scientists demand a change of their copyright agreements, they will change their policy. But on my own, I have no power to demand such a change. I have to publish in the most suitable journals no matter whether I like their policy or not – otherwise I would damage my career. Therefore, a critical mass has to be reached somehow. This could be a coordinated effort by scientists – for instance, though their scientific societies. But these societies usually publish journals themselves and depend on the revues from subscription fees.

This leaves those that fund science in the position to put pressure on publishers – and this is what Baden-Württemberg is trying to do now. In fact, it has already been shown that this approach works: The National Institute of Health (NIH) as the biggest funder of research in the biomedical sciences introduced a similar policy a few years ago. As a consequence, basically all publishers now offer a suitable option for scientists funded by the NIH. One could argue that Baden-Württemberg is much smaller, so that it is too small to put pressure on big, international publishers and eventually its scientists will suffer. However, Baden-Württemberg is not alone: The EU adopted a similar policy in their new research program Horizon2020, and federal German institutions like the Max Planck Society and the DFG are considering such rules as well. Hopefully, other German states will follow the example of Baden-Württemberg [3].

It should be mentioned that the requirements considered now in Baden-Württemberg (also known as “Green OpenAccess”) are pretty mild. Articles on the journal websites do not have to be freely available and journals can still sell subscriptions. And articles are posted on the scientists’ websites or university repositories only after six months. In fact, quite a few journals already comply with the proposed rules [4]. This is, for example, the case of Nature and Science and for many journals in physics-related field.

Finally, it should be mentioned that some journals might charge the authors money if these want to retain to make their articles freely available in order to make up for their loss in subscription revenues. Whether this is a valid argument for the mild requirements discussed here is another discussion, but if more strict OpenAccess rules are enforced, this will certainly be justified. In this case, it will be important that those who set these rules (i.e., the funders) also provide scientists with the necessary money for publishing costs.

To summarize: I believe that to make scientific results freely available, coordinated efforts by funding agencies or enforced by law are the only feasible way. This does not undermine the freedom of science, but eventually restores it.

[1] “Droht Wissenschaftlern der Zwang zum Selbstverlag?” FAZ, 5.2.2014. No link provided here because of a ridiculous German law called “Leistungsschutzrecht“.
[2] Jörg Michael Kastl, “Neue Steuerung” in Forschung und Lehre 12/2013, p. 996.
[3] The German federalism, in which each state is responsible for its own universities, is not really helpful for a coordinated effort here. So someone has to start.
[4] The SHERPA/RoMEO database provides an overview of the copyright policies of different scientific journals.