Battle of the Opens

I'm committed to a lot of different kinds of "open." This means that I can and do engage in tremendous acts of hair-splitting and pilpul with regard to them. "Gratis" versus "libre" open access? Free-speech versus free-beer software code? I'm your librarian; let's sit down and have that discussion.

Unfortunately, out there in the wild I find a tremendous amount of misunderstanding about various flavors of open, sometimes coming from otherwise perfectly respectable communications outlets. (Pro tip: If you're not completely sure you understand, please find someone to ask. A librarian is a good start!)

Make no mistake, getting these things wrong sometimes does serious harm. The open-access movement exhausts itself contending with the same old misunderstandings over and over again, and I'm sure we're not alone in that.

So here—free, gratis, libre, and open—is a brief, simplistic guide to several flavors of open, organized around the following questions:

  • What is the target of this movement? What is being made open? As compared to what?
  • What legal regimes are implicated?
  • How does openness happen? What are the major variants of open works of this type?

Onward. We'll start with:

Open source

What is being made open? Software, specifically its human-readable "source code." Software that is not open-source is usually distributed solely in non-human-readable "binary" form, and (as copyrighted expression) cannot legally be reverse-engineered or changed.

What legal regimes are implicated? Copyright, mostly, though patents sometimes rear their ugly heads. The legal tools are copyright licenses specific to source code, such as the GPL and BSD license.

How does openness happen? Programmers place the source code they have written on the web, associating an open-source license with it. Other programmers are then able to read, use, and change the code. As open-source projects grow, they may have hundreds or thousands of programmers working on the code.

One of the two major ideological variants in the open-source world is the "free software" movement, which holds that opening source code is insufficient without ensuring that those who build upon open source code also make their code open (except when they are using it only privately). This movement produced the GPL. The "open-source software" movement holds that open code can and should be employed in proprietary, closed-source projects, and so tends to prefer licenses like the BSD license, which does not require open release of derivative code.

Open standards

What is being made open? Specifications for how to accomplish particular tasks or build particular (tangible or virtual) objects. Open standards cover everything from computer cables to metadata to the building blocks of websites.

What legal regimes are implicated? Our old friends copyright and patent. Open standards generally want to be implementable without treading on royalty-requiring copyrighted or patented intellectual property.

How does openness happen? Generally a "standards body" does the design and outreach work. This may be an ad-hoc collection of engineers (IETF), a group of interested commercial and/or nonprofit entities surrounding a particular trade or technical phenomenon (IDPF or W3C), or a national or international organization whose specific remit is standards (ISO, despite quibbles about having to buy their specifications' text).

Open access

What is being made open? The academic literature: specifically, the peer-reviewed journal literature which is not written for royalties or any other direct monetary reward to its authors. (While open-access advocates happily cheer for open access to books and other research media, the different money-flows in these areas mean they are not a focus of the movement.) Open-access literature is in opposition to literature which is not available to be read unless a subscription, per-article, or other fee is paid by the reader or the reader's proxy (e.g. a library).

What legal regimes are implicated? Copyright, again. Typical practice for the academic article is that its author(s) transfer their copyright in its entirety to the journal publisher, allowing the publisher to control reuse.

How does openness happen? In two basic ways. Yes, two! One is the soi-disant "gold road," in which authors publish in journals that make their contents available on the Web immediately upon publication without charging reader-side fees. The other is the "green road," in which authors reserve or are granted by the publisher sufficient rights in their article to make some version of it (usually not the final typeset, copy-edited publisher's version) available openly online.

Another division can be drawn between "gratis" open access, in which articles are available freely to be read but require explicit permission for most reuse, and "libre" open access, in which articles are clearly licensed up-front for reuse, often with a Creative Commons license.

Open educational resources

What is being made open? Many sorts of classroom materials, including syllabi, lecture audio/video, assignments, and instructional material such as self-contained web-based "learning objects."

What legal regimes are implicated? Copyright and the related work-for-hire doctrine, that last because some educational institutions claim copyright in instructional materials created by instructors in the course of their regular job duties.

How does openness happen? Typically, through institution-based "courseware" programs or learning-object repositories. Some instructors share educational material through consumer web applications such as SlideShare.

The open-textbook movement is worth mentioning here. Though it is logically affiliated with the OER movement, in practice it bears more resemblance to the open-access movement.

Open (research) data

What is being made open? Data resulting from the research process, in a form less "cooked" than the graphs, tables, and charts in journal articles. ("Data" is a vague word, granted.) Ideally, sufficient description of the data and how they were obtained is included for the data to be verifiable and reusable.

What legal regimes are implicated? In some countries, copyright. For data from industry, trade-secret law.

How does openness happen? Researchers, with or without help from librarians and IT professionals, make their data open. Some journals and science funders are beginning to demand open data; others demand data-sustainability plans that align well with the open-data movement.

Open (government) data

What is being made open? Information gathered by governments in the course of business: geographical information, demographic information, research data gathered by government agencies, sometimes records.

What legal regimes are implicated? For pure data, none in the United States; data are not copyrightable. For other works, copyright, sometimes. Though works authored by (employees of) the US federal government are in the public domain, works authored by (employees of) other governments in the US can be copyrighted.

How does openness happen? Usually, the government in question releases the data online. There is considerable stir and excitement at present over "linked (open) data," which means data expressed in such a way as to be easily and usefully combined with data from other sources.

Open notebook science

What is being made open? The process and progress of a particular research project, analogous to placing a lab notebook on the Web for public view.

What legal regimes are implicated? Copyright, insofar as making original expression available in tangible form (yes, the Internet counts as "tangible" for copyright purposes) immediately creates copyright in it. Patent, insofar as making a patentable invention available removes patentability (in the US), but also creates prior art such that subsequent patents can be challenged.

How does openness happen? At present, researchers employ whatever tools come to hand, from wikis to Google Docs to FriendFeed to github, to document their research process on the Web as the research is happening. Some institutions are trying out "electronic lab notebooks" which could facilitate open notebook science if they are not kept behind firewalls, or if researchers have the option to move their workspaces into the open.

---

Any effort such as this will be nitpicked endlessly. That's what the comments are for, so go to it—but be warned, religious wars and diatribes will be ruthlessly deleted. Emacs and vi are both awful, I don't like Windows or Linux as a desktop environment, and progress in both the green and gold roads to OA makes me happy.

Edited to add: I extend many thanks to the commenters on this post, and have revised it in light of their comments. Any remaining errors or infelicities are of course mine. I strongly recommend supplementing this post with Sarah Glassmeyer's post on free law.

More like this

My last posts on why I don't like the open source metaphor for science have generated a lot of good comments, here and in my email, twitter, and in person. They've forced me to think about what exactly it is about the meme that makes me so uncomfortable, and raised some good objections and points…
I was asked in an interview recently about "open source science" and it got me thinking about the ways that, in the "open" communities of practice, we frequently over-simplify the realities of how software like GNU/Linux actually came to be. Open Source refers to a software worldview. It's about…
It's Open Access Week this week and as part of the celebrations I thought I highlight a recent declaration by the Open Bibliographic Working Group on the Principles for Open Bibliographic Data. It's an incredible idea, one that I support completely -- the aim is to make bibliographic data open,…
How do copyright and fair use laws, framed before the internet was a twinkle in the eye, apply in the world of blogging? The answer, as a case that unfolded on ScienceBlogs this week demonstrates, may be "not so clearly." Ergo, we've asked a few experts and stakeholders to weigh in on the issue of…

Someone else can nitpick. For me, I'll say this is a great, concise, useful summary, with stuff I didn't already know. Under other circumstances, I'd immediately adapt it for use in another arena... Anyway, great stuff.

To start the nitpicking: I've a quibble about the phrase "[s]oftware that is not open-source is usually copyrighted to its ... maker". That's perfectly true, of course, but (pointing this out for the benefit of the careless reader) copyright isn't just for proprietary software. The copyright to free and open source software generally is held by one or more contributors (or their employers, or their assignees) unless the author had placed the source code in the public domain. As you imply in the post, licenses like the GPL are effective because of copyright laws.

Regardless, I agree with Walt - a very useful summary.

Good point, Galen; I've rephrased it to what I hope will be less confusing. Thanks.

Very nice summary!

Just a couple of nitpicks on "open source" software: Non-open-source software is sometimes distributed as source code as well as binary (though this is not common). The defining feature of open source is the license under which the source code is distributed, which allows free distribution and modification of the code for any purpose (sometimes subject to certain non-discriminatory conditions).

Also, copyright by itself does not prohibit reverse-engineering of software. Such reverse engineering has been upheld by courts as fair use in appropriate circumstances (see, e.g. Sega v. Accolade). However, the licenses that accompany many non-open-source programs often include terms that prohibit reverse engineering. (I won't go now into the issues of enforceability of those provisions.)

Excellent article.

This comment by John Mark Ockerbloom is confused, however: "Also, copyright by itself does not prohibit reverse-engineering of software... However, the licenses that accompany many non-open-source programs often include terms that prohibit reverse engineering."

Firstly, it depends what you mean by reverse-engineering. Exploring how the code works is generally OK; using that knowledge to pirate it is not. Secondly, and more seriously, it's important to understand that a license is only there to permit you to do things that would otherwise violate copyright. If something doesn't violate copyright, then you don't need a license to do it.

By Michael Kay (not verified) on 16 Mar 2010 #permalink

John, Sarah, thank you and I will be revising the post to take your comments into account.

Also, may I say, THAT WAS MICHAEL KAY COMMENTING ON MY BLOG. THE MICHAEL KAY. So not worthy.

This is great. Thanks for clarifying all of these terms. While it's implied in the descriptions that the terms overlap, it might be helpful to add that you can have multiple forms of open ____ present at the same time or part of a project, web site, etc. I.E. you can look at an article and see the work was created using open notebook science, the data is openly shared, the software used to process the data is/was open source, etc. Once one component is open frequently there are others. The title might imply there is fighting for dominance, when in fact all the terms exist in harmony and complement each other.

Michael Kay is correct that "license" is imprecise; I should have said "end user license agreement" (or EULA). This is often abbreviated informally to "license", but it's intended to give the user obligations as well as permissions. In the US, some non-open-source-software is only offered with the stipulation of a EULA, and in many cases that EULA is written to forbid certain things that are otherwise permitted under copyright law, including reverse engineering.

And yes, I'm used to "reverse engineering" meaning analyzing a program to see how it works (as per the definition in http://www.chillingeffects.org/reverse/faq.cgi ) , and not a euphemism for piracy-related activities.

Thanks a lot.
very helpful and nice summary.

By aboubakry DIA (not verified) on 18 Mar 2010 #permalink

I think of the phrase "open data" in a more general sense that goes beyond just research data: Open data consists of publicized, non-proprietary data structures, and the public APIs attached to them, that allow users to get access all of their data from a data store, in its original form, without having to pay anyone for that privilege. Anyone who has had to do a migration from one proprietary ILS to another knows the pain of dealing with non-open data, even if, technically, it's your data you're migrating. If you don't know, and can't find out, how your data is stored, then it's not open data, and it's no longer YOUR data.

By Scott Prater (not verified) on 18 Mar 2010 #permalink

Thanks, Scott -- you're absolutely right and I completely spaced on that.

All, I am going to revise this post in light of your very excellent comments. I do ask your forbearance for a little while, because it's being a very weird and weighty week and my brainspace is occupied with other matters.

Nice overview. Concerning ONS, making the components of an invention public immediately prevents the submission of international patents but you still have one year to file for a US patent.

Once one component is open frequently there are others. The title might imply there is fighting for dominance, when in fact all the terms exist in harmony and complement each other.