Title: Machine-readable debian/copyright
DEP: 5
State: ACCEPTED
Drivers: Steve Langasek <vorlon@debian.org>
Date: 2012-02-24
URL: http://dep.debian.net/deps/dep5 (this page)
Source:
http://wiki.debian.org/Proposals/CopyrightFormat?action=info
http://anonscm.debian.org/gitweb/?p=dbnpolicy/policy.git;a=blob;f=copyright-format/copyright-format.xml;hb=da0daadc5f20dacaf1ce6c4f98d23fed5eb4e65c
http://anonscm.debian.org/loggerhead/dep/dep5/trunk/annotate/head:/dep5/copyright-format.xml
License:
Copying and distribution of this file, with or without modification,
are permitted in any medium without royalty provided the copyright
notice and this notice are preserved.
Abstract:
Establish a standard, machine-readable format for debian/copyright
files within packages, to facilitate automated checking and
reporting of licenses for packages and sets of packages.
Current Version: This specification is now maintained as a standard that is part of the debian-policy package. Please refer to it for the most up to date version of this specification.
This is a proposal to make debian/copyright machine-interpretable. This file is one of the most important files in Debian packaging, yet there is currently no standard format defined for it and its contents vary tremendously across packages, making it difficult to automatically extract licensing information.
This is not a proposal to change the policy in the short term. In particular, nothing in this proposal supersedes or modifies any of the requirements specified in Debian Policy regarding the appropriate detail or granularity to use when documenting copyright and license status in debian/copyright.
The diversity of free software licenses means that Debian needs to care not only about the freeness of a given work, but also its license's compatibility with the other parts of Debian it uses.
The arrival of the GPL version 3, its incompatibility with version 2, and our inability to spot the software where the incompatibility might be problematic is one prominent occurrence of this limitation.
There are earlier precedents, also. One is the GPL/OpenSSL incompatibility. Apart from grepping debian/copyright, which is prone to numerous false positives (packaging under the GPL but software under another license) or negatives (GPL software but with an "OpenSSL special exception" dual licensing form), there is no reliable way to know which software in Debian might be problematic.
And there is more to come. There are issues with shipping GPLv2-only software with a CDDL operating system such as Nexenta. The GPL version 3 solves this issue, but not all GPL software can switch to it and we have no way to know how much of Debian should be stripped from such a system.
A user might want to have a way to avoid software with certain licenses they have a problem with, even if the licenses are DFSG-free. For example, the Affero GPL.
Many people have worked on this specification over the years. The following alphabetical list is incomplete, please suggest missing people: Russ Allbery, Ben Finney, Sam Hocevar, Steve Langasek, Charles Plessy, Noah Slater, Jonas Smedegaard, Lars Wirzenius.
The debian/copyright file must be machine-interpretable, yet human-readable, while communicating all mandated upstream information, copyright notices and licensing details.
The syntax of the file is the same as for other Debian control files, as specified in the Debian Policy Manual. See its section 5.1 for details. Extra fields can be added to any paragraph. No prefixing is necessary or desired, but please avoid names similar to standard ones so that mistakes are easier to catch. Future versions of the debian/copyright specification will attempt to avoid conflicting specifications for widely used extra fields.
The file consists of two or more paragraphs. At minimum, the file must include one header paragraph and one Files paragraph.
The value of each field is of one of the four types listed below. The definition for each field in this document indicates which type of value it takes.
A single-line value means that the whole value of a
field must fit on a single line. For example, the
Format
field has a single-line
value specifying the version of the machine-readable format
that is used.
A whitespace-separated list means that the field value
may be on one line or many, but values in the list are
separated by one or more whitespace characters (including
space, TAB, and newline). For example, the Files
field has a list of filename
patterns.
Another kind of list value has one value per line. For
example, Copyright
can list
many copyright statements, one per line.
Formatted text fields use the same rules as the long
description in a package's Description
field, possibly also using the
first line as a synopsis, like Description
uses it for the short
description. See Debian Policy's section 5.6.13, "Description",
for details. For example, Disclaimer
has no special first line,
whereas License
does.
There are three kinds of paragraphs. The first paragraph in the file is called the header paragraph. Every other paragraph is either a Files paragraph or a stand-alone License paragraph. This is similar to source and binary package paragraphs in debian/control files.
The following fields may be present in a header paragraph.
Format: required.
Upstream-Name: optional.
Upstream-Contact: optional.
Source: optional.
Disclaimer: optional.
Comment: optional.
License: optional.
Copyright: optional.
The Copyright
and
License
fields in the
header
paragraph may complement but do not replace the
Files
paragraphs. They can be used to summarise the
contributions and redistribution terms for the whole
package, for instance when a work combines a permissive and
a copyleft license, or to document a compilation
copyright and license. It is possible to use
only License
in the header
paragraph, but Copyright
alone
makes no sense.
Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/ Upstream-Name: SOFTware Upstream-Contact: John Doe <john.doe@example.com> Source: http://www.example.com/software/project
The declaration of copyright and license for files is done in one or more paragraphs. In the simplest case, a single paragraph can be used which applies to all files and lists all applicable copyrights and licenses.
The following fields may be present in a Files paragraph.
Files: * Copyright: 1975-2010 Ulla Upstream License: GPL-2+ Files: debian/* Copyright: 2010 Daniela Debianizer License: GPL-2+ Files: debian/patches/fancy-feature Copyright: 2010 Daniela Debianizer License: GPL-3+ Files: */*.1 Copyright: 2010 Manuela Manpager License: GPL-2+
In this example, all files are copyright by the upstream and licensed under the GPL, version 2 or later, with three exceptions. All the Debian packaging files are copyright by the packager, and further one specific file providing a new feature is licensed differently. Finally, there are some manual pages added to the package, written by a third person.
Where a set of files are dual (tri, etc) licensed, or
when the same license occurs multiple times, you can use a
single-line License
field and
stand-alone License
paragraphs
to expand the license short names.
The following fields may be present in a stand-alone License paragraph.
The following fields are defined for use in debian/copyright.
Format
Single-line: URI of the format specification, such as: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/.
Upstream-Contact
Line-based list: the preferred address(es) to reach the upstream project. May be free-form text, but by convention will usually be written as a list of RFC5322 addresses or URIs.
Source
Formatted text, no synopsis: an explanation from where the upstream source came from. Typically this would be a URL, but it might be a free-form explanation. The Debian Policy section 12.5 requires this information unless there are no upstream sources, which is mainly the case for native Debian packages. If the upstream source has been modified to remove non-free parts, that should be explained in this field.
Disclaimer
Formatted text, no synopsis: this field can be used in the case of non-free and contrib packages (see 12.5).
Comment
Formatted text, no synopsis: this field can provide additional information. For example, it might quote an e-mail from upstream justifying why the license is acceptable to the main archive, or an explanation of how this version of the package has been forked from a version known to be DFSG-free, even though the current upstream version is not.
License
Formatted text, with synopsis. In the header paragraph,
this field gives the license information for the package as
a whole, which may be different or simplified from a
combination of all the per-file license information. In a
Files paragraph, this field gives the licensing terms for
the files listed in the Files
field for this paragraph. In a stand-alone License
paragraph, it gives the licensing terms for those
paragraphs which reference it.
First line: an abbreviated name for the license, or expression giving alternatives (see Short names section for a list of standard abbreviations). If there are licenses present in the package without a standard short name, an arbitrary short name may be assigned for these licenses. These arbitrary names are only guaranteed to be unique within a single copyright file.
Remaining lines: if left blank here, the file must include a stand-alone License paragraph matching each license short name listed on the first line. Otherwise, this field should either include the full text of the license(s) or include a pointer to the license file under /usr/share/common-licenses. This field should include all text needed in order to fulfill both Debian Policy's requirement for including a copy of the software's distribution license (12.5), and any license requirements to include warranty disclaimers or other notices with the binary package.
Copyright
Formatted text, no synopsis: one or more free-form
copyright statement(s). Any formatting is permitted; see
the examples below for some ideas for how to structure the
field to make it easier to read. In the header paragraph,
this field gives the copyright information for the package
as a whole, which may be different or simplified from a
combination of all the per-file copyright information. In
the Files paragraphs, it gives the copyright information
that applies to the files matched by the Files
pattern. If a work has no copyright
holder (i.e., it is in the public domain), that information
should be recorded here.
The Copyright
field
collects all relevant copyright notices for the files of
this paragraph. Not all copyright notices may apply to
every individual file, and years of publication for one
copyright holder may be gathered together. For example, if
file A has:
Copyright 2008 John Smith Copyright 2009 Angela Wattsand file B has:
Copyright 2010 Angela Wattsthe
Copyright
field for a stanza
covering both file A and file B need contain only:
Copyright 2008 John Smith Copyright 2009, 2010 Angela Watts
The Copyright
field may
contain the original copyright statement copied exactly
(including the word "Copyright"), or it can shorten the text, as
long as it does not sacrifice information. Examples in this
specification use both forms.
Files
Whitespace-separated list: list of patterns indicating files covered by the license and copyright specified in this paragraph.
Filename patterns in the Files
field are specified using a
simplified shell glob syntax. Patterns are separated by
whitespace.
Only the wildcards * and ? apply; the former matches any number of characters (including none), the latter a single character. Both match a slash (/) and a leading dot.
Patterns match pathnames that start at the root of the source tree. Thus, "Makefile.in" matches only the file at the root of the tree, but "*/Makefile.in" matches at any depth.
The backslash (\) is used to remove the magic from the next character; see table below.
Multiple Files
paragraphs
are allowed. The last paragraph that matches a particular
file applies to it.
Exclusions are done by having multiple Files
paragraphs.
Much of the value of a machine-parseable copyright file
lies in being able to correlate the licenses of multiple
pieces of software. To that end, this spec defines standard
short names for a number of commonly used licenses, which
can be used in the first line of a License
field.
These short names have the specified meanings across all uses of this file format, and must not be used to refer to any other licenses. Parsers may thus rely on these short names referring to the same licenses wherever they occur, without needing to parse or compare the full license text.
From time to time, licenses may be added to or removed
from the list of standard short names. Such changes in the
list of short names will always be accompanied by changes
to the recommended Format
value. Implementers who are parsing copyright files should
take care not to assume anything about the meaning of
license short names for unknown Format
versions.
Use of a standard short name does not override the
Debian Policy requirement to include the full license text
in debian/copyright, nor any
requirements in the license of the work regarding
reproduction of legal notices. This information must still
be included in the License
field, either in a stand-alone License paragraph or in the
relevant files paragraph.
For licenses which have multiple versions in use, the version number is added, using a dash as a separator. If omitted, the lowest version number is implied. When the license grant permits using the terms of any later version of that license, the short name is finished with a plus sign. For SPDX compatibility, trailing dot-zeroes are considered to be equal to plainer version (e.g., "2.0.0" is considered equal to "2.0" and "2").
Currently, the full text of the licenses is only available in the SPDX Open Source License Registry.
Keyword | Meaning |
---|---|
public-domain | No license required for any purpose; the work is not subject to copyright in any jurisdiction. |
Apache | Apache license 1.0, 2.0. |
Artistic | Artistic license 1.0, 2.0. |
BSD-2-clause | Berkeley software distribution license, 2-clause version. |
BSD-3-clause | Berkeley software distribution license, 3-clause version. |
BSD-4-clause | Berkeley software distribution license, 4-clause version. |
ISC | Internet Software Consortium, sometimes also known as the OpenBSD License. |
CC-BY | Creative Commons Attribution license 1.0, 2.0, 2.5, 3.0. |
CC-BY-SA | Creative Commons Attribution Share Alike license 1.0, 2.0, 2.5, 3.0. |
CC-BY-ND | Creative Commons Attribution No Derivatives license 1.0, 2.0, 2.5, 3.0. |
CC-BY-NC | Creative Commons Attribution Non-Commercial license 1.0, 2.0, 2.5, 3.0. |
CC-BY-NC-SA | Creative Commons Attribution Non-Commercial Share Alike license 1.0, 2.0, 2.5, 3.0. |
CC-BY-NC-ND | Creative Commons Attribution Non-Commercial No Derivatives license 1.0, 2.0, 2.5, 3.0. |
CC0 | Creative Commons Zero 1.0 Universal. |
CDDL | Common Development and Distribution License 1.0. |
CPL | IBM Common Public License. |
EFL | The Eiffel Forum License 1.0, 2.0. |
Expat | The Expat license. |
GPL | GNU General Public License 1.0, 2.0, 3.0. |
LGPL | GNU Lesser General Public License 2.1, 3.0, or GNU Library General Public License 2.0. |
GFDL | GNU Free Documentation License 1.0, or 1.1. |
GFDL-NIV | GNU Free Documentation License, with no invariant sections. |
LPPL | LaTeX Project Public License 1.0, 1.1, 1.2, 1.3c. |
MPL | Mozilla Public License 1.1. |
Perl | Perl license (use "GPL-1+ or Artistic-1" instead). |
Python | Python license 2.0. |
QPL | Q Public License 1.0. |
W3C | W3C Software License For more information, consult the W3C Intellectual Rights FAQ. |
Zlib | zlib/libpng license. |
Zope | Zope Public License 1.0, 1.1, 2.0, 2.1. |
There are many versions of the MIT license. Please use Expat instead, when it matches.
An exception or clarification to a license is signaled
in plain text, by appending with
keywords
exception to
the short name. This document provides a list of keywords
that must be used when referring to the most frequent
exceptions. When exceptions other than these are in effect
that modify a common license by granting additional
permissions, you may use an arbitrary keyword not taken
from the below list of keywords. When a license differs
from a common license because of added restrictions rather
than because of added permissions, a distinct short name
should be used instead of with
keywords
exception.
Only one exception may be specified for each license within a given license specification. If more than one exception applies to a single license, an arbitrary short name must be used instead.
The GPL Font exception refers to the text added to the license notice of each file as specified at How does the GPL apply to fonts. The precise text corresponding to this exception is:
As a special exception, if you create a document which uses this font, and embed this font or unaltered portions of this font into the document, this font does not by itself cause the resulting document to be covered by the GNU General Public License. This exception does not however invalidate any other reasons why the document might be covered by the GNU General Public License. If you modify this font, you may extend this exception to your version of the font, but you are not obligated to do so. If you do not wish to do so, delete this exception statement from your version.
The GPL OpenSSL exception gives permission to link GPL-licensed code with the OpenSSL library, which contains GPL-incompatible clauses. For more information, see The -OpenSSL License and The GPL by Mark McLoughlin and the message middleman software license conflicts with OpenSSL by Mark McLoughlin on the debian-legal mailing list. The text corresponding to this exception is:
In addition, as a special exception, the copyright holders give permission to link the code of portions of this program with the OpenSSL library under certain conditions as described in each individual source file, and distribute linked combinations including the two. You must obey the GNU General Public License in all respects for all of the code used other than OpenSSL. If you modify file(s) with this exception, you may extend this exception to your version of the file(s), but you are not obligated to do so. If you do not wish to do so, delete this exception statement from your version. If you delete this exception statement from all source files in the program, then also delete it here.
The License
short name
public-domain does not refer to
a set of license terms. There are some works which are
not subject to copyright in any jurisdiction and
therefore no license is required for any purpose covered
by copyright law. This short name is an explicit
declaration that the associated files are "in the public domain".
Widespread misunderstanding about copyright in general, and the public domain in particular, results in the common assertion that a work is in the public domain when this is partly or wholly untrue for that work. The Wikipedia article on public domain is a useful reference for this subject.
When the License
field in
a paragraph has the short name public-domain, the remaining lines of the
field must explain exactly what exemption
the corresponding files for that paragraph have from
default copyright restrictions.
License names are case-insensitive, and may not contain spaces.
In case of multi-licensing, the license short names are separated by or when the user can chose between different licenses, and by and when use of the work must simultaneously comply with the terms of multiple licenses.
For instance, this is a simple, "GPL version 2 or later" field:
License: GPL-2+This is a dual-licensed GPL/Artistic work such as Perl:
License: GPL-1+ or ArtisticThis is for a file that has both GPL and classic BSD code in it:
License: GPL-2+ and BSDFor the most complex cases, the comma is used to disambiguate the priority of ors and ands and has the priority over or, unless preceded by a comma. For instance:
A or B and C means A or (B and C).
A or B, and C means (A or B), and C.
This is for a file that has Perl code and classic BSD code in it:
License: GPL-2+ or Artistic-2.0, and BSDA GPL-2+ work with the OpenSSL exception is in effect a dual-licensed work that can be redistributed either under the GPL-2+, or under the GPL-2+ with the OpenSSL exception. It is thus expressed as GPL-2+ with OpenSSL exception:
License: GPL-2+ with OpenSSL exception This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. . In addition, as a special exception, the author of this program gives permission to link the code of its release with the OpenSSL project's "OpenSSL" library (or with modified versions of it that use the same license as the "OpenSSL" library), and distribute the linked executables. You must obey the GNU General Public License in all respects for all of the code used other than "OpenSSL". If you modify this file, you may extend this exception to your version of the file, but you are not obligated to do so. If you do not wish to do so, delete this exception statement from your version. . This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. . You should have received a copy of the GNU General Public License along with this package; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA . On Debian systems, the full text of the GNU General Public License version 2 can be found in the file `/usr/share/common-licenses/GPL-2'.
SPDX is an attempt to standardize a format for communicating the components, licenses and copyrights associated with a software package. It and the machine-readable debian/copyright format attempt to be somewhat compatible. However, the two formats have different aims, and so the formats are different. The DEP5 wiki page will be used to track the differences.
Example 3. Simple
A possible debian/copyright file for the program "X Solitaire" distributed in the Debian source package xsol:
Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/ Upstream-Name: X Solitaire Source: ftp://ftp.example.com/pub/games Files: * Copyright: Copyright 1998 John Doe <jdoe@example.com> License: GPL-2+ This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. . This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. . You should have received a copy of the GNU General Public License along with this package; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA . On Debian systems, the full text of the GNU General Public License version 2 can be found in the file `/usr/share/common-licenses/GPL-2'. Files: debian/* Copyright: Copyright 1998 Jane Smith <jsmith@example.net> License: GPL-2+ [LICENSE TEXT]
Example 4. Complex
A possible debian/copyright file for the program "Planet Venus", distributed in the Debian source package planet-venus:
Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/ Upstream-Name: Planet Venus Upstream-Contact: John Doe <jdoe@example.com> Source: http://www.example.com/code/venus Files: * Copyright: 2008, John Doe <jdoe@example.com> 2007, Jane Smith <jsmith@example.org> 2007, Joe Average <joe@example.org> 2007, J. Random User <jr@users.example.com> License: PSF-2 [LICENSE TEXT] Files: debian/* Copyright: 2008, Dan Developer <dan@debian.example.com> License: permissive Copying and distribution of this package, with or without modification, are permitted in any medium without royalty provided the copyright notice and this notice are preserved. Files: debian/patches/theme-diveintomark.patch Copyright: 2008, Joe Hacker <hack@example.org> License: GPL-2+ [LICENSE TEXT] Files: planet/vendor/compat_logging/* Copyright: 2002, Mark Smith <msmith@example.org> License: MIT [LICENSE TEXT] Files: planet/vendor/httplib2/* Copyright: 2006, John Brown <brown@example.org> License: MIT2 Unspecified MIT style license. Files: planet/vendor/feedparser.py Copyright: 2007, Mike Smith <mike@example.org> License: PSF-2 [LICENSE TEXT] Files: planet/vendor/htmltmpl.py Copyright: 2004, Thomas Brown <coder@example.org> License: GPL-2+ This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. . This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. . You should have received a copy of the GNU General Public License along with this package; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA . On Debian systems, the full text of the GNU General Public License version 2 can be found in the file `/usr/share/common-licenses/GPL-2'.