Hello,
I quite agree with Drew DeVault on there that REUSE should try to adopt a bit more headers and probably get a more formal specification, currently it's a bit too much prose-like.
One thing I've been wanting for a while, specially when writing packages for large repos with legacy code is to have something which would try to give out as much as licences and notices as reasonably possible even with some false-positives and bit of manual work when a human-format is detected. And I think this would need a formal specification so both devs and packagers can be sure that their licences can be found and respected. Of course there should be a limit on it, for example be described via ABNF.
REUSE from my point of view is a great thing as it seems to be easier to maintain and adopt than the Debian Copyright File. But it doesn't really solves the problem of packagers wanting to know what is the licence of something, in fact it creates yet other ways for developers to specify the licences, making it a bit harder to create a license scan tool.
For example one kind of format I would definitely add in such a tool is:
# Copyright 1999-2021 Gentoo Authors # Distributed under the terms of the GNU General Public License v2
And of course this one doesn't uses SPDX and has the slightly annoying problem of not being explicit on being "GPL-2.0-only" or "GPL-2-or-later".
By the way I would be interested in helping to create such a specification with reasonable coverage that I think could be verified thanks to the work already done within distributions.
Best Regards
Hi, I must say I am slightly surprised to read that the specification should be more formal. Actually, it was my feeling that it is well worded, short and precise. So I am wondering if we are looking at different parts of REUSE? I am referring to the specification version 3.0 at https://reuse.software/spec/
Using the REUSE linter tool, IMHO it is easy to write such headers and afterwards check them to be correct or to use a self-crafted server-side Git hook (as we are doing in KDE). At the moment we have more than 50.000 REUSE based license headers in KDE and I doubt that more than a very few developers use the REUSE addheader option to create them but write them by hand. -- Yet, I see the point of difference in hyphens and agree that one variant could be deprecated in the future such that both are consistent between Copyright and License. Actually, using the "SPDX-*" prefixed tags globally in our code base simplifies tooling a lot because those tags are not repeated anywhere in a source code comments, unlike strings as e.g. "Copyright" or "License".
Regarding the "Distributed under the terms of ..." statement, to my understanding this is something currently completely not covered by REUSE, because REUSE only has inbound licenses in focus and not (yet) outbound licenses. It is important to clearly distinguish those because for outbound licenses it is not enough to just respect all source code licenses but also licenses of linked third-party libraries and headers (e.g. shared or static linked libraries or header-only libraries). That said, I would very welcome if we could get a way to clearly state those outbound licenses. But for the single statement, I do not see that we have to go away from the SPDX syntax. If you look, for example, at Yocto based license statements, those declarations for outbound licenses use the same idea of AND and OR connections in case that a binary artifact is distributed under different outbound licenses or when requirements of several licenses must be fulfilled at the same time (e.g. "BSD-2-Clause AND LGPL-2.1").
Best regards, Andreas
Die 24. 12. 21 et hora 12:06 Haelwenn (lanodan) Monnier scripsit:
For example one kind of format I would definitely add in such a tool is:
Just out of curiosity what does your tool catch that existing tools such as ScanCode and FOSSology do not?
See e.g. research paper: https://oss.cs.fau.de/2019/08/07/final-thesis-a-comparison-study-of-open-sou...
# Copyright 1999-2021 Gentoo Authors # Distributed under the terms of the GNU General Public License v2
And of course this one doesn't uses SPDX and has the slightly annoying problem of not being explicit on being "GPL-2.0-only" or "GPL-2-or-later".
This can be a big problem.
There is a good reason why for the licensing information REUSE strictly requires the SPDX tag, while for the copyright statement it does not.
Also while cleaning up the Linux kernel and moving it to use SPDX tags, if memory serves me right they identified hundred(s) of ways GPL is referred to
If we’re just talking about license texts of GPL-2.0, there are 20+ already identified versions: https://github.com/pombredanne/gpl-history
And this is why REUSE, and many others, so happily relies on the SPDX IDs for license UIDs and the SPDX License List for a list of canonical license texts. Since SPDX has become an ISO standard, I would suspect its use will grow not dwindle.
(Sure, the syntax inconsistency caused by hyphen in the legacy SPDX-License- Identifier tag is annoying, just as the “-only” and “-or-later” suffixes for the GPL family of licenses are, but that’s a minor gripe compared to what it fixes. And above all, that is a discussion to take at the SPDX level, not REUSE.)
When it comes to copyright statements, while copyright law typically proscribes how they should look like, in practice since copyright is automatic, the statements are more akin proof or rather clues.
For example, if we were strict, a copyright statement that is in line with both the US and Slovenian (as the EU jurisdiction I am most familiar with), would look like this:
© 2021 Matija Šuklje
…anything more is not required by law, anything less makes it non-compliant
But since some licenses (and copyright law) require you _keep_ the copyright notices as they are, REUSE is much more flexible on how the statements should look like, but that they should be there.
more: https://matija.suklje.name/how-and-why-to-properly-write-copyright-statement... https://matija.suklje.name/how-and-why-to-properly-write-copyright-statement...
By the way I would be interested in helping to create such a specification with reasonable coverage that I think could be verified thanks to the work already done within distributions.
By discussing here on this mailing list and its GitHub issue tracker https://github.com/fsfe/reuse-docs/ (yes, I know, GH…) you help creating such a spec :)
holiday cheers, Matija