Hi! I'm trying to adapt REUSE to my projects, but I find the per-file headers to be a bit verbose and uncomfortable to read or write. It seems that, on the spectrum from machine-readable to human-readable, these have landed too far to the former.
I am instead using a modified approach in my projects, which, for example, looks like so:
// License-Id: GPL-3.0-only // Copyright: 2019 Drew DeVault sir@cmpwn.com // Copyright: 2019 Haelwenn (lanodan) Monnier contact@hacktivis.me // Copyright: 2019 Jeff Kaufman jeff.t.kaufman@gmail.com // Copyright: 2019 Nate Dobbins nated@posteo.net // Copyright: 2019 Noah Loomans noah@noahloomans.com // Copyright: 2019 Philip K philip@warpmail.net // Copyright: 2019 Simon Ser contact@emersion.fr // Copyright: 2020 Drew DeVault sir@cmpwn.com // Copyright: 2020 skuzzymiglet skuzzymiglet@gmail.com // Copyright: 2021 Gianluca Arbezzano ciao@gianarb.it // Copyright: 2021 sourque contact@sourque.com
Let's not get ahead of ourselves by prematurely over-namespacing everything. The current license headers read like an SPDX advertisement.
~ Drew DeVault [2021-12-21 14:30 +0100]:
Hi! I'm trying to adapt REUSE to my projects, but I find the per-file headers to be a bit verbose and uncomfortable to read or write. It seems that, on the spectrum from machine-readable to human-readable, these have landed too far to the former.
Compared with a compiled SPDX bill of material or some other rather esoteric best practices, I still find REUSE very human-friendly ;)
But yes, keeping the balance between human- and machine-readability is not trivial. I can well live with the solution REUSE found but may also be *a bit* biased.
I am instead using a modified approach in my projects, which, for example, looks like so:
// License-Id: GPL-3.0-only // Copyright: 2019 Drew DeVault sir@cmpwn.com [...]
Let's not get ahead of ourselves by prematurely over-namespacing everything. The current license headers read like an SPDX advertisement.
Well, you can do that, and it would fit the purpose of being human-readable, but for machines it's only partly, and it would simply not be fully compatible with REUSE.
The "SPDX-License-Identifier" is a well-known tag, understood by most if not all compliance tools, and was therefore also a no-brainer for REUSE to adopt. The copyright lines are completely fine, and the SPDX-FileCopyrightText is just an alias. However, we chose to make it the default because it avoid false-positives/negatives. In the tool, that's easy to change [^1].
Best, Max
[^1]: https://reuse.readthedocs.io/en/stable/usage.html#addheader
On Tue Dec 21, 2021 at 4:43 PM CET, Max Mehl wrote:
Well, you can do that, and it would fit the purpose of being human-readable, but for machines it's only partly, and it would simply not be fully compatible with REUSE.
The "SPDX-License-Identifier" is a well-known tag, understood by most if not all compliance tools, and was therefore also a no-brainer for REUSE to adopt. The copyright lines are completely fine, and the SPDX-FileCopyrightText is just an alias. However, we chose to make it the default because it avoid false-positives/negatives. In the tool, that's easy to change [^1].
"Someone already does it this way, so let's do that" is certainly a good way shortcut the bikeshedding phase. However, if it's not a *good* way to do something, then it should be changed. For one, just look at the inconsistency in the two headers you mentioned as an example: License-Identifier is skewer-case and FileCopyrightText is PascalCase. The design is noticeably worse even in this small example. I think that better headers ought to be chosen and supported, with the existing set only used for backwards compatibility.
The perspective of the REUSE developers is likely biased towards the needs of large projects or large businesses, many with hundreds of developers, and an existing culture of using tooling and automation to maintain their software. The values of this culture are not shared with small- to medium-sized projects.
For instance, REUSE wants me to install a CLI tool. The answer is "no". I intend to maintain these files by hand, so they need to be pleasant to read or write by hand. The introduction of a new tool into my workflow or dependency graph requires a large value-add and is subject to a lot of scrutiny, and the reuse tool does not pass. An annoying specification which is made easier to use thanks to a robot is still an annoying specification.
Perhaps it comes down to taste. If so, know that many developers are already bending over backwards to fight their distaste at the behaviors encouraged by this specification. A whole lot of developers hate the idea of filling their source files with huge license headers, taking on the bigger maintenance burden to keep copyright lines up to date, getting annoying emails from distributions who want them to change their software to suit a slightly different convention*, and so on. If the headers are more pleasant to use, then that's one less thing causing friction for maintainers.
You might think that I'm just being difficult, and maybe I am. But I am squarely in the target demographic of REUSE, and if you want me to overcome all of these papercuts to adopt it for my projects, then I expect an equal measure of good-will from REUSE to make these papercuts as few in number as possible. Does that make sense?
* Fun fact, this is why I'm here in the first place.
~ Drew DeVault [2021-12-21 16:59 +0100]:
On Tue Dec 21, 2021 at 4:43 PM CET, Max Mehl wrote:
The "SPDX-License-Identifier" is a well-known tag, understood by most if not all compliance tools, and was therefore also a no-brainer for REUSE to adopt. The copyright lines are completely fine, and the SPDX-FileCopyrightText is just an alias. However, we chose to make it the default because it avoid false-positives/negatives. In the tool, that's easy to change [^1].
"Someone already does it this way, so let's do that" is certainly a good way shortcut the bikeshedding phase. However, if it's not a *good* way to do something, then it should be changed. For one, just look at the inconsistency in the two headers you mentioned as an example: License-Identifier is skewer-case and FileCopyrightText is PascalCase. The design is noticeably worse even in this small example. I think that better headers ought to be chosen and supported, with the existing set only used for backwards compatibility.
We would have been able to define a new standard, but always remember XKCD #927 [^1] ;)
Jokes aside, some historical background: SPDX-License-Identifier is the first original tag AFAIK. Then, the project decided to change to CamelCase. I think they dislike this inconsistency as well, but recognised that changing the former would not make sense.
For REUSE, it's pragmatic to use a widely adopted standard that is used among communities and industry. Do we like the inconsistency? No. Would forking the standard because of that make sense? No, too.
The LICENSES folder and .license file on the other hand are pure REUSE inventions. Existing practices did not make sense or were to complicated for developers from our PoV, so we had to fill in the gap. That's pragmatic.
The perspective of the REUSE developers is likely biased towards the needs of large projects or large businesses, many with hundreds of developers, and an existing culture of using tooling and automation to maintain their software. The values of this culture are not shared with small- to medium-sized projects.
Absolutely not. I dare to say that we provide the best practices that a) fit 99% of the edge cases and b) are still reasonably easy to understand and implement for developers. We've also received positive feedback from smaller projects in the scope of the NGI project [^2]. The critique we received was rarely on inconsistencies but on other factors (like that GitHub and GitLab to this day do not "understand" REUSE's LICENSES folder, but that's another story).
That we always have a developer without any legal background or a lot of time in mind when working on the spec and tool should become obvious when having a look at the tutorial and FAQ [^3].
For instance, REUSE wants me to install a CLI tool. The answer is "no". I intend to maintain these files by hand, so they need to be pleasant to read or write by hand. The introduction of a new tool into my workflow or dependency graph requires a large value-add and is subject to a lot of scrutiny, and the reuse tool does not pass. An annoying specification which is made easier to use thanks to a robot is still an annoying specification.
Hey, one neither forces you to adopt REUSE nor to install the tool. It's a suggestion, a best practice you can choose to either use or disregard.
In the early days there was no tool, and we received the feedback that adding a linter and further commands to add headers (and more) would increase adoption. I'd say that was a good decision as keeping an overview manually over *all* files in a large repo and their different ways to label them via REUSE is almost impossible.
If you want to make this manually, go ahead. If you would like to confirm whether you made it right but not use the tool, feel free to use the API.
Perhaps it comes down to taste. If so, know that many developers are already bending over backwards to fight their distaste at the behaviors encouraged by this specification. A whole lot of developers hate the idea of filling their source files with huge license headers, taking on the bigger maintenance burden to keep copyright lines up to date, getting annoying emails from distributions who want them to change their software to suit a slightly different convention*, and so on. If the headers are more pleasant to use, then that's one less thing causing friction for maintainers.
I am still not completely certain what you would want REUSE to change. Instead of "SPDX-License-Identifier" you would like to do "License-Id". We'll stick with the former because it's the de facto standard, and I cannot recognise any "bending over backwards" because of using this string and not the other. If we'd introduce aliases and breaking changes, that would confuse our core audience: developers.
Ad "huge license headers": two lines really are not "huge", compared with the license notices other best practices demand (often two paragraphs or more).
Ad "maintaining copyright lines", please see this FAQ item [^4].
Again, I know that making a project REUSE compliant is extra work which we intend to make as simple as possible with extensive documentation and tooling. Our experience is that once this is achieved, it's relatively easy to maintain. But no one forces a project to do this, but they should know that the alternative is licensing and copyright uncertainty, perhaps incompatibilities or unknown proprietary components.
You might think that I'm just being difficult, and maybe I am. But I am squarely in the target demographic of REUSE, and if you want me to overcome all of these papercuts to adopt it for my projects, then I expect an equal measure of good-will from REUSE to make these papercuts as few in number as possible. Does that make sense?
To make it short: please sum up your concrete suggestions for REUSE spec changes? Here's what I got: changing the license identifier tag is not feasible as explained above. Using "traditional" copyright lines is already today supported. Using the tool is not mandatory as per the spec, just a strong recommendation.
Best, Max
[^1]: https://xkcd.com/927/
[^2]: https://fsfe.org/news/2021/news-20210504-01.html
[^3]: https://reuse.software/tutorial/ & https://reuse.software/faq/
[^4]: https://reuse.software/faq/#many-copyright-statements
On Wed Dec 22, 2021 at 10:08 AM CET, Max Mehl wrote:
We would have been able to define a new standard, but always remember XKCD #927 [^1] ;)
You already have defined a new standard, in case you hadn't noticed ;)
Jokes aside, some historical background: SPDX-License-Identifier is the first original tag AFAIK. Then, the project decided to change to CamelCase. I think they dislike this inconsistency as well, but recognised that changing the former would not make sense.
For REUSE, it's pragmatic to use a widely adopted standard that is used among communities and industry. Do we like the inconsistency? No. Would forking the standard because of that make sense? No, too.
The LICENSES folder and .license file on the other hand are pure REUSE inventions. Existing practices did not make sense or were to complicated for developers from our PoV, so we had to fill in the gap. That's pragmatic.
The introduction of these changes opens the floor for other changes, too. You should take a more pragmatic approach. This is an opportunity to improve upon the status quo.
Also, as mentioned before, the REUSE tool allows you to configure these:
https://reuse.readthedocs.io/en/stable/usage.html#addheader
In general, I find that configuration options are harmful as often as they are helpful. It is essential for the tool to choose strong defaults so that it does the right thing and provides leadership for the community on what the best approach is. Often when strong enough defaults are established, it no longer makes sense to make them customizable. That we have so many choices suggests that the defaults are not, in fact, strong enough.
Absolutely not. I dare to say that we provide the best practices that a) fit 99% of the edge cases and b) are still reasonably easy to understand and implement for developers. We've also received positive feedback from smaller projects in the scope of the NGI project [^2]. The critique we received was rarely on inconsistencies but on other factors (like that GitHub and GitLab to this day do not "understand" REUSE's LICENSES folder, but that's another story).
Aside: I would be pleased to add first-class REUSE support to SourceHut if I felt that some of my concerns (some mentioned in this thread, some not) were addressed.
That we always have a developer without any legal background or a lot of time in mind when working on the spec and tool should become obvious when having a look at the tutorial and FAQ [^3].
Right, but one thing which is probably consistent among all of your contributors is a passion for dealing with licenses properly. Most small projects have a passion for getting their work done and they want little to nothing to do with licensing.
Ironically, I *do* care quite a lot about licensing, but I am also fiercely pragmatic about it, which puts me somewhere in the middle.
Hey, one neither forces you to adopt REUSE nor to install the tool. It's a suggestion, a best practice you can choose to either use or disregard.
Right, but the question is whether or not the absence of the tool is a use-case you want to consider, and if REUSE should strive to be accommodating of users in this bucket.
In the early days there was no tool, and we received the feedback that adding a linter and further commands to add headers (and more) would increase adoption. I'd say that was a good decision as keeping an overview manually over *all* files in a large repo and their different ways to label them via REUSE is almost impossible.
Oh, sure, I'm not suggesting the tool is not useful. I'm suggesting that a design which requires a tool to be comfortable to use may not be a good design. Even if the design were improved I'm certain that the tool would remain very useful. But if you make it easier to meet the spec without the tool, then the spec will be adopted by more projects.
I am still not completely certain what you would want REUSE to change. Instead of "SPDX-License-Identifier" you would like to do "License-Id". We'll stick with the former because it's the de facto standard, and I cannot recognise any "bending over backwards" because of using this string and not the other. If we'd introduce aliases and breaking changes, that would confuse our core audience: developers.
It's already confusing. I don't want to read a bunch of machine headers in comments at the top of all of my files. It's ugly. License-Id: makes it read more like natural language while still being useful to machines. Code is written to be read, and should be as accommodating of human readers as it is of machine readers. It's important for code to be beautiful.
It's also not a breaking change, to be clear. You can support both styles.
Ad "huge license headers": two lines really are not "huge", compared with the license notices other best practices demand (often two paragraphs or more).
To be clear, the license headers are a hurdle I'm willing to overcome. But they are a hurdle, and you should decide which hurdles are important and which are not, and eliminate the ones that aren't. This one is important. Others, less so. Every hurdle you add is another developer who won't be willing jump it. Every hurdle you remove is another developer who will.
Ad "maintaining copyright lines", please see this FAQ item [^4].
REUSE provides a lot of solutions to problems, but it does not eliminate a lot of problems.
Again, I know that making a project REUSE compliant is extra work which we intend to make as simple as possible with extensive documentation and tooling. Our experience is that once this is achieved, it's relatively easy to maintain. But no one forces a project to do this, but they should know that the alternative is licensing and copyright uncertainty, perhaps incompatibilities or unknown proprietary components.
Extensive documentation and tooling does NOT make it simpler. It makes it more work! Now you don't just have to solve the problem, but you also have to read a bunch of "extensive" docs and learn and evaluate new tools. If you want it to be simpler, *make it simpler*.
You're leaning too heavily on the mantra of "good licensing discipline" is important. To most people, it isn't, and honestly, they're right. FOSS projects below a certain scale have almost always gotten away with subpar licensing practices, because for the most part, it doesn't really matter for them. The value-add is not worth the overhead for a lot of projects. The only way to get them on board is to lower the overhead.
To make it short: please sum up your concrete suggestions for REUSE spec changes? Here's what I got: changing the license identifier tag is not feasible as explained above. Using "traditional" copyright lines is already today supported. Using the tool is not mandatory as per the spec, just a strong recommendation.
I will admit that I'm making somewhat vague, philosophical arguments. I can make specific suggestions, too, but I expect them to be shot down much like the license identifier suggestion if we cannot achieve philosophical alignment first.
Last note: I know this is an annoying thread, but I am appreciative of your patience with me. You've been very responsive and helpful so far, even if we haven't found common ground yet.
Hi,
I’ve just read the thread and for the sake of clarity and getting back on the original topic, I will try to summarise what the original issue seems to be about.
If I understand Drew’s original problem it is that he finds REUSE to lean too much towards human readable because they can be quite verbose and have to be in each file. (Drew, please correct me if I am misreading your initial e- mail.)
# Per file
While the REUSE Spec does emphasise a preference to keep the license and copyright notices in the files themselves, it only requires that each file has notices related to it.
As such, a valid option is to use a central file with globbing for certain or even all files. An improved version of this is being worked on in: https://github.com/fsfe/reuse-docs/issues/81
# Many lines due to many copyright holders
Another thing that I can see in your example is that you list many copyright holders, each in their own line.
If that is a bother for you, you can always do the following (I’m keeping your syntax) and it would still be a valid according to REUSE Spec:
// Copyright: 2019 Contributors to ProjectX <$optional_url>
…where$optional_url is an (optional) URL to the project’s homepage or the online file/page that lists all the contributors, however you see fit to format that list.
for more see: https://matija.suklje.name/how-and-why-to-properly-write-copyright-statement...
# Year
You could even omit the year, if dealing with that is an additional hurdle, and it would still be REUSE compliant.
for why _I_ think the year is useful see: https://matija.suklje.name/how-and-why-to-properly-write-copyright-statement...
# Summary
As such, a valid REUSE header in your case could be:
``` // SPDX-License-Identifier: GPL-3.0-only // Copyright: Contributors to ProjectX https://projetx.org/contributors ```
`cat $reuse_header $file > $file`
Then even if someone later on (e.g. a packager) wants to figure out what the license situation is for your project all they need to do is:
`ls LICENSES/`
…to see the list of all the licenses.
and if they feel that is not enough for them, they can always:
`grep --recursive SPDX-License-Identifier`
… to see which files fall under which licenses.
No need for any fancy additional tooling :)
holiday cheers, Matija