Hello kind people,
I have created a draft of the tutorial. Going into the tutorial, I
prioritised three things:
- Length (i.e., keep it short)
- Clarity of instructions
- Tangibility/relatability
Things I explicitly avoided (but might need to be mentioned elsewhere):
- Explaining how copyright and licensing works
- Explaining why this is important
- Explaining how SPDX and DEP-5 work
- Explaining rationales for REUSE decisions
- Explaining all the possible options from the spec
- Explaining that some licences might not be compatible, and you may
occasionally need legal review from an expert
- Explaining that Free Software licences are the best licences
Specifically, I wanted to touch on these things going in:
- How to add licence
- How to mark files as being under a certain licence
- How to deal with files that cannot have headers
- How to deal with large directories
And so I ended up with <http://carmenbianca.eu/en/reuse/>;. If or when
you have time, please let me know how it could be improved, or whether
the general style of the tutorial is fitting at all. I am half-certain
it would be much better with nice illustrations, but I am not very good
at those.
With kindness,
Carmen
Hi all,
It was a pleasure to meet you at FOSDEM, and I hope you had safe trips
home.
In this e-mail I share the minutes of the meeting we had, though calling
them "minutes" might be overselling it. I'm summarising the points
raised. If I missed any, then my mind is fallible and my notes
incomplete. I aim to keep the e-mail short, but I'm notoriously bad at
that.
- It was raised repeatedly that adding headers to big projects is a
non-trivial task that very few people want to do manually. If this
can somehow be automated, and if REUSE can facilitate doing this
automatically, then it might increase the amount of headered files.
+ Important to note, however: This may result in false headers.
Automatically adding headers to old files skips manual legal
review. This is something to be cautious about, and to convey to
users of such a tool.
+ This would require a template(s). The template(s) would be put in a
"[.]reuse" directory.
+ Provide links to tools that might already do this, e.g. Maven
plugin.
- The utility of the bill of materials (third step of REUSE) was
debated.
+ There was some misconception that this file could be used to declare
the copyright and licensing of a file like in debian/copyright.
This is not so. Rather, it is a list of ALL files in a project and
the copyright info that was discovered.
+ The bill of materials must ALWAYS be generated by a computer, and
possibly curated by hand.
+ As such, the bill of materials may not need to exist within the
repository. It is build output, which customarily never exists in
the codebase. But if one wanted to, they could keep it in the repo
and "bump" the file with every release.
+ Moreover, the bill of materials is not an actionable STEP for
developers (though more on the target audience of REUSE later).
Developers can run `reuse compile` to generate the BOM, but then
there is nothing they can do with the output. The output is for
legal teams. As such, being able to generate the BOM should be a
_goal_ rather than a step. i.e., once you can generate a complete
BOM, you have succeeded in adopting REUSE.
- The specification is not a good tutorial. A new tutorial must be
created.
+ Link to existing tutorials.
+ One tutorial cannot cover the complexity of the spec unless the
tutorial has an interactive decision tree.
+ If the decision tree is not implemented, then a tutorial ideally
only covers _one_ path through the spec, and this should be clearly
marked. e.g., "this tutorial is one easy way to be REUSE compliant,
but it is not the only way".
+ The tutorial should be as short and sweet as possible, and have a
limited scope. People do not like reading long tutorials, so a long
tutorial would hamper adoption of REUSE.
- The specification is not a good specification.
+ It would be nice if the spec had identifiable bullet points that can
be referenced, instead of being a wall of text.
+ The spec contains an error regarding SPDX Exceptions, which is
approved for fixing.
+ The spec contains a lot of silly edge cases. Rewriting the spec
will fix this.
+ Matije and Carmen will co-operate on drafting a new spec.
- The recommendation that Git/VCS could be used to record copyright and
authorship information instead of recording this information in the
files themselves will be SCRAPPED. People _could_ do that if they
really wanted to, but it would be out of the scope of the REUSE spec.
- The tool is currently hosted on gitlab.org. The website/spec is
currently hosted on github.com. There are still some bits and pieces
on git.fsfe.org, possibly. It would be good if this were cleaned up.
+ Preference for GitHub and/or GitLab, because git.fsfe.org has a
non-trivial barrier for entry. GitHub probably has the lowest
barrier, but is the least in line with FSFE's ideals.
+ Consult Matthias about this. Gabriel?
- Programmers really like automation. The tool could do some extra
things:
+ Possibly reduce Python version from 3.5 to 3.3/3.4 for teams that
are stuck on old Python versions. Python 2 is out of the question.
+ Provide a Docker container for people who really hate dealing with
Python.
+ `reuse init` to automatically set up some simple stuff.
+ Download licences from SPDX or elsewhere. Fill in the templates
automatically.
+ Set up the DEP-5 file.
+ Auto-generate headers (see above).
- People don't want to mark trivial or config files as having copyright,
and want to exclude them from the linter.
+ Excluding files from the linter is a can of worms best left closed.
+ Make a clear recommendation to license those files under CC0.
- Sites such as GitHub do not recognise the LICENSES directory. This
leads to an awkward situation where you have to put your "main"
licence in LICEN[CS]E, COPYRIGHT or COPYING for it to be recognised.
This is not ideal, because now LICENSES does not contain ALL
licences.
+ If you elect to put everything in the LICENSES directory, the site
will erroneously say that you have no licence at all.
+ This also messes with multi-licence projects.
+ Contact GitHub/GitLab/Bitbucket about collaboration.
- "debian/copyright" is not a great file path. It is incompatible with
projects that legitimately include that path, and it incorrectly
invokes a relation to the Debian project.
+ Allow the user to specify their own path.
+ Change the default to "[.]reuse/dep5" (or something similar).
- MIT and BSD are tricky licences. They include the copyright holder
within the licence---which means that no two such licences are
identical---and mandate an exact reproduction of the licence. REUSE
currently deals with this by recommending the developer to create a
separate licence for every copyright holder, but this is frankly very
ugly.
+ Solve this by not making any recommendations about this in the
spec. Instead, include an explanation of the problem in an FAQ.
- Speaking of an FAQ: Turn the flight rules into an FAQ.
- There was some confusion regarding the goal of REUSE and its target
audience, and I am not certain it is solved entirely. What follows is
a summary mixed with personal reflection.
+ The argued goals are (1.) making sure that copyright information can
be parsed by computers, (2.) making sure that developers know how to
license their stuff, (3.) make sure that lawyers can generate SPDX
files, and probably some more goals that have slipped my mind.
+ After some reflection, it appears to me that they are all perfectly
valid and tangential goals, but the fact that there was some
(slight) confusion and disagreement here re-ignites a pain point I
should have written down, but did not: REUSE does not have an
elevator pitch, and it desperately needs one. When I talk to a
developer and tell them about REUSE, I do not have a simple, quick
explanation at hand, even though I've been working on this for a
while. Any attempt at an elevator pitch quickly becomes convoluted:
"The REUSE Project presents a set of recommendations to developers
that they can implement so that their project have full coverage of
copyright and licence information, in such a way that a computer can
verify this". This elevator pitch is correct, but more than a
mouthful.
I hope that this has covered most of the talking points (and action
points). Please correct the things I got wrong, and add things I
forgot. I am only human.
With kindest regards,
Carmen
Dear all,
I am Max, Programme Manager with the FSFE. I would love to become part
of the REUSE project and help pushing it forward. With this email I
would like to introduce myself to those who don't know me, and outline a
few parts of the project I could be of help for.
For the FSFE, I am coordinating a few projects, for example the "Public
Money? Public Code!" initiative [^1] or the EU project FOSS4SMEs which
will create learning materials about Free Software for people working in
SMEs [^2]. I am also supporting our technical teams, although I am no
technician by profession but a political scientist.
What fascinates me about REUSE is the intersection of individual
developers, the practical needs of companies, and the legal aspect of
licence compliance. I also like its mix of technical tools and
human-understandable guidelines.
I hope I can add some value when it comes to explaining technical things
to non-techies, boiling down complicated information to a few lines of
text, and creating symbiosis with other projects inside and outside of
the FSFE. At the same time, I am also keen on learning more about
practical licence compliance.
If you have any further questions about me, please do not hesitate to
get in touch with me. I am looking forward to working with you on this
exciting initiative!
Best,
Max
[^1]: https://publiccode.eu
[^2]: https://fsfe.org/activities/foss4smes/
--
Max Mehl - Programme Manager - Free Software Foundation Europe
Contact and information: https://fsfe.org/about/mehl | @mxmehl
Become a supporter of software freedom: https://fsfe.org/join
Hi All,
Jeff McAffer here. Gabriel connected me up with this list after a discussion at FOSDEM. I run a project called ClearlyDefined (https://clearlydefined.io) as part of the OSI which is aimed at clarifying, amongst other things, the license/compliance related data around open source components. The main goal is to reduce friction and increase engagement around open source. This is, I think, very much inline with the REUSE mission and there are likely some great intersection points.
The short version of the project operation is that we mechanically harvest as much data as we can from the packages and repos out there. This involves running ScanCode, FOSSology, Licensee, and some custom tools to find licenses, copyrights, source location, release date, ... The harvested data is summarized and aggregated to form a "definition" that contains the license-related info in a normalized form.
Since tools are not perfect and data is often missing, we enable people to curate the data by submitting pull requests to a repo on GitHub. These are reviewed and vetted like any other open source contribution, and eventually merged causing the definition to be updated.
>From there we work with the upstream community to either learn where they keep the data we missed or add/correct the data in the project if it was missing/erroneous. That way, future releases of the project are, well, more clearly defined.
It would be great to look at ways we can collaborate to drive our shared goals.
Jeff