Hello,
I quite agree with Drew DeVault on there that REUSE should try to adopt a bit more headers and probably get a more formal specification, currently it's a bit too much prose-like.
One thing I've been wanting for a while, specially when writing packages for large repos with legacy code is to have something which would try to give out as much as licences and notices as reasonably possible even with some false-positives and bit of manual work when a human-format is detected. And I think this would need a formal specification so both devs and packagers can be sure that their licences can be found and respected. Of course there should be a limit on it, for example be described via ABNF.
REUSE from my point of view is a great thing as it seems to be easier to maintain and adopt than the Debian Copyright File. But it doesn't really solves the problem of packagers wanting to know what is the licence of something, in fact it creates yet other ways for developers to specify the licences, making it a bit harder to create a license scan tool.
For example one kind of format I would definitely add in such a tool is:
# Copyright 1999-2021 Gentoo Authors # Distributed under the terms of the GNU General Public License v2
And of course this one doesn't uses SPDX and has the slightly annoying problem of not being explicit on being "GPL-2.0-only" or "GPL-2-or-later".
By the way I would be interested in helping to create such a specification with reasonable coverage that I think could be verified thanks to the work already done within distributions.
Best Regards