Hello, I'm currently doing some research about the best way(s) to publish open data for a local administration. For many of the data they wish/can publish the license question is quite straightforward, but there's one database which raises much questions. I try to expose the problem as simply as possible:
1-the database : shape files with routes and POI about a region. 2-the context : the administration makes some money selling a walking guide using those informations 3-the aim : opening those data so that eventual reuses will publish correct information about pathes, dangers, etc. 4-the dilemma : open publishing those information may serve any competitor editor to build a competiting guide upon those data 5-the question : how licensing could help preventing such an usage while welcoming more friendly reuses such as : a promotional guide for hiking in the area that would reuse some of the data for the sake of promotion.
*4b : let's assume that the dilemma is real. I mean : the risk may not ba as big as their fears, but this is not my point here.
Licences evaluated, and the point where I am : - Open Licence (https://wiki.data.gouv.fr/images/0/05/Open_Licence.pdf), or CC-BY : welcomes *every* reuse, so won't solve the dilemma. - CC-BY-NC : may sound appropriate, but : non free (I'd dislike very much ending up in recommending a non-free license), would prevent some legitime reuses (let's imagine that the promotional guide has anything to do with a commercial use) - CC-BY-SA : this is the way I'd choose for : - it's free - the constraint introduced is only reclaiming openness - it may solve the dilemma, but can you help me answer following questions ?
Let's assume the competitor wants to use the shape files under CC-BY-SA along with a closed base map bought from a vendor to have a nice looking printed guide. => publishing such a reuse of the data seems to lead to a licences conflict : the closed license would say something like "all rights reserved, reproduction forbidden", where the CC-BY-SA part would claim "any reuse of this data will be under this same license" => How to solve that ? If it' not solvable, this means that the commercial reuse cases of the data opened under CC-BY-SA are quite a complicated way, which I find a reasonable manner to solve the original dilemma
Am I on the right way with my assumptions and questions ?
Michel Roche
Dear Michel,
[This is a personal opinion, and has not been discussed with team@.]
On 12/03/2014 05:04 PM, Michel Roche wrote:
I'm currently doing some research about the best way(s) to publish open data for a local administration.
The best way for public administrations is *always* public domain or CC0. The taxpayer has already paid for the production of that data, and frankly the state has no business telling its customers how they may or may not use the data they have paid for.
1-the database : shape files with routes and POI about a region.
It is absolutely necessary for an European public administration to waive any database rights they may have in the data collection (or license those rights – IIRC new CC licences cover that avenue). Otherwise a two tier system will result where European data users will be restricted by database rights, and the American ones will be not (the US does not recognize db rights).
2-the context : the administration makes some money selling a walking guide using those informations
The fact that the administration makes money selling a walking guide should have no influence on the discussion. Merchandise sales are not a core competency of any public administration; if it helps them to popularize the area, it is OK to engage in such activities; however, if a private party were to take the data and started to produce better guides, the administration's primary goal of popularizing the area would still be fulfilled. In fact, this would probably free up the administration's resources to better handle its core functions.
3-the aim : opening those data so that eventual reuses will publish correct information about pathes, dangers, etc.
This cannot be ensured by any free licensing scheme. It is important to recognize that the administration will not be responsible for malicious or grossly negligent undertakings of any private party.
4-the dilemma : open publishing those information may serve any competitor editor to build a competiting guide upon those data
This would be a good thing!
5-the question : how licensing could help preventing such an usage while welcoming more friendly reuses such as : a promotional guide for hiking in the area that would reuse some of the data for the sake of promotion.
This should not be avoided.
Let's assume the competitor wants to use the shape files under CC-BY-SA along with a closed base map bought from a vendor to have a nice looking printed guide. => publishing such a reuse of the data seems to lead to a licences conflict : the closed license would say something like "all rights reserved, reproduction forbidden", where the CC-BY-SA part would claim "any reuse of this data will be under this same license" => How to solve that ? If it' not solvable, this means that the commercial reuse cases of the data opened under CC-BY-SA are quite a complicated way, which I find a reasonable manner to solve the original dilemma
This seems like a possible licence violation, but would be better left to the potential parties' lawyers to sort out.
In any case, public administrations to serve the public. They do not exist to make profit or hoard public resources. That is the best way to solve such conundrums.
Best,
Hello
Yesterday I made my first contribution to the OpenStreetMap project by adding a node that describes the shop at Berliner Straße 8 in Frankfurt am Main, Germany. OpenStreetMap uses the ODBL[1]. Recently I found out that a free software clone of VOCALOID uses the same licence. However you could also use the GPL for databases such as soundfonts. Freepats adds an exception to the GPL[2], wich allows use of the soundfont in proprietery compositions. I don't know if one of the licences fits your needs.
Tobias Platen
[1]http://opendatacommons.org/licenses/odbl/ [2]http://metadata.ftp-master.debian.org/changelogs//main/f/freepats/freepats_2...
On 03.12.2014 18:04, Michel Roche wrote:
Hello, I'm currently doing some research about the best way(s) to publish open data for a local administration. For many of the data they wish/can publish the license question is quite straightforward, but there's one database which raises much questions. I try to expose the problem as simply as possible:
1-the database : shape files with routes and POI about a region. 2-the context : the administration makes some money selling a walking guide using those informations 3-the aim : opening those data so that eventual reuses will publish correct information about pathes, dangers, etc. 4-the dilemma : open publishing those information may serve any competitor editor to build a competiting guide upon those data 5-the question : how licensing could help preventing such an usage while welcoming more friendly reuses such as : a promotional guide for hiking in the area that would reuse some of the data for the sake of promotion.
*4b : let's assume that the dilemma is real. I mean : the risk may not ba as big as their fears, but this is not my point here.
Licences evaluated, and the point where I am :
- Open Licence (https://wiki.data.gouv.fr/images/0/05/Open_Licence.pdf),
or CC-BY : welcomes *every* reuse, so won't solve the dilemma.
- CC-BY-NC : may sound appropriate, but : non free (I'd dislike very
much ending up in recommending a non-free license), would prevent some legitime reuses (let's imagine that the promotional guide has anything to do with a commercial use)
- CC-BY-SA : this is the way I'd choose for :
- it's free
- the constraint introduced is only reclaiming openness
- it may solve the dilemma, but can you help me answer following
questions ?
Let's assume the competitor wants to use the shape files under CC-BY-SA along with a closed base map bought from a vendor to have a nice looking printed guide. => publishing such a reuse of the data seems to lead to a licences conflict : the closed license would say something like "all rights reserved, reproduction forbidden", where the CC-BY-SA part would claim "any reuse of this data will be under this same license" => How to solve that ? If it' not solvable, this means that the commercial reuse cases of the data opened under CC-BY-SA are quite a complicated way, which I find a reasonable manner to solve the original dilemma
Am I on the right way with my assumptions and questions ?
Michel Roche
Discussion mailing list Discussion@fsfeurope.org https://mail.fsfeurope.org/mailman/listinfo/discussion
Hi all, thank you Heiki and Tobias for your answers.
First discussion with Tobias : I'm aware of opendatabasecommons licences, but it seems to me that the v4 of creativecommons ones have now adressed their scope to deals with databases also. CC licences would have my preference because they have (will have, for 4.0) french translations. Comparing two "equivalent" licences, say ODBL vs CC-BY v4, doesn't rise to me any great difference on the obligations and rights granted. I understand I may be wrong on this assumption because I should go deeper in the text explanation, but at this point I feel I can consider them as equivalent (at least, compatible).
The second pointer you gave me is very interesting (http://metadata.ftp-master.debian.org/changelogs//main/f/freepats/freepats_2... ), it's kind of turning the GPL into a BSD-like license. That said, the question in the studied case would be exactly to find the reciprocal way : how to circumvent such a reuse. The ethics behind that is discussed later, in the answer to Heiki.
with Heiki : Short answer to your remarks : from my ethical point of view I totally agree with you. Kind of "free up all the stuff and see what happens" attitude. *But* despite this point of view has some great and important supporters in the said organization, there are some fears around opening data from also some key people. Not that they are alone to decide, but because they are *the* technical person on this or that matter. So that it would be a mistake to circumvent them in the process. So I have two ways of giving advice : adopting a rms-savvy point of view promoting full opening of datasets, thus risking some datasets won't be opened at all in the end due to unadressed fears ; or trying to find a way to circumvent the fears surrounding the opening of some datasets so that they'll finally be opened with eventually some restrictions. Freedom has this particularity : is it still freedom when is not total ? I'm sure we can write tons of pages thinking about this question ;-) Long answer comes below, argument by argument.
I'm currently doing some research about the best way(s) to publish open data for a local administration.
The best way for public administrations is *always* public domain or CC0.
While I agree at a philosophical/ethical level, the fact is that the organization has spent time and efforts (even some [public] money) to produce the data. They want to see reuses of it, and they also want to be granted, rewarded in a way. I see the Open License (or CC-BY or ODBL) licence as an appropriate tool to adress this will while still providing openness and freedom for the published databases. OK, we are already less than perfect, but it sounds like a fair trade, no ?
1-the database : shape files with routes and POI about a region.
It is absolutely necessary for an European public administration to waive any database rights they may have in the data collection (or license those rights – IIRC new CC licences cover that avenue). Otherwise a two tier system will result where European data users will be restricted by database rights, and the American ones will be not (the US does not recognize db rights).
Thanks for pointing that out as I wasn't aware at all that things could be that different outside europe. I have a question about your wording : when you say waiving rights, do you mean Public Domain only, or do you mean any Attribution licence (Open LIcence, odbl, cc-by v4) does the job of clarifying the situation ? (even if a US based reuser could try to argue that the licence terms do not apply to him, as they do not have corresponding terms in US laws ?) There's a discussion about that on the odbl FAQ page (questions 2.1 and 2.2) : http://opendatacommons.org/faq/licenses/#Are_Share-Alike_Provisions_Such_as_... drawing the conclusion that licensing even with a controversial share-alike provision is better than nothing since it, at least, clearly states your intentions, eventually postponing the decision of its applicability in a particular country to be drawn by the relevant court.
2-the context : the administration makes some money selling a walking guide using those informations
The fact that the administration makes money selling a walking guide should have no influence on the discussion. Merchandise sales are not a core competency of any public administration; if it helps them to popularize the area, it is OK to engage in such activities; however, if a private party were to take the data and started to produce better guides, the administration's primary goal of popularizing the area would still be fulfilled. In fact, this would probably free up the administration's resources to better handle its core functions.
Well, maybe some additional context is needed here. As you may know, here in France there is a strong tendency (it's an understatement :-( ) to transform public services into profitable orgs. So every one of them tends to develop whatever is merchandisable to get a little more resources than allocated to get their public mission done. In this context, the said organization has applied its skills and knowledge to the lawful objective of publishing a comprehensive guide to hiking in the region, alongside with a collection of more cultural publications. Needless to say that this second category of publication is far from profitable. The situation is that years passing, publishing the guide has become a way to balance the other editions in the income statement of the organization. At this point, I think you get the picture better, and begin to grasp their fear : if we do any harm to the income of the guide, they may have difficulties to get the difference in next year's budget. I do agree with you that this situation is a pity, nevertheless they asked for my advice on how they could open data, and I'm willing to give them an answer as useful as I can. By useful,I mean advices they'll be able to apply, and hopefully will.
3-the aim : opening those data so that eventual reuses will publish correct information about pathes, dangers, etc.
This cannot be ensured by any free licensing scheme. It is important to recognize that the administration will not be responsible for malicious or grossly negligent undertakings of any private party.
I think they know that. They only hope that by making data available reusers will tend to use them more easily thus leveraging the quality of their reuses. No will of constraint or insurance here, only an attempt to feed a vertuous circle.
4-the dilemma : open publishing those information may serve any competitor editor to build a competiting guide upon those data
This would be a good thing!
Yes, but that's where the main fear resides. Suppose that the owner of very good base maps publishes a direct competitor to the guide. This would make a really great guide (I'm happy because it'll be helpful to me), but could potentially kill the oarganization's other publictions (I won't be happy on this side, loosing interesting thoughts). Well, it's really because the original situation is so weird that I end up wondering if a better map is really a good thing ! Suppose now that the new edition is a crap, but still on the vendors' shelves. Some people will still buy them (eventually realising their mistake afterwards) and the result can be the same, and without its positive side. Reading a second time this paragraph, I realise that they lead me to play an awful role of devil's advocate ;-)
Let's assume the competitor wants to use the shape files under CC-BY-SA along with a closed base map bought from a vendor to have a nice looking printed guide. => publishing such a reuse of the data seems to lead to a licences conflict : the closed license would say something like "all rights reserved, reproduction forbidden", where the CC-BY-SA part would claim "any reuse of this data will be under this same license" => How to solve that ? If it' not solvable, this means that the commercial reuse cases of the data opened under CC-BY-SA are quite a complicated way, which I find a reasonable manner to solve the original dilemma
This seems like a possible licence violation, but would be better left to the potential parties' lawyers to sort out.
Well, that's a path I'd like to explore a little much further to be able to formulate an educated answer. Sure I'm not looking to play the match before it occurs, but I'd like some hints.
Do you think it would be a relevant question to ask to the legal team ?
Michel Roche
On 12/04/2014 10:16 PM, Michel Roche wrote:
While I agree at a philosophical/ethical level, the fact is that the organization has spent time and efforts (even some [public] money) to produce the data. They want to see reuses of it, and they also want to be granted, rewarded in a way. I see the Open License (or CC-BY or ODBL) licence as an appropriate tool to adress this will while still providing openness and freedom for the published databases. OK, we are already less than perfect, but it sounds like a fair trade, no ?
Public administrations that act like businesses should be reminded that they are not businesses.
It is absolutely necessary for an European public administration to waive any database rights they may have in the data collection (or license those rights – IIRC new CC licences cover that avenue). Otherwise a two tier system will result where European data users will be restricted by database rights, and the American ones will be not (the US does not recognize db rights).
Thanks for pointing that out as I wasn't aware at all that things could be that different outside europe. I have a question about your wording : when you say waiving rights, do you mean Public Domain only, or do you mean any Attribution licence (Open LIcence, odbl, cc-by v4) does the job of clarifying the situation ?
In my opinion any licence with a permissive, non-copyleft database rights licence grant resolves the issue. ODBL is not such a licence. Sui generis database rights are a horrible mistake of the European parliament. In jurisdictions where they do not exist (the US), any licensee cannot tell whether they have to consider these restrictions or not: the answer will not be uniform for the US and the answer will not be uniform for any specific copyleft DB licence, because any disputes would be solved according to contract law, and whether or not a contract exists will vary based on the manner the database is accessed, the manner the licence is presented, and the state of the US user.
In my opinion, the only decent thing to do would be to license the contents of the database using a free content licence (e.g., CC; I am assuming that the contents are sufficiently original to warrant copyright protection, not simply factual information) and waive the database rights. Free licences should aim to provide legal clarity, not make things fuzzier than they are.
http://opendatacommons.org/faq/licenses/#Are_Share-Alike_Provisions_Such_as_... drawing the conclusion that licensing even with a controversial share-alike provision is better than nothing since it, at least, clearly states your intentions, eventually postponing the decision of its applicability in a particular country to be drawn by the relevant court.
The intentions of the licensor are irrelevant here. When most of us make Free Software, we don't intend it to be used for illegal surveillance, yet we do not loudly proclaim that we do not intend our software to be used for illegal surveillance. We recognize that software freedom is the greater good, and we cannot guard against every wrong.
Similarly, legal clarity throughout the world is a greater good than ensuring RoI. An arrangement of information should not preclude others from making use of that same arrangement simply because someone has invested resources into arranging that information. If editorial control is exercised and the arrangement is a selection that possesses originality, then such an arrangement is protected by compilation copyright anyway. Facts, however, do not receive copyright protection, and a systematic collection of facts that lacks originality should receive no protection either. Unfortunately, this is not the case in Europe; however, this does not mean that these unfortunate restrictions should be exported to other regions as legal uncertainty and contractual terms.
Well, that's a path I'd like to explore a little much further to be able to formulate an educated answer. Sure I'm not looking to play the match before it occurs, but I'd like some hints.
Do you think it would be a relevant question to ask to the legal team ?
I think they would be able to explain the relevant factors in the infringement analysis for the particular case, but without a specific case, they will not be able to determine the probable outcome.
Michel Roche listes.pichel@free.fr writes:
[…] the fact is that the organization has spent time and efforts (even some [public] money) to produce the data.
So, the public has already paid for it. The public should never again have a barrier to access that data, since they own it now.
Yes, but that's where the main fear resides. Suppose that the owner of very good base maps publishes a direct competitor to the guide.
Excellent! This is what government's role is, to facilitate the society to build on its work. The situation you describe would be something to stand aside and allow to happen, not to put artificial restrictions onto.
The work of publicly-funded government-commissioned work should, in general, be released to the same public under license terms that fully respect the software freedom of all recipients.
Some cases may have special reasons that justify restricting that freedom. This case does not sound like one deserving of such restrictions.
Hello Michel,
Thanks for launching this discussion! I hope we can help you there!
I just wanted to point out that licensing, as such, cannot resolve fear of the unknown. Choosing a license based on fears and unverified assumptions does not seem like a good basis to start opening up government data.
So I think it's a good idea to be able to make a distinction and to drive the discussions where we can reach results. I haven't got much experience with government's databases and their opening up, but other organisations do. Have you got in touch?
Good luck!
Le 05/12/2014 09:35, Hugo Roy a écrit :
Hello Michel,
Hello all,
I just wanted to point out that licensing, as such, cannot resolve fear of the unknown. Choosing a license based on fears and unverified assumptions does not seem like a good basis to start opening up government data.
You three finally express the same thought which is summarized above. And well... that was my first reaction when the problem aroused. In my advice giving mission I'm in touch with the IT guy of the organization whom is pretty open minded, and clearly will to implement a real and sincere opening of data project. But when he came accross the desk of X which is in charge of geographical data, announcing the organization would open all data including geographical one, X's face turned green. Despite the bureau of the organization has decided to implement an opening of data, a tech guy coming in and howling that they are going to kill the golden eggs hen would have some effect on them ;-) We're OK to say we should open X's mind, but well... talkinh with IT guy he lead me to search if licensing options could help with the issue.
That's when I launched this discussion. All your answers helped me come back firmly on my first assumptions, and while I'm not in a position to teach them how they should behave as a public organization, I'll let them know my advice on how the best way to open data is : open !
I think that Heiki's sentence about the aim of licencing is very true and should be enforced by all means :
Free licences should aim to provide legal clarity, not make things fuzzier than they are.
And trying to use them to make some reuses difficult is clearly not providing clarity but provisioning for later lawyers disputes.
Now, I have some technical questions raised by Heiki's remark :
Sui generis database rights are a horrible mistake of the European parliament.
I went accros reading the 96/9/CE directive which instantiates this sui generis database rights, on the model of Intellectual Property as I understand it. Which aspect(s) are you considering horrible ? - the simple fact to have invented a new type of copyright ? - the fact that (article 7.4) the right applies independently of the copyrights that may be applied to the database or its content ? Thus making it impossible to evade it, unless, as you seem to say, we totally waive the database rights with a copyleft license (such as CC, PublicDomain, and eventually CC-BY, Open Licence) - why don't you include ODC-BY in the acceptable licences ? (I don't mind not using it, the question is just for the sake of precision) http://opendatacommons.org/licenses/by/summary/
Very pragmatic now : Let assume I have a set of meteorological data fitted in a text file (say a .csv). Am I correct in saying : - The Database is the csv file itself, the Content is the raw data itself. - If I put under a PD license the database, and a CC-BY the content this means : anyone can reuse the file, but as it's content is cc-by will have to attribute the work accordingly. Finally, I could have put under CC-BY both of them without changing many things ? - If I do the opposite : Db under CC-BY, Content under PD. anyone can reuse some parts of the data without having to attribute paternity. But if one wants to make advantage of the file (the Db) in its reuse, he would have to attribute paternity ?
I'm just trying to understand the licensing scheme and not really willing to do such licensing things ;-)
Michel
Sorry to jump into this discussion a bit late. I heard that meteorological data was mentioned in this thread but only just managed to register with the list. I'm replying to an earlier message because the discussion gets into quite detailed questions about uses of database licenses later on.
On Wed Dec 3 18:04:30 CET 2014, Michel Roche wrote:
I'm currently doing some research about the best way(s) to publish open data for a local administration. For many of the data they wish/can publish the license question is quite straightforward, but there's one database which raises much questions. I try to expose the problem as simply as possible:
1-the database : shape files with routes and POI about a region. 2-the context : the administration makes some money selling a walking guide using those informations 3-the aim : opening those data so that eventual reuses will publish correct information about pathes, dangers, etc. 4-the dilemma : open publishing those information may serve any competitor editor to build a competiting guide upon those data 5-the question : how licensing could help preventing such an usage while welcoming more friendly reuses such as : a promotional guide for hiking in the area that would reuse some of the data for the sake of promotion.
I understand the concerns around incorrect re-use of data. I think my employer tries to address that by using a license that allows re-use but requires attribution (CC BY 3.0) and one I'm not familiar with (NLOD) which includes a section on proper use.
http://met.no/English/Data_Policy_and_Data_Services/
As to points 4 and 5, I think it's possible to choose very open licenses that encourage re-use of data in ways that are beneficial to society. Merely requiring attribution may be enough to dissuade users of the data from simply repackaging and reselling it, since they would be required to acknowledge where the data came from.
I believe that I read somewhere that even that modest requirement puts off would-be users of OpenStreetMap data, for example.
I tried to find presentations from colleages that talk about the motivations behind making data available under open licenses. In the end, I found this one:
http://2013.data-forum.eu/person/kristin-lyng.html
It makes a point about the trade-off between revenue generation and the overall benefit to society:
"The institute decided to stop selling weather information that was produced by the core service. Hence, all data and products that you can see on yr.no as numbers, figures and animations are available as open data. The consequence was that we gave up a marginal income in favour for the society at large."
Unfortunately, public agencies are often required to raise funds for themselves by governments who want to be seen as being tough on public spending. However, it is good to remember where the money comes from in the first place, as others have mentioned, and this is also covered in the talk abstract:
"The new openness is about making the results of our work available to the public, the ones that funds our service. Our work should be available for re-use with no restrictions. We believe that allowing re-users to have free access to data will have a huge potential to become an important contribution to foster innovation and value creation in the business sector."
Regards,
David Boddie
Le 11/12/2014 17:07, David Boddie a écrit :
Sorry to jump into this discussion a bit late. I heard that meteorological data was mentioned in this thread but only just managed to register with the list. I'm replying to an earlier message because the discussion gets into quite detailed questions about uses of database licenses later on.
Thanks for your thoughts David. There's a real concern between the pure intention of opening data and consolidating self incomes for such organizations. Nevertheless, as it been thouroughtly pointed out in the discussion, open licences should be used to simplify things, and not to complexify them. The SA and NC clauses fall into the latter category, and if my first thought was not to use them, I now understand better why standing this point of view.
My final advice to the organization has finally been to use a Public Domain licence or an Attribution one when it was relevant. And for opening the data that rise problems, the answer will be in their "political" decision, where they'll have to decide between a small income for them and the wealth of society. there's even an intermediate choice that is to publish subsets of data that do not rise the said risk.
I like very much the text of the talk abstract you cited and will forward it to the organization.
Michel Roche
Michel Roche pichel@vercors-net.com writes:
My final advice to the organization has finally been to use a Public Domain licence or an Attribution one when it was relevant.
I think there is no problem for this data having a license with attribution as a condition.
Thank you very much for consulting us and for working to convince publicly-funded organisations to license their products respecting the public's freedom.
Le 11/12/2014 17:07, David Boddie a écrit :
Sorry to jump into this discussion a bit late. I heard that meteorological data was mentioned in this thread but only just managed to register with the list. I'm replying to an earlier message because the discussion gets into quite detailed questions about uses of database licenses later on.
Thanks for your thoughts David. There's a real concern between the pure intention of opening data and consolidating self incomes for such organizations. Nevertheless, as it been thouroughtly pointed out in the discussion, open licences should be used to simplify things, and not to complexify them. The SA and NC clauses fall into the latter category, and if my first thought was not to use them, I now understand better why standing this point of view.
My final advice to the organization has finally been to use a Public Domain licence or an Attribution one when it was relevant. And for opening the data that rise problems, the answer will be in their "political" decision, where they'll have to decide between a small income for them and the wealth of society. there's even an intermediate choice that is to publish subsets of data that do not rise the said risk.
I like very much the text of the talk abstract you cited and will forward it to the organization.
Michel Roche