Wikidata talk:Requests for comment/Improve bot policy for data import and data modification

From Wikidata
Jump to navigation Jump to search

Comment 1[edit]

I thought we could write about what bots and tools should do and what we would like to have someday. The second would help those who write bots, tools and lua modules to know in which direction they should go. --Molarus 22:09, 20 November 2015 (UTC)[reply]

Bots vs. Semi-automatic editors[edit]

This RfC focusses on bot behavior, which—to my understanding—refers to bot-flagged user accounts that perform (semi-)automatic editing (cf. Wikidata:Bots). However, semi-automatic edits can be done by any (registered) user without many restrictions, for instance by using mechanisms like Widar and one of the many great tools written by Magnus and others. While I like these semi-automatic edit tools from the author’s point of view, they do of course come with the same problems that bots have as well: malicious edits can be performed at a (quite) high rate. Do we have any numbers that compare and contrast the amounts of undesired edits by bots and semi-automatic users? Based on these numbers, we might want to treat bot edits and Widar edits in a similar manner.

To my opinion it’s not the bots who are the problem here. Bots just do the edits they’re told to do, i.e. they need input by their maintainers in order to edit. While I’m sure that the vast amount of bot maintainers operate their bot with great responsibility (and Widar users do the same), it appears to me that the problems discussed in this RfC arise from either of the following two:

  1. careless bot/Widar run preparations (i.e. lack of detailed checks in advance of the run)
  2. different point of view of how to structure data

In both cases, a stricter bot policy would not really solve the problem. General Wikidata rules just do the job here. If a bot (or Widar editor) then repeatedly fails to comply with the general rules, we need to take appropriate actions.

What’s your opinion? Do I miss something here? How can we go about this problem? —MisterSynergy (talk) 15:36, 9 December 2015 (UTC)[reply]

Good points. I would hope that supported wikidata tools ought to implement/support all wikidata rules. Do we have a central place for such rules right now? Help:Contents has "guidelines", are they really our rules? Should we be engaging Magnus and others here? ArthurPSmith (talk) 16:19, 9 December 2015 (UTC)[reply]
To my opinion, this is not a technical problem. Thus, tools do not need to be updated to include “Wikidata rules” (which are not machine readable anyway). —MisterSynergy (talk) 19:40, 9 December 2015 (UTC)[reply]
The case of OAuth tool users was raised several times and vanished allways quickly in the archives... --Succu (talk) 19:14, 9 December 2015 (UTC)[reply]
Do you happen to know whether the amount of malicious semi-automatic (“OAuth”) incidents is of similar order of magnitude as bot incidents are? Would be worth to at least mention the problem in this RfC. —MisterSynergy (talk) 19:40, 9 December 2015 (UTC)[reply]
You mean dumb bot vs. dumb OAuth contributions? No. Casual errors are not fine, but tolerable in both cases, I think. Systematic or repeated errors are not. BTW OAuth are fully automated and bot like. --Succu (talk) 19:52, 9 December 2015 (UTC)[reply]
Well I meant “semi-automatic” because you need to prepare the edits manually. A bot can autonomously search for work to some extent, which would be a fully automatic edit procedure. Anyway, this is what I meant: dumb bot vs. OAuth contributions. —MisterSynergy (talk) 20:03, 9 December 2015 (UTC)[reply]
Both have to choosen (or accept) a job. Often this is the result of a category scan or a WDQ query. In both cases (bot owner / OAuth user) the responsibility for changes in Wikidata and the restrictions to do that should be the same. --Succu (talk) 20:14, 9 December 2015 (UTC)[reply]
This kind of manual semi-automatic edits have a tendency to spread some bad edits. For example the mirroring of symmetric relations. But I am not sure that have to be a bad thing. If not those edits would have been done, many of them would probably never have been detected. -- Innocent bystander (talk) 19:15, 10 December 2015 (UTC)[reply]
I do semi-automated edits myself, most tools have auto-controls, haven't tested all but imho tools like this one may be used for vandalism/promotion by anyone with basic excel knowledge-- Hakan·IST 14:37, 12 December 2015 (UTC)[reply]