Wikidata talk:WikiProject Books/2013

From Wikidata
Jump to navigation Jump to search

A book can be a work or a single edition of this work.

"Book" in the meaning of (single) edition (like 2nd ed., translation etc.).
Comments I

I suggest "title" is better for this purpose. It includes also essays or parts of a collection. --Giftzwerg 88 (talk) 16:46, 3 February 2013 (UTC)

I'm afraid "title" is nonspecific. --Kolja21 (talk) 22:16, 3 February 2013 (UTC)
At enWS, we have used the word "work" for generic approach, though also know that paintings can be "work" as well, and then an illustrated work could have many engravings within it that are works. Surely this has been thought through be greater minds, and we can just utilise their preferred terms.
Comments II

Hi all, I wanted to understand if we are following a schema or we are just proposing a brand new one. Librarians have designed different metadata models for books for centuries :-), so I was wondering if we want to import one (or more) of them in Wikidata. Think about en:Dublin Core (qualified), or en:MARC 21, oren:MODS. Moreover, we should synchronize the model with the template Book on Commons (well, maybe synchronize the template to this model). --Aubrey (talk) 21:08, 10 February 2013 (UTC)

I forgot to mention this: http://schema.org/Book. --Aubrey (talk) 09:52, 11 February 2013 (UTC)
Also relevant here is the focus of http://www.w3.org/community/schemabibex - to build a consensus around proposals to extend Schema.org to better represent bibliographic data. Many of the questions raised here have also been discussed in that group - there is potential for sharing experience - FYI I chair that group. Rjw (talk) 21:55, 14 February 2013 (UTC)
Hi Andrew, from what I have understood (and it is maybe wrong) we are discussing the properties for books, and for example Q170583 would be an Item, and will have the property title: "Pride and prejudice", author:"Jane Austen", and so on. Right now, we are choosing a model for those metadata, which we are going to represent every book. --Aubrey (talk) 11:46, 11 February 2013 (UTC)
I've also found this:https://www.wikidata.org/wiki/Wikidata:Infoboxes_task_force/works#Work_of_literature_.2F_Literarisches_Werk_.2F_.C5.92uvre_litt.C3.A9raire--Aubrey(talk) 12:01, 11 February 2013 (UTC)
The header here is "Book" in the meaning of (single) edition (like 2nd ed., translation etc.)., which is why I'm a little confused. "Book in the sense of overall work" makes more sense for us, I think! Andrew Gray (talk) 12:04, 11 February 2013 (UTC)
If you are thinking at the en:FRBR model, yeah, maybe work is the best expression. I still think that the data model should be FRBR-ambiguos, though. For example, if we create a Edition property, we could use it in Commons and Wikisource, which are different from Wikipedia and work with the manifestation and event item level. --Aubrey (talk) 13:49, 11 February 2013 (UTC)
Are there any downsides (besides complexity) of supporting multiple metadata standards? Wikidata concepts will always point to different FRBR levels. Even in a single language a page does not uniformly correpsond to either a work, expression, or manifestation. Also there could be a complication in that Wikidata conceptCcould point to language X and language Y where language X 's page is a work, and language Y 's page is an expression. There for I would propose that we have multiple properties; a Dublin Core property a MARC 21 property, a Commons Book Template property. Maximilianklein(talk) 14:33, 11 February 2013 (UTC)
I'm not sure how useful it would be to have different properties for different metadata standards. But myabe I'm not understanding your point of view, Max. As I said before, we do not have to think to FRBR, designing this Book properties. We'll define a set of potential metadata, and in my understanding, we'll need a schema complex enough to fully support Dublin Core and the core MARC properties. Wikidata should host the richest description for books, and every project should use the metadata and the properties that they need (eg. Commons and Wikisource need the Edition, Wikipedia often doesn't). We are now defining the metadata for Wikidata, and thus for all the project. Max, can you explain to me what you plan for DC, MARC properties? Aubrey (talk) 16:38, 12 February 2013 (UTC)
@Andrew: I've changed the note "A book can be a ..." back to the old version. Hopefully it's less confusing. --Kolja21 (talk) 00:07, 13 February 2013 (UTC)
Thanks - that makes a lot more sense. In which case, I guess we'll need a property work= to tie editions (for citations, WS, etc) back to a main concept. For the moment, it'd probably be best to try and see if we can map something like w:Resource Description and Access standards across... Andrew Gray (talk) 20:11, 14 February 2013 (UTC)
Under the schema we also need to remember that in the 19thC that many of the books were published in volumes, and even these volumes could be spread over years. Then the work could be republished with all the volumes merged into the one edition. Similarly, a work by Author A in language M, is then translated by Author B, to language N; and nn years later the work is translated by Author C to language N. All needs to be manageable. I suggest to think of the most convoluted work with multiple variations of the publishing details. Re the datasets, it is quite important that there is the ability to link back to these publicly available catalogues.  — billinghurst sDrewth 13:25, 6 March 2013 (UTC)
Comments III

Could I ask (sorry for my ignorance) which are the forseen purposes of the schema we are designing? I imagine we need these metadata for at least three different tasks:

  • metadata about books used in Wikipedia articles infoboxes. We'll need to detail all the realtions in Q170583, and those metadata will be used in Wikipedia article en:Pride and prejudice.
  • metadata about books used in Commons (now with Book template) and thus Wikisource. Mind that maybe we'll need to add metadata aboout djvus and pdf, that is thescans of the books.
  • (optional) metadata about books used in the citations, for references and bibliographies in Wikipedia article.

Am I right? Do I miss something? --Aubrey (talk) 10:43, 13 February 2013 (UTC)

Moreover, Kolia21, could you explain us the difference between these proposals andthis? I'm a bit confused.--Aubrey(talk) 10:50, 13 February 2013 (UTC)
Hi Aubrey, the Wikidata:Infoboxes task force has started end of November. Later an other editor started "Wikidata:Property proposal". So we have two pages with the same goal. Multiply these pages with your three different tasks and you see how difficult it gets. My proposal was to wait with phase 2 till the software is ready and we have developed a concept, but now it's the other way around and we have to make the best out of it. --Kolja21 (talk) 10:16, 14 February 2013 (UTC)
Ok, thanks Kolia21 for the answer. I'm still confused about the scope and boundaries of the "book model" we are creating, because I don't understand which pages will be created in Wikidata regarding books. Will we create just a page per "work"? Or different pages? Will we have a page both per work and different expression and manifestations (like translations and different editions)? ASAIU, this matters for creating the properties.
Beside that, my idea is to have the richest BUT simplest book model possible (we don't have to replicate MARC21) for the wikiproject to use. Wikipedia infoboxes about books are the simplest case: it is often about the general "work", so the metadata are less.
But we'll have also Commons, and I think we should create a property for every field of the Book template:
  • Author --> Creator
  • Editor --> TBD
  • Translator --> Translator
  • Illustrator --> TBD
  • Title --> original title
  • Subtitle --> TBD
  • Series title --> TBD
  • Volume --> TBD
  • Edition --> Edition
  • Authority control --> ISSN, ISBN, OCLC, VIAF
  • Publisher --> Publisher
  • Printer --> TBD
  • Year of publication --> Date of publication
  • Place of publication --> Place of publication
  • Language --> Language
  • Description --> TBD?
  • Page overview --> TBD?
  • Source --> TBD?
  • Permission --> TBD?
Another possibility is to have a different structure of the Wikidata page. (don't know if I'm doing this right, so please correct me if wrong).
We could structure the Wikidata book page as being a default "work", as in Der Prozeß, with all the interwikis and aliases.
Then, we could have a property which says "has translation", and that would create a subsection with all the metadata related to that specific edition of the translation. The same thing could happen for a particular edition (non a translation, just an edition).
In this way, we could be able to add metadata related to single book scans, and that would solve the Wikisource and Commons issue. What do you think? --Aubrey (talk) 11:09, 18 February 2013 (UTC)

Let me be more clear about the confusion which is going on here (also in my brain). We are designing a set of properties for books, but we are (I, at least) confusing the levels of the books. We often think at them as works (again, look at w:FRBR model): it is the case of Wikipedia, for example. Look here: Q274744 is a wikidata page for a work called "Sense and sensibility", which has been translated and published (and scripted and directed) many times. His infobox on the Wikipedia page has these fields:

  • Author(s): Jane Austen
  • Country: United Kingdom
  • Language: English
  • Genre(s): Romance, Novel
  • Publisher: Thomas Egerton, Military Library (Whitehall, London)
  • Publication date: 1811
  • OCLC Number: 44961362
  • Followed by: Pride and prejudice

But, in Wikisource, there is the text of the book itself, and on Commons we have 3 different files for 3 volumes ([[:commons:File:Sense and Sensibility (Volume 1).djvu |1]], [[:commons:File:Sense and Sensibility (Volume 2).djvu |2]], [[:commons:File:Sense and Sensibility (Volume 3).djvu |3]]) of a particular edition of the book. The volume 1 of that book (as a manifestation of a work), has the following metadata (in template Book):

  • Author: Jane Austen
  • Editor:
  • Translator:
  • Illustrator:
  • Title: Sense and Sensibility
  • Subtitle:
  • Series title:
  • Volume: 1
  • Edition: First
  • Authority control:
  • Publisher: T. Egerton
  • Printer: Whitehall
  • Year of publication: 1811
  • Place of publication: London
  • Language: english
  • Description: This is the scanned DjVu text Sense and Sensibility (Volume 1) by Jane Austen.
  • Page overview:
  • Source: Internet Archive
  • Permission: Public Domain

Many metadata match, so it's ok, but I still think we should define a way how to handle different levels of books while we define the properties. Otherwise it will be a mess later on. --Aubrey (talk) 10:04, 19 February 2013 (UTC)

As I see it the metadata intended for infoboxes should match the FRBR level that the Wikipedia page is describing. For citations in articles it would be good in my opinion to use the source component of the properties. In this context a source is yet another triple whose datatype could be a link to another Wikidata item. Imagine if Wikidata did a full import of some Library catalog. Then your source would be another Wikidata item, which is actually an import of a Library catalog item - and the metadata used would be that of the Library's who you imported. Maximilianklein (talk) 09:05, 21 February 2013 (UTC)
I think it would be very cool to import library record (and elaborate ways to use them as source). So we need a metadata that states the "level" of the book recorded (FRBR, maybe?).
Anyway, I tried to draft a mapping here: https://docs.google.com/spreadsheet/ccc?key=0AlPNcNlN2oqvdFQyR2F5YmhrMWpXaUFkWndQWUZyemc#gid=0 Please give me feedback. I'm really interested in develop properties as "subject", "mentions", "genre", "keywords"... That would help us a lot in stating the "aboutness" of a resource, and this could be extremely useful. Moreover, the "mentions" can be used also right now for Wikisource, which often has proper templates to cite other works or authors (see s:it:Template:AutoreCitato).--Aubrey (talk) 14:54, 25 February 2013 (UTC)

Properties discussion

  • Country: is ambiguous (and a redundant to Place of publication). I think we don't need this property.
 Support I also think we don't need it. Aubrey (talk) 12:14, 22 March 2013 (UTC)
✓ Done I removed "country" from the list. --Kolja21 (talk) 14:02, 25 March 2013 (UTC)
  • Place of publication:
✓ Done --Kolja21 (talk) 14:02, 25 March 2013 (UTC)
  • Publisher: what happens if a company has a name change (the label of an item can change)?
Well, I don't think we care. We'll have a page for that pablisher which says that it changed over the years :-). Aubrey (talk) 12:14, 22 March 2013 (UTC)
+1. I don't think such a change would be a problem - the name of the publisher-item would change and so all properties. Alternative, so also old, names should be listed in the publisher-item. --Don-kun (talk) 18:51, 7 April 2013 (UTC)
A link to commons would be better than choosing one picture of many possible. --Don-kun (talk) 18:51, 7 April 2013 (UTC)
As you know, we'll have time to discuss Authority control properties. Aubrey (talk) 12:14, 22 March 2013 (UTC)

--Kolja21 (talk) 06:51, 16 March 2013 (UTC)

Do you mean the number of all volumes of a work or the number of a particular volume, wich is the topic of the item? --Don-kun (talk) 18:51, 7 April 2013 (UTC)
I mean the number of a particular volume.--Snaevar (talk) 11:06, 9 April 2013 (UTC)


Moreover, I'd like to propose these properties:

  • Wikisource: it's the URL for the digitized book on Wikisource. It can be ns0 or nsIndex.
  • Commons: it's the URL for the digitized scan on Commons. It could be a File or a category.
  • awards: it's a parameter in the English Infobox. ✓ Done
  • mentions: I think this could be particularly useful, as it could exploit data generated by template like it:s:Template:AutoreCitato. A book often mentions other books and authors. On the Italian Wikisource we have templates to state when a author or a text is cited. We could use these data to have on Wikidata and updated list of Auhtoprs cited and Books cited. Aubrey (talk) 12:25, 22 March 2013 (UTC)
Wikisource and Commons should be links, not properties. --EugeneZelenko (talk) 14:55, 4 April 2013 (UTC)
+1. And mention may be a good idea in the future, but it seems to be a lot of work to collect mentions of books. So maybe we should wait to use such a property and concentrate on other things. --Don-kun (talk) 18:51, 7 April 2013 (UTC)
Ok for wikisource and Commons: we'll probably have a chat with the WD team, so we'll figure out with them. As for mentions, well, I think that the property would do no harm, as it is something that could be used in other ways (even Wikipedia entries "mention" books and documents, as citations). I agree that right now this property would maybe benefit the only italian wikisource (because I'm sure a book can be written to list in Wikidata all the authors mentioned, parsing the categories). But I'm also sure this is a fundamentally good idea, that would be better spread all over other Wikisources (which maybe have a similar system). I have no rush for this kind of property (there are other priorities), but please keep discussing it :-) Aubrey (talk) 14:34, 8 April 2013 (UTC)

Missed a property like "dcterms:subject", so added Property:P31 which comes near to that. As a side-effect, awards can be added with that property.Dr0i (talk) 08:19, 17 September 2013 (UTC)

Sorry to come late in the discussion, but we need a country of publication. This is specially important regarding the copyright of a work. Beside, the city of publication is often not known, although the country is. And a work could be published in several places in one country, e.g. once in New York and once in San Francisco. And last, I have seen the place of publication to include several countries (PAL regions). Yann (talk) 16:57, 14 November 2013 (UTC)

First of all, a definition for Book entity

Excellent idea Aubrey! Ubi major minor cessat, so I'll only contribute in some detail. I encourage the group to define exactly what's a book "entity" (or a "written work" entity?) and what's its best label and description content. --Alex brollo (talk) 15:12, 6 March 2013 (UTC)

This is a tricky issue, because, afaiu, we in Wikimedia projects use books in different perspectives.
FRBR is a theoretical frameworks which tells us 4 main "views" of the Book: work, expression, manifestation, and item.
Wikipedia says:
Group 1 entities and basic relations (RDF version)
  • Work is a "distinct intellectual or artistic creation." For example, Beethoven's Ninth Symphony apart from all ways of expressing it is a work. When we say, "Beethoven's Ninth is magnificent!" we generally are referring to the work.
  • Expression is "the specific intellectual or artistic form that a work takes each time it is 'realized.'" An expression of Beethoven's Ninth might be each draft of the musical score he writes down (not the paper itself, but the music thereby expressed).
  • Manifestation is "the physical embodiment of an expression of a work. As an entity, manifestation represents all the physical objects that bear the same characteristics, in respect to both intellectual content and physical form." The performance the London Philharmonic made of the Ninth in 1996 is a manifestation. It was a physical embodiment even if not recorded, though of course manifestations are most frequently of interest when they are expressed in a persistent form such as a recording or printing. When we say, "The recording of the London Philharmonic's 1996 performance captured the essence of the Ninth," we are generally referring to a manifestation.
  • Item is "a single exemplar of a manifestation. The entity defined as item is a concrete entity." Each copy of the 1996 pressings of that 1996 recording is an item. When we say, "Both copies of the London Philharmonic's 1996 performance of the Ninth are checked out of my local library," we are generally referring to items.
In my understanding, Wikipedia often views books at the work level. At that level, you don't have a publisher or an edition. But in Wikisource (and in Commons), we view them at the manifestation level: a single edition, a single translation, with a publishing date, an editor, etc. We can have different editions of the same books, because we have scans, and we have them on Commons. So my mapping wanted to just collect all the metadata needed, without thinking about the FRBR level. I just picked the templates we used in Wikimedia and list the set of core properties needed. Wikidata will be FRBR-agnostic, I think: we will have a set of properties you can use on Wikidata, but if you speaking about Hamlet (the work) is one thing, if you are speakink of a 1922 Italian translation of Hamlet is another. Ho we will really structure all these things is yet to be decided. --Aubrey (talk) 15:35, 6 March 2013 (UTC)
Isn't the djvu/pdf files on Commons and Wikisource items rather than a manifestation ? (I'm not sure to understand the difference but the files on Commons and then use on Wikisource come from a specific library via a specific source often with a specific stamp).
In any case, a work can be/is usually defined by is first item, no ? (and in addition with some other significant item but the first one is pretty always significant)
What about the languages too ? Hamlet or Romeo & Juliet are « universal » works ; the wikipedias articles speak about the work but usually says « first present/publish in 16.. » (refering to the first items in English and sometimes addings the first in the local language). So Works are « translingual » and Items are « language specific » (instruments specific for musical works), what about Expressions and Manifestations ? How does it works ?
Cdlt, VIGNERON (talk) 16:22, 6 March 2013 (UTC)
In my view, the relationship between works and book is (unluckily) many-to-many. Having a list of works, they are collected into many kind of books: single work books, collections of many works into a single book, or a single work can be published in different books.
So, I think that al least two different kinds of entities should be created:
  1. at work level (author/authors, topic/topics, original language/languages .....)
  2. at "book" level (author/authors, work/works, editor, publisher.... see the table of Aubrey, I presume that Commons template :::Book could be very useful, but it only covers in a unique set only data for proofread books).
Nevertheless, there's one more complication, since a single "work" is sometimes a collection of different works. Collections of novels and of poems are a typical example. Anyone of Shakespeare sonets is a "work", just as any novel of our Giovanni Verga. And... in some cases is far from simple to avoid controversies. Doesn't Inferno of Dante Alighieri deserve it's place into entities and its unique wikidata ID? --Alex brollo (talk) 23:22, 6 March 2013 (UTC)
I would say that Wikisource books are really "manifestations" - they're used as examples of that edition even though they're scans of a single physical copy. A true item-record would be Q80935 - only one exists! Andrew Gray (talk) 23:39, 6 March 2013 (UTC)
I agree with Andrew. VIGNERON, "items" are single, physical istances of books. For example, 2 different copies of a 1918 edition of "I Malavoglia". This is actually important for scans: you can scan a book in good contitions, and one in bad conditions. But, ASAIK, right know we never keep different scans of the same edition of a book! We just keep the best one. This is why I think we work on "manifestations". Aubrey (talk) 09:47, 7 March 2013 (UTC)
So. I presume we need a double set of entities ("works" and "manifestations") and a table matching the two kinds of entities to record their many-to-many relationship. Any of you can give me a link to understanf how a many-to-many relationship cam be implemented here into Commons?
Another many-to-many table is needed to record the relationshiop authors/works. Alex brollo (talk) 13:23, 8 March 2013 (UTC)
Wikipedia may not even be on "work" level. I have tried to model the book The Skeptical Environmentalist. It exists in both Danish version(s) and an English version(s?). It might be that FRBR would regard the Danish and English versions as separate (but derived). — Fnielsen (talk) 11:11, 14 March 2013 (UTC)
Q1755935 is about the work Verdens sande tilstand. On the long run we also need items for the different editions (Danish, English with ISBN ...). I wouldn't use the term "book", because both (work and editions) are books. --Kolja21 (talk) 03:48, 15 March 2013 (UTC)
Difference between a work and a book is deep and tricky to implement fully into a database.
It's obvuoiy, but allow me to mention some possibilites.
Some works are splitted into different books. Most works have one author, some works have more than one.
Some books collect different works by an autor. Many books contain works by different authors. A work can be translated into a variety of languages. Books containing the same work in the same language differ - sometimes substantially - in ortography and punctuation, after editor's critical work.
I don't see other solutions, but a double level work-book entities, where book is the entity that can be identified by a ISBN. "Works" entities should be unique, "abstract", language-indipendent, linked by a many-to-many relationship to books, a completely different set of entities, and linked with a many-to-many relationship to the set of authors. --193.43.176.15 12:04, 15 March 2013 (UTC)
I see that I wasn't logged when I posted previous comment. I apologyze; it's mine. --Alex brollo (talk) 10:10, 27 March 2013 (UTC)

Bot specs

Firstly I wanted to mention that I think getting the "Work" properties right is going to be our most difficult challenge. That's actually something that goes further than books. Having a general Creative work set of properties that is not tied to the expression is important because that's the level that most Wikipedia articles are.

That being said, for Books, we will still need a good set of properties that are at the expression level. I have plans to write a bot that consults the infoboxes of the most major Wikipedias, and fills in non-conflicting properties, plus if an ISBN or OCLCnum is available, then using worldcat.org to find extra missing data. Are there any objections or suggestions about making a BooksBot like this? Maximilianklein (talk) 22:05, 6 March 2013 (UTC)

I think we should avoid importing ISBNs/OCLCnumbers for now until we've actually figured out work/expression differences and what entries we're actually going to use them on; WP uses ISBNs and OCLC numbers in a very mishmashy way. Andrew Gray (talk) 23:41, 6 March 2013 (UTC)
Identifiers are many and more, and I think that theoretically we could have them all. Obviously, they work at the "expression/item" levels, not at the work level. I still don't know how we will really sort it out, but I guess we'll have an entity for the work (eg. Wikipedia page) and an entity for the expression (eg. Commons scans, Wikisource index page). We will repeat the metadata, I fear (author, title, etc.)...
Another possibility would be to store all the metadata in one single page, (the "work" page) and than have a section, or whatever, that clearly states there is an "expression" level, and then the metadata would be added for that specific level. Or we could use a "has Expression" property... --Aubrey (talk) 13:24, 8 March 2013 (UTC)

archive.org API

Cute: http://want.archive.org/api?isbn=1930841426 --Nemo 21:15, 8 March 2013 (UTC)

Notability

Right now only books that have an Wikipedia article are notabile. I think we should expand the rules. Proposals:

  1. Every edition (2nd edition, translations etc.) is notable: too early
  2. Every work is notable: too early
  3. Every work is notable, that is used as a source: imho a good start

There is already a collection of no. 3 items, see de:Kategorie:Vorlage:BibISBN. --Kolja21 (talk) 07:04, 16 March 2013 (UTC)

Good idea. I would also add that every work/edition which is present on Wikisource and Commons can stay on Wikidata. But right now, it is not implemented. --Aubrey (talk) 11:37, 18 March 2013 (UTC)
I would also include all (or most) books which we have on Wikisource, or Commons. One of those years those projects will be supported by Wikidata too. --Jarekt (talk) 19:47, 18 March 2013 (UTC)
There are plans to use Wikidata for citations on Wikipedia. For that, solution 3 would work, but does not seem practical. If we want to make serious use of Wikidata for citations, it means we have to create items for most of them. But restricting that to books used in Wikipedia sounds like an unnecessary maintenance headache. It would need to be updated each time a new book is used in Wikipedia, and also each time a book ceases to be used in Wikipedia, unless we have weird notability rules like "books that used to be cited in Wikipedia even if it was nothing but spam".
Using solution 1 or 2 seems much simpler. We could then upload book from external databases like worldcat. And provided we devise the right gadgets, when a Wikipedia user wants to cite a book, she can easily make make us of Wikidata, even when it is not yet cited in any Wikipedia. --Zolo (talk) 10:07, 19 March 2013 (UTC)
That will be o.k. with me, but I don't know if we will get a majority for that. Let's move this discussion to WD:N. --Kolja21 (talk) 15:30, 19 March 2013 (UTC)
Sure, done. --16:46, 19 March 2013 (UTC)
This section has been moved to Wikidata talk:Notability

Relations between books

I was reading this discussion (plese join), and I think we need to think about relations between different expression/manifestation of a work. I think that Pichpich is raising a paramount point. We'd need to be able to describe the relationships between different expressions and manifestations of books, this is very important (see w:FRBR for definitions). I'm not sure whether we'd need to create even FRBR propersties as "is manifestation", "is expression", ecc. I would say no. But we need (I think) to express that a book is a translation of another book: we have the translator property, but it's a property about a persona and a book, not about the book and its translation. AFAIK, we are not discussion whether:

  • we'll have a page for each book (intended as a manifestation). It's sort of the Openlibrary structure.
  • we'll have a page for just the work and then create a lot of subclasses within that page for every "expression/manifestation". I think I disagree with this.

Someone of you is familiar with the openlibrary logic structure? We could study that and see what fits within Wikidata (unfortunately, I know for sure that that project is stuck at the moment: we could even think to reproduce (some of) it's data here). We have the opportunity to create a detaild netowork of books, we should not waste it. --Aubrey (talk) 11:02, 29 March 2013 (UTC)

Open Library has two pages:
1. The Autobiography of Alice B. Toklas (Q6618986) as a work (/works/OL37277W/) and
2. Published 1955 by Vintage Books in New York as single edition aka "book" or "media" (/books/OL15026461M/)
Imho we have to do the same. It helps citing a book. Unfortunately Wikidata is missing basic functions like controlled vocabulary (it would be nice creating properties with limited answers like: "work" and "book") and reciprocal linking. Even though Q6618986 has been written by Q188385 (G. Stein), the work does not appear on her page. --Kolja21 (talk) 13:39, 30 March 2013 (UTC)
The main building block that we are missing is "reciprocal linking" or "bidirectional linking", with that feature it should be possible to have the pairs "has translation/translated from", "has edition/edition from", etc. In the mean time we could do it manually and fill the missing links with a bot. What do you think?--Micru (talk) 14:05, 4 April 2013 (UTC)
The development team has made it clear that they did not intend to develop biderectional properties anytime soon, and it seems that they would raise thorny questions. However, we will have the ability to make queries that provide the same information as bidirectional properties. Because of that, I think a "translation of" property would be enough. "Has translations" will be easily computable from it, and hardcoding, say a list of all translations of the Bible would be cumbersome.
As to the work / book distinction. That sounds rather appealing but I wonder how it would scale. Does that mean that every minor book with only one edition should have two items, one for the work and one for the book ? An alternative might be to use the first edition as a main item of sorts (for instance, it would be the one linked to Wikipedia articles). --Zolo (talk) 15:19, 4 April 2013 (UTC)
Ok, then we'll have to live with one-way links plus queries.
My proposal is to use at first an item as both, work and book representation, if further editions appear then we should make the a further distinction with "edition of" or "translation of". If it is done that way, then we might need a gadget to simplify the process. The process would be something like this: 1) first the work/book is represented as one item with mixed data, that is work data and book data 2) when a user wants to add a translation/edition then the gadget divides the original "work/book item" in two and creates a new one where the user enters the data related to the new edition/translation.--Micru (talk) 20:25, 5 April 2013 (UTC)
OL does create a Work page for every book, even those that only exist in one published form. The stats from WorldCat indicate that about 85% of titles have only one published form. My preference is not to create a Work entity but to treat Works as clusters of published books, although I can't say today how I would do that. Note that having a separation between Work and Book/Edition can complicate search and display. In OL, if you search on something that is Edition-specific (e.g. Publisher), the display still shows you the Work and all of the Editions -- exactly what you were trying to avoid by searching on a specific publisher or a specific date. So think through the search and display before deciding that you want separate work/edition entities. Kcoyle (talk) 16:06, 2 October 2013 (UTC)
Hi Zolo. Mentioning also the thread below (Original title / vs title and similar issues), I want to make clear that I proposed the distinction work/manifestation mainly because I'm thinking about sister projects like Commons and Wikisource. Me and Micru will probably have a chat with the Wikidata team regarding the grant we won (yeah :-): it is assumed in fact that per WD:N only Wikipedia articles will have a wikidata entry, but I think it will be easy in the following month to establish a similar convention for sister projects (if it's in Commons/Wikisource, then can be in Wikidata). Thus the point is to define properties that will manage the different "status" of books, which are intended in a certain perspective in Wikipedia but in a different one in Wikisource and Commons... Aubrey (talk) 20:48, 7 April 2013 (UTC)
Congrats for the grant, that seem like a valuable project. The notability policy was explicitly conceived for phase 1 and clearly cannot do for phase 2.
I am not sure that sister projects integration fundamentally affects the requirements for Wikidata. Wikipedia also needs information about both a text in general (for infoboxes) and for a particular edition of a book (for citations). Or did I miss something ? --Zolo (talk) 07:42, 8 April 2013 (UTC)
Well, that's a thing I haven't understood yet. Will Wikidata store metadata for citations in references and bibliographies? Is someone working on those? That would be paramount, but IMO extremely difficult to do well. We should work at high granularity of metadata, and I haven't seen discussions on WD about these topic. If you know some discussion, please point it out. For now, I am assuming that we are working for infoboxes in all wiki projects, and this is how I developed my mapping... Aubrey (talk) 07:24, 9 April 2013 (UTC)
I can actually reply myself, look at this page. Aubrey (talk) 07:26, 9 April 2013 (UTC)

I think that we are tackling 2 most important issues here:

which structure do we want to reproduce for book?
    1. One page for the "work" (eg. Wikipedia) and one page for the "edition/manifestation" (eg File on Commons, Index on Wikisource).
    2. a unique page, which will store all the data of the work and the editions too.

I prefer (at least at the beginnign) the first option. For this, I am assuming for now that Wikidata will host data for infoboxes, both for Wikipedia and Sister projects. Thus, I am assuming that it will be easier for the WD team to integrate metadata from Commons templates in WD, creating WD items called, for example, File:FILE.djvu, etc. We still don't know what they plan, but tomorrow me and Micru will have a chat with them and we'll know more.

This structure though is similar to the OpenLibrary structure (one page for work, one for every edition) and moreover it is in this direction that brand new bibliographic framework as bibframe are being developed (source: Karen Coyle). Please bear in mind that thus we don't expect a billion WD items more, because on average we have a single scan of a single book (maybe, we can have more than one scans, but it's rare).

Of course, we can create properties/link to connect all the editions of a single work (connect between them, connect to the work itself), as viceversa. We thus can use new properties/links to solve this "many to many " relationships (the one Alex was talking about few weeks ago).

The other main issue is:

will WD store also bibliographic metadata of cited books?

This is what is discussed in Wikidata:Sources. I don't have an answer yet. I think this matter a lot becaus it will affect the number of "edition" pages on WD, thus the structure we want to use with books (option 1 or option 2). Please give me a feedback: tomorrow we will discuss with the WD team these 2 issues. Aubrey (talk) 14:30, 11 April 2013 (UTC)

If I had to choose one of the two systems, the first one would also sound simpler to me.
It is rather hard to know what will come out of Wikidata talk:Sources, as not many people take part. user:Snipre has asked the development team about some feedback about that too, but did not get any so far.
About using them for in Wikipedia's cite books, I suppose you can ask for opinions in Wikipedias. My bet is that if that could be made in an user-friedly way, that would be very helpful. --Zolo (talk) 16:45, 11 April 2013 (UTC)
Few notes after an online meeting with Wikidata team
  • WD team will tackle Sister projects, one by one. They will start in few months, after the end of the present phase. they will start with simpler ones, eg. Wikivoyage.
  • They don't knoow which "structure" the Commons file will have in Wikidata. We discussed with them the option 1 (above), but they say it's unlikely to have a WD for every manifestation. We then discussed more indetalis and they understood better the concept involved, and say that it's maybe doable. But it also depends by the community.
  • They say a lot of metadata will be stored in Commons and Wikisource, not everything will be in Wikidata.
This means that they will provide tools and stuff, but many data will remain in Sister projects. This is somehow unexpected, but not bad per se. We need to stay tuned on this.
  • It is not yet certain if Sources were to be stored in Wikidata or not.
  • The only Wikidata ID will be for pages. There will be for subpages/sections. Aubrey (talk) 13:32, 12 April 2013 (UTC)
Sorry but I don't understand this sentence It is not yet certain if Sources were to be stored in Wikidata or not: What do you mean by sources ? At the end what kind of information can we put in the source section for each statement ? Snipre (talk) 13:13, 16 April 2013 (UTC)

Original title / vs title and similar issues

As suggest in the above thread, when a book has translations, translations should have their own item. That makes properties like "original title" rather odd. The title should always be the title of the item. If the item is the original edition, it should be the original title, if it is a translation, that should be the translation's title and we can easily find the original title by following the link in the "translation of" property. So I think that what we need are general "title", "subtitle" and "language" properties with wide applicability, and qualifiers for fringe cases. --Zolo (talk) 07:19, 6 April 2013 (UTC)

The title is alway the title. But for people who are interested in The Diary of a Young Girl "original titlel", "original subtitle" and "original language" are important properties. "Translations should have their own item"? Maybe, but right know it's not even allowed to create an item without an Wikipedia article (WD:N). --Kolja21 (talk) 21:36, 6 April 2013 (UTC)
What do you mean " title is alway the title." ? We cannot always store all relevant data about the title in the label, so we need a property for it, and for the subtitle as well. That is true whether the title is the original one or not. My point was not that the original title is unimportant, it was that storing it in a separate property is not a good way to handle the issue. It is fairly clear that the notability guidelines have to be changed, I do not think it makes sense to base our data structure on the current ones. --Zolo (talk) 06:53, 7 April 2013 (UTC)
What titles are you talking about? And in what languages? One or multiple? --Kolja21 (talk) 17:41, 7 April 2013 (UTC)
I think I was fairly explicit: I was saying that items about a particular edition should just store the title of the edition, that clearly means in the language in which it was published. If the item is about the text in general, it can have several titles, and the original one should be marked as original. But it should be done with a qualifier, doing it with a specific property makes the structure more complex and offers no benefit. --Zolo (talk) 19:37, 7 April 2013 (UTC)

Pages number

What means Pages number, the total number of pages ? only numbered pages (including page numbered in roman ?). Note than pagination in bibliographical record are often in the form pp. i-xii, 5-362. It look like pages number will be the number of pages in the commons items but it's not very useful for most purpose, except to describe the commons items itself. This question is related to the above talk, what are we describing here ? A manifestation, a physical book item, or a common item which is tied through a scan to a physical book item. (some item on common are build from different scan physical book, because of different damaged page in each scan available). Phe (talk) 15:35, 8 April 2013 (UTC)

The properties are taken from en:Template:Infobox book so "pages" mean total pages. As far as I know you can enter both types: "xi, 200" or (total) "211". Page number for referencing is Property:P304. --Kolja21 (talk) 17:36, 8 April 2013 (UTC)
I have changed it to "Total number of pages" to make it more clear.--Micru (talk) 18:00, 8 April 2013 (UTC)

Collection/series

Some items are subdivided in Period then Series, e.g. in the few hundred book "Revue des deux Mondes", you can get a volume Period III, series 2, volume 45 and a Period IV, series 2, volume 45, how to identify this ? Looking at Dublin core I see only a Coverage which can be used for period. Phe (talk) 15:47, 8 April 2013 (UTC)

What do you think about using a 3-level item approach? There would be the general item "Revue des deux Mondes", then an item for the period "Revue des deux Mondes - Period IV" (series: "Revue des deux Mondes") and an item for the series "Revue des deux Mondes - Period IV - series 2" (series: "Revue des deux Mondes - Period IV").--Micru (talk) 17:58, 8 April 2013 (UTC)
.I think that everything depends on what Wikidata will store, that is which WD we'll create for Wikipedia and sister proejcets. Above there's a discussion on that. --Aubrey (talk) 14:38, 11 April 2013 (UTC)

Local data storage/manipulation

Previous talk "Pages number" is inspiring about a general need of something that should be built, and that IMHO is too much granular to be hosted into wikidata, but is too complex to be implemented into regular project pages. Really, any book is a complex "data container" that needs a formally decent and fast database approach to be decently managed, both when editing (think about the complex relationship between links in nsPage and their conversion in ns0 running links) and in view mode (think about internal and external quotes, glossaries, analytical indexes, and so on). While wikidata seems an excellent project to manage high level, general database for high-level data, I feel that it is not an appropriate container for such low-level, project-specific data. I presume that other specialized "sister projects" have the same need that I feel working into wikisource.

If I'm true, it would be very interesting to export some of wikidata principles into local, simplified but well-structured database structures, easy to implement and to update, fast, and server unexpensive.

Moderately complex sets of data (some thousands) can be implemented roughly into normal wiki pages (as json structures to be read by javasctipt; as templates containing mega-switches; as tables of labeled sections (very server-expensive!); as structured texts; as data attributes of html tags) but such data structures are far from effective, safe, and easy to be built and updated; they are far from standardized and almost unusable by other projects. A major drawback is that any minor update of such data-container pages implies the storage of the whole page into page cronology - a terrible space wasting when many, frequent, minute updates are done into an heavy page.

I presume that it would great if wikidata experience would be used to suggest some standard for local data storage and managing. I think that an interesting step should be to add a special, optional nsData, with no cronology or with a differential cronology mechanism, and to suggest one, or few, standardized data structures and editing tools to be used in such special pages. --Alex brollo (talk) 22:17, 10 April 2013 (UTC)

Hi Alex. Well, this is an interesting point of view, I hope to find the time to show your message to the WD team. I still think WD is a main opportunity for WS, but we should find a way to use it as it is, because no one will ever spend a cent to think for a Wikidata optimized for Wikisource. Not in this decade, I fear. But maybe your perspective can be implemented with qualifiers, or other tricks. --Aubrey (talk) 14:35, 11 April 2013 (UTC)
As soon as a good definition of what is a book, and what is a work will be gained, wikidata will host interesting and useful metadata about these two related entities, and this will be a great help to manage such entities and to share related general entities. I agree with you, after a minimal wikidata insight, that it's both unrealistic and uneffective the idea that minute, granular books and works data could ever be managed by wikidata - think simply to the growing need to buildbook-specific databases of words, to use them both as editing tools (as Distributed Proofreaders does) and as research tools for language analysis (an obvious and old side-result of written works full digitalization); or think about the deep, but extremely complex relationship between wikisource and wikiquote, that would need a perfect cross-linking between authors, specific sentencies, and their location inside specific works and books. This is why I feel that good, local, well-structured databases, easy to implement, to update, to manage should be thought, and that wikidata experience could be used and exported locally. I hope that database approach to data will be exported and used locally too, using a good, exportable schema, so that granular, local data could be shared. In brief I feel that wikidata could be both a data container and a shared ideas, methods and tools centralyzed container. --Alex brollo (talk) 05:22, 12 April 2013 (UTC)

Course of action

After the meeting with Wikidata (see above) we know that it will take a few months until it is deployed both on Commons and Wikisource. There are some tasks that can be done anyway:

  • Complete book data from Wikipedia. At least first from the English Wikipedia, where the Infobox book matches quite well the proposed properties.
  • In Commons:
    • Mass migration to the Template:Book, though this would have to be discussed
    • Complete book information with OCLC/OL/LCC IDs
    • Use standard templates for books imported from Google Books and Internet Archive (needed for eventual synchronization with this projects).

Aubrey suggested to create a mailing list to coordinate all these efforts and after that organize an online meeting if needed. The Wikisource vision work pages are almost ready for discussion, which means that when we start organizing the conversations about the WS vision, we could bring this tasks up to the different Wikisources.--Micru (talk) 18:02, 12 April 2013 (UTC)

Summary

Sorry but the whole discussion above is about what we can do, what we want ,... but no solution for the present situation. I propose a solution for now which can be update when new tools or features will available. 1) To have a item a work has to have a wikipedia article or to be used as sources in another item (in section Source) 2) Manifestation of a work can not have an item on WD: properties refering to a manifestation have to be added in the Source section of the item using the work as references. 3) WD won't integrate all data fom Common and Wikisource. Did I catch most of the conclusion ? Snipre (talk) 09:09, 13 April 2013 (UTC)

Just some clarifications:
  1. So far yes, 1 wikipedia article = 1 wikidata item, that is the rule until other sister projects are added in a few months.
  2. Manifestations won't have an item until wikidata is deployed either on Commons or on Wikisource (hopefully both). It is viable manifestations can have an item when that happens.
  3. That is very recommendable. There is information like file resolution, exif, etc that it is better kept in Commons and not in WD. That doesn't mean that the data won't be accessible. File data in Commons can be stored in a structured format (as Wikidata does) and linked. The data would be stored in Commons, but still "queriable".
--Micru (talk) 17:22, 13 April 2013 (UTC)
From what I see the general concept for sources is the next one: data about manifestation of work will be stored outside of wikidata, then using a tool these data will be added in the Source section of a statement when someone wants to refer to a specific manifestation. Snipre (talk) 08:39, 16 April 2013 (UTC)
There is no decision yet, so I opened a RFC to discuss them.--Micru (talk) 01:46, 17 April 2013 (UTC)

Collecting book data by bot

I just get a bot flag into Commons, and I'm writing a script to collect any useful data from it.source and Commons for djvu/pdf description pages of it.source proofread book. The idea is, to re-build - following a standard model - description pages. Collected and parsed data will enclose any parameter in Commons Information/Book templates and Wikicource MediaWiki:Proofreadpage_index_template; categories; PD templates; but I've to add too it.source Autore template parameters. The whole set of data will be wrapped into a python class Book. Presently data are very messy, I hope to get some useful result. I'll tell you any result (good, or bad...). --Alex brollo (talk) 22:28, 14 April 2013 (UTC)

That is an excellent idea to start cleaning up the data. I think Jarekt also did some work in that regard for the migration to the Commons template. But I fear that the OCLC codes will have to be added manually if there is no way to do it automatically. About the data storage in Wikisource, there is this GsoC proposal to store the data as JSON. Anyway, maybe we should have a talk about all this when you have time?--Micru (talk) 23:47, 14 April 2013 (UTC)
Do you know de:Vorlage:BibISBN? These pages should imho be transfered to Wikidata like pictures have been transfered to Commons. --Kolja21 (talk) 00:37, 15 April 2013 (UTC)
Hi Kolja21, yes, I'm aware of that template and also of the doi templates. I agree with you that Wikidata should hold that information and I left my proposal on the Sources discussion about how to do it using the Open Annotation Model. What do you think about using a "R" letter to define those kind of items versus using normal items with a "instance of: Reference" identifier? Remember that those items are not normal items in the sense that they don't represent neither a Wikipedia article nor the source itself.--Micru (talk) 02:02, 15 April 2013 (UTC)
I'm in contact with Jarekt - I discuss with him anything about Creator and Book templates. About OCLC codes: I don'yknow anything about them, nor my present aim is, to add any lacking data; the aim is only to explore and align data as they are, but the vision is, to use Book (and Creator) as "reference data container" for it.wikisource.
PS To parse template code fully, without limitations from any level of nested templates, I use a simple python routine parseTemplate(), that transforms any possible template into a python obiect (a dictionary of parameters and a list of parameter names in their original order). A second routine rewriteTemplate() rebuilds template code in a "beautified" style. If any of you is interested, python scripts are in subpages of commons:User:Alex brolloBot/Python scripts. --Alex brollo (talk) 04:59, 15 April 2013 (UTC)
@Micru. If you can convince the development team to create a new entity type S or R I think this will simplify the reference management. But this won't be so different from creating new item Q for each manifestation: you just change Q entity by S/R entity and the problem of the number of new entities to create stays the same Snipre (talk) 09:33, 16 April 2013 (UTC)

New RFC about references and sources in Wikidata

Following the discussions in Help:Sources and here, I have started a new RFC about sources and references in Wikidata to summarize the different options and to gather feedback from the community.--Micru (talk) 22:41, 16 April 2013 (UTC)

Another RfC

Micru created another important RfC: https://meta.wikimedia.org/wiki/Requests_for_comment/Interproject_links_interface Please join! --Aubrey (talk) 11:13, 24 April 2013 (UTC)

Listing all authority controls?

I know that there are more authority controls that have been created that relate to books, eg. Property:P409 NLA. Is the list selective or meant to be complete?  — billinghurst sDrewth 12:10, 26 April 2013 (UTC)

Originally it was a list of basic properties needed for books. You can expand it with an authority control section if you want, but since that section should be the same as Wikidata:List_of_properties#Authority_control, I would add only the properties missing in that list.--Micru (talk) 20:08, 29 April 2013 (UTC)

Different ISBN in different languages

The whole ISBN-thing makes me confused. The book The Hundred-Year-Old Man Who Climbed Out the Window and Disappeared (Q3793400) is originally a Swedish book. Should property 212 be the original Swedish ISBN? In my opinion, what use do people on any other (other than Swedish) Wikipedias have of the ISBN of the Swedish edition? The book have been released in English. Wouldn't the English Wikipedia want the ISBN of the first English edition? //abbedabbtalk 22:57, 28 April 2013 (UTC)

ISBN is going to be used for sources not for the item describing the general data about the work. As for each edition or each translation we have a new ISBN, you can't put all that information in the same item as main properties. Snipre (talk) 07:45, 29 April 2013 (UTC)

[Experimental] Possible solution for edition data

Here an experiment of edition data using qualifiers. Please do not use yet, as I would like to confirm yet if it could be a valid solution.--Micru (talk) 14:55, 5 May 2013 (UTC)

Interesting but for me that solution is only a home improvement: people will need to use a Q item and an edition code in order to source their statements instead of an unique identifier. I am not against your solution but this solution is not appropriate for the present structure of wikidata with a mixing of information for different manifestations and for their work in an unique entity.
But we have at least 3 possibilities now. Good work. Snipre (talk) 00:43, 6 May 2013 (UTC)
Well, there could be a default edition (maybe the first one, or depending on the language WP where it is being cited). I don't know if that would be possible, but it might be an option to explore.--Micru (talk) 00:56, 6 May 2013 (UTC)
If q-items are going to be used for sources, I think it would be better with q-items both for works and manifestations. The usual way of citing works is to the edition. — Finn Årup Nielsen (fnielsen) (talk) 12:16, 6 May 2013 (UTC)

I received the confirmation that it is up to us to decide how we want to do it. So we can have edition data in the same item, as different items, or a mixed solution with all edition data in the same item when there are not many editions, and editions as single items wherever it makes sense. My proposal is:

  • Work data and first edition information placed on the "ground" level of the item (using book properties normally)
  • Up to 10-15 editions stored using edition qualifiers
  • For items with a larger number of editions, split editions into separate items (property "edition of" will be needed), trying to keep editions in the original language on the parent item, and language editions grouped together.
  • Special cases where the edition is extremely relevant for citation purposes might have separate items.

What do you think?--Micru (talk) 12:53, 10 May 2013 (UTC)

  •  Support --Aubrey (talk) 13:24, 10 May 2013 (UTC)
    Too complex especially for data extraction in wikipedia: we have to choose one solution between your all-in-one-item solution or the multiple-items solution. And the choice has to be made by the community so I prefer to rewrite the request with the three solutions we have and to ask the community to choose. No mix between two or three solutions can be allowed because 1) it will too difficult for contributor to choose the correct one (most contributors will add only a few times some data and they won't spend an hour to read and to understand the rules on sourcing correctly the statement: they want a simple solution to apply quickly), 2) multiple formats are a problem for data extraction and infoboxes building: the lua code or the inclusion syntax should test all format combinaisons in order to be able to fill correctly the templates. Think: with your proposition each data can be stored in 3 different formats meaning that you have to detect from wikipedia which format is used in wikidata for every reference and if somebody mixes the formats (like put the autor property in one section of your multiple editions format) this will be a mess.
    So one format which is not a function of the number of editions and the simpler it is the better it is. I just want to remenber that we have other types of sources (newspaper or scientific article, report, TV or radio emission, web site... ) then the simplicity is the most important criterion. Snipre (talk) 07:10, 11 May 2013 (UTC)
    @Micru I just think about very special situation for book: work like The Lord of the Ring can have several first edition in one language because several publishers published the same text. So your solution is good for present works but for old works which have been published several times by different publishers in each language this is more difficult. Snipre (talk) 07:32, 11 May 2013 (UTC)
    @Snipre: the option with qualifiers does allow multiple first editions in different languages, but I notice that it is hard to tag one as the "main" one. Another way to keep consistent edition data could be to use a specific user interface, would you like to propose something in that regard? Feel free to start a new rfc or expand the one about sources if you think it is needed.--Micru (talk) 20:59, 14 May 2013 (UTC)

I have expanded the RFC about references/sources to decide how to store edition data.--Micru (talk) 13:24, 16 May 2013 (UTC)

Need of comment

As Micru says, we could wait for this property. --Aubrey (talk) 12:58, 24 June 2013 (UTC)

Quote property proposed for deletion

See Wikidata:Properties_for_deletion#Property:P387. Snipre (talk) 01:30, 7 June 2013 (UTC)

Books without authors

Magnus is doing a lot of useful things:

  • here is the list of all books without authors

http://tools.wmflabs.org/wikidata-todo/?action=noauthor

  • and here is a useful tool to be pasted in you common.js
importScript( 'User:Magnus_Manske/missing_props.js' )

With both, it's fairly easy to add missing properties. Please, mind that all the books there are on Wikidata are often of the "work" FRBR level, which means that not all the properties regarding to editions apply. --Aubrey (talk) 10:46, 12 June 2013 (UTC)

New batch of books/artwork-related properties for discussion

Your comments are always appreciated.--Micru (talk) 14:52, 26 June 2013 (UTC)

Proposal: property "included into" for work items

I would like to propose this new property for work items: "Included into", ie when a work is also part of some broader work. Like, a poem which the author included into a collection of poems. Both the poem and the collection are works, I think there are no doubts about that. For example, s:en:O Captain! My Captain! is "included into" s:en:Leaves of Grass. Same applies to novels which are part of a cycle. Candalua (talk) 13:10, 11 July 2013 (UTC)

I think the part of (P361) property is already really fine for that. TomT0m (talk) 16:01, 11 July 2013 (UTC)

Hum, ok. Looking at P361, it is not very clear that it can apply to creative work also. Maybe this should be added. And it should be added also in the work items properties. Candalua (talk) 19:25, 11 July 2013 (UTC)

It's a generic property, and it's meant to reprensent composition of any nature, so it's also meant to reprensent composition of works to build another work. TomT0m (talk) 23:01, 11 July 2013 (UTC)

The difference between Book (literary work) and Book (individual printed object)

I have added a section for properties related items about a single copy of a book. There a few of these in Wikipedia - I have included some links.

English Wikipedia has an en:Book article which mostly deals with the physical objects and a en:Literature article that deals with the creative works. I think we should probably try and split these concepts on Wikidata. This would mean that a lot of pages will have to be changed from 'instance of' 'book' to 'instance of' 'literature'.

What do you think? Filceolaire (talk) 20:29, 18 August 2013 (UTC)

True that we didn't have properties for single copies, thanks for updating that! It will be even more interesting for Commons, since there you can specify which volume from a library was scanned or which exemplar from a number edition is it.
There are two main problems: the first one is that the templates in Wikipedia combine different levels of metadata into one single template, which wouldn't be problematic if we only catered for one edition in one language, but we want to support all editions in all languages, so it will require to create a more intelligent infobox once properties from other items can be displayed in the article. The other problem is that it should be a system as simple as possible, and one that conveys the most without having to learn how it works. "Instance of book / edition" (and maybe a third level "exemplar"?) is quite simple, and even if it is not semantically accurate, I think it can be easily understood. If we wanted to use more precise words, we would lose that simplicity, so I am not sure that there are that many "single exemplars" in Wikipedia to compensate the effort.
Btw, I have migrated the first table to the multilingual format table, please feel free to migrate all other tables.--Micru (talk) 20:46, 18 August 2013 (UTC)
I think understanding that a novel is a novel which can exists in several editions and several physical copies is quite easy to explain and understand. TomT0m (talk) 21:04, 18 August 2013 (UTC)
@Filceolaire: The Books task force (like Open Library) uses the expressions "work" (literary work) and "edition" (individual printed object). --Kolja21 (talk) 01:10, 2 September 2013 (UTC)

IA ARK

I left a question at Property talk:P724. --Nemo 09:03, 23 August 2013 (UTC)

Ping? --Nemo 22:17, 10 September 2013 (UTC)

Is it correct that for a book and an edition, this value should be book and edition, respectively? Now examples in the tables say differently. --DixonD (talk) 18:34, 9 October 2013 (UTC)

What do you mean? The difference between work (unpublished; multiple editions etc.) and edition? --Kolja21 (talk) 12:25, 11 October 2013 (UTC)
Check the example for instance of in the table at Wikidata:Books task force#Work_item_properties. Now it says that <The Forever War> instance of <Category:Hugo Award for Best Novel winning works>. I'm just saying that it should be instance of <book> for all books. --DixonD (talk) 14:24, 11 October 2013 (UTC)

How many ISBNs we want to add?

See also: ISBN for different languages and formats

Regarding ISBN-10 (P957) und ISBN-13 (P212): How many and what kind of ISBNs we want to add? Popular works have dozens of editions with (of cause) dozens of ISBNs. Imho using qualifiers like "paperback", "hard cover", "2nd edition", "French translation", "original edition" etc. would be helpful. Should we write a list of ISBN qualifiers? BTW: French WP uses the parameters "isbn" (= French edition) and "isbn_orig" (= original edition). --Kolja21 (talk) 19:50, 21 October 2013 (UTC)

In my opinion major editions and translations should have their own item. For the edition type it might be good to have qualifiers for the isbn. Maybe expanding the constraints of distribution format (P437)? As for ISBN-10 (P957), all ISBN-10 can be converted easily into ISBN-13... do we really need both?--Micru (talk) 05:45, 23 October 2013 (UTC)
Strong +1. I don't think that an ISBN-10 property is needed and we should convert ISBN-10 into ISBN-13 in order to don't have two properties to check when we are looking for an ISBN. Tpt (talk) 06:41, 23 October 2013 (UTC)
There is already a property edition, a property language so no need of qualifier for original or Xth edition, neither for translation. And look at Wikidata:Sources to understand the system of work/edition items: each edition is described by a specific item. Snipre (talk) 09:18, 23 October 2013 (UTC)
+1 with Tpt: we should merge ISBN-10 (P957) und ISBN-13 (P212). Snipre (talk) 09:19, 23 October 2013 (UTC)
"Each edition is described by a specific item" is fine with me, but WD has already about 10.000 ISBNs. (BTW: I fought for the distinction of work and edition since the beginning of Wikidata, but I doubt the majority of users is interested in bibliographical details.) Also I don't mind merging ISBN-10 and -13, but again, first we need a cleanup and of cause an idea how the constraint violation would look like for an united ISBN, since we need a well-functioning error list. --Kolja21 (talk) 09:52, 23 October 2013 (UTC)
As contraints for an use of ISBN-13property:

Again, fine with me, but if you look at Wikidata:Database reports/Constraint violations/P212 till there was ISBN-10 (P957) user added the ten digit variant as ISBN-13. We can add a comment like: "ISBN-13 preferred. Please use a converter." Imho this is the correct way. A bot can once a while do the converting and empty the ISBN-10 property. (BTW: As constraint no 4 the checksum should be validated.) --Kolja21 (talk) 14:03, 24 October 2013 (UTC)

Sorry but I think I don't understand what is your problem: the only thing to do is just to treat the list of items which don't respect the constraints. I already tried to correct one item but the problem is the bot work: I delete a statement which was added in the wrong item but a bot added again the statement. So we have to find a solution with bot owners. Snipre (talk) 14:39, 24 October 2013 (UTC)
The question is: How can we achieve high standard bibliographic data? Since not only members of the Books task force are adding works and books (editions) it is not enough to use a property like ISBN-13 without qualifiers and hope one day everyone will be a librarian. I'm afraid WD is taking the same way as Open Library: Started with great enthusiasm and after the bots have imported millions of items (full of errors and duplicates like "our" ISBN-13) the data now rest in peace. In other words: Even if we correct a ISBN-13, that have been added without a qualifier, we still don't know if this ISBN fits to the item. (Example: WP articles are normally about a work. So we have thousands of ISBNs added to a "work" - not a single edition. Clear distinctions like ISBN-10, -13 and qualifiers like edition number (P393): "7th edition", "French translation" etc. are needed since everyone is welcome to edit WD.) --Kolja21 (talk) 15:58, 24 October 2013 (UTC)
1) Merge property ISBN-10 (P957) and ISBN-13 (P212), 2) ask a bot to convert ISBN numbers with 10 numbers into ISBN number with 13 numbers. Snipre (talk) 19:36, 24 October 2013 (UTC)
This strategy 1) merge, than 2) hope someone will do the clean up did not work. The test results are clear: Take a look at Open Library or P212 violations count: 6393 (> 50%). Nevertheless: I agree with you that ISBN-13 should be the first choice. --Kolja21 (talk) 05:11, 25 October 2013 (UTC)
Why should we use ISBN-13 in books from 1970? -- Lavallen (talk) 08:31, 4 November 2013 (UTC)

Authority file as (single) book number?

"Edition item properties" lists: LCCN, BnF and SELIBR. Imho all three authority file properties should be taken from this list since they are "only for authority control" (description P244). --Kolja21 (talk) 22:59, 3 November 2013 (UTC)

✓ Done Please revert if I was wrong. --Kolja21 (talk) 14:49, 4 November 2013 (UTC)