Wikidata talk:Schemas

This page is based on two RFCs about data modelling on Wikidata[edit]

Just to note, this page was created based on two RFCs that dozens of people took part in

Wikidata:Requests for comment/Wikidata to use data schemas to standardise data structure on a subject
Wikidata:Requests for comment/How to model curricula and link them to educational resources?

Thanks

John Cummings (talk) 17:40, 23 March 2021 (UTC)[reply]

Schemas and statements[edit]

"What is a schema?" section might want a sentence or two about the relation of schemas and statements, or if it needs more than a couple of sentences, then a link to where that can be found. For a lot of people who have experience with Wikidata but not with schemas, that's probably the best place for them to start getting their head around this. - Jmabel (talk) 20:52, 23 March 2021 (UTC)[reply]

Exporting a Schema[edit]

How can I export several Schemas at one time. I tried it with Special:Export but it did not work. I want to try to generate possible questions based on defined data structures and for that schemas are useful. Also I am interested in creating templates for entering data through structured sentences with gaps for the arguments and want to try to write schemas in natural language in German and try to bring it into the structure as used for EntitySchemas here in Wikidata.--Hogü-456 (talk) 21:56, 23 October 2021 (UTC)[reply]

Feedback from the "Entity Schema build-out and integration" task of WMDE's development plan, and OpenRefine[edit]

WMDE shared in its development plan that it wanted to assess the existing Entity Schema integration. I wonder if any conclusions have been reached about this topic, and if so, whether they can be read somewhere. This topic is relevant for OpenRefine (see below).

First, here are my two cents about why this functionality (Entity schemas) is not being used much (which is probably nothing new, but worth stating to motivate the rest):

ShEx is tailored to the validation of RDF datasets. While Wikibase entities can be exported to RDF, this translation is pretty much one-way: given the RDF representation of a Wikibase entity, it is not easy (but surely not impossible) to translate it to the native Wikibase data model (formulated with statements, ranks, qualifiers, references). For ShEx validation to be really useful in the context of Wikibase (whatever the concrete use cases where it is meant to be used are, which is still not completely clear to me), one should be able to reliably translate back the results of ShEx validation to the Wikibase data model, and that is difficult for the same reasons.
Even if those inverse transformations could be implemented, their complexity would be likely to make the integration not so user-friendly. To write and use entity schemas, users need:
- Knowledge of the rules for the translation of Wikibase entities into RDF. Arguably the community masters this quite well because it is required to write SPARQL queries (but it is by no means trivial);
- Knowledge of the syntax of ShEx;
- But also: some intuitive understanding of the process which maps RDF back to the Wikibase datamodel and SHEX warnings to some notion of warnings applicable to the Wikibase data model. Because even if those transformations can be implemented, they are likely to have their own oddities given the complexity of the task, in my opinion.

Now to why this is currently relevant for OpenRefine. As part of the Wikimedia Commons integration project, we want to offer what we currently call "schema templates". In OpenRefine, what we call a "schema" is a structure which describes how to translate tabular data to Wikibase edits. When using OpenRefine to upload data to a Wikibase, one needs to know how the data is typically represented in the Wikibase instance, to make sure we conform to the existing data modelling conventions. This is something we want to support by offering "schema templates", which are basically schemas with missing values. By using a schema template, the user would only need to drag and drop the columns of their project into the schema template, without needing to know in advance which properties or structures of statements they should be using, as this is already in the schema template.

It seems to me that the Wikibase "entity schemas" were introduced to let the community codify data representation conventions, so it is tempting to suggest that OpenRefine should just take those to be its "schema templates". We are not planning to do so for two reasons:

The problem mentioned above of being able to lift back from RDF to the Wikibase data model. So far, OpenRefine lets people upload data with no knowledge of the RDF translation of Wikibase entities, because it lets them operate directly on the native Wikibase data model. Using Wikibase's "entity schemas" as our new "schema templates" would require this inverse transformation mentioned above.
ShEx is designed for data validation, and this is not exactly our use case: we just want to prefill a data input UI. So although validating data and prefilling a UI both require representing data modelling conventions, it is likely that some design choices of ShEx would not work out well for our use cases. We would likely need something closer to R2RML for instance (which we do not use either for the reason above).

So far, we are planning to introduce our own format for those "schema templates", based on our current JSON format for schemas. Feedback on this plan is very welcome.

Of course one can be sad that we are not following standards and making up our own thing, but I feel like we have no other choice since Wikibase does not use RDF as its primary data model (which should not be understood as a blame: the fact that Wikibase chose its own format probably plays an important role in its success). Although we plan to use our own format too, I can imagine it being possible to have some sort of partial translation from our schema templates to ShEx, and maybe also a similar one in the other direction.

Pinging people who might be interested in this: Lydia Pintscher (WMDE), YULdigitalpreservation, Andrawaag, EricP, Spinster, Loz.ross − Pintoch (talk) 10:38, 23 July 2022 (UTC)[reply]

@Pintoch, I added my detailed reply here. I feel that ShEx could still be used for proposing "schema templates". John Samuel (talk) 10:14, 25 July 2022 (UTC)[reply]

@Pintoch: Thanks for bringing this up. We should definitely talk more about this. Maybe a call would help?

In general here is my thinking: Wikidata needs better ways to discuss and encode modeling decisions. This is needed to both validate/check the data we have as well as enable tool builders to get a machine-readable interpretation of those modeling decisions so they can build their tools to make correct edits. EntitySchemas are supposed to be the way to achieve this in Wikidata. They do not yet achieve this. A big part of it is that they are still lacking important basic integration points with existing tools and processes. That is the part I want to tackle next. This for me currently includes the following:

Make it possible to link to EntitySchemas in statements so editors can define for each class which EntitySchema governs it. We need to introduce a new datatype for linking to EntitySchemas (phab:T214884)
Make it possible to query for Schemas in the Query Service so editors can create dynamic lists of EntitySchemas (phab:T225701)
Make labels of EntitySchemas more useful to ensure it is easier to understand what is behind an EntitySchema ID
- Ensure that EntitySchemas show up by their label instead of ID in listings like Recent Changes and watchlist (phab:T214885)
- Introduce language fallbacks for EntitySchema labels (phab:T228423)
- Make it possible to show labels for EntitySchemas referenced in wikitext discussions and help pages (phab:T214886)
Make it possible to merge two duplicate EntitySchemas (phab:T224537)

Once we have that I expect EntitySchemas to become more useful in day-to-day quality workflows.

Now to OpenRefine using or not using EntitySchemas: I think we should try to dig into the issues preventing you from doing it more. It'd be a shame if people would be required to basically encode the same modeling decisions several times. It leads to divergence as we are currently already seeing for example with lex. data where similar modeling decisions are encoded in the templates for the Lexeme Forms tool and EntitySchemas. If there is something preventing toolbuilders from relying on EntitySchemas for providing forms and similar things for correct data entry then I think we need to address that. Lydia Pintscher (WMDE) (talk) 14:26, 25 July 2022 (UTC)[reply]

@Lydia Pintscher (WMDE): Sounds great. Yes happy to talk about it any time! Let's coordinate by email to find one. − Pintoch (talk) 15:18, 25 July 2022 (UTC)[reply]

@Pintoch I would like to share this publication Creating Knowledge Graphs Subsets using Shape Expressions (Q109812520) as a resource for this discussion. In this paper there is a discussion of how to describe Wikibase graphs and related workflows involving ShEx. While OpenRefine users would be looking to contribute new data rather than getting a subset of Wikibase data out for reuse, perhaps the use of WShEx to describe Wikibase graphs could be useful. YULdigitalpreservation (talk) 17:25, 25 July 2022 (UTC)[reply]

@YULdigitalpreservation: thanks for the link! It looks relevant indeed: figure 8 (page 25) is a good description of my first reason not to use ShEx. A notion of ShEx tailored to the Wikibase datamodel (such as WShEx) would already be more workable (even though there still remains the problem that it is designed for data validation and not data entry). But then, that does not exist as a specification yet, has no tooling, and the existing Entity schema integration is based on ShEx, not WShEx, so it is not really something we can tap into right now. − Pintoch (talk) 07:15, 26 July 2022 (UTC)[reply]

@Pintoch To be sure I'm understanding the OpenRefine "schema templates", are these templates used to be sure that people are using the same data structure, or are they rules that recognize a specific CSV and emit a conformant data structure?

If we consider Entity Schema E366, we can use any of the validators to convert a schema written in ShExC to ShExJ, which would give us:

{
:::::  "type": "Schema",
:::::  "start": "http://shex.io/webapps/shex.js/doc/coin_hoard",
:::::  "shapes": [
:::::    {
:::::      "type": "Shape",
:::::      "id": "http://shex.io/webapps/shex.js/doc/coin_hoard",
:::::      "expression": {
:::::        "type": "EachOf",
:::::        "expressions": [
:::::          {
:::::            "type": "TripleConstraint",
:::::            "predicate": "http://www.wikidata.org/prop/P31",
:::::            "valueExpr": {
:::::              "type": "Shape",
:::::              "expression": {
:::::                "type": "TripleConstraint",
:::::                "predicate": "http://www.wikidata.org/prop/statement/P31",
:::::                "valueExpr": {
:::::                  "type": "NodeConstraint",
:::::                  "values": [
:::::                    "http://www.wikidata.org/entity/Q15484785"
:::::                  ]
:::::                }
:::::              }
:::::            }
:::::          },
:::::          {
:::::            "type": "TripleConstraint",
:::::            "predicate": "http://www.wikidata.org/prop/P18",
:::::            "valueExpr": {
:::::              "type": "Shape",
:::::              "expression": {
:::::                "type": "TripleConstraint",
:::::                "predicate": "http://www.wikidata.org/prop/statement/P18",
:::::                "valueExpr": {
:::::                  "type": "NodeConstraint",
:::::                  "values": [
:::::                    {
:::::                      "type": "IriStem",
:::::                      "stem": "http://commons.wikimedia.org/wiki/Special:FilePath"
:::::                    }
:::::                  ]
:::::                }
:::::              }
:::::            },
:::::            "min": 0,
:::::            "max": 1
:::::          },
:::::          {
:::::            "type": "TripleConstraint",
:::::            "predicate": "http://www.wikidata.org/prop/P189",
:::::            "valueExpr": {
:::::              "type": "Shape",
:::::              "expression": {
:::::                "type": "TripleConstraint",
:::::                "predicate": "http://www.wikidata.org/prop/statement/P189"
:::::              }
:::::            },
:::::            "min": 0,
:::::            "max": 1
:::::          },

(truncated example)

This is truncated to fit here but shows how e.g.property picklists could be composed from the schema. An example with ps: properties would show how to pick qualifiers.

Perhaps the ShExJ would be enough to drive the GUI to help the user supply the values. More info about ShExJ: http://shexspec.github.io/primer/ShExJ

YULdigitalpreservation (talk) 17:01, 27 July 2022 (UTC)[reply]

@YULdigitalpreservation: our proposed notion of schema template is comparable to the templates of the Lexeme Forms tool. In that tool, if you pick a particular type of grammatical structure in a given language, you are presented with a form which helps you add the relevant statements on your lexeme. Similarly, we want that OpenRefine users uploading media to Commons are able to pick a particular type of media (artwork, book…) and then prefill an OpenRefine schema with the relevant properties for the type of media being uploaded (leaving most statement values empty). The same functionality would be available to other Wikibase users: for instance, on Wikidata, one could offer schema templates for sportspeople, films, and the like.

So our notion of schema template would not be used for validation at all (for instance, you are free to delete some of the statements from the template you are using if they do not make sense in your case). This is purely a matter of prefilling a user interface with fields that are likely relevant for the user.

One good experiment to understand why ShEx is not a natural fit to represent this information is to go through the list of possible node types in ShExJ (such as "TripleConstraint" and "EachOf"), and ask ourselves how this node should be interpreted, when reading a ShExJ object to turn it into an OpenRefine schema with some missing values. For a node like "TripleConstraint", this sounds relatively easy (although you need to keep track of whether this triple is a main statement triple, a qualifier, a reference or even other things, and that might not be so easy). A node "EachOf" is a priori also simple, at least when it appears as a root node as in your example above, but what if it appears, say, somewhere nested? For instance it could be requiring statements on a particular value of a root statement, or in fact any other part of the datamodel. What about "OneOf"? Just treat it like "EachOf" and let users figure out that not all of the corresponding statements are required? Or bake into the interface some sort of visual clue that only some are required, but then that is not a schema with missing values anymore, it is something more complicated that we need to handle with a different interface, probably? What about cardinality constraints? And so on.

Note that I am not saying that such a translation cannot be done: I can totally imagine that one could write something that works-ish for 80% of the schemas in use on Wikidata for instance (just like Cradle is able to make some sense of entity schemas to build forms for data inputs - with a red "EXPERIMENTAL" warning). And having such a translation may be useful to some. But that means ShEx cannot (in my opinion) be the primary format we use to store such schema templates, because we need to be able to reliably read and write to this format. − Pintoch (talk) 20:46, 27 July 2022 (UTC)[reply]

Here is a quick update about this after a call with Lydia Pintscher (WMDE), Lucas Werkmeister (WMDE) and Loz.ross:

We identified three data input tools which need to access data modelling conventions on Wikidata, and do not use ShEx as their primary format to represent those conventions:
- Cradle (with experimental support of ShEx)
- Wikidata Lexeme Forms
- OpenRefine (planned)
Among the reasons for this, are the reasons mentioned above, as well as:
- The richness of ShEx, which makes it very hard to parse and support fully
- The need to represent specific information needed by those tools (for instance, the Wikidata Lexeme Forms tool displays example phrases to put the forms into context.)
To provide some sort of integration, it looks like developers would need to work independently on parsing ShEx and translating it to their internal format. To avoid duplicating efforts, it was considered introducing a common intermediate format, which would serve as "data input tool"-friendly translation of ShEx. There could then be a single tool doing the conversion from ShEx to this common format. There are no concrete plans for working on defining such a format.

There was some interest in a follow-up call to discuss that more.

Perhaps it is worth also mentioning my role in this: I am currently trying to step away from Wikibase integration in OpenRefine, to be able to focus on other areas of OpenRefine. The work I am currently doing on the integration with Wikimedia Commons was planned for another developer, but work has been slower than expected so I am just helping out on the short term to finish off the project. In this context I do not have time to work on any form of ShEx integration. Therefore I do not expect to be involved in follow-up discussions: if any ShEx integration in OpenRefine is planned, someone else will have to work on it. OpenRefine is actively looking for developers interested in getting involved in maintaining and developing our Wikibase integration. Obviously, this is a great opportunity to have a say in which direction this integration goes (for instance, by developing some ShEx integration). Therefore let me reach out to EricP, YULdigitalpreservation and Jsamwrites: if you are interested in working on this yourself or know people who would, now is a great time to get involved. − Pintoch (talk) 07:11, 4 August 2022 (UTC)[reply]

@Pintoch, I still think that our "schema template" could actually just be called "mapping template" to avoid confusion all around (or some other term like 'shaping template'). You say that "without needing to know in advance which properties or structures of statements they should be using", seems as if it assumes there is an inherent taxonomy always known? Class/Type information where perhaps Wikibase out of the box could support a properties for this type (P1963) that is always available in every instance, maybe make its id simple such as 'PT'. But since Wikibase is not a Typed knowledge graph like Freebase was but instead a directed labeled graph with a user driven taxonomy - a data graph, then baking in a P1963 sort of pushes it away from a general labeled graph with taxonomy. An intermediate instead of ShEx might indeed be something like R2RML, but I think YARRRML might interest you. With YARRRML, you are likely interested in 9.7 or all of chapter 9 and others, in particular where essentially 'predicateObjects' could be thought of as 'PT' or 'P1963'. If I am wrong in my understanding of the need to drive a UI for applying predicate mappings to classes/types in OpenRefine for Wikibase integration, then please correct me in those areas. --Thadguidry (talk) 16:30, 10 August 2022 (UTC)[reply]

New EntitySchema data type is ready for testing on Test Wikidata[edit]

Hello,

As many of you may already know, we have been working on introducing a new Wikidata data type that will make it easier to find EntitySchemas and use them to connect to other Wikibase Entities. This will allow editors to refer to existing EntitySchemas in statements to indicate what class of Items, Lexemes etc. are governed by an EntitySchema. This new EntitySchema datatype is now live on Test Wikidata for testing and your feedback.

Background

EntitySchemas were first introduced in 2019 as a way to model the structure of Wikidata Items and validate data against those specifications. There are a number of shortcomings with EntitySchemas still, which means they are not as useful and used as much as they should be. We are now addressing a number of those issues, starting with this new data type.

In 2019, we built the first version of the EntitySchema datatype, but it was eventually rolled back based on your feedback. We have made a lot of progress since then and take your feedback into account when developing this new iteration.

The main goal of this development is to help editors model data more consistently by making EntitySchemas more visible and integrated into day-to-day editing work. The new EntitySchema data type offers the following features:

A new data type that allows to make statements that take an EntitySchema ID as a value
A canonical URI scheme for EntitySchemas has been developed that matches prefixes of other Semantic Entities (Items, Lexemes, and Properties) to identify them as concepts and access them when they are referred to in statements in various formats such as RDF
"What Links Here" now enables you to see what Items, Lexemes, and Properties link to an EntitySchema in a statement
A “Concept URI” link has also been added to the EntitySchema’s sidebar, mirroring the same format as Items

What will come next for EntitySchemas

Displaying EntitySchemas linked in statements by their labels instead of their IDs, making them more readable and easier to understand.

Support for language fallback to make EntitySchemas legible across languages. An updated termbox (the table with labels, descriptions and aliases) to provide a more consistent experience between Items, Properties and EntitySchemas in the future.

Testing and Feedback

Today, we’d love for you to explore EntitySchemas on Test Wikidata and provide feedback.

We hope that the new EntitySchema data type will increase centralised discussions around the modeling of specific classes in Wikidata. This new visibility will allow for more integration of EntitySchemas into the ecosystem, leading to improved data quality through more consistent modeling. Ultimately making the reuse our data easier, especially for small to medium-sized reusers.

Here is an example we prepared earlier Q497.

If you encounter any issues, have questions or concerns, or want to provide feedback, please don’t hesitate to reach out to us on Wikidata talk:Schemas or leave a comment on this ticket phab:T332724.

Cheers, Arian Bozorg (WMDE) (talk) 15:49, 13 June 2023 (UTC)[reply]

Looks good! Would like to be pinged when this is active, as I'd like to query for schemas with SPARQL and get the relevant properties from them. Carlinmack (talk) 16:16, 15 June 2023 (UTC)[reply]

Absolutely! We will let you know when it's live here, and on Telegram Arian Bozorg (WMDE) (talk) 08:54, 22 June 2023 (UTC)[reply]

Thanks for requesting our feedback on this demo!

As a user, when looking at https://test.wikidata.org/wiki/EntitySchema:E3300, I have indeed the impression that entity schemas are a new sort of entity (comparable to items, properties or lexemes) - which is the impression you intend to convey according to your description above. But when I try to edit the statement linking to it on https://test.wikidata.org/wiki/Q497, I notice that there is no auto-completion set up. I need to type the raw entity id (E3300) and I cannot type the corresponding label (human) for this schema to be suggested to me. Is this an intended end state of this feature or do you plan on changing this?
As a tool developer, I observe that the API exposes those values as string datavalues and not entities: https://test.wikidata.org/w/api.php?action=wbgetentities&ids=Q497. This is consistent with the user experience of not having any auto-completion. This also has the benefit that third-party tools (such as OpenRefine) should be able to treat such statements as any string-based custom datatype, and therefore support retrieving and editing them without any update required. But I wonder if this is the desired end state (given the slightly confusing UX). If you deploy this prototype on Wikidata and want later on to turn those entity schemas into real Wikibase entities, then you'll be changing the structure of datavalues of all statements using this new datatype, which is probably a difficult operation to carry out on live Wikidata, and could also cause some friction in third-party tools which banked on the string datavalue type, no?

But I assume you already thought about all that in your planning. − Pintoch (talk) 08:34, 18 June 2023 (UTC)[reply]

Thanks so much for the feedback Pintoch! This is really helpful for us.

Regarding your first point, we will be making the new data type mimic what we have for Items - allowing you to enter labels, and aliases and adding auto-completion. You can find the ticket for that here phab:T338615

And for the API exposing the EntitySchema values as string we would like to adjust this now. Before we do this, it would be great to get some feedback from you: we would like to propose that the value type be entityschemaid as opposed to wikibase-entityid would this cause any friction on your end or for any other tool developers? Arian Bozorg (WMDE) (talk) 09:12, 22 June 2023 (UTC)[reply]

@Arian Bozorg (WMDE): so far, the types of datavalues exposed by Wikibase have been very stable so I would not be surprised if it would break quite some tools and libraries. Introducing a new datavalue type would at least break Wikidata-Toolkit and its reusers (including OpenRefine). For the past few years, when a new type of statement was introduced, a new datatype was created using an existing datavalue type. So I think going down that route would be much more convenient for most people. That being said, I do not know if the stability of datavalue types is a formal policy. Maybe it is worth clarifying the intended stability of those.

I also wonder what is the motivation to introduce a entityschemaid instead of reusing the wikibase-entityid datavalue type. In which sense does it help you? Is it not sufficient to specify a new datatype? − Pintoch (talk) 05:29, 26 June 2023 (UTC)[reply]

@Lydia Pintscher (WMDE), Arian Bozorg (WMDE): here are some more thoughts. My previous comments have been about two different things, which might have made my position unclear.

As far as Wikidata-Toolkit / OpenRefine are concerned, I don't want to make it sound like there are big blockers. It's no big deal, we can work with whatever you go for. Some solutions will require updates on our side, and that's obviously fine. As I wrote earlier, the current situation (new datatype relying on a string datavalue) will work out of the box. Anyway that's only affecting people who want to edit statements pointing to EntitySchemas via OpenRefine - which feels like it would be a pretty rare use case. If we ever wanted to add support for editing EntitySchemas via OpenRefine (even more of a niche use case I guess?) then it will be weird if their ids are just strings and not proper Wikibase entity ids like for Items, Properties, Lexemes, Forms, Senses and MediaInfo. But we can probably hack around to make it work.
The other thing I mentioned was my own perception of the consistency with the existing state of Wikibase. Obviously you are much better placed than I am to make a judgment on this, but my own expectation about this would be to have a new datatype (such as wikibase-entityschema) relying on the existing datavalue type wikibase-entityid (since that's how all the other entity types work, I think). I don't have good enough knowledge of Wikibase's internals to understand why this would be difficult and why it would be preferable to use a new datavalue type. Maybe because EntitySchemas aren't proper entities internally? If that's the case, then I expect more problems will show up in other situations (such as the fact that they cannot be retrieved/edited via the wbgetentities/wbeditentity API).

All of those considerations are also independent from the problems voiced earlier (see section above) about using EntitySchemas as "schema templates" in OpenRefine. The way EntitySchemas are linked to in statements has no influence on that. − Pintoch (talk) 11:35, 11 July 2023 (UTC)[reply]

+1 to what Pintoch said. I believe both way (either using the string datatype or using wikibase-entityid) should work with external tools like WikidataToolkit. Using wikibase-entityid seems to me the "least unexpected" approach if we we consider that schemas are already a kind of entity. It has also the advantage of allowing easier good integration by reusing the tooling already built for entity ids (validation, link creation...). However, it also more breaking because tools seems to me to make more assumptions on entity ids than they do on strings (but there are already 5 other kinds of entity id in use at Wikimedia at the moment so I would guess most codebases are already pretty open to extensions at the moment). And, tools have already to deal with Wikimedia Commons image names that relates a lot with schema ids and are encoded with the strings datatype but should be validated and displayed using links. Tpt (talk) 12:22, 11 July 2023 (UTC)[reply]

Thanks @Pintochand @Tpt this is very helpful and insightful feedback for us developing this feature. Based on what you said we will go with entityschemaid.

Regarding our technical decision, this is the first step for us to be able to deliver features and changes according to various use cases in a fast and flexible manner. For that we needed to make some technical changes to how entities are conceptualised, and more specifically how Wikibase and EntitySchemas rely on each other.

This makes things on our end more maintainable and allows us to be more flexible. Arian Bozorg (WMDE) (talk) 12:41, 11 July 2023 (UTC)[reply]

I had a poke at it trying to break it and seemed to resist this. I was able to add somevakue and novalue, which I imagine is OK assuming "Only existing EntitySchemas can be selected as values" means specific values.

It was picky about requiring E rather than e, when it might have autofixed that (though on mobile it will default to E anyway). It didn't allow a raw number. But it only told me about validation errors when attempting to publish, rather than when I shifted focus - as it would for, say, a property link. GreenReaper (talk) 08:57, 14 July 2023 (UTC)[reply]

New Schema Validation Tool UI[edit]

Hello everyone, As part of my thesis project I've made a new UI mode for the Schema Validator used by Wikidata. I discussed this already during the Data Modelling Days last year, but now the code is ready to use. The new UI represents validation reports as a table rather than a very long string, and replaces most links with hyperlinks with some of the text behind them. I just started hosting it yesterday on https://shex-validator.toolforge.org/packages/shex-webapp/doc/shex-simple-improved.html. I'm also looking for people willing to participate in evaluating the user experience and the ease of use of this new UI in a roughly 1 hour interview sometime in may. For more information, check out my user page. If you want to register for the evaluation interviews, you can do so at https://datumprikker.nl/event/index/fuwv62b5tatqq4vr. M.alten.tue (talk) 11:10, 30 April 2024 (UTC)[reply]

Re: New EntitySchema data type is ready for testing on Test Wikidata[edit]

Hello,

You may recall our previous announcement inviting you to test a new EntitySchema datatype on Test Wikidata. Once fully implemented, this data type will allow editors to refer to existing EntitySchemas in statements to indicate what class of Items are governed by an EntitySchema for example. Many thanks to everyone who provided feedback during the initial testing phase.

Based on your input, we have reassessed the architecture of the datatype. It should now return as a ‘wikibase-entityid’ when accessed through the API. See phab:T339920 for more information on this.

In addition to this update, we have made the following improvements:

Improved Display: EntitySchemas linked in statements, Wikitext, edit summaries and Special Pages are now displayed by their labels instead of their IDs, making them more readable and easier to understand.
Language Support: We've added support for language fallback to ensure EntitySchemas are legible across different languages.

We invite you to test these changes once again and provide us with your feedback by June 6. Unless any major issues arise, we will enable the new datatype on Wikidata during the next train deployment. Once this is done, the open property proposal at Wikidata:Property_proposal/Pending#Shape_Expression_for_class can be created.

Please note when testing: the data type still needs to be registered with Wikibase Client to ensure full accessibility of Items with EntitySchema statements on client wikis. See phab:T363153 for more information on this.

If you encounter any issues, have questions or concerns, or want to provide feedback, please don’t hesitate to reach out to us on this talk page or leave a comment on this ticket (phab:T332157).

Cheers,

- Mohammed Abdulai (WMDE) (talk) 13:34, 29 May 2024 (UTC)[reply]

Wikidata talk:Schemas

Contents

This page is based on two RFCs about data modelling on Wikidata[edit]

Schemas and statements[edit]

Exporting a Schema[edit]

Feedback from the "Entity Schema build-out and integration" task of WMDE's development plan, and OpenRefine[edit]

New EntitySchema data type is ready for testing on Test Wikidata[edit]

New Schema Validation Tool UI[edit]

Re: New EntitySchema data type is ready for testing on Test Wikidata[edit]

Navigation menu

Wikidata talk:Schemas

This page is based on two RFCs about data modelling on Wikidata[edit]

Schemas and statements[edit]

Exporting a Schema[edit]

Feedback from the "Entity Schema build-out and integration" task of WMDE's development plan, and OpenRefine[edit]

New EntitySchema data type is ready for testing on Test Wikidata[edit]

New Schema Validation Tool UI[edit]

Re: New EntitySchema data type is ready for testing on Test Wikidata[edit]

Navigation menu

Search