How can we make multimedia data easier to use on Wikipedia and Wikimedia sites?
Today, information about media files on Wikimedia sites is stored in unstructured formats that cause a range of issues: for example, file information is hard to search, some of it is only available in English, and it is difficult to edit or re-use files to comply with their license terms.
To address these issues, members of the Wikidata and our Multimedia teams met with community volunteers for a week-long bootcamp in Berlin in October 2014.
The focus of this event was to investigate how to structure data on Wikimedia Commons, reusing the same technology as the one developed for Wikidata. Participants collaborated in small workgroups to explore a range of problems and solutions, in parallel sessions focused on community, design, engineering, licensing and product management challenges.
Each workgroup produced concrete examples of how these ideas could be implemented, including:
- first data models for making file information machine-readable and license-compliant
- first user interface designs for viewing and editing structured data seamlessly
- a working prototype of a high-level API for transferring metadata about media files
- improvements to a prototype identifying files missing machine-readable data.
These preliminary ideas are now being documented on Commons so they can be discussed and improved with Wikimedia community members. For a project overview, check out this development page and these project slides.
The bootcamp was very productive, but many questions remain unanswered. Next steps include community discussions, design, prototyping, testing and a series of experiments — before starting actual development and data migration next year.
Everyone is invited to contribute to this important project. Your ideas and comments are much welcome, and developers would love your active participation to define and guide this project.
We look forward to working with our community to modernize our multimedia infrastructure and better support the needs of our users.
Update: To learn more, check out this video of my presentation at Wikimedia Foundation’s November metrics meeting.
Credits: I am the main author of this blog post, which was originally published on the Wikimedia Blog, under CC-BY-SA 3.0. Photo: Structured Data Bootcamp Group Photo by Christopher Schwarzkopf, under CC-BY-SA 2.0.