This year the conference was related to the topic of “Digital Representation of the Artefact – methods, reliability, sustainability.”. The conference took place in Wrocław (19-20 Nov 2012). During the conference multiple interesting presentations were performed, including those related to digital preservation, visualization and access to cultural heritage digital assets over the internet. Several standards and formats were presented (e.g. STARC metadata schema for cultural heritage documentation), 3D visualization approaches proposed, and finally cataloging techniques described supplemented with best practices. Conference attendees had also opportunity to learn about multiple tools developed by Poznań Supercomputing and Networking Center for the cultural heritage institutions, including dMuseion system for building digital museums, dLab system to manage digitization workflow, as well as dArceo focused on long term preservation of cultural heritage digital assets.
The Theory and Practice of Digital Libraries conference (known before as European Conference on Digital Libraries, ECDL) was held in Paphos (Cyprus) on September 23-27, 2012. PSNC presented two papers which can be found in the conference proceedings published as Lecture Notes in Computer Science (7489):
- Creation of Textual Versions of Historical Documents from Polish Digital Libraries (Adam Dudczak, Miosz Kmieciak, Marcin Werla)
- Advanced Automatic Mapping from Flat or Hierarchical Metadata Schemas to a Semantic Web Ontology (Justyna Walkowska, Marcin Werla)
The former paper describes the prototype of the Virtual Transcription Laboratory created by PSNC as part of the SYNAT project. The work described in the paper included performing experiments whose goal was to train an OCR engine to automatically recognize text in digital scans of old documents (Polish texts printed between 16th and 17th century). The paper explains the rationale behind the prototype, its possibilities, and new development directions.
The latter paper concerns the issue of transforming data described using traditional metadata schemas (such as MARC 21 or Dublin Core) to an ontological formats, designed to exist in the Semantic Web and Linked Open Data environment. The paper describes requirements for languages expressing such mapping rules and the tools that implement them. It also shortly presents the jMet2Ont mapping tool.
For us, the conference started on Sunday with a so-called doctoral consortium. A doctoral consortium is a meeting during which each PhD student is assigned a mentor who is obliged to read (before the meeting) an extended abstract of the planned PhD thesis, and to prepare a list of comments and questions. During the meeting, each student presents their work and results to date. The mentor is expected to facilitate discussion after the presentation. Such an event is very beneficial for the students who are offered a chance to learn experts’ opinion on the strong and weak points of the research, all in a safe and friendly environment (the meeting is closed to the public).
The main conference lasted three days, Monday to Wednesday.
An outstanding keynote speech was Cathy Marshall‘s (Microsoft Research) Whose content is it anyway? Social media, personal data, and the fate of our digital legacy. The author raised a number of interesting issues concerning the transience of digital media, the expectances of the general user, and how the situation has been changed by social media such as Twitter or Facebook. The talk was well prepared and full of surprising points, turnabouts, and inspiring conclusions.
The same subject appeared in a presentation by Hany M. SalahEldeen and Michael L Nelson of Old Dominion University. In their paper entitled Losing My Revolution: How Many Resources Shared on Social Media Have Been Lost? the authors analyzed archival contents of social media corresponding to six important events in the las few years (including the Egiptian revolution, H1N1 pandemic, and Michael Jackson’s death). It turns out that after a year 11% of content linked from social portals is no longer available. One more year means another dozen percent of dead links. The full paper is available at the arXiv.org pages. This study was considered important also by traditional media, including the BBC.
A large number of papers were dedicated to machine learning applications. Digital libraries data seem to be a perfect field to apply and test machine learning algorithms. Another very interesting talk was Finding Quality Issues in SKOS Vocabularies (Christian Mader, Bernhard Haslhofer, Antoine Isaac).The authors defined a set of quality indicators and good practices for thesauri encoded in the SKOS format, and also created a qSKOS validating tool.
One of the most interesting events during the conference was the poster and demo session. The best demo contest was won by FrbrVis: An Information Visualization Approach to Presenting FRBR Work Families (Tanja Mercun, Maja Zumer, and Trond Aalberg).The authors, aware of the fact that more and more libraries and metadata aggregators are thinking about introducing the FRBR model, assigned themselves the task of designing an effective way of displaying FRBR data, so that the user could benefit from the model without feeling overwhelmed by it. They proposed four interface options, and then performed usability testing on a large number of users. Two graphical representations were picked as favourite, a concentric (called sun burst) and a hierarchical one. An unexpected conclusion was that graph-based representation (popular in Semantic Web world due to the very nature of RDF data), even though considered attractive at the first glance, proved difficult to use. A notable poster was presented in this session by the already metnioned here Hany M. SalahEldeen, who studied the temporal intention of users publishing links to online resources in social networks.
Thursday was the day of workshops. Conference participants were given the following choice:
- International Workshop on Supporting Users’ Exploration of Digital Libraries
- Networked Knowledge Organisation Systems and Services. The 11th European Networked Knowledge Organisation Systems (NKOS) Workshop
- 2nd International Workshop on Semantic Digital Archives
The NKOS workshop was dedicated mainly to the ISO 25964 thesaurus standard and its relation to SKOS. Only the first part of the standard is ready as of now. The documents describing the standard are not available for free, but a number of materials can be downloaded from the ISO 25964 webpage, including the XML schema (xsd) definition.
The archives workshop included a Semantic Technologies & Ontologies session in which Vladimir Alexiev of Ontotext gave a very interesting presentation about CIDOC CRM Search Based on Fundamental Relations and OWLIM Rules. Mentioning the FORTH (Foundation for Research and Technology – Hellas) A New Framework for Querying Semantic Networks study, he presented a model of searching which translates the 82 classes and 142 properties of CIDOC CRM to a smaller number of so-called fundamental classes (e.g. Person, Place) and properties, making the search much easier. Ontotext is the producer of the RDF repository called OWLIM. The presentation also described a set of OWLIM reasoning rules producing the simplified model.
In the shortest of the workshops, on supporting users’ exploration (additional materials available here) the participants had a chance to listen to a talk by David Haskiya (Europeana Foundation) about Europeana’s existing and planned features supporting users’ exploration of resources. The workshop ended with an interesting panel discussion in which the most prominent subject were the needs and expectations of current and future users of digital libraries, especially in the context of the youngest generation (see the video below).
The conference was held in a beautiful and historically significant corner of Europe which unfortunately is very hard to reach from Poland. The last year’s location (Berlin) was easier to get to for most of the participants. Next year the conference is to be held in Malta.
Post authors: Adam Dudczak, Justyna Walkowska, Marcin Werla
- Evidence Finder (http://labs.ukpmc.ac.uk), which allows searching 2M of documents and 71M of sentences.
- MEDIE (http://www.nactem.ac.uk/medie/), which allows semantic searching of biomedical information (it is based on MEDLINE, http://www.nlm.nih.gov/pubs/factsheets/medline.html).
- Argo (www.nactem.ac.uk/Argo), which allows creating workflows related to texts analysis and processing.
- HIVE and extension HIVE-ES ( https://www.nescent.org/sites/hive/) which makes it easy to create metadata and vocabularies.
- CORE (http://core-project.kmi.open.ac.uk/), which allows searching both data and metadata from various documents, it includes possibility to search for content based on harvested metadata.
During the development of the above systems various tools has been utilised, e.g. TextCat (http://odur.let.rug.nl/vannoord/TextCat/), U-Compare (http://u-compare.org/), OSCAR4 (https://bitbucket.org/wwmm/oscar4/wiki/Home), ANTRL (http://www.antlr.org/), MAUI (http://code.google.com/p/maui-indexer/), KEA (http://www.nzdl.org/Kea/), Sesame (http://www.openrdf.org/index.jsp), H2 (http://www.h2database.com/).
- “Build to scale” – presentation that shows how to build search system based on ApacheSolr, for 250M of records and providing results in 2 or less seconds.
- “Inter-repository Linking of Research Objects with Webtracks” – presentation which describes InteRCom protocol for exchanging semantic information between repositories.
- “ResourceSync: Web-based Resource Synchronization” – presentation of the protocol for synchronisation of data. It is based on experienced from OAI-PMH and OAI-ORE protocols.
- “Griffith’s Research Data Evolution Journey: Enabling data capture, management, aggregation, discovery and reuse.” – description of research infrastructure of the Griffith University, including semantic tools such as VIVO (http://sourceforge.net/apps/mediawiki/vivo/) and VITRO (http://vitro.mannlib.cornell.edu/).
- “Multivio, a flexible solution for in-browser access to digital content” – presentation which describes multi purpose viewer for PDF, GIF, JPEG and PNG that can understand DublinCore, MARC21, MODS and METS.
- “ORCID update and why you should use ORCIDs in your repository” – presentation that shows the current status of the system for researchers identification called ORCID (http://about.orcid.org/).
- “Digital Preservation Network, Saving the Scholarly Record Together” – presentation related to the initiative among several institutions in the USA focused on building heterogeneous system for long-term preservation (http://d-p-n.org/).
The registration for the 2011 Polish Digital Libraries Conference will be open for only two more weeks. Current edition of the conference will be held in Kórnik Library of the Polish Academy of Sciences between 10th and 13th of October. The conference will be accompanied by two parallel tutorials, a workshop and the demonstration day of the IMPACT project.
Detailed program of the conference (in Polish) is available at: http://www.man.poznan.pl/PBC/2011-program-konferencji/
The “CIDOC 2011 – Knowledge Management and Museums” conference took place in Sibiu in Romania on September 4-9, 2011. The conference is an annual event, organized by ICOM-CIDOC, that is the Committee for Documentation at the International Council of Museums.
The conference participants came from very different, but cooperating environments: museologists, librarians, programmers and museum software vendors, researchers in the field of ontologies and semantic web,
and also people and institutions concerned with museum documentation standards.
The conference included meetings of CIDOC working groups:
- Archaeological Sites
- Conceptual Reference Model Special Interest Group
- Data Harvesting and Interchange
- Digital preservation
- Documentation Standards
- Information Centres
- Transdisciplinary Approaches in Documentation
A number of topics were raised at the conference which are tightly connected with PSNC’s work in the SYNAT project. The most prominent ones were:
- LIDO (Lightweight Information Describing Objects) specification (www.lido-schema.org/) for description of museum resources made available online
- recommendation to use persistent, unique identifiers (URIs) of museum resources
- FRBRoo ontology which merges CIDOC CRM and FRBR (Functional Requirements for Bibliographic Records) to properly describe digital resources online (www.nla.gov.au/lis/stndrds/grps/acoc/tillett2004.ppt, http://www.frbr.org/categories/frbroo)
- Wiss-ki system presentation (http://wiss-ki.eu/, http://www8.informatik.uni-erlangen.de/transdisc/hohmann_cidoc09_wisski-2.pdf). The goals and assumptions of the project are very close to those of SYNAT. Some of the already used solutions might possibly be used in SYNAT.
The next CIDOC conference will take place in June 2012 in Helsinki. Additionally, the CIDOC “summer school” for people taking care of museum documentation is planned for the holiday period of 2012.
Open Repositories 2011 conference was held on June 6-11. It is an important international event for exchange of information about development, management and application of digital repositories.
Over 300 participants, from over 20 countries had opportunity to hear the lectures of such great representatives of IT and digital libraries community as Jim Jagielski and Clifford Lynch. Conference sessions were dedicated to various topics related to digital repositories, including semantic web, tools and standards, long term preservation and social networks.
Conference opening speech was performed by Jim Jagielski, president of Apache Software Foundation. Jim Jagielski described open-source communities organisation and collaboration. He underlined that open-source projects are developed mailny by volunteers, and the key aspect of cooperation is trust between project participants. Bradley McLean from DuraSpace identified key trends for the future of digital repositories: mobile technologies, long term preservation, cloud computing, and mashups. Richard Rodgers from M.I.T. Libraries presented ORCID initiative, which aim is to create a central registry for researchers to solve the problem of author ambiguity.
Many tools, systems and initiatives related to digital repositories were also presented on the conference: Memento, Hathi Trust, DAR, FITS, OTS-Schemas, BatchBuilder, ReDBox and Mint, Exhibit, Fascinator, Recollection, SWORD, CUPID.
On the conference, Tomasz Parkoła from PSNC presented a poster describing the concept of building multiple virtual digital repositories mapped over collections of a shared digital library. Digital repositories are currently an important new trend in the network of Polish digital libraries. The main aim is to increase on-line visibility of contemporary Open Access research works. This kind of activities is also supported by the Integrated Knowledge System developed by PSNC in frame of the SYNAT project.
In Warsaw, on the 7th of December, 2010 there was held a second seminar in the series of Computerization of Cultural Institutions – Cultural Institutions on the Internet. The objective was to demonstrate how cultural institutions can exist and promote themselves on the Internet, and going further, what are the benefits of their presence in the world wide web.
During this seminar Agnieszka Lewandowska from Poznan Supercomputing and Networking Center presented Europeana project, the Europeana.eu portal and data transfer possibilities to the Europeana via the EuropeanaLocal project. The presentation was focused on the description of Europeana and the benefits from connecting to it (increased visibility, promotion, interesting neighborhood, etc.). The model of connection to Europeana.eu was presented as well as the possible paths for the Polish regional and local institutions. We hope that the result will be new institutions joining Europeana via EuropeanaLocal project and Digital Libraries Federation.
Among many lectures deserving attention, there was the speech of Tomasz Rodowicz, who talked about attracting internet users to the theater – “Theatre Online – or new media culture”. The speaker presented the method of drawing attention of internet users based on his own experience. He told about broadcasting the traditional theatrical performance (with a standard audience present) via the Internet, while the part of the scenery was a large screen displaying live comments of internet users. Thus forming a triangle of spectator-actor-net user, which all parties interacts with each other.
At the end we encourage you to have a look at two pages:
- Map of Culture – “the first nationwide interactive portal to promote Polish culture in the regions co-creatred by the users”,
- Platform Culture – “an interactive portal dedicated to culture and cultural education designed to informational exchange, presentation of interesting practices and projects from across the Polish and the activation of cultural backgrounds”.
Those two pages were presented during the seminar by Karolina Szczepanowska from Polish National Center of Culture.
The “i3: Internet-Infrastructures-Innovations” conference took place in Wrocław on December 1-3, 2010. The conference was organized by the PIONIER Consortium. Poznań Supercomputing and Networking Center and Wrocław Center for Networking and Supercomputing played a significant role in organization of this year’s edition.
During the conference we had a chance to present two papers: “Technical Challenges Associated with Presentation of Digital Cultural Heritage in the Internet” and “Architecture and Protocols for Building a Knowledge System – PSNC Tasks in the SYNAT Project”.
The first paper gave an overview of the PSNC approach for the presentation of multi-domain digital objects over the Internet. The idea is to store various types of digital objects (e.g. texts, graphics, audio and video) using generic data model of the dLibra system and present the objects using personalized and dedicated web portals for different institutions, such as libraries, museums or oral history related institutes.
The second presentation “Architecture and Protocols for Building a Knowledge System – PSNC Tasks in the SYNAT Project” was a general overview of the work done by PSNC in the SYNAT project. The development of SYNAT has began in August 2010, so it is still in the early phase. The main purpose of this presentation was to show our initial assumptions and gather feedback which might be useful in our further work.
A number of interesting presentations were given during the conference. There were three main session themes (“Internet”, “Infrastructures”, “Innovations”). Subjects within these themes included e-health, e-education, Future Internet and network safety.
During the regions session a representative of the IZIP company presented a Czech Electronic Health Record system. The system was commissioned by the biggest health insurer in the Czech Republic. It stores information about the patient’s health, diagnoses, ordered analyses and their results. This way the analyses are not repeated by every doctor the patient visits, which significantly limits the costs. Another important goal of the system is health service quality improvement thanks to easy access to the patient’s history, facilitating the decision-making process.
We found the following presentations particularly interesting:
- “Open Administration Services in Urban Area”, given by Bartosz Lewandowski from PSNC. Bartosz described the open e-administration policy promoted by PSNC and the Municipal Office of Poznań.
- “The Importance of Digitization and Archival Resources for the Development of New Social Initiatives”, given by Piotr Skałecki. Piotr presented digitization projects from the field of genealogy, e.g. the “Poznań Project: Marriage Database Search” which allows its users to search a base containing data about 540825 marriages contracted in the years 1820-1889 in the former Province of Posen. Also the projects of the Polish Genealogical Society were mentioned, among them Geneteka and the transcription of a document entitled “Polish Declarations of Admiration and Friendship for the United States”.
- “The URBANCARD Service” – a presentation about the city card (URBANCARD) for which the Municipal Office of Wrocław was awarded the Innovation Pioneer prize. This presentation was given by Dariusz Jędryczek.
Most presentations are available online at the conference webpage (http://www.i3conference.net/online/program.php).
This year’s edition of the “Cyfrowe spotkania z zabytkami 4” conference was organised with subtitle “Cultural Heritage on the Web: Access and Exchange of Information.”. The conference brought together representatives of various institutions including libraries, museums, archives, humanists and computer science specialists. Conference covered topics related to the standards for information representation (such as CIDOC CRM and LIDO), methods of knowledge acquisition from unconventional sources (e.g. blogs) and also standards for describing iconographic resources. In the context of cartography, a system for geographical localisation of cultural heritage digital resources was presented.
During the conference Poznan Supercomputing and Networking Center presented dMuseion – digital museum platform, which is developed by PSNC in cooperation with National Museum in Warsaw. Despite dMuseion is not publicly available yet, it already offers functionality for building digital collections of museum holdings.