Tag Archives: dArceo

dArceo source code is open!

dArceo is a long term preservation system for source data (e.g. master files), initially developed by PSNC as a part of the SYNAT project . This system has been successfully included in the PSNC package DInGO – the set of tools to digitize and share collections on-line and it is currently used by dozens of Polish cultural and scientific institutions.

We are pleased to inform that the source code has been released under GNU GPL v3.0. It is a result of cooperation with Open Preservation Foundation, whose PSNC is a member. Releasing dArceo sources under the free software license makes the system more transparent and accessible for all interested – these are key conditions in the area of professional archiving and ensuring long-term access to digital content. We encourage to use the system and cooperation at its development!

dArceo source code repository:

http://github.com/psnc-dl/darceo

5th Digital Encounters with Cultural Heritage

This year the conference was related to the topic of “Digital Representation of the Artefact – methods, reliability, sustainability.”. The conference took place in Wrocław (19-20 Nov 2012). During the conference multiple interesting presentations were performed, including those related to digital preservation, visualization and access to cultural heritage digital assets over the internet. Several standards and formats were presented (e.g. STARC metadata schema for cultural heritage documentation), 3D visualization approaches proposed, and finally cataloging techniques described supplemented with best practices. Conference attendees had also opportunity to learn about multiple tools developed by Poznań Supercomputing and Networking Center for the cultural heritage institutions, including dMuseion system for building digital museums, dLab system to manage digitization workflow, as well as dArceo focused on long term preservation of cultural heritage digital assets.

Open Repositories 2012 conference

Open Repositories 2012 conference and corresponding workshops were held in Edinburgh (Scotland) on 9-13 July 2012. Rich conference programme and workshops, as well as a huge number of participants confirms the importance of the Open Repositories series. OR2012
Workshops consisted of several sessions. Especially interesting in the context of digital libraries were those related to data and text mining as well as long-term preservation. Data mining workshops were mainly related to search engines, semantic search, metadata and data aggregation, information extraction from texts as well as workflow systems related to texts. Various topics and systems has been presented, including:

During the development of the above systems various tools has been utilised, e.g. TextCat (http://odur.let.rug.nl/vannoord/TextCat/), U-Compare (http://u-compare.org/), OSCAR4 (https://bitbucket.org/wwmm/oscar4/wiki/Home), ANTRL (http://www.antlr.org/), MAUI (http://code.google.com/p/maui-indexer/), KEA (http://www.nzdl.org/Kea/), Sesame (http://www.openrdf.org/index.jsp), H2 (http://www.h2database.com/).

Workshops related to long-term preservation were focused mainly on Trident system and its possibilities.During the workshops most important aspects of long-term preservation has been presented, including identification of files that should be migrated or normalised as well as tools that can be used to create long-term preservation workflow (Kepler (https://kepler-project.org/), Taverna (http://www.taverna.org.uk/), Ptolemy II (http://ptolemy.eecs.berkeley.edu/ptolemyII/), Triana (http://www.trianacode.org/)).
The conference itself covered three days. Various topics has been raised and a number of interesting articles presented, e.g.:
  • “Build to scale” – presentation that shows how to build search system based on ApacheSolr, for 250M of records and providing results in 2 or less seconds.
  • “Inter-repository Linking of Research Objects with Webtracks” – presentation which describes InteRCom protocol for exchanging semantic information between repositories.
  • “ResourceSync: Web-based Resource Synchronization” – presentation of the protocol for synchronisation of data. It is based on experienced from OAI-PMH and OAI-ORE protocols.
  • “Griffith’s Research Data Evolution Journey: Enabling data capture, management, aggregation, discovery and reuse.” – description of research infrastructure of the Griffith University, including semantic tools such as VIVO (http://sourceforge.net/apps/mediawiki/vivo/) and VITRO (http://vitro.mannlib.cornell.edu/).
  • “Multivio, a flexible solution for in-browser access to digital content” – presentation which describes multi purpose viewer for PDF, GIF, JPEG and PNG that can understand DublinCore, MARC21, MODS and METS.
  • “ORCID update and why you should use ORCIDs in your repository” – presentation that shows the current status of the system for researchers identification called ORCID (http://about.orcid.org/).
  • “Digital Preservation Network, Saving the Scholarly Record Together” – presentation related to the initiative among several institutions in the USA focused on building heterogeneous system for long-term preservation (http://d-p-n.org/).
During the conference representative of Poznań Supercomputing and Networking Center presented the article entitled “dArceo services: advancing long-term preservation” and described long-term preservation services, focused on texts, images and a/v content, dedicated for Polish scientific and cultural heritage institutions. We invite you to visit OR2012 (http://or2012.ed.ac.uk/) website and view available presentations.