All posts by Adam Dudczak

ACCESS IT plus course in Bosnia and Herzegovina


Another good news from Balkans ;-). “Digital Repositories for small memory institutions” and “Cooperation with Europeana” e-learning courses developed by PSNC in the AccessIT and Access IT plus projects (funded under the EU Culture Programme) are now available in Serbian, thanks to the joint effort of National and University Library of Republica Srpska (NULRS, our Bosnian partner) and Belgrade City Library (Serbian partner of ACCESS IT). A few days ago NULRS has just released both courses in their e-learning portal.

Exactly the same as in Poland, Greece, Croatia, Serbia and Turkey, courses are free for anyone who is interested in digitisation, digital libraries and Europeana. Congratulations to our colleagues from Bosnia and Herzegovina, you did a great job! 😉

Next stage in beta testing of VTL

On Friday 15th of February 2013, we have released a number of new functions and improvements in the Virtual Transcription Laboratory ( portal.

These are the most prominent ones:

  • noticeable improvement of capability and stability of whole portal activities,
  • change in the way of transcription edition history is stored,
  • import of existing DjVu publication on the basis of the OAI identifier (this feature is described in an end-user documentation),
  • batch OCR for all files in the project,
  • notifications showing whether changed performed in transcription editor were saved,
  • many minor improvements and bug fixes reported by users,
  • the first version of documentation for users has been published (

A few months passed since the BETA release of VTL. We would like to thank everyone for their feedback ;-). After the initial release it became clear that serious changes must be done in the portal engine. The most important was the change in the way transcription is represented and stored in database. This was a very significant thing but it resulted in a significant performance and stability improvement.

In the near future two new functions will be added:

  • export of project results in EPUB format,
  • the possibility to upload TIFF files into the project (they will be automatically converted to  PNG file in 300 DPI).

Authors of the post: Bogna Wróż, Adam Dudczak

Summary of 2nd edition of FBC e-learning courses

The ongoing edition of Polish Digital Libraries Federation’s e-learning courses was available since October 2012 and it has ended along with the end of January. Courses “Cooperation with Europeana” and “Digital repositories for small memory institutions” met with quite big interest. The Europeana course attracted exactly 100 participants and the second one concerning subjects related to digitisation and digital libraries had 128 participants registered. Every participant could gain – apart from knowledge,
Keene High School (old) Graduating Class of 1875, Keene, New Hampshire
also a certificate confirming the participation in the course and the given note. Passing the tests was a condition for which a participant could gain a certificate; together, courses include 11 tests and 255 questions (57 in first and 198 in second). We certificated 78 people: 27 for Europeana course participants and 51 for digital repositories course participants (31 in this 2. edition). W congratulate everybody who has managed to finish all tests ;-).

People who have finished the course were asked to grade the course. Each subject of the course was rated in 1 to 5 scale, we received also personal comments about courses content.

As for the “Cooperation with Europeana” course, the first topic “Overview of Europeana” obtained the average note of 4.71 and the second “Technical aspects of Europeana” has 4.24. Some critical notes concerned mostly the technical aspect mentioned in the course, participants asked for more practical examples. We promise to amend 😉 Nonetheless most notes were positive, participants wrote that courses improved their skills which have a use in everyday work.

As far as the course “Digital repositories for small memory institutions” is concerned, the highest note was given to the subject “Introduction to digitisation” (4.77) and “Building digital collections” got the lowest one (4.57). There was a critical note about standardizing the grading of quizes. A few participants complained about minuteness of instructions related to usage of particular software packages but this was also one of biggest advantages mentioned by other graduates.

With the end of the second edition of the courses we will close the enrollment for courses till the late March. We are planning to start another edition of the courses at that time.

Authors: Bogna Wróż, Adam Dudczak

ACCESS IT plus course in Croatian

View on Rijeka

Digital Repositories for small memory institutions” and “Cooperation with Europeana” e-learning courses developed by PSNC under the AccessIT and Access IT plus projects (funded under the EU Culture Programme) were released in Croatian versions by City Library in Rijeka. 

Exactly the same as in Poland, Greece, Serbia and Turkey, courses are free for anyone who is interested in digitisation, digital libraries and Europeana. With a maximum duration of 3 months, upon completion of a course there is the possibility of gaining certification which is provided in cooperation with CSSU (Center for permanent education of librarians) at the National and University Library in Zagreb.

More information can be found at Access It Plus project website.

Culture 2.0: Digital Archives – Tool Shop

On October 26-27, Polish National Audiovisual Institute (NiNA) organized the annual Culture 2.0 conference/festival. The attendants were given a chance to participate in a range lectures, workshops, games and demonstration. Full conference programme is available at the conference website. PSNC was the event’s partner,  Platon TV recorded the event, and the Digital Libraries Team was responsible for operating the “Digital Archives Tool Shop”.

What was the idea behind the Tool Shop? Digital libraries, archives and museums are usually associated with big institutions and their priceless, historical collections. But each one of us can stumble upon some family mementoes – old documents, photos or postcards – hidden in a long-forgotten drawer, but worthy of preservation and display. Our goal was to show the visitors how to create a digital (e.g. family) archive using widely available tools: a simple scanner and camera, open source software, and how to make it accessible online in accordance with digital librarianship canons and guidelines. The Tool Shop consisted of three stands: “Scanning and Processing”, “Tran2|>rip>ion”, and “Let Everyone See!”

fot. Justyna Walkowska

At the first stand (Scanning and Processing) the visitor were invited to digitize materials they brought from home. We presented the scanning process and its result, and explained how the quality of the result can be improved after it finds it way to a computer disk. All tasks at this stage were performed using the DigitLab system, with tools such as ScanTailor, gScan2PDF, Tesseract or SimpleScan. We treated the taks as a kind of exam for DigitLab, and we think it passed with flying colours. Direct contact with users is a great opportunity for every tool creator. The comments and suggestions we received will be reflected in the next release of the system.

Stand no. 2, the one with the peculiar name (Tran2|>rip>ion), presented the “Virtual Transcription Laboratory”. VTL is a portal which allows users to create full-text versions (transcriptions) of textual documents. As demonstrated at the first stand, the result of a digitization process is a graphics file – the digital representation of the document that was scanned. However, it does not contain the text of the document in a form understandable for the computer. The textual contents are necessary to create effective search mechanisms, to enhance the document’s visibility online, and to open new research possibilites. Using VTL, the conference visitors were able to automatically convert their scans to digital text in the OCR (Optical Character Recognition) process. VLT also makes it possible for users to co-edit the automatically recognized text, correcting any programme’s errors. VTL brings together automatic and crowdsourcing methods, thanks to which librarians, researchers and hobbyists can create high-quality textual representations of historical documents.

At the last stand the visitors were encouraged to make their digitized resources available online in the same way  professional librarians do. We tought them how to create a private archive using tools such as Omeka, and also presented the publication process of the biggest Polish digital libraries (which mostly use our dLibra software). As the next step, we explained how to check who is linking to our online resources and how to monitor their usage with free tools. A significant number of visitors had not heard about the  Digital Libraries Federation or Europeana, so we put some effort into describing those portals’ functions and goals.

For us this event offered a priceless opportunity to test our solutions in direct interactions with the users. Those were two very busy days, and unfortunately we did have much time to participate in lectures or workshops happening in the same place. We did manage to look around Level 2.0 (that is the 2nd floor on which we were located), where different installations were presented. One of our favourites was Waldemar Węgrzyn’s Electrolibrary in which a traditional book was used as the interface to an enriched electronic version.

Electrolibrary from Waldek Wegrzyn on Vimeo.

This short post is far from being a complete description of what conference participants were able to see. We hope that the recorded lectures will be made available soon, giving us the chance to catch up and see what we missed. 😉

First Polish THATCamp

First Polish THATCamp will be organized on 24-25 October 2012 and will be held next to “Zwrot Cyfrowy w humanistyce Internet Nowe Media-Kultura 2.0” conference in Lublin. Event is organized by the Polish THATCamp coalition and will take place in headquarters of NN Theater on Old Town in Lublin (Grodzka 21). Poznań Supercomputing and Networking Center is an official partner of this event.

THATCamps (The Humanities And Technology Camp, is a meeting of people interested in new technologies in humanities, sociology, academic and artistic institutes activities (universities, galleries, archives, libraries and museums) organized all over the world. Participation in that kind of events is free.

Beginnings of THATCamp date back to 2008, when it was organized for the first time in USA by Center for History and New Media (CHNM) in George Mason University.

More information about event can be found here (in Polish).

Post authors: Bogna Wróż, Adam Dudczak

Digital Humanities 2012 conference

Digital Humanities 2012 was, one of the best conferences which we have attended in 2012. Organizers managed to gather more than 500 attendes from all around the world. Conference was held at the University of Hamburg, which is a really great venue to host more than 200 sessions (5 parallel tracks of the main conference plus various workshops and tutorials) during the 5 days starting from Monday 16 of July.

As a summary of the conference we would like to bring your attention to a few very interesting projects and tools which were presented there. If you are interested in getting more information about them, you may check conference website, because all the lectures were filmed and videos are available online for free.

First project on our list, a “Programming historian 2” it is an effort which aims at creation of a second edition of a textbook which shows how programming tools like Python can be used by digital historians in their research. It sounds like a very ambitious and interesting task. Project is a collaborative effort, it consists of lessons which lasts from 30-60 minutes and tries to show what and how can be done using modern programming tools.

One of the most interesting project from user interface point of view was Neatline. It is a set of plugins for Omeka digital library framework. Neatline allows to creae a visually rich presentations of e.g. helps users to tell a story using timeline and map (example exhibitions about Battle of Chancellorsville). Tool itself is very nicely done, apart from normal fully-fledged version it’s also mobile ready – really worth trying out.

Next interesting project that was introduced at the conference was Pelagios. The name of project stand for ‘Pelagios: Enable Linked Ancient Geodata In Open Systems’. It is a collection of online ancient world projects (ie. Google Ancient Places, LUCERO and many others) used to find information about ancient places and visualize it in meaningful way. To achieve this purpose they use common RDF model to represent places reference and align all place references to the Pleiades Ancient World Gazetteer. As authors says project now focuses on ancient world, but it is only first step on building Geospatial Semantic Web for Humanities.

Among many other interesting things project named “Visualizing the History of English” introduced by Alexander Marc was one of the best project at the conference. Alexander Marc presented method to visualize English vocabulary by treemap chart from different time periods. For this purpose uses huge database of the Historical Thesaurus of English (793 747 entries within 236 346 categories). I truly recommend to have a look at the video from this presentation.

There was also few very exciting projects related to different aspects of history and geography. One of the best example was The MayaArch3D Project that combines art history, archeology with GIS, virtual reality for teaching and research on ancient architecture. The current prototype is a virtual searchable repository of Maya city located in Copan of western Honduras. One of the purpose of this paper is to analyse the visual and spatial relationships between built forms and landscape elements. Project was developed in Unity3D game engine in combination with PHP and PostgreSQL.

QueryArch3D Demo Film from Jennifer von Schwerin on Vimeo.

This is of course not all, below you can find a list of a few interesting tools, named in various presentations during the conference:

  • Stanford NLP group publishes results of work of NLP group from Stanford University. Website offers access to multiple tools which can be used for Natural Language Processing.
  • Apache Open NLP a machine learning based toolkit for the processing of natural language text.
  • Alchemy API – it helps to transform text info knowledge. Alchemy is a cloud-based text mining platform providing semantic tagging to over 18,000 developers. AlchemyAPI provides the most comprehensive set of natural language processing capabilities of any text mining platform, including: named entity extraction, author extraction, web page cleaning, language detection, keyword extraction, quotations extraction, intent mining, and topic categorization.
  • A few presentations during the DH named Open Calais an semantic enrichment API powered by Thomson Reuters. Now in 4.6 version, a very interesting project available since quite a while, nice to see that it is widely used.
  • D3.js – Data-Driven Documents is a very nice JavaScript library for manipulating documents based on data. It helps to bring data to life.
  • OKF annotator, developed by Open Knowledge Foundation allows to annotate virtually any resource on the web.
  • GeoStoryteller is one of the tools used during the German Traces NYC project. It is an educational tool that allows you to create stories about physical places. Users can take a walking tour and engage with the GeoStories you have created using their mobile phone.

Last but not least links to two interesting documents Research Infrastructure in the Digital Humanities (from European Science Foundation) and an inventory of FLOSS dig_hum tools.

authors: Piotr Smoczyk, Adam Dudczak