Tag Archives: wlt

New version of the Virtual Transcription Laboratory portal

A few days ago a new version of Virtual Transcription Laboratory portal has been implemented (http://wlt.synat.pcss.pl). In the current version you can find some new functions and conveniences, also all mistakes reported by users were repaired.

Most important changes that were implemented:

  • transcription’s editor supports multicolumn documents f.ex. newspapers (this option is available in new projects),
  • lines verification mechanism was added, each line is connected to the information about whether it has been looked through or not,
  • the import of TIFF files mechanism was expedited,
  • the possibility to download transcription results in text format,
  • link to preview whole scan in transcription’s editor,
  • numbers of lines were added in a transcription’s editor view,
  • it is also possible to move lines to a certain position in transcription’s view (by giving its number),
  • a mail is sent to the project’s owner after finishing the batch OCR,
  • the information about changes author was added to a history view,
  • author of the project is an optional field in a formula of creating new project

Detailed note about release with all changes and adjustments can be found here.

Apart from changes mentioned above, a suggestions and improvements forum where you can inform us about your propositions of improvements in VTL and vote for other user’s ideas has started (it is available here). You can enter the forum using an orange bookmark “Your suggestion”, which appears in the right upper corner of the VTL site. We strongly encourage to report your ideas and vote for those already visible on forum. Further works on VTL will include functions which mostly appeal to users.

New functions of the Virtual Transcription Laboratory portal

Source: http://pl.wikipedia.org/wiki/Plik:Escribano.jpg

We can gladly inform about the release of the newest version of the Virtual Transcription Laboratory (http://wlt.synat.pcss.pl).

This version contains new functions and correction of errors reported by users. Among most important changes there are:

  • the possibility to export the results of work in a ePUB file,
  • share project only with chosen VTL users,
  • support for scans in TIFF format (after uploading they will be automatically converted into the PNG/300DPI format),
  • changes in transcription editor dialogue,
  • a number of corrections in outcome hOCR files,

Full list of changes with screens can be found on our wiki:

Next stage in beta testing of VTL

On Friday 15th of February 2013, we have released a number of new functions and improvements in the Virtual Transcription Laboratory (http://wlt.synat.pcss.pl) portal.

These are the most prominent ones:

  • noticeable improvement of capability and stability of whole portal activities,
  • change in the way of transcription edition history is stored,
  • import of existing DjVu publication on the basis of the OAI identifier (this feature is described in an end-user documentation),
  • batch OCR for all files in the project,
  • notifications showing whether changed performed in transcription editor were saved,
  • many minor improvements and bug fixes reported by users,
  • the first version of documentation for users has been published (http://confluence.man.poznan.pl/community/display/WLT).

A few months passed since the BETA release of VTL. We would like to thank everyone for their feedback ;-). After the initial release it became clear that serious changes must be done in the portal engine. The most important was the change in the way transcription is represented and stored in database. This was a very significant thing but it resulted in a significant performance and stability improvement.

In the near future two new functions will be added:

  • export of project results in EPUB format,
  • the possibility to upload TIFF files into the project (they will be automatically converted to  PNG file in 300 DPI).

Authors of the post: Bogna Wróż, Adam Dudczak

Culture 2.0: Digital Archives – Tool Shop

On October 26-27, Polish National Audiovisual Institute (NiNA) organized the annual Culture 2.0 conference/festival. The attendants were given a chance to participate in a range lectures, workshops, games and demonstration. Full conference programme is available at the conference website. PSNC was the event’s partner,  Platon TV recorded the event, and the Digital Libraries Team was responsible for operating the “Digital Archives Tool Shop”.

What was the idea behind the Tool Shop? Digital libraries, archives and museums are usually associated with big institutions and their priceless, historical collections. But each one of us can stumble upon some family mementoes – old documents, photos or postcards – hidden in a long-forgotten drawer, but worthy of preservation and display. Our goal was to show the visitors how to create a digital (e.g. family) archive using widely available tools: a simple scanner and camera, open source software, and how to make it accessible online in accordance with digital librarianship canons and guidelines. The Tool Shop consisted of three stands: “Scanning and Processing”, “Tran2|>rip>ion”, and “Let Everyone See!”

fot. Justyna Walkowska

At the first stand (Scanning and Processing) the visitor were invited to digitize materials they brought from home. We presented the scanning process and its result, and explained how the quality of the result can be improved after it finds it way to a computer disk. All tasks at this stage were performed using the DigitLab system, with tools such as ScanTailor, gScan2PDF, Tesseract or SimpleScan. We treated the taks as a kind of exam for DigitLab, and we think it passed with flying colours. Direct contact with users is a great opportunity for every tool creator. The comments and suggestions we received will be reflected in the next release of the system.

Stand no. 2, the one with the peculiar name (Tran2|>rip>ion), presented the “Virtual Transcription Laboratory”. VTL is a portal which allows users to create full-text versions (transcriptions) of textual documents. As demonstrated at the first stand, the result of a digitization process is a graphics file – the digital representation of the document that was scanned. However, it does not contain the text of the document in a form understandable for the computer. The textual contents are necessary to create effective search mechanisms, to enhance the document’s visibility online, and to open new research possibilites. Using VTL, the conference visitors were able to automatically convert their scans to digital text in the OCR (Optical Character Recognition) process. VLT also makes it possible for users to co-edit the automatically recognized text, correcting any programme’s errors. VTL brings together automatic and crowdsourcing methods, thanks to which librarians, researchers and hobbyists can create high-quality textual representations of historical documents.

At the last stand the visitors were encouraged to make their digitized resources available online in the same way  professional librarians do. We tought them how to create a private archive using tools such as Omeka, and also presented the publication process of the biggest Polish digital libraries (which mostly use our dLibra software). As the next step, we explained how to check who is linking to our online resources and how to monitor their usage with free tools. A significant number of visitors had not heard about the  Digital Libraries Federation or Europeana, so we put some effort into describing those portals’ functions and goals.

For us this event offered a priceless opportunity to test our solutions in direct interactions with the users. Those were two very busy days, and unfortunately we did have much time to participate in lectures or workshops happening in the same place. We did manage to look around Level 2.0 (that is the 2nd floor on which we were located), where different installations were presented. One of our favourites was Waldemar Węgrzyn’s Electrolibrary in which a traditional book was used as the interface to an enriched electronic version.

Electrolibrary from Waldek Wegrzyn on Vimeo.

This short post is far from being a complete description of what conference participants were able to see. We hope that the recorded lectures will be made available soon, giving us the chance to catch up and see what we missed. 😉