All posts by Adam Dudczak

Workshop on “Open data and re-use of public sector information”

 I had a chance to participate in the first day of Digital Agenda Assembly, the conference organized by European Commision to summarize developments in the area of European digital economony. During two days of conference (16-17 of June) participants had a chance to take a part in series of parallel workshops. Topics covered during the conference were related to seven pillars of Digital Agenda for Europe, including stimulation of growth of single European digital market, interoperability and standards, trust and security, fast and ultra-fast Internet access, research and innovation, enhancing digital literacy, skills and inclusion, ICT-enabled benefits for EU society.

During the first day of the conference I was attending „Open data and re-use of public sector information” workshop. Agenda featured multiple case studies coming from several member states, showing that open governmental data can bring benefits to all interested parties, including citizents, administration and commercial entities. Discussions covered things related to financial models behind releasing public data but also boundaries of transparency of public administration (mainly in the context of security).

Among other case studies presented during the workshop I wanted to bring your attention to a talk given by Katalin Gallyas showing how Amsterdam authorities are approaching to problems related to open data and its reuse. Katalin was talking about various projects and activities including Open Cities and Open Government Data Initiative. I hope that this short summary will encourage you to take a closer look to slides from her talk.

Apart from case studies and discussions, participants had a chance to see the results of Open Data Chalenge and series of hackathons Hack4Europe!.I will come back to Hack4Europe! in my next post. Open Data Challenge was a competition organized by Open Knowledge Foundation in cooperation with multiple institutions all around the Europe. Participants competed in several categories including best application using public open data, best idea for application and best visualization. Organizers received 430 entries from 24 member states, there were 20 000 euros in prizes to win. I encourage you all to watch short video which summarizes the results of the competition.

Open Data Challenge from Open Knowledge Foundation on Vimeo.

To summarize my impression after the workshop, open data can be really valuable for various commercial entities. This oportunity was already noticed by giant of IT like Google (with its http://opendatakit.org/ and involvement in OpenData Challenge) and Microsoft (Open Government Data Initiative). Hope that involvement of private sector will help to accelerate change in public sector.

Hack4Europe! summary

During the second week of June Europeana foundation have organized a series of four hackthons under the common slogan Hack4Europe! Events were organized by local partners in Poznań, London, Barcelona and Stockholm. Polish hackathon was organized by Poznań Supercomputing and Networking Center and The Kórnik Library of the Polish Academy of Sciences.

All hackathons aimed at development of innovative applications created on top of data about 18 million of cultural heritage objects collected by Europeana. Developers competed in four categories: application with greatest commercial potential, application with greatest potential for inclusion, most innovative application and audience award (this one was voted by developers).

As a result Hack4Europe! Events gathered 85 developers, who have prepared 48 prototypes. The most common development themes included applications designed for mobile devices, applications using the potential of social networks, solutions allowing users to curate content, integrating Europeana content into various games, connecting cultural heritage data with Wikipedia and finally various visualizations showing how various objects are related.

Polish edition of Hack4Europe! was held on 7-8th of June in Działyński Palace in the heart of the Poznań Stary Rynek. There were 18 participants from various places all over the Poland, during two very intensive days managed to create 8 prototypes. Three of them were awarded by Jury:

  • “Art4Europe” was awarded as the application with greatest commercial potential. This project was created by Jakub Jurkiewicz , Marcin Szajek , Jakub Porzuczek and Tomasz Grzywalski who represented ITraff Technology.
  • Zbigniew Tenerowicz and Piotr Kaleta (students from Poznań University Technology) created the most innovative application called “Europeana Field Game”.
  • The winner of greatest social inclusion category was Hackmemory a simple game developed by Bartek Indycki and Darek Walczak. This game also won the audience award.

Rest of the prototypes were tools allowing for integration of Europeana API with Google Maps and with MediaWiki. It is also worth to mention that awards for best projects were funded by Speed Up Group sponsors of Poznań Hack4Europe!

Authors of the winner in the greatest commercial potential category created application which allows its users to identify given art work using picture taken by the camera of their mobile phone. Art4Europe identifies given object, presents the description of the object, apart from this it can also translate it to any European language and read this description aloud using speech. Users might be also interested in buying reproductions or books about given art work.

In “Europeana field game” user can “carry” and pin elements to a location and see elements pinned to a location by other users. The game encourages geotagging by introducing quests to accomplish and interaction with other users. The geotags created by players can be later used to suggest interesting Europeana content for everyone based on location data.

The last winner Hackmemory (http://hackmemory.drivent.pl/memory/start) is a simple educational application for kids and adults based on well know memo games. Players have to find two exactly matching pictures. After finding each pair user can read about the content of the picture. User can create his/her own quiz and simply share it with friends on using various social media. The content of the puzzle comes from Europeana and it is filtered by the creator of the quiz (i.e. teacher).

Polish winners took apart in second round to compete with applications awarded during hackathons in Barcelona, Stockholm and London. The results of this final round were announced a few days after last the end of last hackathon. We are very glad to announce that Art4Europe won once again! Apart from Polish project, also three other applications were awarded:

  • Casual Creator (developed during London hackathon) application which facilitates using pictures of the cultural heritage objects in teaching.
  • Time Mash (Stockholm) fully functional geo-location aware search of Europeana for mobile phones. Users can take photos and associate them with existing Europeana objects. Through an inbuilt function to overlay new pictures with Europeana pictures, a seamless “Then-Now” effect is created. The new photos are uploaded with the current GPS position so the app can also function as a geo-tagger tool for Europeana.
  • Timebook (Barcelona) the app integrates content from Europeana and DBpedia and presents it in an easy to use format with, for instance, posts for famous quotes, friends status for influential persons and photos of paintings.

Hack4Europe! awards ceremony was organized on 16th of June during the Digital Agenda Assembly in Brussels. Winners received prizes from European Commission Vice-President Neelie Kroes.

Tesseract 3.0 installation on Ubuntu 10.10 server

Tesseract is an optical character recognition (OCR) engine originally developed by Hewlett Packard, in 2005 it was open sourced under Apache license. Its development is now supported by Google. Version 3.0 was released in September 2010 apart from other things this version offers support for Polish language.

Wiki at Tesseract website is a bit messy, that is why I decided to describe my experience with building and installation of Tesseract 3.0. I was working on Ubuntu 10.10 server edition, deployed on virtual machine created using Oracle Virtual Box.

First, I’ve install build-essential and autoconf:

sudo apt-get install build-essential
sudo apt-get install autoconf

Next, step according to Tesseract wiki is to install dependencies:

sudo apt-get install libpng12-dev
sudo apt-get install libjpeg62-dev
sudo apt-get install libtiff4-dev
sudo apt-get install zlib1g-dev

Please note, that the name of zlib1g-dev package is misspelled in the wiki.

I’ve tried to install libleptonica (Leptonica is also required dependency) package from default Ubuntu repositories but Tesseract’s ./configure script does not recognize that it is installed. To cope with that I have downloaded sources of Leptonica 1.6.7 from its Google Code website and than followed rather standard build process:

./configure
make
sudo make install
sudo ldconfig

The next step was downloading tesseract-3.00.tar.gz from Tesseract project website. Uncompress archive, go to tesseract-3.0 directory and invoke:

./runautoconf
./configure

After invoking ./configure you should check config_auto.h is dependencies were recognised by ./configure script. Header file should contain #define for HAVE_LIBLEPT, HAVE_LIBPNG, HAVE_LIBTIFF, HAVE_LIBJPEG and HAVE_ZLIB.

make
sudo make install
sudo ldconfig

Without ldconfig you might experience problems with launching Tesseract.

Download languages of your choice from Tesseract website and place them (uncompress first) in your tessdata folder (by default /usr/local/share/tessdata).
Now run the OCR using:

tesseract phototest.tiff out.txt -l eng 
more out.txt

Hope that this will be helpful.

More good news from Turkey

In our previous post we have mentioned about great success of our Turkish colleagues, they managed to launch e-learning portal and skills certification in the area of digitization and digital libraries. This development was based on ACCESS IT materials initially developed by PSNC Digital Libraries Team.

Educational program was launched on 10th of January. It had to be very busy period for our Turkish colleagues, yesterday we received a news that Turkish portal has now 895 registered users! We hope to hear about anatomy of such a great educational success during the ACCESS IT final conference which will be held in the end of March in Istanbul.

ACCESS IT online course to be launched in Turkey

ACCESS IT e-learning courses (for more information see our previous post) were developed to facilitate education in the area of digitisation and digital libraries in Greece, Serbia and Turkey. Reference materials developed by PSNC – released in October 2010, were deployed, translated and customized by ACCESS IT partners. We are very pleased to announce that this work has been finished in Turkey.

Prof. Bülent Yılmaz (coordinator of ACCESS IT project in Turkey) from Hacettepe University in Ankara informed us that Turkish online education program will be launched on 10 January 2011 (next Monday). Turkish portal is available here.

Most of the “Digital Repositories for small memory institutions” (DRMSI) course was translated into Turkish. Apart from this, DRMSI course was divided into two smaller courses: “Digitisation” and “Digital Content management”. As a result of this work Turkish students can participate in three courses (the third one is unchanged “Cooperation with Europeana”) . Students who finishes given course and pass the final exam would obtain an ACCESS IT certificate signed by Hacettepe University.

First edition of the course will be open till 6th of March 2011.

i3 Conference: Internet – Infrastructures – Innovations

The “i3: Internet-Infrastructures-Innovations” conference took place in Wrocław on December 1-3, 2010. The conference was organized by the PIONIER Consortium. Poznań Supercomputing and Networking Center and Wrocław Center for Networking and Supercomputing played a significant role in organization of this year’s edition.

During the conference we had a chance to present two papers: “Technical Challenges Associated with Presentation of Digital Cultural Heritage in the Internet” and “Architecture and Protocols for Building a Knowledge System – PSNC Tasks in the SYNAT Project”.

The first paper gave an overview of the PSNC approach for the presentation of multi-domain digital objects over the Internet. The idea is to store various types of digital objects (e.g. texts, graphics, audio and video) using generic data model of the dLibra system and present the objects using personalized and dedicated web portals for different institutions, such as libraries, museums or oral history related institutes.

The second presentation “Architecture and Protocols for Building a Knowledge System – PSNC Tasks in the SYNAT Project” was a general overview of the work done by PSNC in the SYNAT project. The development of SYNAT has began in August 2010, so it is still in the early phase. The main purpose of this presentation was to show our initial assumptions and gather feedback which might be useful in our further work.

A number of interesting presentations were given during the conference. There were three main session themes (“Internet”, “Infrastructures”, “Innovations”). Subjects within these themes included e-health, e-education, Future Internet and network safety.

During the regions session a representative of the IZIP company presented a Czech Electronic Health Record system. The system was commissioned by the biggest health insurer in the Czech Republic. It stores information about the patient’s health, diagnoses, ordered analyses and their results. This way the analyses are not repeated by every doctor the patient visits, which significantly limits the costs. Another important goal of the system is health service quality improvement thanks to easy access to the patient’s history, facilitating the decision-making process.

We found the following presentations particularly interesting:

  • “Open Administration Services in Urban Area”, given by Bartosz Lewandowski from PSNC. Bartosz described the open e-administration policy promoted by PSNC and the Municipal Office of Poznań.
  • “The Importance of Digitization and Archival Resources for the Development of New Social Initiatives”, given by Piotr Skałecki. Piotr presented digitization projects from the field of genealogy, e.g. the “Poznań Project: Marriage Database Search” which allows its users to search a base containing data about 540825 marriages contracted in the years 1820-1889 in the former Province of Posen. Also the projects of the Polish Genealogical Society were mentioned, among them Geneteka and the transcription of a document entitled “Polish Declarations of Admiration and Friendship for the United States”.
  • “The URBANCARD Service” – a presentation about the city card (URBANCARD) for which the Municipal Office of Wrocław was awarded the Innovation Pioneer prize. This presentation was given by Dariusz Jędryczek.

Most presentations are available online at the conference webpage (http://www.i3conference.net/online/program.php).

9th “Automatyzacja bibliotek publicznych” conference

On 25-26 of November we had a chance to participate in 9th edition of “Automatyzacja bibliotek publicznych” conference. This Conference is organized as a joint effort of Warsaw Public Library, Polish Librarians Association and National Library of Poland.

This year conference was held under the slogan “Regional cooperation – strategy, tools and realisation”. Most of the presentations were heavily inspired by this leading sentence. So, we had a chance to listen about various approaches to public libraries automation including development of IT infrastructure, integrated library systems, models for successful cooperation between libraries and various aspects of education in the area of new technologies.

Conference program was filled with interesting presentations, including Edwin Bendyk’s “Between atoms and bits. Future of the digital book” which was truly inspiring. Apart from this one we would like to distinguish presentations made by employees of National Library of Poland (NLP):

  • Director Katarzyna Ślaska was speaking about competence centre in the area of digitisation which was launched this year in the NLP.
  • Joanna Potęga and Dariusz Paradowski were speaking about digitisation workflow model developed in the NLP.

It is also worth to mention about presentation of Madga Miller from City Public Library in Gorlice who was speaking about building of consortium of local libraries, which was led by her library.

PSNC Digital Libraries Team was presenting ACCESS IT e-learning materials. As the goal of the ACCESS IT was to reach small and medium memory institutions, it seems that it can be also useful for Polish public libraries.