PSNC will be coorganizer of the Polish edition of the Hack4Europe! More details: http://hack4europepl.eventbrite.com/
After the Europeana Open Culture Conference in 2010 we started cooperation with Europeana on a prototype use of the Europeana API in some of our services. After some initial discussions we decided to develop two widgets based on the API: one for the Polish Digital Libraries Federation (DLF) and the other one for the Digital Library of Wielkopolska (DLW).
DLF is a Polish metadata aggregator which harvests information from around 60 digital libraries. Currently it provides information about more than 550,000 objects. Of course, the information is contributed to Europeana. DLW on the other hand is the largest Polish digital library. It holds around 130,000 digital objects, mostly national and local cultural heritage from tenths of memory institutions from the Wielkopolska (Greater Poland) region. DLW contributes the metadata to the Digital Libraries Federation.
We wanted to use the Europeana API to provide easier access to European cultural heritage artifacts for users of Polish digital libraries without forcing the users to change their usual workflow. Therefore we have made some initial assumptions about the workflow. We assumed that search in aggregated metadata is the main DLF functionality for the majority of end users. Each displayed DLF search result contains only a few elements of the harvested metadata and redirects the user to full information in the source digital library (e.g. in DLW). The functionality left to the source digital library is to display the full metadata record and to give access to the content of the digital object.
Finally we came up with the idea to achieve our aim by enrichment of the information presented to DLF and/or DLW users with links to additional objects available via Europeana, which can be practically done by putting widgets based on Europeana API on the DLF search results page and DLW full metadata record page.
Further analysis was focused on technical aspects. Europeana API is an Open Search protocol interface. To get some results, an input query is needed. We assumed that for the DLF widget the input will be the query submitted to the DLF by the user, and for the DLW the query will be built from selected elements of a particular metadata record displayed by the user.
As the metadata from DLF is visible in Europeana, we had to face the fact that DLF database is updated each night and DLF to Europeana data transfer in practice takes place every three months. As a result, DLF is a more up-to-date source of information for Polish digital libraries metadata search. On the other hand, Europeana of course contains a lot more information than the Federation. The final decision was to join the data from Europeana and the DLF at runtime: when preparing the final set of information to be shown to the user, the results from Europeana should not include data from DLF, as this data should be taken directly from DLF. Another issue was related to cross-language searching. We decided that the subject element from the DLW metadata records will be translated with Google translate to English, Spanish, German and French before it is sent by the widget as a query to the Europeana API.
The final result of our technical discussions was the architecture presented in the image below.
Figure 1. Final architecture of Europeana API usage by the Digital Libraries Federation and the Digital Library of Wielkopolska.
As you can see in Figure 1, we try to integrate the functionalities provided by three services – Europeana, Digital Libraries Federation and Digital Library of Wielkopolska:
- Europeana exposes Open Search API.
- Digital Libraries Federation uses Europeana API to provide the search results from Europeana together with the search results from the Federation. Those results are presented together in the Digital Libraries Federation website, as you can see in Figure 2.
Figure 2. DLF’s search results page with results from Europeana [source].
Moreover, the Federation exposes two Open Search APIs for external services. One of those is the Federation’s API and the second one is a proxy to Europeana API dedicated for use by Polish digital libraries. The main reason for which the Europeana API has a proxy in the DLF is the ease of development and use of the widget prepared by the Federation for Polish digital libraries.
- This widget is embedded in websites presenting metadata records of digital objects published by the Digital Library of Wielkopolska. The widget extracts parts of the metadata, translates them on the browser side with the Google Translate service and sends the translated metadata together with the OAI Identifier of the digital object to both Open Search APIs exposed by the Federation. After responses are processed by the widget, the search results are presented as a part of the website with the digital object metadata. You can see example of such results in the left column in Figure 3.
Figure 3. Search results from Europeana and DLF on the Digital Libraries of Wielkopolska site [source].
If you would like to see some live examples, you can try the following links:
- search results for Boże Narodzenie in DLF with search results from Europeana (Boże Narodzenie means Christmas in Polish),
- search results for Christmas in DLF with search results from Europeana,
- memoirs of the German occupation period in DLW with search results from Europeana and DLF on the left side,
- doctor thesis on hepatitis C in DLW with search results from Europeana and DLF on the left side.
The design and implementation of both widgets and other necessary code took about 10 person-days of a skilled programmer. The API-based widgets were first deployed in the test environment and consulted with Europeana Team which was also responsible for providing technical information about the API. Then on the 22nd/23rd of December 2010 widgets were deployed in the production environment.
While working on the design, implementation and deployment of the widgets based on the Europeana API, we were hoping to contribute to the following (expected) user flow (see Figure 4):
Figure 4. Expected user flow after widgets’ deployment.
We assumed that the widgets will attract its users to visit Europeana. In return the increase of the group of Europeana users should finally cause also an increase of DLF and DLW users. As widgets were deployed just two months ago, it is not possible to already observe the increased traffic. Nevertheless, at the beginning of March we have contacted the Europeana team and asked for some statistics regarding the traffic coming to Europeana from Poland and about the traffic attracted to Europeana by the Digital Libraries Federation and the Digital Library of Wielkopolska. For the purpose of this article, we have compared those statistics with our own data. All statistics were gathered with Google Analytics.
First, let’s try to find out whether the widgets were useful for end users. Both DLF and DLW get about 70,000 visits each month. Figure 5 contains a comparison of the percentage share of three types of such visits for DLF (blue bars) and DLW (red bars). The middle pair of bars (marked as 100%) represents visits during which a visitor displayed a page with the widgets installed. Those pages (DLF: search results page; DLW: metadata record page) are so crucial to the functionality provided by the service that we assumed that any visit skipping those pages must have been somehow accidental. The pair of bars on the left shows the number of all visits. As you can see that around one third of all visits are not reaching the crucial functionality of the website. This is for sure something that could be improved. But coming back to Europeana API widgets – the last pair of bars in Figure 5 shows the percentage of users who reached the page with widget and decided to click on the digital object’s link provided by Europeana via the API. As you can see, 7.5% of the Federation users went to Europeana and 0.67% of DLW did the same.
Figure 5. Comparison of user visits in DLF and DLW.
At this stage it is hard to estimate whether this is satisfactory, but when we think of the additional links as some kind of targeted advertisement placed on a cultural heritage website, the results may be seen as quite good.
Another interesting analysis is the comparison of the traffic coming out to Europeana from those two services with the traffic coming the other way around. This is shown in Figure 6. First, let us take a look at traffic coming out to Europeana (blue bars). As it was mentioned earlier, the widgets were deployed in the second half of December. In November 2010 there was no traffic coming to Europeana at all. In December we sent around 1, 000 visits, and in January 2011 it was almost 3,800. This again confirms that the widgets and data provided by Europeana were found useful by our users.
Figure 6. Comparison of traffic to and from Europeana over time.
At this moment it is hard to say, that the number of users coming from Europeana (red bars) to DLF and DLW changed after the widgets were deployed. The number of users in November 2010 and January 2011 is quite similar. The December 2010 traffic is significantly smaller, but this is also caused by the Christmas break. Another interesting number is the number of new users coming from Europeana each month. New users are users who have never before visited the service. It seems that Europeana provides us a constant flow of new users – around 400 each month.
The last statistics that we would like to present is a simple comparison of traffic sources for the Digital Library of Wielkopolska (the largest Polish digital library). Traffic sources are ways in which users reach our service.
The table above shows all traffic sources for the DLW in January 2011, which generated at least 1% of overall January traffic. Two first results are quite obvious – direct access (for example a bookmark in the web browser) and access from Google search results. But position 3 and 4 are very interesting. The 3rd place is the national aggregator – the Federation, and the 4th is the European aggregator – Europeana. The last three positions are taken by two genealogical services and Polish Wikipedia. These statistics show that the model of multilevel aggregation described in the Europeana Content Strategy is very good at attracting users to the participating digital libraries.
In this article we have described our experience with Europeana API. It offers very interesting possibilities and it is easy to use (as it is based on the well-known Open Search standard). We hope that there will be more such mechanisms in Europeana in the nearest future, as they give the possibility to move the knowledge about European cultural heritage from metadata aggregation to services integration. And this seems to be the direction of evolution desired by the users.
Presentation for this article can be found at http://dl.psnc.pl/biblioteka/dlibra/publication/349/content