gMan: From research data repositories to virtual research environments: (re-)activating archival knowledge for the Humanities
Studies of humanities scholars have demonstrated that they continue to rely on primary materials held in dedicated collections in special places, in repositories and archives, and it is in repositories (and archives) that the scholar carries out the work of assessing these source materials. In the UK and elsewhere there are significant digitisation programmes for humanities material, which to an increasing extent are able to provide the humanities researcher with digital surrogates for the physical archives. In some cases major memory institutions are systematically digitising the material for which they are responsible, but nevertheless digitisation is on the whole a somewhat piecemeal affair, and is carried out to different extents (e.g. image only or image plus OCR) and quality levels, depending on the availability of funds. Individual projects may address a particular set of archival material relating to a particular research topic, resulting in numerous dispersed (albeit usually online) resources, developed using different technologies and standards. Archival material is thus made easier to access, creating new possibilities for the researcher, but on the other hand this very availability raises new issues.

Our work sets out to investigate how (digital) repository content can be delivered to humanities researchers more effectively, independently of the location and implementation of that content, and with special means provided for customising the retrieval, management and manipulation of this information. Traditional finding aids are to be complemented by more sophisticated retrieval means. In particular, the personal copy of a finding aid that is often quoted as an important prerequisite for specialised research in archives is complemented by the ability to create on demand relevance indexes on the unstructured resources, and to combine the resources in new ways. We consider this to be the grand integration challenge for research repositories in the humanities, delivering data-driven humanities.
Our starting point was
D4Science, a production-level infrastructure serving mainly scientific communities, but which is not biased towards any particular discipline and has great potential for meeting the needs that we have identified for building VREs by combining repositories resources.
gCube, on which the infrastructure is based, is a distributed, service-based system designed to support the full life-cycle of modern research, with particular emphasis on application-level requirements for information and knowledge management. In gCube, VREs can be interactively designed and configured on demand, and the system is responsible for its physical deployment and correct operation in the infrastructure. Computational resources are exploited for computationally demanding tasks such as on-demand indexing of large collections.
We have been investigating how humanities repository resources can be imported into gCube, and how the VRE can be enhanced with further services according to the needs of the targeted research community. The gCube system is designed for extensibility; communities are encouraged to tailor the functionality to their particular needs, by developing new services or plugins. We have focused on importing existing Humanities research collections. We have plenty of those in Humanities. These are often in databases hidden behind web front ends. gCube has developed a well-defined
archival import service, which is of great use in Humanities. Here, we find a lot of existing collection produced in various digitisation and online analysis projects. At the end of such projects, it is often difficult to reuse these collections for future collaborative research. In gMan, we have shown a possible way forward based on scientific research infrastructures.
Some of the services are better explained in the screencast we produced: http://gman.cerch.kcl.ac.uk/?p=105
And some example collections we used to show the potentials of gMan services: http://fresh.cerch.kcl.ac.uk/collections/
Link to technical documentation of gCube: https://wiki.gcore.research-infrastructures.eu/documentation/index.php/GCube_Wiki
The import scripts we used can be directly reused from within the gMan infrastructure. If you have further questions, about this, please contact us directly
Date prototype was launched at Digital Humanities 2010: 08/07/2010
Website of the gMan service: http://portal.d4science.research-infrastructures.eu/ If you want to play with it, please contact us. We had to protect it with a password, as there is some expensive infrastructure involved. But, access is generally available to anybody.
Project Team Names, Emails and Organisations: http://gman.cerch.kcl.ac.uk/about
If you are interested in details and further collaboration, please contact Tobias Blanke.