4. Tiers and Key Technologies of Digital Libraries

4.1 Tiers

Figure 3 shows the function tiers of the entire system. The whole digital library framework includes five major tiers. They are bottom-up infrastructure device tier, digital resource tier, system management tier, business logic tier, and user interface tier. The multi-tier structure shapes a controllable, self-adaptable new digital library service system.

The infrastructure device tier provides the fundamental environment of the system platform, and its overall scale is decided by the size of the library and the number of users carried. Meanwhile, the use of system software operation environment and hardware device shall be taken into overall consideration with an eye to the expansion of digital resources and the expected increase in users over time. The construction of hardware devices includes network environment, communication and transmission equipment, servers and storage equipment, computer room conditions, and system software, as well as associated contents like network security, and system operation and maintenance. In combination with analysis on the above mentioned key components, the infrastructure device tier is distributed in different computer nodes of the business and service integration subsystems or in the trusted cloud service environment.

The digital resource tier includes information resources developed by libraries in accordance with standard specifications, introduced business databases, and other network information resources and open social resources. Each business subsystem owns independent digital resources, software, and basic hardware devices, and the digital resource tier is logically located in different business subsystems.


Figure 3. Function Tier Chart of Digital Libraries

The system management tier includes such application system modules as user authentication, resource processing, resource purchase, copyright protection, performance evaluation, resource scheduling, and system statistics. It is also logically distributed in different business subsystems because of the need to directly manage the corresponding hardware device and digital resources. In addition, certain functions of the system management tier are distributed in the service integration subsystems, such as the uniform authentication system, account management, log system, resource distribution control, data resource scheduling, secure data links, and so on.

In parallel to the system management tier is the business logic tier, which includes all the business functions and services of libraries. According to functional distinction, the business logic tier includes heterogeneous retrieval, remote access, union catalog, resource guide, reference service, document delivery, information push, My Library, and various kinds of services. The services provided by the business logic tier are specific user-oriented digital library service items, which are also the basis to realize resource integration, information organization, and online service of digital libraries. The business logic is established directly above and closely coupled with the digital resource layer.

On top of the system framework is the user interface tier It is the interface for users to access information resources and digital library services. Users’ authentication and authorization are done via a user management component at the system management tier. Once the user logs on, he can easily access the required resources and services, and customize individualized pages without repeatedly logging on to different business subsystems.

4.2 Key Technologies

Key technologies to materialize the overall framework described in this paper can be separated into the following five categories:

 (1) Master-slave distributed architecture. The business subsystem is the slave, and the service integration subsystem is the master of the entire system. The master includes all the registered node information, meta information of business requirements, and corresponding business subsystem. The master-slave framework, simple in structure, is easy to be realized, but there is the possibility of single point of failure (SPOF) [8], that is, a breakdown in the master system could leave the entire system inoperable. To solve this problem, a hot standby or multi-machine backup mechanism can be adopted by the master. In consideration of the reality of more reading and less writing in the whole business process of the digital library system, the hot standby and checkpoint-based log management mechanism are quite able to overcome any SPOF weaknesses, thus preventing the failure of computing nodes in service integration subsystem from making errors in the overall services.

 (2) One-stop authentication mechanism. The service integration subsystem provides a uniform user authentication mechanism. When users enter the system successfully, they can cut across the identification authentication processes of all business subsystems via the service integration subsystem. The API interface helps the service integration subsystem to complete the functions of safe call and identification authentication, and to ensure that all logged-in users can successfully pass the trusted access controlled service of the secure control mechanism. To the interfaces in need of identification authentication, any application system calling these interfaces must follow the standard specifications defined by the service integration subsystem. In addition, one-stop authentication requires all business subsystems match the authentication system of called service integration subsystem. Thus, seamless integration can be made in an effective way.

 (3) VPN technology. VPN is a widely used secure communication mechanism. VPN technology can build a secure virtual private cloud in an open Internet environment. When combined with the one-stop authentication mechanism, VPN technology can help users gain secure and easy access to resources and services provided by digital libraries over the Internet wherever they are.

 (4) Multi-channel business integration. The service integration subsystem offers multiple flexible methods of integration. For instance, users have creation and customization control over things like mash-up, and have such services as REST, RSS, and OAI integrated to control the scope and permission of integrated services. In this way, not only is security ensured for the integrated service, but also for the integrated results shared with the cooperative party.

 (5) Integration of data and results. In terms of new entry system integrated to the entire system by way of proxy users, the mechanism of matching the results and data fusion is adopted to optimize the query results and to upgrade user experience because the non-invasive integration leads to a metadata black box, and the repetition of data and resources in different subsystems leads to overlapped query results.

5. Application

The China Central Radio & TV University (CCRTVU, aka “The Open University of China”, OUC) is developing Digital Library Project of Nationwide Open Universities. It aims to unify the OUC’s digital library and the 44 digital libraries from RTVUs (Open Universities) distributed in different provinces and cities into a uniform RTVU (OUC) digital library system to better serve the RTVUs (Open Universities), teachers, and students nationwide. In combination with CALIS third phase construction project [9], the CALIS application framework is taken as the main body framework of the RTVU (OUC) digital library system. A nationwide RTVU (Open University) digital library system is formed through sharing and connectivity of resources and services by open universities at all levels, and it joins the CALIS as an entity shared domain.

During the process of building the RTVU (Open University) digital library system, the system framework introduced in the thesis is used to build a prototype system, in which local RTVUs (Open Universities) set up business subsystems in harmony with local circumstances and their tasks for resource construction and service development. The CCRTVU (OUC) builds and maintains service integration subsystems, takes the responsibility of integrating and collaborating digital resources and service functions provided by all business subsystems, and offers the business subsystems leased SaaS services, such as inter-library lending and document delivery, combined reference and consultation, and thesis library system. Experience proves that the proposed distributed digital library framework model can operate effectively and provide constant and reliable service.

6. Conclusion

As a new distributed computing technology, cloud computing offers numerous advantages, including high scalability, easy access, low cost, and loose coupling, which offers new development thoughts and technical references for digital library construction. However, the application of cloud computing in a digital library is still a new topic, exposed to a series of technical difficulties, such as distributed liability, resource effectiveness, resource integration, and authentication control mechanism.

In this paper, a new distributed digital library framework is proposed in combination with such techniques as cloud computing platform, SOA, and VPN based on the contribution of the RTVU (Open University) digital library project. The framework is fully combined with the construction requirements of multi-layer management, distributed access, resource sharing, and service collaboration of the nationwide RTVU (Open University) digital libraries. It is universally applicable and can be favorably integrated with document information service systems like CALIS. The key nodes of the framework are divided into two categories, the business subsystem and service integration subsystem. The business subsystem is the specific implementation system of the digital library service, which is responsible for direct management and organization of digital resources, and provides basic digital library services. The service integration subsystem is in charge of integrating various types of resources and services provided by business subsystems and releasing them to users. Corresponding to the tiered system structure, the business subsystem includes all functions of the basic hardware and digital resource tier, as well as some functions of the business logic tier. In contrast, the service integration subsystem is in charge of system management, user interface, and other business logic. The multi-layered architecture forms a low coupling, highly cohesive, controllable, and self-adaptable new type of digital library service system, which better matches the diversified requirements of different types of users with differing levels of information resource access privileges.

References

[1] Qi Zhang, Lu Cheng, Raouf Boutaba. Cloud computing: state-of-the-art and research challenges[J]. J Internet Serv Appl. (2010) 1: 7-18.

[2] Zhang Zhenglu. A Review of Cloud Computing Research Among the Library and Information Circle in China [J]. Journal of The National Library of China.2010(3):73-76.

[3] Library Cloud Atlas: A Guide to Cloud Computing and Storage Stacking the Tech.[OL].[2011-6-1].http://www.libraryjournal.com/article/CA6695772.html.

[4] Rochkind J. OCLC library management services. Bibliographic Wilderness. [OL].[ 2011-6-1]. 

http://bibwild.wordpress.com/2009/04/29/oclc-library-management-services/.

[5] Kevin Liu. How Libraries Uprising with the Cloud Computing [J].Journal of Academic Libraries.2009(4):2-6.

[6] NoSQL.[OL].[2011-6-1]. http://en.wikipedia.org/wiki/NoSQL.

[7] Shared-nothing Architecture.[OL].[2011-6-1]. 

http://en.wikipedia.org/wiki/Shared_nothing_architecture.

[8] Single Point of Failure.[OL].[2011-6-1].http://en.wikipedia.org/wiki/Single_point_of_failure.  

[9] Wang Wenqing, Chen Ling. The Model of CALIS Cloud Service Platform for Distributed

Digital Services [J]. Journal of Academic Libraries.2009(4):13-18.