Managing data and information to safeguard our environment
- Avinash Chuntharpursat
The various natural resource sectors and disciplines in South Africa have been actively engaged in a wide range of research programmes for several decades, in some cases for over 100 years. Because there were no data management systems, data and information from this research were sometimes lost by the institutes that carried out these studies. Sharing of data, information and knowledge was largely limited to the organisation that produced it.
Societal change in South Africa has resulted in an emergence of a knowledge-driven society. This is evident from the government’s legal support for the promotion of access to information. Advances in information communication technology (ICT) have enabled the rapid dissemination of knowledge and its components. This, combined with a developing culture of sharing, is a major enabling factor in a knowledge society.
Due to global concern over environmental issues, long-term monitoring via the various SAEON nodes is crucial in supplying data for decision-making at all levels of society. It is with these considerations in mind that the SAEON Data Management System (DMS) has been devised.
The system above shows the interaction between the components and requirements of a DMS via a data ethics background. The components - people, data, hardware, software and processes - give a broad indication of what the DMS will consist of. The requirements indicate what the DMS is expected to deliver.
Of particular note are the public awareness and analytical capability requirements. These two special requirements are largely relevant to developing countries such as South Africa . The DMS should be in a position to facilitate various science awareness projects via SAEON’s educational programme or via those of various science promotion organisations.
Software tends to be prohibitively expensive. Providing free analytical capabilities via the web or as downloadable analytical software goes a long way in promoting analysis and exploration of data. Open-source software would feature prominently here.
Data ethics is a concept currently being developed and promoted by SAEON. A data ethic would provide overall guidance to the human networking that is inherent in data management. This not only applies to SAEON, but also to the knowledge society as a whole. SAEON’s aim is to ensure free access to data. However, in achieving free access there are ethical issues to consider, such as data sharing, intellectual property, giving credit, long-term custodianship, protection of endangered species, and privacy. Developing a data ethic is a long-term societal change undertaking. SAEON hopes to encourage this by taking the first steps towards the development of such an ethic.
The first point of access and association with a dataset is through its metadata, which is the information used to describe the relevant data. It is therefore crucial that the metadata be accurately representative of the data and provides the enquirer with as much information as possible. SAEON aims to utilise international standards for metadata, including Ecological Metadata Language (EML), a derivative of XML (eXtensible Mark-up Language) and associated tools.
Internationally, SAEON forms part of a network called International Long Term Ecological Research (ILTER). SAEON regularly exchanges information on data and information management with its LTER partners. Currently China and Taiwan are the leaders in countrywide information management. Taiwan LTER has the only known system for implementing EML nationwide. This system was initiated a few months ago and serves as inspiration for the SAEON DMS. Prior to this, there was no practical system that SAEON could relate to.
Interoperability, which allows for the free exchange of data between different systems, is a key feature of the SAEON DMS. There are other aspects to interoperability than just technical issues. A recent trip to an Information Management workshop in China revealed the need for interoperability between languages and alphabets. EML allows for different languages and formats. This is important globally and locally, particularly since South Africa has 11 official languages that need to be catered for.
To ensure sustainability, a strong human network is required. From international observations it emerged that good relations with the relevant stakeholders are crucial for the formation of a strong DMS and related human network. SAEON is in the fortunate position of having a network of experts who are called upon to provide technical advice in implementing the DMS.
From the above principles and requirements it is evident that SAEON faces a daunting task in the implementation of such a system. Currently, the Collaborative Spatial Analysis and Modelling Platform (CoSAMP) of the CSIR and Innovation Hub is the preferred platform for the implementation of the DMS. The CoSAMP caters for a distributed system of nodes with a central platform that allows for the compilation of data from different sources and systems, their integration into a single dataset or product, and interoperability between different systems. It is flexible since it caters for data management at individual nodes as well as a central data repository.
This article provides an outline of the SAEON data management system. However, many issues still need to be finalised. SAEON recently opened its first node at Phalaborwa. More nodes are required, and more input would be needed from the nodes for a more representative strategy to best cater for nodal requirements.
The stakeholder community is invited to comment on the above developments by writing to Avinash – avinash@saeon.ac.za