The concept of big data has been in existence for 40 years, and its current high level of integration with technology has led to many changes in industry and everyday life. But what have been its effects on education, and how can it be applied to education in China?

A journalist from E-Learning interviewed Professor Li Qing of the Education Technology Research Laboratory at the Beijing University of Posts and Telecommunications. Professor Li said that “big data has initiated major transformations, and everyone should have data literacy. The application of big data will bring about dramatic changes in education. At present, the priority is to give education training as early as possible so as to improve the data literacy of teachers. The goal is to have education in China not merely keep up with the times but to stand in the global vanguard.”

The era of big data is in need of new thinking.

Journalist: In 1980, the futurist Alvin Toffler exalted big data as “the third wave’s cadenza” in his book The Third Wave. Nearly forty years have passed; how should big data be defined now?

Li Qing: Gartner, a world-leading research and advisory company, defines it as massive and diversified information assets with high growth that require new types of processing and make for stronger decision-making, insights, and process optimisation. McKinsey Global Institute defines it as datasets too large for typical database software tools to capture, store, manage, and analyse. It is generally recognised that big data is too vast to be managed by traditional means.

Big data is not just huge but also diverse, as well as lacking in structure. IBM has proposed five features to describe it: volume, velocity, variety, value and veracity. Volume refers to the vast amounts of data generated; variety to their different types and formats; velocity to the speed at which new data are obtained and processed. In terms of value, its individual units are less valuable than traditional ones, but its overall value is higher because of its massive size. Finally, veracity means that the results of big-data analysis may not be one hundred percent accurate. These five “Vs” are generally regarded as the criteria for judging whether a dataset can be considered to belong to big data.

However, in everyday use, the “big data” we frequently speak of may be both big data as datasets, and the practice of dealing with routine data using big-data methods.

Journalist: Big data has permeated various walks of life and is having a profound impact. What should be our attitude toward it?

Li Qing: Big data has initiated major transformations, and our way of approaching data should change accordingly, as the author of Big Data, A Revolution That Will Transform How We Live, Work and Think states.

First, we should reject data sampling. In traditional research, sampling was necessary because the full range of data could not be obtained or processed. This, however, is no longer the case.

Second, efficiency should take priority over accuracy. Big-data calculations are used to give practical guidance, which makes timeliness essential, such as when commodities are recommended during online shopping. When the system has to choose between timeliness and accuracy, precision can be sacrificed for better effectiveness.

Third, correlation should take priority over causation. It is very difficult, if even possible, for computers to infer cause and effect. However, in most cases, correlation is enough to find rules and make forecasts.

Big data will reform education completely.

Journalist: What role can big data play in education? What changes can it bring about in this field?

Li Qing: In the natural and social sciences, pedagogy is relatively backward in terms of research methods and the utilization of various tools. However, teachers are gradually moving from reliance on subjective experience and intuitive understanding of students to evidence based on data analyses.

Then what is the significance of big data to education? Seen microscopically, we can use big data to reveal problems that are normally hidden, find evidence for or overturn our assumptions about students and schools, identify the areas in greatest need of change, and guide the allocation of resources. Big data can also help us identify teaching structures or learning activities that are better suited to specific tasks or students than the ones now in use.

Seen macroscopically, the application of big data to education can promote its reform. First, it is conducive to individualised and data-driven teaching. For example, it will help teachers divide students into groups or allocate teaching resources at the regional (district and county) levels by using online learning to meet the actual needs of students. Educational fairness promises to be enhanced. Second, it is conducive to constructing an all-round, diversified and multi-dimensional evaluation system. On the one hand, the evaluation of individual students can be done from multiple perspectives instead of by only relying on examinations; on the other hand, the setup of benchmark data in each area will give a clearer understanding of development levels and allow adjustments to teaching strategies. Third, it is conducive to scientific decision-making in terms of education management, removing the subjective element and replacing it with data-based evidence. Finally, big data will allow education research to shift from a focus on theory to one on evidence, and offer advanced research methods and tools.

In short, the application of big data can facilitate the evolution of education.

Journalist: What types of educational data are included?

Li Qing: Evaluation data are one of the most common educational data, and include examination results, benchmark data of student abilities, and diagnostic data obtained during the course of teaching, along with data from questions and feedback, in-class activities, tests, experiments, and group reports; teacher records of observations related to student attentiveness, engagement and behavior can also be included. Data on individual students can cover motivation, attitude, sociability, attendance, and so on. In addition, there are data related to special-needs students, and cover their degrees of disability and levels of rehabilitation.

Some data are generated directly through teaching activities, examinations, tests, and online interactions. Other data are collected through management, such as information about families, teaching staff and schools, and still other are generated through campus life, and include records of food consumption and internet access.

Traditional educational data are non-electronic, and very difficult and time-consuming to organise and process. As education is moving more and more online, more and more electronic data of great diversity are being generated, unstructured and in real time, creating new opportunities for researchers to explore the learning environments of students.

Data-literacy education has fallen behind the international standards.

Journalist: Teachers are essential to applying big data. What is your understanding of the situation in foreign countries, and how do teachers in China compare?

Li Qing:The ability of teachers to support their teaching with big data is related to their data literacy. In developed countries, data literacy is part of their professional training, and requires teachers to understand what data are and be able to collect, analyse and explain different types of data in order to improve their teaching.

Due to their lack of training in this respect, most Chinese teachers are confused about how to collect and use data. For example, they cannot identify the data required or the sources of them, are poor at organising and analysing them, and do not know how to use them for evaluation and decision-making.

In addition, our current level of IT development in education is unfavourable in terms of enabling teachers to use big data. Teachers have difficulty in obtaining most student-related data, and lack tools to process them. At the same time, missing or private data related to the basic conditions of students at the district, county, city and provincial levels have also hindered the evaluation and decision-making of teachers.

Journalist: You have studied several aspects of teacher accomplishment, and served on the Study of Teacher Data Literacy in the Era of Big Data in the Beijing Municipal Educational Science Plan. What various levels of data literacy should teachers at universities, middle schools and primary schools have?

Li Qing: The research team analysed the data literacy of teachers in foreign countries in relation to the national average, and formulated a prelimary system for evaluating it. Their data literacy was divided into 26 categories in 10 dimensions and 4 types. The first type of literacy is a basic understanding of data and knowledge of how to use data tools; the second is the ability to obtain, manage and analyse data as well as to evaluate them in use; the third is applying data to teaching, which includes the ability to explore, communicate and make decisions through data. The last is awareness of and attitude toward data, and includes ethical aspects. In addition, teachers should be capable of enabling students to gain data literacy.

Teachers at all levels should be able to use data to discover the learning processes of individual students and entire classes, and to make teaching decisions. Diagnosis and discussion should also be supported by data, and university teachers should additionally have the ability to conduct scientific research using data when these are relevant to their research interests.

Training is in need of national, local and school collaboration.

Journalist: Training is an important way of improving data literacy. What arrangements do you think should be made and what are the priorities in teacher training?

Li Qing: The improvement of teacher data literacy should be part of their systematic training. This has been the case in Europe and especially America for 10 years, and considerable experience has been accumulated. Training of teachers in data literacy should be given attention by both national and local authorities, and both schools and individuals.

At the national level, data literacy needs to be included in teacher-qualification standards and become a basis for funding research, setting up demonstrations, and building open-education data bases at the national level.

On the one hand, local education administrators should promote the use of educational data together with data-literacy instruction to teachers, include data literacy in pre-service training, facilitate and support data-based professional development, and assess the data-literacy skills of teachers. On the other hand, data-use platforms should be made available to teachers, the IT infrastructure of schools should be upgraded, and local authorities and schools should be encouraged to provide resources, including manpower, for data-driven teaching.

At the school level, administrators should assure that data are included in the evaluation of teachers. An organizational culture should be established around the use of data, teachers should be provided with technical support and expert instruction, and their cooperation should be promoted.

The profound integration of virtual reality, big-data technology, cloud computing, sensor technology, and artificial intelligence with mobile networks, the internet of things, and other network technologies has made the concept of intelligent education a major trend in education. Teacher education will become an important driver of this trend as well as of education reform.

Li Qing is a professor and supervisor of Master’s students with the Education Technology Research Laboratory at the Beijing University of Posts and Telecommunications. His main research interests are mobile learning, the development of technology-based learning systems, and the standards for such technology. In recent years, he has published about 40 academic papers in China Educational Technology, E-education Research, Distance Education in China, Open Education Research and other journals, as well as at several international conferences. Since 2006, he has taken part in studies on mobile learning in online education, mobile learning and key technologies, innovations in undergraduate teaching activities from the perspective of u-learning, web-based wireless ubiquity, and others. He has years of research and development experience in the fields of mobile-software development and mobile learning.

By E-Learning, Liu Zenghui