Skip to main content
Search by keyword
Research

As artificial intelligence (AI) and robotics technologies continue to expand their scope of applications across the economy, understanding their impact becomes increasingly critical. The AI and the Future of Skills (AIFS) project at OECD's Centre for Education Research and Innovation (CERI) is developing a comprehensive framework for regularly measuring AI capabilities and comparing them to human skills. The capability measures will encompass a wide range of skills crucial in the workplace and cultivated within education systems. They will establish a common foundation for policy discussions about AI's potential effects on education and work. The AIFS project has undergone two phases of developing the methodology of the assessment framework. The first phase focused on identifying relevant AI capabilities and existing tests to evaluate them. It drew from a wealth of skill taxonomies and assessments across various disciplines, including computer science, psychology and education. The second phase, the focus of this report, delves deeper into methodological development. It comprises three distinct exploratory efforts:

  • Rating AI on education tests using expert judgement
  • Rating AI on occupational tests
  • Direct measures of AI capabilities

This report is organised as follows:

  • Chapter 1 by Mila Staneva  provides an overview of the report. 
  • Chapter 2 by Abel Baret, Nóra Révai, Gene Rowe and Fergus Bolger presents the evolution of methods the project used to collect expert judgement on AI capabilities from computer scientists and other experts.
  • Chapter 3 by Mila Staneva, Abel Baret et al. presents the exploratory work on the use of education tests for collecting experts’ assessments on AI. 
  • Chapter 4 by Mila Staneva, Britta Rüschoff and Phillip L. Ackerman discusses the usefulness of complex occupational tasks for collecting expert judgement on AI and robotics capabilities. 
  • Chapter 5 by Margarita Kalamova presents two exploratory assessments of AI and robotics performance on complex occupational tasks. 
  • Chapter 6 by Anthony Cohn and José Hernández-Orallo proposes a method for describing the characteristics of AI direct measures to guide the selection of existing measures for the assessment. 
  • Chapter 7 by Guillaume Avrin, Elena Messina and Swen Ribeiro provides an overview of the direct measures of AI resulting from the numerous evaluation campaigns organised by NIST and LNE. 
  • Chapter 8 by Yvette Graham, edited by Nóra Révai, reviews existing benchmark tests in the field of NLP and synthesises their results into a conceptual model for assessing AI language competence. 
  • Chapter 9 by Stuart Elliott summarises the results of the explorations described in this volume. It then outlines how these insights will be used for developing AI measures for key AI capabilities in the subsequent stage of the AIFS project. 

Skills intelligence publication details

Target audience
Digital skills in education.
Digital technology / specialisation
Digital skill level
Geographic scope - Country
Austria
Belgium
Bulgaria
Cyprus
Industry - field of education and training
Education not elsewhere classified
Geographical sphere
International initiative
Publication type
Study