Using Pilot Programmed For International Student Assessment Pisa and Trends in International Mathematics and Science Study Times to Inform Education Policy Discussion in Ethiopia

Executive Summary

This policy brief outlines the complexities existing in student learning assessment in the Ethiopian education system. It advocates for a broader method of assessing student learning outcomes and their associated factors and processes through incorporating international learning assessment systems, such as the PISA and TIMSS into our understanding of evidence-based policy and practice. It informs policymakers to move towards a broader definition of student learning assessment and participation in international learning assessment systems, including PISA and TIMSS. This broader conceptualization and participation can provide a sound basis for generating empirical evidence and developing a national assessment framework and strategies to enhance the quality and relevance of the education system.


For the education system to function more effectively and efficiently, using student learning assessment data often involves a challenging process of examination and evaluation. In view of this, so far, researchers in the Ethiopian education system have looked for evidence stemming from national and regional assessment practices that incorporate national data with modest rigor in methodologies as a way of revealing empirical evidence for evaluation. However, this narrow assessment approach, exclusively confined to the national data, is not enough to identify evidence for relevant and high-quality interventions, nor is it always the best way.

Education programs are designed and implemented with the aim of achieving successful outcomes for students across a range of domains, including academic achievement, attitude, and wellbeing, among others. How ‘success’ is defined depends upon the desired outcomes and the indicators or measures in use for achieving them. In addition to being able to demonstrate that the selected programs are logically related to the desired outcomes, educational programs and services also need to be able to show that they are effective in achieving these outcomes.

Research shows that evidence of effective education is based on scientifically-validated theoretical frameworks and methodologies that articulate clearly how the education processes achieve the desired outcomes. Without this evidence, intervention efforts to improve infrastructure and the quality of teaching and learning will only be based on what seems right or has always been done; this is less likely to achieve desirable outcomes. In the global experience, education systems of many countries in the world have numerous high-quality studies from which to draw solid conclusions about the appropriateness or otherwise of a particular intervention for a particular student group. However, this is usually not the case in the Ethiopian education system for a variety of reasons. Consequently, there has been a move within some countries, including middle- and low-income countries to redefine the concept of student learning assessment and how evidence is collected and analysed from a comparative perspective.

While Ethiopian national assessments collect valuable data on education quality and performance at the system level, data from international assessments, such as PISA and TIMSS allow for a comparison across education systems, giving Ethiopia the opportunity to share techniques, policies, and strategies from other countries. Some countries achieve much higher levels of educational performance in terms of system operation as well as outcomes. Detailed and internationally comparable evidence about education systems helps identify these strong performers in specific areas while also flagging weaknesses in other areas.

International learning assessment systems such as PISA and TIMSS are complex systems, and include selected student groups, specific framework, and core subjects. A major reason for conducting the Ethiopian pilot PISA and TIMSS studies was to shed light on the educational outcomes, processes, and wider contexts (home, school, and classroom) of education (primary and secondary). Large-scale international assessments, such as the PISA and TIMSS, can be a valuable resource for studying global trends in education systems performance, and to generate evidence of change in performance of a country over time (Prenzel & Sälzer, 2019).

This policy brief provides an overview of the main findings of the pilot international large-scale learning assessments, including PISA and TIMSS, as applied in the Ethiopian school context. In particular, it presents a summary that highlights the preparation, implementation, analysis, and reporting of these assessments in the Ethiopian school contexts, and draws relevant conclusions for the benefit of stimulating policy discussions. The proposed recommendations are identified and presented under four areas: i) preparation, ii) administration, iii) analysis, and iv) reporting.

Approaches, Methodologies, and Challenges


PISA uses the Education Prosperity Framework for approaching student learning assessment. The framework basically consists of two components called “prosperity outcomes”, and “foundation factors or enablers of success” (OECD, 2016). On the other hand, TIMSS uses the curriculum, broadly defined as the major organizing concept in considering how educational opportunities are provided to students. The curriculum model used in pilot TIMSS study constitutes three components, including the expected, the implemented, and the assessed curriculum. Later transforming into a Tri-Partite curriculum model, TIMSS defines curriculum at three different levels: the Intended – what a system intends students to study and learn; the Implemented – what is taught in classrooms; and the Attained – what students can demonstrate that they know (Mullis & Martin, 2017). Similarly, the prosperity model in PISA provided a broader framework to capture the complexities inherent in the Ethiopian education system (Willms, 2015).


The research team used a survey method, collecting quantitative data via questionnaires distributed to the students and their teachers. The essence of survey method can be explained as describing responses. In education studies, survey method of primary data collection is used in order to test concepts, reflect attitudes of people, establish the level of customer satisfaction, conduct segmentation research, and for a set of other purposes. The survey method utilized in this study pursued two main purposes: describing certain aspects or characteristics of population and testing hypotheses about nature of relationships within a population.

By studying the item-by-item results in the national and international contexts, the research team provided detailed insights about the performance of Ethiopian students’ samples included in the pilot PISA and TIMSS. Also, by combining item-by-item results into topic areas, the research team evaluated achievement in these topic areas across groups, including geographic area, school ownership, and gender. In addition to data on student achievement, a major purpose of these pilot PISA and TIMSS studies is to support educational improvement efforts by identifying positive and negative factors associated with learning outcomes measured in terms of cognitive and non-cognitive outcomes (Mohtar, Halim, Samsudin, & Ismail, 2019).

For example, the competencies assessed by the pilot PISA considered the fundamental school knowledge 15-year-olds should possess in this 21st century. Also, the TIMSS mathematics or science framework is based on two main organizing dimensions, a content dimension and a cognitive dimension.

Several international comparison statistics are given in the pilot PISA and TIMSS reports. Both in PISA and TIMSS, the scale centrepoint, the international average, along with the corresponding standard deviation are given. The scale centrepoint, which is the mean of the scales (for each subject, Mathematics, Science, or Reading), is 500, with a standard deviation of 100 score points. The international average is the mean score or percentage of students who answered correct in all countries participating in PISA or TIMSS at that schooling year level. Also, both of them have scoring rubrics for each constructed-response item in each subject domain.

Challenges encountered

  • Selecting adequate assessment items for measuring students’ competencies in PISA tests was a challenge as there were limited items officially released and available on the web. Even finding results of the item analysis was sometimes quite difficult. Still, these evidences were the basis for the selection of the item.
  • There were challenges faced in customizing the items to the Ethiopian context due to the strange nature of the PISA questions, demanding familiarity with and extra caution. Also, the research team had difficulties of finding the right expert for translation of the TIMSS questions across more than 10 languages, particularly for TIMSS Grade 4 Mathematics and Science subjects. It was also a challenge to determine the number of items and the grade level of students to be assessed. While a proportional number of students was expected to be sampled in each school, sometimes the sample went far above or below the recommended level.
  • The administration of more than one subject test for each student sample and the inclusion of additional contextual questions was time taking. Also, teachers complained about the time allotted for completion of the questionnaires.
  • Expert involvement in marking and scoring constructed-response items was also costly and problematic.
  • There were various other challenges in combining test data and contextual data involving social background variables as well as the characteristics of the educational environments.



On a scale of 0 to 1,000, the average score in Mathematics Literacy of 15-years old students was 166.93 with SD of 101.92. For Reading Literacy, the average score was 252.56 with SD of 104.07 (0 to 1000 scales).  Similarly, for Science Literacy, the average score was 252.78 with SD of 97.98.  The Ethiopian average scores for the three subjects tested were much lower than PISA 2018 OECD average scores and other participating countries’ in Africa, for example, Morocco’s performances in PISA 2018 (OECD, 2019), Zambia in PISA-D, and  the overall PISA-D’s average for countries with lower and middle income (OECD, 2018).

Similarly, Ethiopia’s Grade 8 students’ samples’ scale mean score for the pilot TIMSS mathematics was 246.19 (SD=137.28) and that of science was 265.35 (SD=112.10). These were similar with the results of Ethiopia’s fourth grade students’ samples, with the scale mean score for the pilot TIMSS mathematics and science were 258.58 and 276.90, respectively. These results are far below the lower average recommended by the benchmark score of 400 (Mullis & Martin, 2017).

Seen from a different perspective, Ethiopian students’ samples had moderate attitude towards learning as measured in terms of students learning interest, confidence, and values. However, the presence of some common health issues for 30-45% of them on the average is a serious challenge. Moreover, as per the results of the students’ contextual questionnaire, some students reported that they did not have access to books at home and lack study supports. Some other students reported that they came to school feeling tired or hungry every day.

Group Differences

Almost all achievement-related variables included in the pilot PISA and TIMSS studies showed significant differences when comparing students sample groups based on geographic area and school ownership. High-performing schools tend to be located in an urban location and with the private and faith-based school ownership than others. Accordingly, students with the greater average achievement of pilot PISA and TIMSS are found in an urban location and the private or faith-based schools than others. Effect size scores indicate that location and ownership are the variables that provide the greatest explanation of the level of performance (in terms of percentage of variance explained).

Students in private and faith-based schools performed significantly higher than students in governmental schools for most of the achievement measures. There was a statistically significant difference between the achievement of boys and girls in educational attainments measured.


Internationally and in the pilot PISA and TIMSS studies contexts, there was a clear negative association between the higher sense of tiredness or hunger, frequency of bullying, and the extent of disorderly behaviour and students attitude for the Mathematics and Science learning (Mullis, Martin, Foy, Kelly, & Fishbein, 2020Thomson et al., 2021Thomson, Wernert, Rodrigues, & O’Grady, 2020). Similarly, students’ absenteeism, sleeplessness, and depression symptoms increase when students’ samples had lower sense of belongingness, higher frequency of bullying, fewer meals per day, and higher feelings of tiredness or hunger, among others.

Of the different predictors of students’ learning attitude in Mathematics and Science, clarity of instruction and school belongingness had consistently been the most powerful positive influencers than others. In addition, the results indicate that the more positive attitude students had for the learning of Mathematics, Reading, and Science, the higher they tended to achieve higher test results.

The use of initial SEM models effectively guided analysis to consider established evidence. At the same time, the use of a preliminary conceptual framework allowed for the development of new evidence to emerge for Ethiopian students’ samples. As clearly presented in the SEM analysis results of the pilot PISA and TIMSS, teaching quality was what matters most for students learning in school (Kyriakides, Christoforou, & Charalambous, 2013). The clarity of instruction with which teachers deliver to students has significant implications for student learning. As per the findings of the pilot PISA and TIMSS studies, this was found consistently true across the student participants’ samples.


The results of the students’ outcomes, attitude, and wellbeing indicated that Ethiopia needs to work very hard in terms of infrastructure and improving the teaching and learning in schools. If priority should be given, it is important to put in place intervention measures to narrow the achievement difference across geographic areas, school ownership type, and gender. According to existing evidences several factors across the individual, home, classroom, and school dimensions are predictors of educational achievement. However, factors such as clarity of instruction, support received during the COVID-19 pandemic, bullying, and other factors have consistent influence compared to others. Taken as a whole, it is clear from the evidence that factors associated with students learning in schools are complex and contextually embedded.

Policy Recommendations


In terms of preparation, it is recommended to use the specific framework, processes, tools, and procedures of the standard PISA and TIMSS. Also, inclusion of a broader stakeholder groups should be the target. These include students, parents, teachers, principals, curriculum experts, and supervisors, depending on the specific guidelines presented in the particular international assessment systems.

Moreover, it is highly recommended to begin PISA or TIMSS, based on demands and the prevailing contexts. Hence, a systematic entry into the international assessment systems with the long-term preparation and empowerment sounds much better. For this, first entry should target less TIMSS Mathematics and instead of the regular TIMSS. Gradually, moving on to the regular TIMSS after reviewing performance and making the necessary adjustment. Entry to the PISA need to be a long-term target after obtaining ample experience with TIMSS and building better capacity, targeting the young population beyond the in-school students to accommodate out of school children as well.


It is recommended for MoE to conduct a full-scale field test of all instruments and operational procedures in preparation for the standard PISA and TIMSS data collection. The field test will enable the MoE and their staff to become acquainted with the operational activities. The feedback they provide will be used to improve the procedures for data collection and for the establishment of much stronger baseline and capacity. The field test will result in some enhancements to survey operations procedures and administering the test booklets to ensure the successful execution of the standard procedures. The following are the major operational activities the MoE need to coordinate for data collection:

  • Contacting sample schools and sampling classes;
  • Selecting assessment instruments;
  • Sampling schools;
  • Verifying translation and layout of the assessment instruments;
  • Managing the assessment administration.

The other important recommendation is eAssessment. It is believed that changes in technologies of data collection will transform the actual conduct of the studies (Gustafsson, 2018). Given the powerful advantages of eAssessment, in providing an enhanced measurement of the student performance and efficient assessment system, it is recommended that Ethiopia’s participation in the future PISA and TIMSS should consider transitioning to administering these assessments via computer. A blended administration of the paper and pencil format as in previous assessments, together with eAssessment should be given special attention.

To support the transition to eAssesssment, it is highly recommended to develop eAssessment systems to increase operational efficiency in item development, translation and translation verification, assessment delivery, data entry, and scoring (Martin & Mullis, 2019).


It is recommended to score the constructed response items using the specific item level rubrics and experts’ marking and scoring accordingly. Also, using IRT modelling through a 2-parameter logistic model and item analysis would help to provide item level information, at the same time, analysing students’ test results. The presence of item information not only helps to measure the difficulty and discrimination ability of the items but also in creating the data files for the item banking. Moreover, it is recommended to use SEM analysis to examine the influences of the educational processes and some other factors of student learning outcomes.


In terms of reporting, it is recommended to pay attention on the multidimensional nature of educational outcomes that goes beyond test results. Also, from the perspective of data reporting, it is recommended for Ethiopia to use the formats and templates of the PISA and TIMSS, as well as SEM methodology as we do for our pilot PISA and TIMSS. SEM comprising the personal, home, school, and classroom contexts, improve the traditional single-level analysis approach (Kyriakides, 2008Mohammadpour, Shekarchizadeh, & Kalantarrashidi, 2015). By analysing multi-dimensional models, researchers can ensure rigor. Hence, it is recommended for Ethiopia to use the formats and templates of the SEM methodology for PISA and TIMSS data.

Related Post

Cognitive Behavioral Therapy (CBT): A talk therapy
CBT is a well-established effective type of short-term therapy or treatment. It is based on the connections...
Read More
Emotional Maturity- Emotional Intelligence
Emotional maturity means being honest about your feelings and building trust with those around you because...
Read More
One Health Perspective: Food as a Connection among Environmental Wellbeing, Animal Welfare, and Human Health
One Health is the ideology that human, animal, and environmental health are all interconnected. It entails...
Read More
Scroll to Top