A systematic review of relationships and sex education outcomes for students with intellectual disability reported in the international literature

Abstract Background Little is known about how to evaluate relationships and sex education (RSE) delivered to students with intellectual disability and what stakeholders perceive are important outcomes. The present study aimed to systematically review existing studies on outcomes of RSE, as the first step in the development of a core outcome set (COS) for students with intellectual disability. Method A systematic literature process included two stages: (1) searching for studies reporting on RSE outcomes for students with intellectual disability and (2) studies reporting on measurement properties (e.g. validity, reliability and responsiveness) of standardised instruments identified in stage 1. Results A total of 135 RSE outcomes were extracted from 42 studies: 43 outcomes for students in secondary education and 92 outcomes for students in further education. No RSE outcomes were reported for primary education. Outcomes referred to the human body, hygiene, relationships, sexuality, sex and its consequences, inappropriate and appropriate social and sexual behaviour, keeping safe, emotional vocabulary and positive self‐esteem. Outcomes were predominantly knowledge‐based, rather than relating to skills and attitudes development. Students with intellectual disability, parents and teachers perceive different RSE outcomes meaningful. Five instruments were used to measure the outcomes, but none have established psychometric properties with this population. Conclusions The comprehensive list of RSE outcomes for students with intellectual disability will be used to inform the next steps of a Core Outcome Set needed for RSE evaluations in research and education settings. There is an urgent need to develop standardised instruments validated for students with intellectual disability.


Abstract
Background Little is known about how to evaluate relationships and sex education (RSE) delivered to students with intellectual disability and what stakeholders perceive are important outcomes. The present study aimed to systematically review existing studies on outcomes of RSE, as the first step in the development of a core outcome set (COS) for students with intellectual disability. Method A systematic literature process included two stages: (1) searching for studies reporting on RSE outcomes for students with intellectual disability and (2) studies reporting on measurement properties (e.g. validity, reliability and responsiveness) of standardised instruments identified in stage 1. Results A total of 135 RSE outcomes were extracted from 42 studies: 43 outcomes for students in secondary education and 92 outcomes for students in further education. No RSE outcomes were reported for primary education. Outcomes referred to the human body, hygiene, relationships, sexuality, sex and its consequences, inappropriate and appropriate social and sexual behaviour, keeping safe, emotional vocabulary and positive self-esteem. Outcomes were predominantly knowledge-based, rather than relating to skills and attitudes development. Students with intellectual disability, parents and teachers perceive different RSE outcomes meaningful. Five instruments were used to measure the outcomes, but none have established psychometric properties with this population. Conclusions The comprehensive list of RSE outcomes for students with intellectual disability will be used to inform the next steps of a Core Outcome Set needed for RSE evaluations in research and education settings. There is an urgent need to develop standardised instruments validated for students with intellectual disability.

Introduction
Current evidence suggests that individuals with intellectual disability have limited knowledge and skills regarding sexuality and relationships (Galea et al. 2004;Jahoda and Pownall 2014;Borawska-Charko et al. 2017;Baines et al. 2018). They are twice as likely to experience unintended pregnancies, sexually transmitted diseases, and sexual abuse than people in the general population (Jahoda and Pownall 2014;Baines et al. 2018).
In 2020, relationships and sex education (RSE) became compulsory in English schools for all students including those with intellectual disability (Department for Education [DfE] 2019). The aim of RSE is to equip all students with knowledge, attitudes, and skills about 'the emotional, social and physical aspects of growing up, relationships, sex, human sexuality and sexual health' (Sex Education Forum 2020). The new policy indicates that all students including those with intellectual disability must receive Relationships Education in primary schools (when students are 5-11 years) and RSE in secondary schools (when students are 11-16 years) (DfE 2019). Schools can adapt the content to meet individual needs and developmental levels and RSE frequency, duration, and mode of delivery is not specified (DfE 2019). Across the UK, there is lack of standardisation of RSE delivery and content (Lafferty et al. 2012). For example, RSE is not compulsory in Scotland whereas in Northern Ireland this education is compulsory (although parents have a right to withdraw their children from parts of this education in primary schools and schools can develop their own content of RSE lessons to match the schools' ethos) and from 2022 RSE became compulsory in Wales (The Scottish Government 2014; Council for the Curriculum, Examinations and Assessments 2019; Welsh Government 2021). However, little is known how to deliver RSE effectively to students with intellectual disability, and, importantly, how to measure the impact of RSE in schools. Families are very concerned about children's safety, and teachers are unclear on what RSE should achieve in the school environment, and how to get parents on board with its aims and objectives (Todd 2009;Lafferty et al. 2012;Pownall et al. 2012).
Existing systematic reviews on RSE have focused on its content, delivery, and effectiveness for people with intellectual disability of any age and have only included academic papers written in English (Schaafsma et al. 2015;McCann et al. 2019;Sala et al. 2019;Brown et al. 2020). These reviews have suggested that existing RSE programmes lack specific outcome goals and that outcomes measured are heterogeneous (e.g. some studies measured knowledge of sexuality topics and some self-protection skills). This inconsistent use of RSE outcomes in the literature hinders the comparison of the effectiveness of RSE curricula across studies which in turn could affect the development of appropriate RSE for this vulnerable population. In addition, researchers highlight that people with intellectual disability are rarely involved in the development of RSE and thus their needs and concerns are not considered (Schaafsma et al. 2015;McCann et al. 2019;Sala et al. 2019). Instead, researchers are the ones to decide what outcomes to select in their evaluation studies to assess the effectiveness of RSE for students with intellectual disability.
Given these important limitations, there is an urgent need to clearly map the outcomes of RSE for students with intellectual disability whilst also achieving engagement by all stakeholders (students with intellectual disability, parents, teachers, policy makers, and researchers) in this process. The development of a Core Outcome Set (COS) involves identifying 'what' to measure and includes stakeholders' opinions on what constitutes meaningful outcomes (Williamson et al. 2012). The COS provides a minimum standard of outcomes that all trials, evaluation studies and practice-based audits should measure and report within a specific health or social care area (Williamson et al. 2012). The Core Outcome Measures in Effectiveness Trials (COMET) Initiative proposes a standardised process, where the first step of COS development, the systematic review, aims to identify outcomes measured in quantitative studies and to identify stakeholders' perspectives on outcomes reported in qualitative studies . These outcomes will then form an inclusive 'long list' of potential outcomes for later parts of the process ). An important aspect of such a systematic review involves identifying all outcome measurement instruments used in the literature and evaluating their psychometric properties (Prinsen et al. 2016). The COnsensus-based Standards for the selection of health Measurement Instruments (COSMIN) Initiative has developed criteria for evaluating instruments' measurement properties (reliability, validity, and responsiveness) that help to select the most reliable instruments to measure outcomes (Mokkink et al. 2018).
To date, there is no published COS of RSE for students with intellectual disability. The development of such a COS will help to develop consensus in this sensitive area and would also provide, for the first time, a standardised set of outcomes to be used in research and educational practice to assess RSE delivery and develop effective education for this population. Therefore, we aim to develop a COS for RSE for students with intellectual disability for use in English educational and research settings. This paper reports findings of a systematic review that was carried out as a first step in this COS development. The objectives of the review were (1) to identify outcomes of RSE for students with intellectual disability reported in existing studies; (2) to identify measurement instruments used to measure RSE outcomes; (3) to evaluate the identified instruments' measurement properties (validity, reliability and responsiveness) using COSMIN criteria (Mokkink et al. 2018).

Methods
The systematic review was registered with PROSPERO (registration number: 1787) and carried out following PRISMA guidelines (Liberati et al. 2009). The entire COS study is registered prospectively in the COMET database (registration number: CRD42021243176).

Search strategy
The search consisted of two stages. The first stage of the search was carried out in March 2021 to identify all outcomes of RSE and their measurement instruments using the electronic databases and grey sources listed in Table 1. Citation and reference searching was also performed. The search included a combination of controlled and free-text terms related to intellectual disability, the age of the population and RSE, which were combined using the Boolean operator 'AND' and adapted for each database (see Table S1 in the supporting information).
The second stage of the search was carried out in August 2021 to retrieve studies on identified instruments' measurement properties using the same databases and grey sources as in the first stage. For each instrument the search was carried out separately and included the name of the instrument and terms on intellectual disability which were combined with the Boolean operator 'AND' and adapted for each database (see Table S1 in the supporting information). We also planned to include in the search the COSMIN filter (Terwee et al. 2009) developed to support retrieval of studies on instruments' measurement properties but in the end it was not included in the final search as no studies were retrieved using this filter.

Inclusion and exclusion criteria
In the first stage of the review, articles were eligible for inclusion if they met the following criteria: 1 at least 75% of participants were students with intellectual disability aged 5-25 years or parents, teachers, and school staff of students with intellectual disability aged 5-25 years. Studies were included if intellectual disability was administratively defined as well as studies that defined intellectual disability using standardised assessments of IQ and/or adaptive functioning. Students with co-occurring conditions (e.g. autism) in addition to intellectual disability were eligible for inclusion. The upper age limit of 25 years was selected to include young adults still in education (e.g. special schools that serve students until the age of 25); 2 the study was about any type of RSE that is delivered in schools, home, and social care settings. Studies that focused on clinical interventions to treat clinical problems related to sexuality that are not part of an educational curriculum were excluded; 3 contained information on outcomes measured after RSE delivery or stakeholders' views of RSE content; 4 used any qualitative, quantitative, observational, and mixed-methods study designs. Secondary research, for example, systematic reviews were excluded. 5 published from 1999 in any language. Searches were from 1999 onwards because most of the development of RSE policies in Europe took place at this time (Ketting and Ivanova 2018) as well as their integration in education (Department for Education and Employment 2000).
In the second stage of the review, articles were eligible for inclusion if (1) at least 75% of the sample consisted of students with intellectual disability aged 5-25 years and (2) the study provided information on the instrument's development or measurement properties (e.g. reliability and validity) or evaluated the interpretability of scores (e.g. distribution of scores or floor and ceiling effects).

Study selection and data extraction
In both stages of the review, retrieved studies were stored in EPPI-Reviewer 4 ( Thomas et al. 2020) and checked for duplicates. Screening of titles/abstracts and full-texts was performed by the main reviewer (LP). Twenty per cent of titles and full-text records were independently screened by a second reviewer (AP). We recorded numbers and reasons for excluding studies. Any disagreements or discrepancies were resolved through discussion with the senior review team (CV and VT). Studies written in non-English languages that had an abstract written in English were screened against the inclusion criteria; we did not exclude any studies based on whether they had an English abstract or not. The full text of articles written in a non-English language where the abstract appeared to meet the criteria or where there was no abstract were screened by the review team who were proficient or native-level fluent in the relevant language. Other researchers and PhD students (mentioned in the acknowledgments section) helped to screen studies written in a language the review team was not fluent in. We also used professional translation services for some studies that the core review team was unable to translate. Data were extracted by the main reviewer (LP) and 10% of the data were also independently extracted by the second reviewer (AP) to check reliability. In the first stage of the review, data were extracted on study characteristics, sample, data collection method, RSE, outcomes and outcome measurement instruments. In the second stage of the review, we planned to extract data on: 1) study characteristics; 2) results on the instruments' measurement properties (validity, reliability, and responsiveness) and 3) information on the interpretability of scores (such as the distribution of scores or floor and ceiling effects) and instruments' feasibility characteristics (e.g. cost of the instrument, required equipment or training). However, this was not possible as no studies retrieved met the inclusion criteria.

Risk of Bias
In the first stage of the review, a risk of bias assessment was not planned as it was not relevant to the research question. In the second stage of the review, we planned to use the COSMIN Risk of Bias checklist (Mokkink 2017) to evaluate risk of bias, but this was not performed as no study retrieved met the inclusion criteria.
into a long list. An 'outcome' was defined as: any construct measured following RSE delivery, stakeholders' views of RSE content, any RSE-related construct that was part of the education delivered to students with intellectual disability.
First, outcomes were extracted verbatim from the papers with their definitions (if available) and compiled into a list. The main reviewer (LP) grouped the verbatim outcomes by educational stage into three separate lists. RSE outcomes reported for/by students aged 5-10 years were grouped into a list of outcomes for primary education, outcomes reported for/by students aged 11-16 years were grouped into a list of outcomes for secondary education and outcomes reported for/by students aged 16-25 years were grouped in a list of outcomes for further education. Then, the outcomes in those three lists that were overlapping or had the same definitions were merged under the same outcome name by the main reviewer (LP), recording the frequency this outcome was reported in the studies and who reported this outcome (e.g. students with intellectual disability, parents, teachers and/or whether it was measured by researchers). The outcomes that were considered semantically related by the main reviewer (LP) were presented to the senior review team (CV and VT) for evaluation and after that were grouped into outcome domains. The senior review team (CV and VT) also reviewed the lists and outcome categorisation. The feasibility and measurability of identified outcomes was also reviewed independently by each core reviewer (LP, CV, and VT) (see Table 2 for the criteria). Any disagreements or discrepancies about outcomes were resolved by consensus through discussion with another reviewer.

Evaluation of measurement properties of identified instruments
In the second stage of the review, we planned to apply the COSMIN criteria for content validity (Terwee et al. 2018) to appraise studies on the development of an instrument and content validity evaluations by two reviewers independently. The COSMIN criteria for good measurement properties (Mokkink et al. 2018) would have been applied to rate the other measurement properties of identified studies by two reviewers independently as well. However, this was not performed because none of the retrieved studies met the inclusion criteria as discussed in detail in the results section.

First stage of the review
The search retrieved 2219 unique articles of which 326 articles were screened on full text. A total of 284 articles were excluded as not meeting the inclusion criteria. Out of those papers that did not meet the criteria, 17 were not written in English, for example, French, Arabic, Turkish, Polish, Portuguese and Spanish. A total of 42 articles were included in the review. Please see Figure 1 for the PRISMA flow diagram. Table 3 provides a description of the 42 included studies. Out of these, 27 studies provided qualitative data, six quantitative data and nine mixed-methods data. Studies were conducted in the UK, USA, Africa, Turkey, Australia, Sweden, Ireland, Spain, Greece, Cyprus, Lithuania, Iran, India, Israel, Japan, and Republic of North Macedonia. Papers were written in English, Lithuanian, and Spanish.

Study characteristics
A total of 1767 participants participated in the included studies: 38% (678) were teachers and school staff, 34% (612) were parents and caregivers, 23% (413) were students with intellectual disability and 3% (64) were staff members from community intellectual disability services. Out of 413 students with intellectual disability included, 211 were male and 202 were female, aged 5-25 years (73% were aged 16-25 years), 51% had mild to moderate intellectual disability. Two studies focused on students with genetic conditions (e.g. Down syndrome, Williams Syndrome, Velocardiofacal syndrome and neurofibromatosis) (Plaks et al. 2010;Gokgoz et al. 2021) and 4 studies involved students with intellectual disability with co-occurring conditions (e.g. autism, ADHD, speech delay and physical disabilities) (  ID: students aged 16-20 years with mild-severe ID.

Observations of a classroom (type of observation is not reported) and interviews
Observations: a class of children with ID aged 15-16 years in a special school for students with severe ID, physical disabilities, and sensory impairments.   Table 4). The aim of the RSE programme was only reported in four studies (Sheppard 2006;Todd, 2009;Plaks et al. 2010;Graff et al. 2018 Across the studies, the content mainly focused on protection and managing risks rather than being more comprehensive with a balance of risk management and other skill development. Some studies reported delivering RSE in small groups whilst other studies delivered RSE on an individual basis for a period ranging between 10 weeks to 6 months. RSE was delivered in special schools, group homes, students' homes and community intellectual disability services (e.g. day centres, youth clubs and health services) by teachers, parents, youth service workers, community intellectual disability nurses and researchers. Only two studies reported that parents were involved in the development of RSE content delivered in schools (Sheppard 2006;Graff et al. 2018). None of the studies reported that parents were involved in the delivery of RSE in schools (e.g. asked to deliver after-school activities to their children at home).

Outcomes
A total of 135 individual outcomes were extracted from the studies. Outcomes were grouped by 592  . These outcomes were not included in the tables presented here but will still be included in the later stages of the COS process (see Table S2 in the supporting information for these outcomes).

Outcomes for students in primary education
No studies reported outcomes specifically for students with intellectual disability in primary education (aged 5-10 years). However, three studies reported outcomes for students aged 5-25 years and the lower age of participants was in the primary education age range (see Table 5) (Peters 2007;595    Outcomes for students in secondary education Table 6 provides 43 individual outcomes for students with intellectual disability in secondary education (aged 11-16 years) extracted from 15 studies and grouped under 11 outcome domains: understanding of the human life cycle, the human body, relationships, hygiene, keeping safe, appropriate and inappropriate behaviours, human sexuality, sex and its consequences, rights and responsibilities, emotional vocabulary, and positive self-esteem. The most reported outcomes of RSE for students aged 11-16 years were: understanding of private body parts, understanding of areas of a body that should not be touched by others, and understanding what pregnancy is. Out of 43 extracted outcomes, 85% (36) of outcomes were knowledge based (e.g. understanding of what contraception is), 4% (2) of outcomes were skills based (e.g. friendship skills and hygiene skills), 9% (4) of outcomes were behaviour based (e.g. protective behaviour) and 2% (1) of outcomes were attitudes based (e.g. positive selfesteem).
Out of the 15 studies, only two studies provided information about how some of the extracted outcomes were measured (Sheppard 2006;Stankova & Trajkovski 2021). In Stankova and Trajkovski (2021), understanding of sexuality was assessed before and after RSE delivery using pictures and asking students to say what pictures depicted or point to the correct pictures when asked. For example, students' understanding of areas of the body that should not be touched by others was assessed by presenting illustrations of people touching each other and asking students to say if this is appropriate touch or inappropriate touch based on people's facial expressions in the pictures. In Sheppard (2006), outcomes were measured based on teachers' reports of students' behaviour before and after RSE delivery.
Teachers were asked to rate the frequency (always, regularly, occasionally, seldom, or never) students demonstrated friendship skills, understanding of what sexual relationships are, ability to identify elements of peer pressure, understanding of rights and responsibilities, understanding of different emotions and feelings, and ability to identify positive selfattributes. However, what teachers were considering when evaluating these outcomes (e.g. what rights of self and rights of others actually involve and what specific skills were considered when evaluating friendship skills) and how these evaluations were conducted were not reported in the paper (e.g. whether teachers asked students to verbally describe positive self-attributes or point to pictures).

Outcomes reported by parents, teachers, and students in secondary education
There was an overlap between 13 (46%) out of 28 RSE outcomes reported by parents and teachers for students with intellectual disability in secondary education: understanding of personal hygiene, appropriate and inappropriate social and sexual behaviour, consequences of sexual activities, contraception, and protection from sexual abuse. Outcomes that were reported only by parents or teachers closely correspond. However, in three retrieved studies (Todd 2009;Menon & Sivakami 2019;Goli et al. 2020) parents and teachers did not agree that understanding of menstruation, masturbation, and sexual intercourse should be outcomes of RSE for students with intellectual disability.
Students with intellectual disability in secondary education (aged 11-16 years) reported six RSE outcomes: understanding of puberty-related body changes for males and females, understanding of relationships (e.g. how to make close friends), and understanding of sexual intercourse (e.g. understanding of appropriate sexual behaviour and contraception).

Outcomes for students in further education
A total of 92 individual outcomes for students in further education (aged 16-25 years) were extracted from 37 studies and grouped under 13 domains: understanding of the human body, hygiene, menstruation, human sexuality, sex and its  consequences, appropriate and inappropriate behaviours, relationships, keeping safe, assertiveness skills, social skills, positive self-esteem, emotional vocabulary, improvement in attitudes towards sexuality topics (see Table 7). The most reported outcomes across the studies were understanding of private body parts for males and females and understanding of what contraception is, both reported in 13 studies. Out of 92 outcomes, 78.2% (72) were knowledge-based (e.g. understanding of what sexual intercourse is), 8.6% (8) of outcomes were skills based (e.g. social skills, assertiveness skills), 9.7% (9) of outcomes were about attitudes and feelings (e.g. positive self-esteem, improvement in attitudes towards homosexuality) and 3.5% (3) of outcomes were behaviour-based (e.g. engagement in inappropriate sexual behaviour).
Only one of 37 studies (Graff et al. 2018) reported how some of the extracted outcomes were measured. In this study, students' understanding of what consent is, public and private places, terms such as 'self-esteem' and 'empathy' were assessed by asking students to match a word (e.g. consent) with a correct definition out of several possible definitions on a paper questionnaire developed for the study. Understanding of private body parts for males and females was assessed by asking students to draw an 'X' on private parts on pictures of male and female bodies. Five studies (Garwood & McCabe 2000;Dukes & McGuire 2009;Plaks et al. 2010;Box & Shawe 2014) only mentioned the names of the instruments used to evaluate the outcomes. For example, in the Dukes and McGuire (2009) study, researchers administered the Sexual Consent and Education Assessment (SCEA) (Kennedy 1993), an interview schedule, to students with intellectual disability to measure students' understanding of sexuality topics (e.g. private body parts, ability to identify own gender). Students' engagement in inappropriate sexual behaviour and understanding of safety practices was assessed by administering the SCEA instrument to students' carers. However, authors did not report specific questions students and carers might have been asked.
In Luque and Lugo's (2015) study, students' understanding of private body parts, how to express affection in different relationships, and ability to listen and respect your peers was assessed based on students' reports (e.g. whether they think they are able respect their peers), teacher's and parents' reports of student's behaviour. However, what specific behaviours teachers and parents were considering when evaluating these outcomes nor how students' understanding of private body parts was assessed (e.g. not showing private body parts in public or asking students to describe it verbally) were not specified in the paper.

Outcomes reported by parents, teachers, and students in further education
There was an overlap between 14 (31%) out of 44 RSE outcomes reported by parents and teachers for students with ID in further education. These outcomes relate to private body parts, understanding of sex and its consequences, relationships, different sexualities, appropriate and inappropriate behaviours, social skills and how to keep safe. We also looked at whether outcomes that were reported only by parents and only by teachers were different. We found that outcomes identified by each stakeholder group separately tended to be similar to outcomes reported by both stakeholder groups together, suggesting that RSE outcomes reported by teachers and parents tend to converge. For example, parents reported an outcome of RSE 'understanding of what menstruation is', whereas teachers reported an outcome of RSE 'understanding of how to choose, use menstrual products'. It is likely that when parents discuss what menstruation is with their children, parents also discuss how to choose menstrual products, but this information was not specified in the papers. However, in two studies (Menon and Sivakami 2019;Goli et al. 2020) parents and teachers did not agree that understanding of menstruation, masturbation, and sexual intercourse should be an outcome of RSE for students with intellectual disability.
Students with intellectual disability reported 52 individual outcomes of RSE that they perceive as important. The most frequently reported outcomes of RSE by students with intellectual disability were understanding of contraception and what sexually transmitted diseases are, both reported in four studies (Löfgren-Mårtenson 2012;Williams et al. 2014;Phasha and Runo 2017;Wiseman and Ferrie 2020). However, students with intellectual disability reported different outcomes as meaningful which   (2020); Garwood and McCabe (2000); Peters (2007) Parents, teachers, students with ID, and measured by researchers Understanding physiology of menstruation  (2020) Students with ID  Ballan (2011) Teachers, students with ID, parents and measured by researchers Table 7. (Continued) Outcome domain

Who reported/measured
Understanding that is hard to be a parent when   (2000); Peters (2007) Students with ID, teachers and measured by researchers Students with ID, teachers

Parents
Respecting yourself in romantic relationships (not sleeping with everyone) Teachers, parents, and content of RSE Listening to and respecting your peers 1 study: Luque and Lugo (2015) Measured by researchers Respecting other peoples private space 1 study: Luque and Lugo (2015), Kostigen (2020)

Measured by researchers
Emotional vocabulary Ability to recognise and express emotions 1 study: Luque and Lugo (2015); Lafferty et al. (2012) Teachers and content of RSE Improvement in attitudes towards sexuality topics Improvement in attitudes towards friendships 1 study: Garwood and McCabe (2000) Measured by researchers Improvement in attitudes towards dating and intimacy 1 study: Garwood and McCabe (2000) Measured by researchers Improvement in attitudes towards sexual interaction 1 study: Garwood and McCabe (2000) Measured by researchers Improvement in attitudes towards masturbation 1 study: Garwood and McCabe (2000) Measured by researchers Improvement in attitudes towards homosexuality 1 study: Garwood and McCabe (2000) Measured by researchers STDs, sexually transmitted diseases; ID, intellectual disability. † One study contradict that this should be an outcome of RSE (Goli et al.

2020). ‡
One study contradicts that this should be an outcome of RSE (Menon and Sivakami 2019). § One study contradict that this should be an outcome of RSE (Menon and Sivakami 2019).

Where located Instrument
Box and Shawe (2014) Name: "Not a Child Anymore" (Fraser 1987). Original language: English. Available translations: not reported. Garwood and McCabe (2000); Plaks et al. (2010) Name: The Sexuality Knowledge, Experience, Feelings and Needs Scale for people with Intellectual Disability (Sex Ken-ID) (McCabe 1994). Constructs: Knowledge of sexual topics: friendship, dating and intimacy, marriage, body part identification, sex education, menstruation, sexual interaction, contraception, pregnancy, abortion and childbirth, STDs, masturbation, and homosexuality. Feelings towards sexuality topics: friendship, dating and intimacy, marriage, body part identification, sex education, menstruation, sexual interaction, contraception, pregnancy, abortion and childbirth, STDs, masturbation, and homosexuality. Target population: people with mild ID. Intended contexts of use: not reported. Mode of administration: interviews with people with ID (3 separate interviews lasting of 1-1.5 h). When administered: before and after program (specific timings not reported). No. of items and subscales: 248 item interview schedule classified into Knowledge, Experience, Feelings and Needs areas (only knowledge and feelings scales used in the study). Response options: The knowledge questions are open-ended questions, with responses scored as 0, 1 or 2. The feelings questions are the yes-no type (responses scored as 1 to 2) or scored on a 5-point Likert scale (ranging from 1 to 5): "very bad, bad, neutral, good, very good". 2. A total score is obtained for each of the sub-areas (knowledge or feelings) within each subscale (e.g. friendship, STDs). Original language: English. Available translations: not reported.

Constructs
were not mentioned by other stakeholders in the studies. Students reported that in RSE they would like to learn how to identify problems with menstruation, how to have sex and enjoy it, how to use contraception (e.g. how to apply condoms), how to take care of a baby, what are homosexual sexual experiences, what it is like living with a partner, how to flirt, how to break up in relationships, how to love and accept yourself, and what to do in case of abuse. On the other hand, outcomes reported by parents and teachers were about how to protect yourself and appropriate and inappropriate behaviours.

Characteristics of the outcome measurement instruments
Five instruments were used to measure RSE outcomes in the retrieved studies (see Table 8).
The Not a Child Anymore (Fraser 1987) instrument was administered in Box and Shawe's (2014) study with people with intellectual disability aged 20-24 years to assess their knowledge of sexual behaviour (e.g. kissing and sexual assault) and knowledge of sexual topics (e.g. masturbation and contraception). The authors reported that this is a self-report questionnaire that contains 111 questions, and no other information about the instrument was provided (e.g. whether it was developed for people with intellectual disability or measurement properties).
The Sexual Consent and Education Assessment (SCEA) (Kennedy 1993) is an interview schedule developed to assess individuals' with cognitive impairment (e.g. intellectual disability or traumatic brain injury) capacity to consent to sexual activities and identify areas where individuals need further education. This instrument was administered in Dukes and McGuire's (2009) study before the RSE delivery, at the end of each session and at 6 months' follow-up to students with intellectual disability aged 22-23 years. It contained 35 questions distributed over three scales: Knowledge of Human Sexuality Scale (only the first 4 items of the scale were delivered to students in an interview lasting 20-50 min), Safety Practices Scale, and Inappropriate Sexual Behaviour Scale (the former two scales delivered in a separate interview to students' carers). Dukes and McGuire (2009)  reported that the instrument 'has high internal stability and inter-rater reliability and satisfactory test-retest reliability' without providing data to support these statements. The Sexuality Knowledge, Experience, Feelings and Needs Scale for people with Intellectual Disability (Sex Ken-ID) (McCabe 1994) is an interview schedule developed for people with mild intellectual disability to assess their knowledge, experiences, feelings and needs over 12 sexuality topics: Friendship, Dating and Intimacy, Marriage, Body Part Identification, Sex and Sex Education, Menstruation, Sexual Interaction, Contraception, Pregnancy, Abortion and Childbirth, Sexually Transmitted Diseases, Masturbation, and Homosexuality. The instrument contains 248 questions administered in three separate interviews each lasting 1-1.5 h. This instrument was administered in two studies (Garwood and McCabe 2000;Plaks et al. 2010) to students with intellectual disability aged 12-25 years before and after RSE delivery. In Garwood and McCabe (2000) only the Knowledge and Feelings scales were administered. Authors reported that the instrument has 'moderate to high' internal consistency within each of 12 sexuality topics. In their paper, authors provided Cronbach's alpha for selected scales: 0.47 for Knowledge of Dating and Intimacy; 0.79 for the Feelings of Dating and Intimacy scale; 0.79 for Knowledge of Sexual Interaction scale and 0.46 for Feelings of Sexual Interaction scale. The authors reported that the feelings scale was less reliable as it has only 2-3 questions within each of 12 sexuality topics.
The Skills and Independent Activities Questionnaire (Argaman 2003), a parent-report scale, was administered in Plaks et al. (2010) to assess social skills and engagement in independent activities at home before and after RSE delivery for students with intellectual disability aged 15-25 years. The assessment of the understanding of the concept of a friend (Argaman 2003), a self-report questionnaire, was also administered in Plaks et al. (2010) to students with intellectual disability. No other information about these two instruments was provided in the paper (e.g. whether the instruments were developed for people with intellectual disability and instruments' measurement properties).

Second stage of the review
A separate systematic search was conducted for each of the five instruments: the Not a Child Anymore (Fraser 1987); the SCEA (Kennedy 1993); the Sex Ken-ID (McCabe 1994); the Skills and Independent Activities Questionnaire (Argaman 2003) and the assessment of the understanding of the concept of a friend (Argaman 2003). For three instruments-the Not a Child Anymore (Fraser 1987), the SCEA (Kennedy 1993) and the Sex Ken-ID (McCabe 1994) -the searches retrieved studies additional to the ones retrieved in the first stage of the review (see supporting information for PRISMA flow diagrams).
The search of studies on the SCEA (Kennedy 1993) retrieved eight individual articles which were screened on title and abstract. Four papers were screened on full text, but they did not meet the inclusion criteria: one did not evaluate measurement properties of the instrument; one was a duplicate and two provided information on the instrument's development, internal consistency, criterion validity and construct validity, but evaluations were carried out involving people with intellectual disability aged above 25 years.
The search on the Sex Ken-ID (McCabe 1994) retrieved six individual studies which were screened on title and abstract. Three studies were screened on full-text, but they did not meet the inclusion criteria: two did not evaluate the instrument and one provided information on the instrument's development, internal consistency, and test-retest reliability, but evaluations were carried out with people with intellectual disability aged above 25 years.
The search on the Not a Child Anymore (Fraser 1987) retrieved two individual articles which were screened on title and abstract and full-text, but these studies did not meet the inclusion criteria: one did not evaluate the instrument and one full-text was not available.

Discussion
This review identified 135 RSE outcomes for students with intellectual disability aged 5-25 years that were reported as important by stakeholders across 42 studies. The majority of outcomes (92) related to students with intellectual disability in further education (aged 16-25 years), 43 outcomes were extracted for students in secondary education (aged 11-16 years) and there were no outcomes reported for students with intellectual disability in primary education (aged 5-10 years).
Outcomes were predominantly knowledge based and focused on improving understandings of the human body, hygiene, relationships, sexual intercourse and its consequences, how to protect yourself from potential abuse, and appropriate and inappropriate behaviours rather than pertaining to skills and attitudes. However, many students with intellectual disability have difficulties understanding abstract and complex sexuality concepts (e.g. consent in relationships) and applying the new knowledge in practice (Finlay et al. 2015;Bundock & Hewitt 2017). This also possibly explains why students with intellectual disability in the retrieved studies reported that in RSE lessons they want to develop skills (e.g. how to apply condoms) to learn how to have romantic relationships, enjoy sex, embrace their sexuality. In contrast, outcomes reported by the parents and teachers focused on developing knowledge of appropriate and inappropriate behaviours and how to keep safe.
Most of the outcomes were extracted from qualitative data and only eight studies reported outcomes measured following RSE delivery. However, RSE evaluation studies included in the present review were reported in ways that suggest they were not developed systematically; the aim of RSE programmes was rarely specified, none of the studies mentioned theoretical underpinnings of their programmes and only two studies delivered RSE based on materials developed for people with intellectual disability. Studies also lacked detail on what specific outcomes were measured, why these outcomes were selected and how the measurement was performed.
These represent some significant limitations as RSE programmes without clearly specified aims and theoretically linked outcomes cannot be robustly evaluated to determine the effects of their implementation (Fernandez et al. 2019). Without a theory of change, an understanding of what 'active ingredients' may bring about change is also hindered (O'Cathain et al. 2019). Conclusions drawn from such studies are of limited value and it is very challenging to replicate the findings in different settings without additional information.
In addition, RSE programme aims should have been developed based on needs assessment and stakeholders' views (Fernandez et al. 2019;Skivington et al. 2021). However, we found that none of the studies reported including the views of students with intellectual disability in the development of RSE programmes. This is a significant limitation as learner-centred teaching that incorporates students' interests and individual needs is associated with higher students' motivation, achievement, and skills development (Meece et al. 2003;Alfassi 2004;Dano-Hinosolango and Vedua-Dinagsao 2014). Thus, these RSE programmes are likely to be less relevant and effective for students with intellectual disability.
RSE content delivered to students with intellectual disability was only briefly mentioned in the papers and focused mainly on protection and managing risks. Compared with the suggested topics of comprehensive sexuality education by the World Health Organization (2010), the content of RSE in the studies did not seem to include topics on sexual identity, sexual pleasure, how to seek help and information on sexuality topics (e.g. how to access sexual health services), how to express needs and wishes within relationships and information on their sexual rights (including the right to marriage and choosing romantic partners). This gap also contrasts with the RSE statutory guidance for English schools (DfE 2019).
Our review identified that no instruments have been validated to measure outcomes of RSE for students with intellectual disability aged 5-25 years. Searches carried out on measurement properties of the five identified instruments revealed that only two instruments (the Sex Ken-ID and SCEA) had been evaluated. However, the evaluations had not been performed involving students with intellectual disability aged 5-25 years. For example, the Sex Ken-ID instrument was administered in 2 studies (Garwood and McCabe 2000;Plaks et al. 2010) to students with intellectual disability aged 12-25 years, but the piloting of the instrument was performed with people with intellectual disability aged above 27 years (McCabe et al. 1999). Further, the measure's reported internal reliability was low (e.g. Cronbach's alpha for Knowledge of Dating and Intimacy and Feelings of Sexual Interaction scales are below 0.60) (Garwood and McCabe 2000) and we found no studies on its content validity.
Neither instrument was designed to be used across the intellectual disability spectrum, and their content is likely outdated (both developed in the 1990s). The SCEA instrument was designed for people with cognitive impairments such as people with traumatic brain injury and people with intellectual disability. However, those two populations are not the same and the instrument might fail to capture relevant aspects for people with intellectual disability. The administration of the Sex Ken-ID requires three individual interviews of 1-1.5 h and is not feasible with students more severe intellectual disability and verbal communication difficulties. Therefore, additional studies are needed to evaluate these instruments' validity, reliability and feasibility with students with intellectual disability.

Strengths and limitations
To our knowledge, this is the first systematic review on RSE outcomes for students with intellectual disability. This is also the first systematic review with the aim of evaluating RSE outcome measurement instruments' measurement properties. We used a comprehensive search strategy that involved searching nine electronic databases and grey literature and including studies published in any language. Thus, we were able to capture a range of stakeholders' perspectives of meaningful outcomes.
Although the review included studies from several countries, the majority of studies were carried out in Western countries. Therefore, the identified list of outcomes might be less generalisable to non-Western countries.
Outcomes identified this review also do not represent RSE outcomes for all students with intellectual disability. Our review identified that there were no RSE outcomes reported specifically for students with intellectual disability in primary education (5-10 years). It is likely that RSE outcomes for younger students would differ from the outcomes extracted for older students with intellectual disability. Further, outcomes reported by students with intellectual disability were extracted from studies that carried out interviews with verbal students. There were no studies that had included the views of students with more severe intellectual disability and verbal communication difficulties.
We were not able to retrieve full-texts of 23 papers (7%) to see if they would meet our inclusion criteria and there is the possibility that we missed some RSE outcomes.

Implications for research and practice
Our review identified that there is a discrepancy of RSE outcomes perceived as important between students with intellectual disability and their parents and teachers. Therefore, it is essential to include students with intellectual disability in the development of RSE. Our findings suggest a need for RSE programmes to focus on skills development rather than solely providing facts for this population. This could be carried out by including activities that allow students with intellectual disability to practise skills (e.g. using role plays in schools or asking students to complete ethically sensitive activities such as applying condoms at home) and embedding the new skills learned in everyday life (e.g. asking a student's consent before touching a student).
RSE programme developers could consult available guidelines more closely such as the Information Mapping toolkit (Fernandez et al. 2019) or the UK Medical Research Council guidance on developing complex interventions (Skivington et al. 2021) to ensure that programmes developed are comprehensive and reflect the needs and priorities of students with intellectual disability. Findings here clearly highlighted the need to develop validated instruments to assess the effectiveness of RSE with students with different levels of intellectual disability and verbal abilities in school settings.
The list of outcomes resulting from the current review could be used to guide research and school-based evaluations in Western countries. This list of outcomes could also be used as a starting point for researchers interested in developing a stakeholder consensus-based COS of RSE for students with intellectual disability in different cultural contexts.

Supporting Information
Additional Supporting Information may be found online in the supporting information tab for this article. Table S1: Example of search strategy used in the 1st and 2nd stage of review Table S2: RSE outcomes for students with intellectual disability that age was not specified Figure S1: PRISMA Flow Diagram: study selection process on the Sexual Consent and Education Assessment (SCEA) Figure S2: PRISMA Flow Diagram: study selection process on the Sexuality Knowledge, Experience, Feelings and Needs Scale for people with Intellectual Disability (Sex Ken-ID) Figure S3: PRISMA Flow Diagram: study selection process on the Not a Child Anymore