Different Modality, Different Design, Different Results: Exploring Self-regulated Learner Clusters' Engagement Behaviours at Individual, Group and Cohort Activities

Self-Regulated Learning (SRL) competence is an important aspect of online learning. SRL is an internal process, but analytics can offer an externalisation trigger to allow for observable effects on learner behaviours. The purpose of this paper is to explore the relationship between students’ SRL competence and their learning engagement behaviours observed in multimodal data. In a postgraduate course with 42 students, eighteen features from three types of data in seven learning activities were extracted to investigate multi-level SRL competence students’ engagement behaviours. The results revealed that students with different SRL competence clusters might exhibit different behaviours in individual, group, and cohort level learning activities. Also, students with similar SRL competence might exhibit significantly different engagement behaviours in different learning activities, depending on the learning design. Therefore, while using engagement data in AIED systems; the modality of the data, specific analysis techniques used to process it, and the contextual particularities of the learning design should all be explicitly presented. So that, they can be considered in the interpretations of automated decisions about student achievement.


Introduction and Background
Online learning is considered to have the advantages of breaking the barrier of time and space, which may provide more flexibility and accessibility for students [1,2].Even though online learning has great potential, its effectiveness depends on student active engagement as well as how they plan, control, and reflect on their learning activities in these settings [3,4].This process has been broadly referred to as self-regulated learning (SRL) [5].Self-regulated learning has been defined as the process of modulating emotion, cognition and behaviours to accomplish desired goals which could lead to a higher level of academic achievement [6][7][8][9].However, as SRL is an internal and hidden process, representative measures such as behavioural proxies or self-report data are required for probing SRL [10].
Self-report questionnaires are extensively used to assess students' SRL characteristics due to their advantage in capturing learners' perceptions.However, it is questionable whether this subjective measure can comprehensively represent SRL.Rather than viewing SRL as an aptitude, many researchers support that SRL is a dynamic process of individual learners responding to the constraints imposed by the learning environment in accordance with their personal goals and the task at hand [11][12][13].Although alternative measures such as think-aloud and event-based interviews are more responsive to the situation, their time-demanding nature and imposition of cognitive load during the task engagement for learners, leave self-reports less practical and unsatisfactory instruments to assess SRL [14].With the rise in computer-based learning environments, digital traces-how students selectively interact with the available digital resources, are other available measures to observe SRL, especially in terms of how and when students deploy an SRL strategy within the environment.
Research has shown that self-report measures and online traces are equitable to capture SRL from different perspectives [14][15][16].Hadwin et al. [15] found that while students did not report their actual behaviours, trace data can be applied to study the student's actual enactment of SRL strategies.Similarly, Jamieson-Noel and Winne [16] also concluded that students used different criteria when reporting SRL compared to their deployed tactics that emerged through log data.van Halem et al. [14] suggested that self-report is better at capturing student's prior performance whereas trace data is better at explaining variance in student's subsequent performance.Therefore, self-reports can be used as a complement to trace data.The method that integrates multiple data sources to analyze learners' interactions and examine complex learning processes is known as Multimodal Learning Analytics (MMLA) [17].MMLA aims to generate tailoring and timely support for learners and aid teachers' decision-making in the learning contexts [18].Previous research has reinforced the usual higher predictive accuracy of models generated from multimodal compared to unimodal data [19][20][21] SRL research with MMLA has been introduced in many contexts such as individual distance learning [22][23][24] and collaborative learning [25][26][27][28].For example, perceived individual versus group challenges for regulation was studied in collaborative contexts by Järvelä et al. [29] whereas emotional regulation of students in an agent-based individual learning platform was investigated by Azevedo et al. [30].However, most available studies focus on a single learning context and generate results based on learners' interaction data from a single digital environment or a learning activity.Learning design refers to a set of learning activities or learning interactions, including resources and materials prepared for students to achieve learning goals [31].Learner behaviours are likely to vary depending on the learning design, domain, method of assessment, expected learning outcomes as well as the social plane in which students operate (e.g., individual, group, and cohort levels).Although the impact of learning design on learner behaviours is well-established in the learning sciences literature, cross-context and cross-modality differences of students' SRL behaviours in digital environments are underexplored.Here, we investigate the differences in student's engagement behaviours based on their SRL competence clustering when they are engaging in learning activities alone, while they are in small groups, and while they are interacting at the cohort level.More specifically, we investigate two research questions: 1) To what extent do students with different SRL competence exhibit different engagement behaviours in learning activities?
2) To what extent do students with the same SRL competence engage differently in learning activities from different social planes (individual, group, and cohort)?
This study aims to provide insights into how SRL behaviours might potentially differ depending on the learning activities and social planes in which these activities are implemented.Through the identification of such potential differences, our ultimate goal is to support future designs of AI and analytics solutions that may provide appropriate support in different learning activities and at different social planes.

2
Context of Study

2.1
The Educational Context Forty-two postgraduate students from a UK university enrolled and completed the 10week course in Design and Use Technology for Education.All of them have consented to participate in the study which has obtained ethical approval from the institution.During the course, students participated in learning activities that require them to interact at different social planes: individual level, group level and cohort level.At the individual level, each week students were asked to (1) finish the readings available on the online platform, (2) watch the video lectures about the weekly topic, and (3) write individual reflections on what they have learnt.At the cohort level, students (4) participated in an online asynchronous debate on a debate platform and (5) voluntarily attended a live Q&A session for clarification.Besides, a forum was also available for asynchronous discussion.For the group level, students were divided into ten groups (with 3-5 members) who had interdisciplinary backgrounds, mixed genders, and were from similar timezones for conveniently scheduling synchronous group meetings.They had to select an educational challenge and carry out an educational technology design case to solve the challenge they identified.Weekly, the group (6) had an online synchronous group meeting to discuss and (7) completed the group tasks in a co-design environment.During the meeting, the speaker's timestamp was recorded to automatically generate students' speech time pie charts and their turn-taking networks [32].These graphs were manually combined with narrative feedback from two teaching assistants and sent to students via weekly emails.
The assignment consisted of two parts: Part A focused on students' individual reflections (40%) and Part B was a 2500-word essay (60%) on critical analysis of the key issues on the design case.The feedbacks were provided as (1) Part A's formative feedback at mid-term (week 6), (2) feedback on the draft of part B before the final submission, and (3) summative feedback at the end of the module (week 15).The final submissions were marked by three reviewers.Double marking was applied to 35% of the final submissions, achieving 96% of inter-rater reliability in the final grades given.

2.2
Measuring students' SRL competence Before the course started, all students have completed the questionnaire which used to evaluate the students' SRL competence.The questionnaire was adapted from a metareview in the SRL field concerning multiple SRL dimensions in relation to academic achievement including metacognition (metacognitive strategies, goal-setting, self-monitoring and self-control), attention, learning strategies, time management, environmental structuring, help-seeking, motivation, emotion control, and attributions [9].The adapted version can be found here.Cronbach's alpha (0.771) was used to test the reliability.In this study, students were divided into high (n=13, Mean = 4.18, SD = 0.14), medium (n=14, Mean = 3.80, SD = 0.15) and low (n=15, Mean = 3.37, SD = 0.14) SRL competence clusters according to the mean score of all dimensions of the questionnaire.

Data Collection and Preprocess
In total, three types of data were collected from seven resources: 1) timestamp data was collected from Zoom meetings to investigate students' speech behaviours within the group contexts; 2) log data was collected from Moodle, Ligilo, Echo360, Miro and Google Docs; and 3) qualitative manual coding data was recorded from live Q&A sessions to investigate students' engagement in the sessions.Since the submission deadline for individual reflection was five weeks after the course ended, there were 15-week log data from Google Docs.For other sources, there were a total of 11-week log data which included the reading week, a mid-term break at week 6.All data has been anonymized according to the requirements of the institutional ethical approval before its analysis.
Timestamp Data.During the ten-week collaboration, groups used Zoom (https://zoom.us) as an online synchronous meeting platform to meet, plan and complete their design tasks.In each meeting which lasted around 1-2 hours, a system was used to collect timestamp data from Zoom.Once a user's microphone has been detected as active, the system would capture the timestamp and the user's ID.If none of the user's microphone has been detected as active, the system would also capture the timestamp and mark the user's ID as 0. For each meeting, the system also records all users' IDs and names, as well as meetings' IDs and titles.All this data has been integrated into one JSON file and stored in a cloud database.In total, 130000 lines of timestamp data were successfully collected from eighty-five remote group meetings (lost data due to technical issues).The timestamps were 13-digit and accurate to milliseconds.
During data preprocessing, initially, the data was extracted from the JSON files and restructured into a CSV file.For each line in the CSV file, it presented the week number, the group number, speakers' student ID, timestamp of speech beginning, timestamp of speech end, and duration of the speech.The meetings' IDs and titles were used to identify the week number and group number.The speakers' usernames were used to identify who the speaker was.Since the system only records the timestamps of speech beginning, the timestamp of a new speaker appearing was taken as the end of the last speech.The duration of one speech was calculated by the difference between the timestamp of a speech beginning and the timestamp of a speech ending.
Log Data.In this study, log data was collected from five online platforms/tools: Moodle, Ligilo, Echo360, Miro and Google Docs.The log data from Moodle (https://moodle.org)consisted mostly of the number of page viewing, posts creating, and task completion.Ligilo (https://www.go-ligilo.com)hosted the online asynchronous debate activity in the course.It is an online discussion platform designed specifically for peer learning which enables students to connect their posts with others' through a specified relationship.The data from Ligilo recorded the number of postviewing and post creating.Echo360 (https://echo360.com) is a video platform on which students can view all pre-record lectures in the course.It recorded how many lectures each student viewed and the duration of each lecture viewing.Miro (https://miro.com) is an online visual collaboration platform that supports, for example, brainstorming, mind mapping and prototyping.Every week, each group was asked to finish pre-set tasks related to the weekly topic on Miro.The platform recorded actions with timestamp made by students.Google Docs (https://docs.google.com) was used as a tool for students to write individual reflections.An open-source Chrome extension, Draftback (http://draftback.com), was used to extract log data from the edit history of google docs.The data provided information about 1) what types of change (inserting or deleting) has been made; 2) the start and end index of the document in which the change has been made; 3) when the change has been made; 4) the unique identifier of the change called 'revision number'; and 5) what actual contents has been inserted or deleted.During the data preprocessing, all log data was restructured into CSV files.Activity data from teachers, course administrators, and dropped students were removed.Then, duplicated data has been identified by the timestamps: if the same action has been made more than one time in a very short period, a single record was kept as the actual action.
Qualitative Observation Data.To investigate students' engagement in Q&A sessions, manual coding was applied to record: 1) students' attendance, and 2) whether students have made relevant contributions to the discussion in the Q&A sessions.For instance, the researchers recorded the list of participants in one Q&A session three times: at the beginning, middle and the end of the session to be able to count late attending and early leaving students.Students have been marked as "attended" only if they were in at least two of the participant lists.Meanwhile, during the sessions, students would be marked as "made contributions" if they have asked questions, answered questions, or shared their opinions on the content of the course.

3.2
Behavioural Features Behavioural features were generated from the different modalities of data described above to evaluate students' engagement in different learning activities across the whole course.Table 1 shows the extracted behavioural features and their description.

Reflection_Avg-StrCountPerDay
The average string count of the revisions that a student made per day in the reflection document

Comparison tests
To explore the relationship between students' SRL competence levels and their engagement in different learning activities, significance tests of difference were used.Students' general SRL competence levels were examined as independent factors, while the eighteen features described above were considered as dependent factors.Tests for homogeneity of variance were conducted for each feature to determine the normality of data distributions and the appropriate tests were administered accordingly.If the pvalue is less than 0.05, it means a violation of the assumption, Kruskal-Wallis test for nonparametric comparisons was applied.Otherwise, One-way ANOVA was used to explore whether there are differences between different SRL competence groups.Bonferroni multiple comparison corrections were conducted in post-hoc.

3.4
Time Series Analysis Due to the time-dependent nature of the data being collected, also a time-series analysis was applied.Time-series analysis is useful to explore the timely behaviours and extract time series components to better understand and model future behaviours [33].This analysis is based on the assumption that data points in the previous time are good predictors of the later data points [33].The main time-series components we used within the focus of this study was the trend.The trend represents the long-term direction of the dataset which can be an increasing/upward trend or a decreasing/downwards trend.In the module, most of the tasks e.g., cohort debates and group discussion were advised to be completed on a specific day of the week except the individual reflection in which students were free to complete on any day of the week until the optional formative feedback was given at mid-term.Therefore, students' weekly behaviours on individual reflection tasks were investigated using time-series analysis..2show the mean value and standard deviation for each behavioural feature at the cohort level.An ANOVA on the value for the Moodle_DiscussionView produced a significant difference, F(2,39) = 5.237, p < 0.05, partial η 2 = 0.212.Follow up post-hoc tests (with Bonferroni correction) showed that for the Moodle_DiscussionView value, the high SRL group was significantly higher than the medium SRL group, p = 0.013.It means that high SRL competence students have viewed the discussions on Moodle more frequently than medium SRL competence students.Furthermore, there is no significant difference in Moodle_DiscussionCreate, Ligilo_View, Ligilo_Post, Q&A_Attendance, and Q&A_Speech.It means that students with different SRL competence levels did not exhibit different engagements in online asynchronous debate activities and live Q&A sessions..4show the mean value and standard deviation for each behavioural feature at this level.The result of ANOVA analysis in the last column shows that there is no statistical difference found in these learning activities.Students with different SRL competence levels did not exhibit statistically significant differences in engagement behaviours at this level.Given the time-dependent nature of the reflection task, time series analysis was conducted to investigate time-series components.The average number of edited strings per day was selected as a proxy to represent student reflection behaviours.Figure .1 shows the average number of edited strings per day of students with high, medium, and low SRL competency across 15 weeks.Despite similar trends of the three groups productively reflected more towards the end of the course (submission date on week 15), the graph visually suggested that students with high SRL competence and students with medium SRL competence tend to be more active in reflecting their learning.For instance, in figure 1, these two groups of students are observed with more peaks and higher peaks than the students with low SRL competence.These differences are significant in the week after students received their midterm feedback and the week before the submission deadline.A more detailed discussion of time-series analysis could be found in [34].

Discussion
In this study, eighteen features were extracted from three different types of data in seven learning activities to investigate different SRL competence students' engagement behaviours at cohort, group, and individual levels.The results illustrated significant differences in students' behaviours at different social planes.The first research question explored whether students with different SRL competence levels exhibit different engagement behaviours in the learning activities.At the cohort level, a significant difference has been observed in students' engagement with the discussion activities on Moodle.The students with high SRL competence tended to view more posts in the discussion forums than the students with low SRL competence.Further investigations into the details of the log data from Moodle revealed that these engagement behaviours mainly focused on sharing individual reflections and asking questions to teachers.This particularly links to the recent findings from Jivet et al. [35] that only the high SRL skills, specifically high help-seeking skills will value the discussion forum as a sort of help.It was also observed that students with high SRL competence have a higher frequency of viewing others' reflection input as well as viewing discussions between students and teachers.In terms of online asynchronous debate and live Q&A sessions, students with different SRL competence did not appear to exhibit different behaviours.At the group level, students with different SRL competence did not exhibit significantly different behaviours in their online design platform activities (Miro).Similarly, there was no difference in speech time during the online group meetings between students with different SRL competence.However, high SRL students had a lower frequency of speaking during the group meetings.It means that the speech made by high SRL students may be longer than the speech made by low SRL students.One potential interpretation of this result might be that low SRL competence students tend to ask more questions with short sentences while high SRL competence students used long sentences to share and explain their opinions.
In terms of the learning activities at the individual level, viewing lectures and writing individual reflections, there was no statistical difference found between different SRL groups.This indicates that they might have similar levels of engagement in these two activities during the course at the accumulated data over the course time.However, the time series analysis showed that students with different SRL competence seemed to use different behaviours in individual reflective writings.For instance, students with high and medium levels of SRL competence tended to edit their individual reflections immediately after they received the mid-term feedback.But reactions to feedback by low SRL students were given two weeks later, if at all.Furthermore, students with high and medium SRL competence levels were more active than the students with low SRL competence in the week before the submission deadline.
Our second research question investigated whether students with the same SRL competence exhibit the same engagement behaviours in learning activities from different social planes (individual, group, and cohort).The results show that high SRL students were more active in learning at the cohort level activities and individual levels but less active in group level activities.Although high SRL students did not create a significantly larger number of posts on Moodle or Ligilo, they viewed a significantly larger number of posts on Moodle.Similarly, at the individual level, the high SRL students exhibited a higher number of engagement behaviours than the low SRL students.Time series analysis illustrated that students with high SRL were more active in some specific weeks even though no significant difference was found at the accumulated data analysis.However, compared with individual-level activities and cohort-level activities, students with high SRL competence exhibited different engagement behaviours at grouplevel activities.As mentioned above, students with high SRL competence tended to have a lower frequency of speaking in remote group meetings which contradicted the results from Cho, Kim and Choi [36] that high SRL competence tended to perceive a greater sense of community of inquiry (CoI), i.e., beliefs in knowledge acquisition through social inquiry within a community.In other words, it can be inferred that high regulators are expected to participate more in the group meeting.However, in their study, the authors only considered students' self-perception of CoI without observing their actual behaviours unlike in our study.Moreover, these results also highlight the importance of considering contextual specificity in studying students' SRL behaviours from trace data.In different learning activities and different social planes, high competence SRL students might exhibit different engagement behaviours.
Looking at the learning activities we investigated in this study, the strategic importance of assessment also becomes clear.High SRL students tended to be more active in viewing Moodle discussions and individual reflective writing activities.Students often used the Moodle forums to share their individual reflections or ask questions about final submissions.It is important to note that both of these learning activities are strongly tied to the course's summative assessment and student grades.On the other hand, in Ligilo discussions, which were strongly encouraged but were not part of the summative evaluation, different SRL groups did not exhibit different engagement behaviours.Similarly, high SRL competence students appeared to be less active in remote group meetings for which feedback was provided to promote equality in the meeting contributions, but there was no summative evaluation.These results may indicate that high SRL competence students tend to engage more in the activities that are strategic for their performance measures (i.e., are related to the final assessment).The relationship between assessment and self-regulation has long been affirmed in many studies e.g., an entanglement between teachers' assessment criteria and students' learning goals and the effects of formative/summative assessment on student's monitoring of progress [37].

Conclusion
This study has some implications for the future design and implementation of multimodal AIED tools for promoting students SRL.First, it shows that data from different channels and modalities may bring in different interpretations of student engagement behaviours and their connection to students' SRL competence.Furthermore, it underlines the value of bringing in temporal data analysis approaches compared to accumulative ones for investigating dynamic student behaviours.To be more specific, we showed that a time series analysis can reveal potential differences in students' individual reflective writing behaviours, which were not found in the analysis of students' accumulated data.Third, perhaps, more importantly, the same group of students with similar SRL competence might exhibit significantly different engagement behaviours in different learning activities, and different social planes, depending on the learning design requirements such as the assessment and expected learning outcomes.If we are to design AI systems that automatically detect, monitor, and support students with different levels of SRL competence, the modality of data that is fed into the system, specific data analysis techniques used, as well as the contextual particularities of the learning design should also be taken into account since they can have significant implications on the decisions made by the systems and their interpretations by humans.

Fig. 1 .
Fig. 1.The average number of edited strings per day between students with high (blue line), medium (orange line) and low (green line) SRL competence across 15 weeks

Table 1 .
Extracted behavioural features and description according to the task levels and tools.

Table 2 .
Comparison tests of cohort-level behavioural featuresThe first columns of Table.3 show the mean value and standard deviation for each behavioural feature at the group level.An ANOVA on the value for the Zoom_SpeechFrequency produced a significant difference, F(2,39) = 3.452, p < 0.05, partial η 2 = 0.15.Follow up post-hoc tests (with Bonferroni correction) showed that for the Zoom_SpeechFrequency value, the high SRL group was significantly higher than the low SRL group, p = 0.046.It means that students with low SRL competence exhibited a higher frequency of speech in online group meetings than students with high SRL competence.Apart from this, at students' group-level interactions, no statistical differences were observed in Zoom_SpeechTime and Miro_Engagement behaviours.

Table 3 .
Comparison tests of group-level behavioural features Individual Level.At the individual level, we investigated two learning activities: lecture viewing and individual reflective writing.The first columns of Table

Table 4 .
Comparison tests of individual-level behavioural features