Learning Management Systems (LMS) or learning platforms like Moodle and Blackboard have become key tools in education. A huge volume of student and teacher data is generated by LMS on a daily basis
Transforming this data into relevant information for decision-making is a major challenge due to the complexity of the data structure and the difficulty of summarizing the learning process based on it.
Ignacio Alvarez
Mauro Loprete
Oscar Montañés
Jimena Padín
Bruno Tancredi
Identify the key drivers in the use of LMS, understand the sources of variability, and define measures for student engagement.
Predict student performance in English tests based on their use of the LMS.
Year | N | Students | Techerss |
---|---|---|---|
2018 | 1.912801 | 98.054 | 3801 |
2019 | 2.652856 | 103.168 | 4474 |
2020 | 7.403004 | 109.019 | 5477 |
2021 | 8.607445 | 119.065 | 5299 |
To transform these data into relevant information for analysis and decision-making it is required to pre-process the data, define relevant summaries, and statistical visualization.
Should be scalable
Shuld be modularized
Should be flexible enough to maintain, modify and test
Should be reproducible
Should be prepare to be use for many users at the same time
The app was develop based on rhino
conected with PostgreSQL data base.
Goal: Predict English test performance based on LMS student usage.
Different problems:
Predict the final points in the English test for each student.
Predict the average points in the English test at the class level.
Predict whether a 6th grade student will reach the A2.1 level as it is expected.
Workflow based on tidymodels
Data sources:
Focus:
Monthly attemps by Grade and Socioeconomic context
Monthly right answers by Socioeconomic context and English level
Children in 6th grade are to reach A2.1 level.
Response: \[Y_i =\begin{cases} 1 & \text{ reaches A2 level or higher} \\ 0 & \text{ otherwise } \end{cases}\]
Use LB acumulated work up to July
Fit several statistical learning methods
Include school random effect, BART
A single tree is denoted as:
\(g(x | T, M ) = \sum_k \mu_k I(x \in R_k)\)
having two basic parameters: tree structure \(T\) and set of leaves values \(M = (\mu_{1}, \ldots, \mu{b})\).
BART: sums of trees model
\[\begin{array}{cl} Y_{i} &= f(X_i) + \epsilon_i \\ &= \sum_j g_j(X_i | T_j, M_j) + \epsilon_i \\ & \epsilon_i \sim N(0, \sigma^2) \end{array}\]It is possible to add random effects, \(f(X_{ig}) + \alpha_g\).
There are schools with positive effetcs in most quintile groups
LMS data can be used to improve education.
Transforming data into relevant information requires appropriate computational tools and statistical methods.
We present summary statistical information in a Shiny app based on rhino to monitor and evaluate LMS usage at different levels.
We built a BART model to predict academic performance at the end of the year.
Model results can be used as an early warning tool to identify students at risk and centers in need of intervention, as well as centers to learn from.
Some of the relevant tools used in this project for data wrangling, visualization, and predictive modeling were data.table
, shiny
, rhino
, plotly
, ggplot2
, tidymodels
, dbart
, and databases (PostgreSQL).