1) The three categories of student data
- Identity data
Name, email, student ID, class, school, parent contact details. - Learning data
Progress, quiz scores, assignments, participation, attendance, timestamps. - Behavioral/telemetry data
Clicks, time-on-task, scroll depth, device info, IP address, session recordings.
2) Data minimization: the most underrated strategy
Only collect what you need to deliver learning outcomes.
Ask:
- Do we need a date of birth, or can we store an age range?
- Do we need full addresses, or just country/state?
- Do we need session recordings, or aggregated events?
Minimization reduces:
- Breach impact
- Compliance burden
- Legal risk
- Internal access complexity
3) What you should always implement (baseline controls)
Role-based access control (RBAC)
- A teacher should not see administrative billing.
- A school admin should not see other schools' data.
- A support agent should have scoped, audited access.
Audit logs
Log:
- who accessed student records
- what changed
- when exports happened
- permission updates
Encryption
- Encrypt data at rest
- Encrypt data in transit
- Protect secrets properly (no plaintext tokens in code or logs)
Retention rules
Define how long you keep:
- inactive user accounts
- old assignments
- analytics events
- logs and exports
Retention should not be "forever by default."
4) AI features increase privacy responsibility
If you use AI for:
- feedback generation
- personalization
- tutoring
- recommendations
…you need clear policies about:
- what data is used for prompts
- whether data is stored by third parties
- how outputs are moderated
- how you prevent leakage of personal information
5) Implementation reality: privacy needs engineering discipline
6) What to never do (common mistakes)
- Collecting sensitive data "just in case"
- Leaving teacher/admin accounts without MFA
- Storing exports on shared drives with no access controls
- Using production data in dev/testing environments
- Logging personal data in error logs






