Student Mobility Management Platform
Overview
A microservices-based platform serving four user personas — students, guardians, faculty, and bus drivers — across a US school district. Integrates with two external school systems (TXEIS, Transfinder) via automated ETL, provides real-time bus tracking with anomaly detection for guardians, and runs a digital behavioral incentive program for faculty and students.
| Domain | EdTech — K-12 Student Transportation & Behavioral Management |
|---|---|
| Role | Senior Software Engineer / Technical Lead |
| Team | 3 Backend Engineers, 2 Mobile Developers, 1 DevOps |
| Stack | Node.js, Express, MongoDB, Redis, RabbitMQ, Socket.io, React, React Native |
| Scale | 5 microservices, 25+ data models, 60+ API endpoints, 5 external integrations |
Key Outcomes
- 5 microservices orchestrated behind a reverse proxy with centralized logging
- 25+ data models with automated ETL from two external school systems
- Real-time event tracking for student bus boarding/alighting with anomaly detection
- Role-based behavioral system — faculty award points, students redeem prizes
- Push notification pipeline via RabbitMQ to guardian mobile devices
System Architecture
High-Level Overview
Technology Stack
| Layer | Choice | Rationale |
|---|---|---|
| Runtime | Node.js | Non-blocking I/O for real-time transport tracking; shared language across all services |
| Framework | Express.js | Lightweight, middleware-driven — suited for the custom IAM pattern |
| Database | MongoDB | Schema flexibility for varied entity shapes (students, trips, mobility logs); GridFS for document storage |
| Real-time | Socket.io | Bidirectional communication for live bus tracking and instant notifications |
| Message Queue | RabbitMQ | Decoupled notification delivery — school events and transport alerts produced independently |
| Sessions/Cache | Redis | Shared session store enabling stateful auth across horizontally scaled instances |
| Observability | ELK Stack | Centralized log aggregation with structured nginx log parsing via Logstash grok patterns |
| Proxy | Nginx | Path-based routing to service backends; Filebeat sidecar for log shipping |
| Containers | Docker | Per-service isolation with Docker Compose for local development |
| CI/CD | GitLab CI | Automated build → test → deploy pipeline |
Service Architecture
I decomposed the system into domain-bounded services, each owning a specific business capability:
Service Map
School API
The core business service managing the behavioral incentive program and academic records. Faculty award rewards (positive points) and deficiencies (negative points) to students. Students accumulate points and redeem them for prizes through faculty. Guardians receive push notifications for every transaction.
Key design decision: I implemented a custom IAM (Identity Access Management) framework rather than using off-the-shelf RBAC libraries. Each route declares a permission string (module:resource:action), and a role-to-permission ACL resolves authorization at the middleware layer. This gave us fine-grained control — for example, a guardian can view their children’s history but cannot view other students, while faculty can only give rewards/deficiencies to students in their assigned classes.
Transport API
Handles real-time bus operations. When a driver scans a student’s QR code at boarding, the system:
- Validates the student belongs to this trip, bus, and stop
- Logs a mobility event (success or anomaly type)
- Constructs a contextual notification message (e.g., “John was picked up from Oak Street on Trip AM-12”)
- Publishes the event to RabbitMQ for guardian notification delivery
Anomaly detection catches three scenarios: wrong trip, wrong bus, and wrong stop — each generating a distinct alert to guardians.
ETL Service
An event-driven data pipeline that synchronizes the school district’s external systems with our platform. It downloads files via FTP, converts XLS/CSV to JSON, deduplicates and normalizes records, then upserts to MongoDB — all orchestrated through an EventEmitter pattern with dependency-ordered processing.
Insertion order is dependency-driven: Courses → Students → Buses → Drivers → Faculty → Guardians → Report Cards → Trips → TripStops → Transport Student Data → Attendance. Each insertion emits a completion event that triggers the next dependent stage.
Data Model
Entity Relationship Overview
The system manages 25+ entities across four bounded contexts:
Design pattern: The base User entity uses a role-polymorphic pattern — a single authentication model linked 1:1 to role-specific profile entities (Student, Faculty, Guardian, Driver, Administrator). This enabled a unified auth flow while allowing each role’s profile to evolve independently.
Authentication & Authorization
Custom IAM Framework
Rather than using a generic RBAC library, I built a declarative IAM system where routes define their required permissions and authorization is resolved at the middleware layer:
Route declaration pattern — each module declares its routes as a data structure with permission strings:
// Declarative route + permission binding
{
prefix: "/auth",
routes: [{
path: "/signin",
methods: {
post: {
middlewares: [users.signin],
iam: "users:auth:signin" // ← permission string
}
}
}]
}Role hierarchy — permissions cascade and compose:
| Role | Inherits | Unique Permissions |
|---|---|---|
guest | — | signup, signin, forgot password |
user | — | signout, profile, files |
student | user | view own grades, attendance, prizes, QR card |
faculty | user | give rewards/deficiencies, redeem prizes, view class students |
guardian | user | view children, children’s history and prizes |
driver | user | trip selection, student boarding, discipline recommendations |
admin | user | full CRUD on all entities |
Multi-Strategy Authentication
The School API and Auth Gateway use standard email + password authentication (Passport.js local strategy with PBKDF2 hashing). The Transport API uses a distinct strategy — drivers authenticate with their employee ID badge, which maps through the Driver entity to the underlying User record. This was a deliberate UX decision to minimize friction for drivers who operate handheld devices in a moving vehicle.
Notification Pipeline
The notification system uses a producer-consumer pattern with RabbitMQ as the message broker:
Device targeting: When a notification is produced, the system resolves the student’s guardians, finds their registered devices (via the Device model), and includes the device IDs in the message payload. The consumer uses these to target the correct push notification recipients.
Message types:
| Source | Event | Guardian Receives |
|---|---|---|
| School API | reward | “Your child received a reward from [Faculty]” |
| School API | deficiency | “Your child received a deficiency from [Faculty]” |
| School API | redeem | “Your child redeemed [Prize] through [Faculty]” |
| Transport API | onboarding_successful | “[Child] was picked up from [Stop] on [Trip]” |
| Transport API | dropoff_successful | “[Child] has been dropped off at [Stop]” |
| Transport API | wrong_trip | “[Child] took the wrong trip: [Trip Name]” |
| Transport API | wrong_bus | “[Child] got on the wrong bus: [Vehicle]” |
| Transport API | wrong_stop | “[Child] was picked up from the wrong stop: [Street]” |
Observability
Centralized Logging
Logstash parses structured nginx access logs extracting: client IP, HTTP method, URL path, response status, response time, bytes transferred, and request body. This enabled us to build Kibana dashboards for API latency monitoring, error rate tracking, and usage pattern analysis.
Scaling Design
Identified Bottlenecks & Solutions
Target Production Architecture
Retrospective: What I Would Do Differently
Having taken this system through its full lifecycle, here are the decisions I would revisit:
1. Shared Database Was a Mistake
What happened: All services read/write the same MongoDB instance with their own Mongoose model definitions. Over time, the User schema diverged across services — different field names (birth_date vs birthdate), different structures (full_name string vs name.first/name.last object), different type definitions (localID as String vs Number).
What I’d do instead: Database-per-service from the start, with a shared schema package (npm module) for the canonical model definitions. Cross-service data needs would be handled through an event bus (change streams or explicit domain events), not direct database access.
2. Build the Auth Service First, Not Last
What happened: The Auth Gateway (meta-api) was added late as a cross-service communication layer. By then, each service had already implemented its own authentication — including the Transport API’s unusual employeeID-as-password pattern.
What I’d do instead: Start with a centralized auth service issuing JWTs. All other services validate tokens against this service. This immediately enables stateless horizontal scaling and eliminates the session-stickiness problem.
3. The Custom IAM Was Valuable — But Needed a Shared Package
What happened: The declarative IAM pattern (route → permission → role ACL) worked extremely well for developer ergonomics and fine-grained access control. But the implementation was copy-pasted across services rather than extracted as a shared package.
What I’d do instead: Publish the IAM framework as an internal npm package from day one. Define the role/permission matrix in a central config consumed by all services.
4. Event-Driven ETL Was the Right Call
What worked: The EventEmitter-based parser with dependency-ordered processing was elegant and maintainable. Adding a new data source meant writing an insertor module and wiring up the event chain. The upsert pattern (findOneAndUpdate with upsert: true) made re-runs idempotent.
What I’d improve: Add a proper job queue (Bull/BullMQ) with retry logic, progress tracking, and a dashboard. The current implementation has no visibility into failures beyond a log file.
5. RabbitMQ Was Over-Engineered for the Scale
What happened: We introduced RabbitMQ for the notification pipeline, but at the scale of a single school district, a simpler approach (direct HTTP calls to a notification microservice, or even in-process event handling) would have sufficed.
When it becomes the right choice: If the platform expands to multiple districts with thousands of concurrent bus trips, the message queue becomes essential for decoupling and backpressure handling. The architecture was forward-looking but added operational complexity prematurely.
6. Node 8 Should Have Been Flagged Earlier
What happened: We started on Node 8.11.1 and never prioritized the runtime upgrade. By the time we recognized it, the ecosystem had moved on significantly (async iterators, native ESM, improved diagnostics).
Lesson: Pin the runtime version in the project charter with a planned upgrade cycle. Node LTS versions have a defined end-of-life — treat it like any other dependency.
7. Missing Contract Testing Between Services
What happened: With services sharing a database and communicating via RabbitMQ, there were no formal contracts. A schema change in one service could silently break another.
What I’d do instead: Implement consumer-driven contract testing (Pact) and a message schema registry for the RabbitMQ payloads. Every service change would be validated against its consumers before deployment.
Impact & Metrics
| Metric | Value |
|---|---|
| Services designed & built | 5 microservices + infrastructure |
| Data models | 25+ MongoDB collections |
| API endpoints | 60+ REST endpoints |
| User roles | 6 (student, faculty, guardian, driver, admin, user) |
| External integrations | 5 (FTP/TXEIS, Transfinder, Twilio, OneSignal, SMTP) |
| Data pipeline | 11-stage ETL with dependency resolution |
| Real-time features | Socket.io for live bus tracking, instant notifications |
| Notification types | 8 distinct event types across 2 domains |
Technical Skills Demonstrated
- System Design — Decomposed a monolithic problem into bounded-context microservices
- API Design — RESTful APIs with declarative IAM authorization framework
- Data Modeling — Polymorphic user model, event sourcing for mobility logs, denormalized transport data
- Event-Driven Architecture — RabbitMQ producer/consumer, EventEmitter-based ETL pipeline
- Real-Time Systems — Socket.io with session-authenticated WebSocket connections
- ETL Engineering — FTP ingestion, format conversion, deduplication, idempotent upserts
- DevOps — Docker containerization, Nginx reverse proxy, ELK observability stack, GitLab CI/CD
- Security — Custom IAM, PBKDF2 password hashing, role-based access control, session management
- Technical Leadership — Architecture decisions, trade-off analysis, scaling roadmap