Introducing Myself
Hi, I'm Luyang Si, the Alumni of the University of Illinois Urbana–Champaign majoring in Information Sciences + Data Science, graduated in 2025.
I work at the intersection of data engineering, analytics, and decision science. My focus is building systems that transform messy data into reliable insights and help people make better decisions under uncertainty.
My work combines data infrastructure, statistical analysis, and behavioral experimentation, with an emphasis on building tools that turn complex datasets into actionable knowledge.
What I Work On
Data Infrastructure
I design and build data pipelines and analytical systems that enable reliable decision-making.
My work includes:
- Incremental ETL pipelines
- Data warehouse modeling
- SQL-based analytical workflows
- Data quality validation systems
- Metadata and lineage tracking
Technologies I frequently use:
- Python
- SQL / T-SQL
- Azure SQL
- Pandas
- Data visualization and statistical analysis tools
I focus on building data systems that are:
- Reproducible
- Scalable
- Observable
- Easy to maintain
Analytics & Decision Systems
Beyond infrastructure, I'm interested in how analytics can improve real-world decision-making.
I work on systems that combine:
- Behavioral data
- Statistical evaluation
- Structured experiments
to better understand how people reason about risk, uncertainty, and long-term planning.
Current Work
Decision Science Experiments – RtB
I currently work with RtB, an early-stage research-driven company building tools that help individuals improve their decision-making.
My work includes designing and analyzing experiments such as:
Calibration Experiments
Participants assign probabilities to uncertain events and receive feedback on prediction accuracy. We evaluate performance using metrics such as:
- Brier Score
- Calibration curves
- Confidence vs accuracy patterns
These experiments help reveal systematic biases in human judgment.
Financial Decision Simulators
Another project involves interactive simulations where participants make decisions about:
- Savings and consumption
- Risk allocation
- Long-term financial planning
These experiments generate behavioral datasets that can be analyzed to understand decision patterns.
Selected Projects
Data Pipeline Architecture
I have built several production-style data pipelines designed for analytics and research workflows.
Key features include:
- Incremental ingestion with watermark tracking
- Idempotent upserts
- Late-arriving data handling
- Automated data quality checks
- Metadata logging for pipeline runs
These pipelines ensure that analytical datasets remain consistent and reproducible.
Decision Framework Analysis
I also work on extracting structured decision frameworks from interview transcripts.
This involves:
- Parsing decision statements
- Identifying decision objects
- Clustering decisions by structure and trade-offs
The goal is to identify reusable decision frameworks that can support decision-support tools.
Dataset Discovery Tools
Another project explores how to improve dataset discovery for researchers and analysts.
I built a prototype dataset recommender system that uses:
- Semantic similarity
- Dataset metadata
- Structured dataset catalogs
to help users discover relevant datasets more efficiently.
Research Interests
The problems I care most about sit between several fields:
- Data infrastructure
- Decision science
- Dataset bias and research data quality
I'm particularly interested in how better data systems can support more reliable knowledge and better decision-making.
Looking Ahead
I'm interested in roles involving:
- Data engineering
- Analytics and experimentation
- Decision-support systems
I enjoy working on problems where data systems, human behavior, and analytical reasoning intersect.