Introducing Myself

Hi, I'm Luyang Si, the Alumni of the University of Illinois Urbana–Champaign majoring in Information Sciences + Data Science, graduated in 2025.

I work at the intersection of data engineering, analytics, and decision science. My focus is building systems that transform messy data into reliable insights and help people make better decisions under uncertainty.

My work combines data infrastructure, statistical analysis, and behavioral experimentation, with an emphasis on building tools that turn complex datasets into actionable knowledge.

What I Work On

Data Infrastructure

I design and build data pipelines and analytical systems that enable reliable decision-making.

My work includes:

Incremental ETL pipelines
Data warehouse modeling
SQL-based analytical workflows
Data quality validation systems
Metadata and lineage tracking

Technologies I frequently use:

Python
SQL / T-SQL
Azure SQL
Pandas
Data visualization and statistical analysis tools

I focus on building data systems that are:

Reproducible
Scalable
Observable
Easy to maintain

Analytics & Decision Systems

Beyond infrastructure, I'm interested in how analytics can improve real-world decision-making.

I work on systems that combine:

Behavioral data
Statistical evaluation
Structured experiments

to better understand how people reason about risk, uncertainty, and long-term planning.

Current Work

Decision Science Experiments – RtB

I currently work with RtB, an early-stage research-driven company building tools that help individuals improve their decision-making.

My work includes designing and analyzing experiments such as:

Calibration Experiments

Participants assign probabilities to uncertain events and receive feedback on prediction accuracy. We evaluate performance using metrics such as:

Brier Score
Calibration curves
Confidence vs accuracy patterns

These experiments help reveal systematic biases in human judgment.

Financial Decision Simulators

Another project involves interactive simulations where participants make decisions about:

Savings and consumption
Risk allocation
Long-term financial planning

These experiments generate behavioral datasets that can be analyzed to understand decision patterns.

Selected Projects

Data Pipeline Architecture

I have built several production-style data pipelines designed for analytics and research workflows.

Key features include:

Incremental ingestion with watermark tracking
Idempotent upserts
Late-arriving data handling
Automated data quality checks
Metadata logging for pipeline runs

These pipelines ensure that analytical datasets remain consistent and reproducible.

Decision Framework Analysis

I also work on extracting structured decision frameworks from interview transcripts.

This involves:

Parsing decision statements
Identifying decision objects
Clustering decisions by structure and trade-offs

The goal is to identify reusable decision frameworks that can support decision-support tools.

Dataset Discovery Tools

Another project explores how to improve dataset discovery for researchers and analysts.

I built a prototype dataset recommender system that uses:

Semantic similarity
Dataset metadata
Structured dataset catalogs

to help users discover relevant datasets more efficiently.

Research Interests

The problems I care most about sit between several fields:

Data infrastructure
Decision science
Dataset bias and research data quality

I'm particularly interested in how better data systems can support more reliable knowledge and better decision-making.

Looking Ahead

I'm interested in roles involving:

Data engineering
Analytics and experimentation
Decision-support systems

I enjoy working on problems where data systems, human behavior, and analytical reasoning intersect.