top of page
Search

Aviation Shipping and Transportation Analytics


Predicting Flight Delays with AWS & Machine Learning


This project focused on helping a logistics company reduce cargo delays at San Diego International Airport. I built a full machine learning pipeline using AWS SageMaker, S3, and Athena to clean and merge weather + flight data. Using XGBoost, I trained a classification model that reached 95% recall—accurately predicting which flights would be delayed or canceled. The goal was to support better planning and reduce unplanned disruptions for air cargo shipment.







Full Paper Report:


Final Presentation for Non Technical Audience



Results & Impact

  • Achieved 95% recall on the delayed/canceled flight class, ensuring the model caught nearly all disruptions.

  • Used AWS SageMaker's built-in XGBoost algorithm for scalable, cloud-based training and deployment.

  • Reduced class imbalance through undersampling, leading to more accurate predictions across flight statuses.

  • Connected real-time weather patterns to flight outcomes, helping identify key contributors like wind speed and visibility.

  • Built a modular pipeline with S3 buckets, Athena queries, and a clear data flow from raw ingestion to predictions.

  • Designed the model as a decision-support tool, helping logistics teams plan around potential delays more effectively.

 
 
 

Recent Posts

See All

Comments


Send a message
 and I’ll get back to you shortly.

 

© 2025 by Tanya Ortega | Data Science Portfolio. Powered and secured by Wix 

 

bottom of page