Data Warehouse Modernization

Predicting Completion Time of Production Jobs using ML

case-study-feature-img

Bitwise used machine learning to help a leading payment technology and software solutions provider increase efficiency by accurately predicting daily ETA for critical applications to track daily loads and notify support team and stakeholders if there will be any delay in loads to take preventive and proactive actions.

Client Challenges and Requirements

  • Current support process is time consuming, based on assumptions and is prone to human error.
  • Identify factors to predict the expected completion time of production jobs.
  • Derive information from date and time related factors and how to use them in predictive modelling.
  • Training ML models on large scale data using ML packages in Python.
  • Deployment and scheduling of model for daily prediction report.
  • Generating interactive reports on obtained predictions.

Bitwise Solution

  • Analysis of historical job logs to extract relevant factors contributing to total execution time of pipelines.
  • Selection, creation and rejection of features based upon influence on total execution time.
  • Feature engineering to transform date and time related features into ML model understandable values like trigonometric transformations or one-hot encodings.
  • Evaluate different solution strategies like forecasting and regression.
  • Machine learning model building and tuning using the selected features.
  • Choose best algorithms based on evaluation metrics.
  • End-to-end pipeline for daily predictions of production pipelines.
  • UI and reporting integration for better visualization of predictions.

Tools & Technologies We Used

Python
Scikit-Learn
GCP Compute Engine
Google Data Studio

Key Results

skill-icon

Quick and automated training of multiple models based on different applications

skill-icon

Expected time of completion was predicted with low error rate of +/- 10 mins (with buffer)

skill-icon

Enabled support teams to quickly respond to product owners about the data availability

Download Case Study

    To get our latest updates subscribe to our Newsletter.

    Ready to start a conversation?