Machine Learning Practice
 

Our machine learning solutions follow a CRISP-DM based process (cross industry process for data mining) purposed for integrity & risk management.  As shown below, the practice is an iterative, business objective driven process including key elements of learning & prediction data preparation, method selection, model validation, scoring & predictions, results analysis and continuous improvement. Check out our technical paper Machine Learning for Pipeline Integrity for more details.

Team Members
Establish Business Case

Establish the business case, and identify the expected role and value of incorporating the machine learning practice. Set clear metrics of success, outline the overall process, identify team members & stakeholders, set roles & responsibilities, and secure financial commitments and leadership buy-in.

Data.jfif
Define Data Requirements

Assess data requirements to meet the business case. This is often an iterative process leveraging domain expertise and learning results. Create a clear meta-data document identifying source, cost & value of data, and initiate integration into the learning process to quickly learn the value of data. 

QA.png
Prepare, Integrate, DynSeg & QA Data

Often the most resource intensive part of machine learning, the preparation, integration, dynamic segmentation, and QA of data is an iterative process leveraging statistical & feature analysis methods, learning curves to optimize learning data sets, outlier analysis and purposed quality metrics to ensure data is ready for learning and subsequent prediction.  

ModelLearning.png
Evaluate & Optimize Learning Methods

Hundreds of methods are available to the process to learn underlying patterns, the selection of which depends on required transparency, process performance and performance of unseen data processed thru the method. Learning curves support optimization of method hyper-parameters, sampling, feature selection and method types. Once a method is "learned" and "validated" it is defined as a model appropriate to support scoring & predictions.

Validation.png
Model Validation

The "models" resulting from the learning methods process are validated thru cross-validation and testing with unseen observations. Prediction results are then uniquely associated with levels of confidence and other performance measures.

Network.png
Model Application & Prediction Results

The "models" resulting from the learning process are validated thru cross-validation and testing with unseen observations. Prediction results are then uniquely associated with levels of confidence and other performance measures, and ready for use to support the business case.

RiskCurves.png
Risk Analysis & Decision-Making

Results are normally expressed as levels of susceptibility or probability of the target of interest and may be monetized to support mitigation decision-making and maintenance and capital planning.