Network Failure Prediction

Project Proposal

Title: Neural Network for Predicting Network Failure

Objective: To develop a predictive model using neural networks that can analyze network traffic and system metrics to identify potential failure points before they occur.

Approach: Our approach includes traffic pattern analysis, system metric analysis, failure simulations using ns-3 or Mininet, and hybrid modeling techniques. We will implement the solution in Python using TensorFlow or PyTorch.

Expected Outcome: A prototype system capable of accurately predicting network failures, contributing to proactive maintenance and improved network reliability.

Midterm Update

Dataset Selection: We have switched from our initial plan of using generic public datasets to adopting the SOFI dataset (“Symptom-fault relationship for IP-network”). This dataset provides a labeled collection of normal (“NE”) versus faulty (“F”) network states captured from a large, emulated IP network. It includes SNMP-based performance metrics and covers a range of artificially induced failures such as link down events, line card failures, and high link utilization.

Reason for the Change: The SOFI dataset’s clear labeling and rich feature set (e.g., inbound/outbound packets, error rates, operational status) make it ideal for training our predictive model. It also aligns with real-world conditions where failure instances can be rare.

Progress:

Preprocessed the dataset (label encoding, scaling) and confirmed class imbalance (~1.5% failures). We are addressing this through class weighting and potential oversampling methods.
Developed an initial feedforward neural network, achieving high accuracy but a moderate recall for the minority “failure” class.
Planned next steps include comparing other ML algorithms (Random Forest, Logistic Regression) and incorporating time-based splitting to capture sequential patterns more effectively.

Biweekly Update #3

Progress Overview:
Over the past two weeks, we have made substantial progress in building and evaluating machine learning models for network failure prediction using the SOFI (Symptom-Fault relationship for IP-Network) dataset. This dataset, developed to capture symptom-fault causal relationships in IP-based enterprise networks, provided a rich and well-labeled foundation for our experiments.

Our initial focus was on implementing two baseline models using PyTorch in Google Colab:

Supervised Learning Model: Built using a basic feedforward neural network. It was trained on labeled instances of the SOFI dataset, using cross-entropy loss and accuracy as evaluation metrics. This model performed well overall and showed promise in detecting patterns indicative of faults.
Unsupervised Learning Model: Constructed as an autoencoder, this model learned to reconstruct input data from normal (healthy) instances. High reconstruction error was used as an indicator of anomaly, allowing the model to identify unusual network behaviors without relying on labels.

Updated Project Direction:
Based on our results, we’ve expanded the scope of our project to compare four different machine learning approaches:

Supervised Learning (Completed)
Unsupervised Learning (Completed)
Semi-Supervised Learning (In Progress) – Combines labeled and unlabeled data to improve generalization while reducing reliance on fully labeled datasets. We plan to experiment with pseudo-labeling and consistency regularization techniques.
Time-Dependent (Temporal) Model (Upcoming) – Will involve sequential models such as RNNs or LSTMs to account for temporal dependencies in network behavior. This model aims to capture how patterns evolve over time, which is essential for forecasting failures.

Next Steps:

Complete semi-supervised model and evaluate performance.
Preprocess SOFI data into time-series format for temporal model development.
Compare all models using performance metrics such as accuracy, precision/recall, and false positive rate.
Begin drafting our final report and preparing visualization material.

Challenges:
Our primary challenges involve preparing the data in a format compatible with each model type and ensuring fair comparisons between them. Time-series preprocessing, in particular, is a focus for the coming week. Additionally, handling the dataset’s class imbalance across model types continues to be an area we’re actively addressing.

Overall, we’re excited by our progress and believe the comparative evaluation of these models will provide valuable insights for real-world network failure prediction applications.

Final Report Summary

Dataset: We used the SOFI CoreSwitch-II dataset, which captures over 12,000 labeled network states with performance metrics and artificially induced failures. The dataset is highly imbalanced, with failures comprising less than 2% of records.

Preprocessing: We cleaned the data by removing placeholder values, dropped low-variance features, engineered error-based ratios, and scaled inputs. To address class imbalance, we used class weighting and SMOTE for supervised models.

Models Implemented:

Random Forest: Served as our baseline. Achieved 68% recall and 30% precision on failures using threshold tuning. Performed best out of standalone models.
Autoencoder: Unsupervised model trained on normal data. Detected 61% of failures using reconstruction error but had a lower precision of 22%.
Feedforward Neural Network: Supervised model trained using binary cross-entropy and dropout. Achieved 64% recall and 38% precision on failures.
Support Vector Machine: Achieved 39% recall and 44% precision. Good at precision but less sensitive to failure cases.
Semi-Supervised Model: Combined autoencoder reconstruction error with Random Forest. Achieved our best results: 71% recall, 50% precision, and an F1-score of 0.59.
LSTM Model: A sequence model that underperformed due to weak temporal structure in the dataset. Only detected 4% of failures.

Key Takeaways:

Class imbalance must be addressed to get meaningful results.
Simpler models (e.g., Random Forest) performed surprisingly well.
Combining models provided the best balance between recall and precision.
LSTM was not suitable without clearer time-based patterns.
Preprocessing and threshold tuning were more impactful than model complexity in many cases.

Conclusion: This project gave us hands-on experience with a wide range of modeling strategies for anomaly detection. Our findings reinforce that model selection, data handling, and evaluation strategy all contribute significantly to real-world performance.

📄 View Full Final Report (PDF)

About the Project

Project Proposal

Midterm Update

Biweekly Update #3

Final Report Summary