Predictive ML Experiments

Exploring machine learning techniques in challenging prediction domains to enhance understanding and practical skills.

48%

Top-3 Placement Accuracy

50%

Forex Entry Accuracy

Active Applications

R&D

Ongoing Learning

The Challenge

Horse racing and forex markets represent some of the most challenging domains for predictive modeling. These environments are characterized by high noise, numerous variables, and outcomes that financial institutions and professional handicappers struggle to predict consistently.

Rather than avoiding these difficult domains, I chose them as learning laboratories to push my understanding of feature engineering, model selection, and performance evaluation in real-world scenarios where even modest improvements have significant value.

The Approach

Built two distinct applications that demonstrate different aspects of predictive ML: a horse racing predictor that processes daily race cards and historical performance data, and a forex signal generator that analyzes EUR/USD price movements for trading opportunities.

Project Details

Domain

Personal R&D

Focus

Predictive ML Learning

Status

Ongoing Experiments

Technologies

PythonScikit-learnPandasWeb Scraping

Two Experimental Applications

Horse Racing Predictor

Daily race analysis and winner prediction

Data Pipeline

Parses daily race cards, historical performance data, jockey statistics, track conditions, and betting odds to create comprehensive feature sets for each runner.

Model Approach

Trains fresh models daily using recent historical data, focusing on predicting top-3 finishers rather than outright winners to improve accuracy in this high-variance domain.

Current Performance48% Top-3 Accuracy

EUR/USD Trading Signals

Currency pair prediction model

Feature Engineering

Incorporates technical indicators, price momentum, volatility measures, and time-series patterns to identify profitable entry and exit points.

Signal Generation

Focuses on binary buy/sell signals with confidence scoring, emphasizing risk management and position sizing over frequency of trades.

Entry Accuracy50% Profitable

Technical Learning

Feature Engineering in Noisy Domains

Both applications required careful feature selection and engineering to extract signal from noise. Horse racing demanded understanding of racing dynamics, while forex required technical analysis integration.

Learned the importance of domain expertise in feature creation and the challenges of time-series prediction in non-stationary environments.

Model Performance in Real-World Conditions

Traditional accuracy metrics proved insufficient for these domains. Developed custom evaluation approaches focusing on practical utility—top-3 predictions for racing and risk-adjusted returns for forex.

Gained deep appreciation for the gap between laboratory performance and real-world application, especially in high-stakes prediction scenarios.

Data Pipeline Architecture

Built robust data collection and preprocessing pipelines capable of handling inconsistent data sources, missing values, and real-time updates. These systems taught valuable lessons about production ML infrastructure and data quality management.

Key Insights

These experiments demonstrate that even in notoriously difficult prediction domains, thoughtful feature engineering and domain understanding can yield models that outperform random chance by meaningful margins. The real value lies not in perfect predictions but in learning to extract actionable insights from complex, noisy data.

"Satisfies ML curiosity by tackling domains where even modest accuracy improvements have significant value—shows willingness to learn through challenging real-world applications."

Learning

Focus on continuous improvement through challenging applications

Domain

Expertise crucial for effective feature engineering

Real-World

Applications reveal gaps in traditional ML approaches

Back to Home

Discuss This Project