👋 Hi, I’m Martin Orkuma
Data Scientist with an active Secret Clearance and over 4 years of experience turning data into clear insights and actionable strategies for making data-driven solutions.
My work focuses on building reproducible, data-driven pipelines that translate complex datasets into actionable insights.
I am currently pursuing a Master’s degree in Biological Data Science and have experience in statistical modeling, machine learning, and cross-platform analytics workflows.
📫 Connect with me on LinkedIn: https://www.linkedin.com/in/martin-orkuma/
🚀 Technical Toolkit
-
Programming & Scripting: Python, R, Bash (WSL/Linux)
-
Machine Learning & Analytics: Regression, Classification, Clustering, PCA, Model Evaluation, Cross Validation, TensorFlow, and PyTorch.
-
Statistical Methods: Hypothesis Testing, ANOVA, Experimental Design, Inferential Statistics, Longitudinal Data Analysis
-
Python: Pandas, NumPy, scikit-learn, scikit-image, statsmodels, , virtual environments (venv)
-
Data Management: SQL (Joins, Common Table Expressions, Window Functions), data cleaning and validation
-
Data Visualization: Matplotlib, Seaborn, Tableau, Power BI, and ggplot
-
Computer Vision / Imaging: OpenSlide, QuPath, whole-slide image (WSI) tiling, HSV-based tissue segmentation
-
Platforms & Tools: Git/GitHub, Jupyter, RStudio, WSL, Azure, Virtual environments
📂Featured Repositories
- Naked Mole-Rat Ovarian Follicle Machine-Learning Project
End-to-end machine learning pipeline for automated ovarian follicle segmentation and counting in naked mole rat histological images, integrating reproducible preprocessing, annotation, model training, and evaluation workflows. - Human Accelerated Regions Comparative Genomic Project
Human Accelerated Regions (HARs) in Comparative Genomics: Association of Human-Lineage Accelerated Noncoding Regions with Brain Development and Function. - Anolis Ecomorph Classification
An end-to-end biological data science project that uses Bash and Python to automate data ingestion, cleaning, exploratory analysis, feature engineering, and machine learning on Anolis lizard trait data to study evolutionary patterns and build a reproducible ML pipeline. - Comparative Fetal Overaian Reserve Analysis
Comparative analysis of fetal ovarian reserve formation across three mammalian species using histological and biological datasets.
🎯 Interests & Focus Areas
- Computational biology
- Computer vision
- Machine learning
- Population health outcomes

