Python Skills Overview
The projects I’ve chosen to include here range from basic applied language, to machine learning models. Most of my projects are in the form of scripts and notebooks, fully open-sourced on my GitHub repository, Python.
A summary of some of the python packages and tasks I’m familiar with handling, can be observed below…
1. Data Manipulation & Processing
Libraries: pandas, numpy
Efficient in handling missing data, outliers, and feature engineering
Strong in data aggregation, merging, and transformation
Optimized large dataset operations using vectorized computations
2. Machine Learning & Predictive Modeling
Libraries: scikit-learn, statsmodels
Trained models like Linear Regression, Logistic Regression, Decision Trees, and Random Forests
Applied feature selection (LASSO, Ridge), VIF checks, and statistical testing
Optimized models with cross-validation and hyperparameter tuning
3. Unsupervised Learning & Clustering
Libraries: scikit-learn, hdbscan
Strong in K-Means, DBSCAN, and hierarchical clustering
Familiar with Principal Component Analysis (PCA) for dimensionality reduction
4. Time Series & Forecasting
Libraries: statsmodels, pmdarima, fbprophet
Built ARIMA, SARIMA, and exponential smoothing models
Used autocorrelation and differencing for trend analysis
5. Natural Language Processing (NLP)
Libraries: nltk, spacy, transformers
Performed tokenization, stemming, lemmatization, stopword removal
Built TF-IDF, BERT, and sentiment analysis models
6. Data Engineering & Big Data
Libraries: sqlalchemy, pyspark, dask
Built ETL pipelines for large-scale data processing
Optimized SQL queries using Python (SQLAlchemy)
7. Data Visualization
Libraries: matplotlib, seaborn, plotly, folium
Created interactive and static visualizations
Built geospatial visualizations using Folium & GeoPandas
8. Model Evaluation & Optimization
Libraries: scikit-learn, shap, lime
Evaluated models using ROC-AUC, confusion matrices, precision-recall
Explained complex models with SHAP & LIME
9. Automation & Scripting
Automated data workflows with Python scripts & Jupyter Notebooks
Built Python-based task schedulers and reporting tools