Personal Information
Unternehmen/Arbeitsplatz
United States United States
Beruf
Data Scientist
Branche
Technology / Software / Internet
Info
I lead the development and deployment of scaleable models, with expertise in both real-time and big data architecture.
= Apache: Spark, Hadoop, Pig, Hive, and Oozie.
= Python: scikit-learn, pandas, NumPy, and Luigi.
= R: PivotalR, madlib, Time Series Analysis with X12-ARIMA.
= Modeling: MLLib, H2O, yhat, Sense
= Machine Learning: Random Forests, Clustering, Association Rules, and Logistic Regression.
= Software Development: Streaming, Distributed Systems, REST APIs.
= Visualization: Matplotlib, ggplot2, Seaborn, and D3.
= Database: Hive, Postgres, SQL
I build data science pipelines and frameworks (see my presentations below).
Tags
model
classification
machine learning
kaggle
predictive analytics
analytics
data science
software
scikit-learn
logistic regression
xgboost
tensorflow
pipeline
pandas
python
gradient boosting
random forest
framework
stock market
regression
market analysis
change point
nfl
fantasy
sports
Mehr anzeigen
Präsentationen
(3)Gefällt mir
(4)AlphaPy
Robert Scott
•
Vor 7 Jahren
kaggle_meet_up
Marios Michailidis
•
Vor 7 Jahren
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Vivian S. Zhang
•
Vor 8 Jahren
General Tips for participating Kaggle Competitions
Mark Peng
•
Vor 8 Jahren
Personal Information
Unternehmen/Arbeitsplatz
United States United States
Beruf
Data Scientist
Branche
Technology / Software / Internet
Info
I lead the development and deployment of scaleable models, with expertise in both real-time and big data architecture.
= Apache: Spark, Hadoop, Pig, Hive, and Oozie.
= Python: scikit-learn, pandas, NumPy, and Luigi.
= R: PivotalR, madlib, Time Series Analysis with X12-ARIMA.
= Modeling: MLLib, H2O, yhat, Sense
= Machine Learning: Random Forests, Clustering, Association Rules, and Logistic Regression.
= Software Development: Streaming, Distributed Systems, REST APIs.
= Visualization: Matplotlib, ggplot2, Seaborn, and D3.
= Database: Hive, Postgres, SQL
I build data science pipelines and frameworks (see my presentations below).
Tags
model
classification
machine learning
kaggle
predictive analytics
analytics
data science
software
scikit-learn
logistic regression
xgboost
tensorflow
pipeline
pandas
python
gradient boosting
random forest
framework
stock market
regression
market analysis
change point
nfl
fantasy
sports
Mehr anzeigen