rage against the machine learning
personal website of ryan r. curtin
objective
Understand an algorithm. Make it faster.
education
Ph.D. in Electrical and Computer Engineering
Georgia Institute of Technology, Atlanta, GA
Advisors: Dr. David V. Anderson, Dr. Alexander G. Gray, Dr. Charles L.
Isbell, Jr.
completed August 2015
Master of Science in Electrical and Computer Engineering
Georgia Institute of Technology, Atlanta, GA
received May 2009
Bachelor of Science with Highest Honors in Electrical
Engineering
Georgia Institute of Technology, Atlanta, GA
received May 2008
relevant and recent publications
- ``Rk-means: Fast Clustering for Relational Data''. R.R.
Curtin, B. Moseley, H.Q. Ngo, X.L. Nguyen, D. Olteanu, M. Schleich. In
Proceedings of the 23rd Conference on Artificial Intelligence and Statistics
(AISTATS 2020), p.
2742-2752, 2020. [pdf]
- ``On Coresets for Regularized Loss Minimization''. A.
Samadian, K. Pruhs, B. Moseley, S. Im, R.R. Curtin. In Proceedings of the
23rd International Conference on Artificial Intelligence and Statistics (AISTATS
2020), p. 482-492, 2020. [pdf]
- ``Flexible numerical optimization with ensmallen''. R.R.
Curtin, M. Edel, R.G. Prabhu, S. Basak, Z. Lou, C. Sanderson. arXiv preprint
arXiv:2003.04103, 2020. [pdf] [code]
- ``mlpack 3: a fast, flexible C++ machine learning
library''. R.R. Curtin, M. Edel, M. Lozhnikov, Y. Mentekidis, S. Ghaisas,
S. Zhang. Journal of Open Source Software, vol. 3, issue 26, p. 726,
2018. [pdf] [code]
- ``Detecting adversarial samples from artifacts''. R.
Feinman, R.R. Curtin, S. Shintre, A.B. Gardner. arXiv preprint
arXiv:1703.00410, 2017. [pdf]
- ``A dual-tree algorithm for fast k-means clustering with
large k''. R.R. Curtin. In Proceedings of the 2017 SIAM
International Conference on Data Mining, p. 300-308, Houston, Texas,
2017. [pdf]
- ``Armadillo: a template-based C++ library for linear
algebra''. C. Sanderson, R.R. Curtin. Journal of Open Source
Software, vol. 1:26, p. 1-2, 2016. [pdf]
- ``Tree-independent dual-tree algorithms''. R.R. Curtin,
W.B. March, P. Ram, D.V. Anderson, A.G. Gray, C.L. Isbell, Jr. In
Proceedings of the 30th International Conference on Machine Learning (ICML
'13), p. 1435-1443, Atlanta, Georgia, 2013. [pdf]
skills
- Extensive knowledge of Linux and related UNIX-like systems (as well
as Windows)
- Good understanding of and experience with 1930s automotive
technology
- Extremely comfortable with C and C++ as well as a plethora of other
languages and design paradigms
- Experience with distributed, multicore, and GPU technologies such
as MPI, OpenMP, OpenCL, CUDA, and others
- Basic machining knowledge: lathes, mills, drill presses, routers,
saws, etc.
- Knowledgeable with state-of-the-art machine learning techniques for
classification, regression, density estimation, and other similar tasks
- Experienced with hand-optimizing programs for substantial runtime
improvement
- Amateur metallurgist
- Conversant in circuit design and physical implementation
- Skilled woodworker
- Nationally-known indoor kart racer (multiple national-level
wins, 12th overall in KWC 2019)
- Capable of operating voting machines
professional experience
RelationalAI, Atlanta, GA
Summer 2018 - present
Computer Scientist
At RelationalAI my work consists of developing new accelerated in-database
algorithms for machine learning problems, as well as helping design and
implement the database system on which these algorithms will be run. My works
at RelationalAI consist of a few main themes:
- Design the machine learning API for RelationalAI's database
query language (Rel), and bind in
support from external tools such as mlpack and other machine learning libraries
available in Julia.
- Work as part of a research team to develop and implement an
automatic differentiation system for use inside of Rel (a fully declarative
language).
- Implement query optimization strategies to allow efficient
training of machine learning models on relational data, without materializing
joins unnecessarily.
Symantec Corporation, Atlanta, GA
Center for Advanced Machine Learning
Fall 2015 - Summer 2018
Principal Research Scientist
My responsibilities at Symantec fell into roughly three categories:
- Pursue a research programme loosely focused on Symantec-relevant
applications such as malware classification and related tasks
- Continue work as lead developer of mlpack (http://www.mlpack.org), a C++ machine learning
library
- Apply machine learning approaches to internal Symantec problems, or
help other internal Symantec groups improve their machine learning
approaches
Georgia Institute of Technology, Atlanta, GA
Fall 2009 - Fall 2015
Graduate Research Assistant
At various times I worked for these four labs:
I was/am also the primary developer and maintainer for mlpack, an open-source scalable C++ machine
learning library that is in use by scientists worldwide, currently with over
125k downloads and 150 contributors.
I was also involved as a TA or guest lecturer for multiple courses and
groups.
Compuglobalhypermeganet, L.L.C., Atlanta, GA
Spring 2013 - present
CEO/Founder
I do machine learning consulting and advisement.
Google, Inc., Mountain View, CA
Summer 2010
Software Engineering Intern
I worked with the Similar Pages team to provide improved search results.
Georgia Tech Research Institute, Atlanta, GA
Food Processing Technology Division
Fall 2009 - Spring 2010
Graduate Research Assistant
I applied machine learning techniques for stress detection in broiler
chickens.
Georgia Tech Research Institute, Atlanta, GA
ELSYS Lab
Spring 2009 - Fall 2009
Graduate Research Assistant
I investigated techniques for the A-to-D frontend of a radar warning
receiver.
Nexidia, Inc., Buckhead, GA
Summer 2007
Research Intern
I created voice synthesizers that can generate missing samples and still
be comprehensible.
advising, mentoring, and professional service
Through both Google Summer of Code and the labs I have worked for, I have
advised and mentored a number of students.
- 5 Masters students from 2010 to 2014.
- 6 undergraduate students from 2010 to 2015.
- 19 Summer of Code students from 2013 to 2020.
I have also served in a number of volunteer positions.
- Co-organizer for MLOSS 2018 workshop at NIPS [site]
- Program Committee (PC) for EDML 2019
- Reviewer for The Journal of Machine Learning Research (JMLR),
ICML, CVPR, WACV, ICLR, MLOSS 2015 workshop at NIPS,
Science of Computer Programming, GlobalSIP 2014, Transactions
on Knowledge and Data Engineering (IEEE TKDE)
- Fedora Package Maintainer (2013-present)
- President, Linux Users Group at Georgia Tech (2006-2011)
- Treasurer, Eta Kappa Nu, Beta Mu chapter (2007-2009)
full publication list
(journal publications)
- ``The ensmallen library for flexible numerical
optimization''. R.R. Curtin, M. Edel, R. Prabhu, S. Basak, Z. Lou, C.
Sanderson. The Journal of Machine Learning Research (JMLR), vol. 22, p.
1-6, 2021.
- ``Functional Aggregate Queries with Additive Inequalities''.
M.A. Khamis, R.R. Curtin, B. Moseley, H.Q. Ngo, X. Nguyen, D. Olteanu, M.
Schleich. ACM Transactions on Database Systems (TODS) 45.4, pp. 1--41,
2020.
- ``Practical Sparse Matrices in C++
with Hybrid Storage and Templated-Based Expression Optimisation''. C.
Sanderson, R.R. Curtin.
Mathematical and Computational Applications, vol. 24, no. 3, article 70,
2019. [pdf] [html] [bib] [code]
- ``mlpack 3: a fast, flexible machine learning library''.
R.R. Curtin, M. Edel, M. Lozhnikov, Y. Mentekidis, S. Ghaisas, S. Zhang.
The Journal of Open Source Software, volume 3, issue 26, pp. 726, 2018.
[pdf]
- ``Exploiting the structure of furthest neighbor search for fast
approximate results''. R.R. Curtin, J. Echauz, A.B. Gardner.
Information Systems, 2018. [pdf]
- ``gmm_diag and gmm_full: C++ classes
for multi-threaded Gaussian mixture models and Expectation-Maximisation''.
C. Sanderson, R.R. Curtin. The Journal of Open Source
Software, vol. 2, 2017. [pdf]
- ``Armadillo: a template-based C++ library for linear
algebra''. C. Sanderson, R.R. Curtin. Journal of Open Source
Software, vol. 1:26, pp. 1-2, 2016. [pdf
- ``Plug-and-play runtime analysis for dual-tree algorithms''.
R.R. Curtin, D. Lee, W.B. March, P. Ram. The Journal of Machine
Learning Research, vol. 16, p. 3269-3297, 2015. [pdf]
- ``Dual-tree fast exact max-kernel search''. R.R. Curtin,
P. Ram. Statistical Analysis and Data Mining, vol. 7, issue 4, p.
229-253, 2014. [pdf]
- ``mlpack: a scalable C++ machine learning library''. R.R.
Curtin, J.R. Cline, N.P. Slagle, W.B. March, P. Ram, N.A. Mehta, A.G. Gray. In
The Journal of Machine Learning Research (JMLR), vol. 14, p. 801-805,
2013. [pdf]
(conference and workshop publications)
- ``An Approximation Algorithm for the Matrix Tree Multiplication
Problem''. M. Abo Khamis, R.R. Curtin, S. Im, B. Moseley, H. Ngo, K. Pruhs,
A. Samadian. The 46th International Symposium on Mathematical
Foundations of Computer Science (MFCS 2021), vol. 202, p. 6:1-6:14,
2021.
- ``An Adaptive Solver for Systems of Linear Equations''. C.
Sanderson, R.R. Curtin. In The 14th International Conference on
Signal Processing and Communication Systems (ICSPCS '20), pp. 1--6,
2020. [pdf]
- ``Rk-means: Fast Clustering for Relational Data''. R.R.
Curtin, B. Moseley, H.Q. Ngo, X.L. Nguyen, D. Olteanu, M. Schleich. In
Proceedings of the 23rd Conference on Artificial Intelligence and Statistics
(AISTATS 2020), p.
2742--2752, 2020. [pdf]
- ``On Coresets for Regularized Loss Minimization''. A.
Samadian, K. Pruhs, B. Moseley, S. Im, R.R. Curtin. In Proceedings of the
23rd International Conference on Artificial Intelligence and Statistics (AISTATS
2020), p. 482--492, 2020. [pdf]
- ``On functional aggregate queries with additive
inequalities''. M.A. Khamis, R.R. Curtin, B. Moseley, H.Q. Ngo, X.L. Nguyen, D.
Olteanu, M. Schleich. In Proceedings of the 2019 ACM SIGMOD/PODS International
Conference on Management of Data, p. 414--431, 2019. [pdf]
- ``Detecting DGA domains with recurrent neural networks and side
information''. R.R. Curtin, A.B. Gardner, S. Grzonkowski, A. Kleymenov, A.
Mosquera. Proceedings of The 14th International Conference on
Availability, Reliability, and Security, p. 1--10, 2019.
- ``ensmallen: a flexible C++ library for efficient function
optimization''. S. Bhardwaj, R.R. Curtin, M. Edel, Y. Mentekidis, C.
Sanderson. Proceedings of the Systems for ML Workshop at NeurIPS 2018,
2018. [pdf]
- ``A User-Friendly Hybrid Sparse Matrix Class in C++''. C.
Sanderson, R.R. Curtin. Proceedings of The 2018 International Congress on
Mathematical Software (ICMS 2018), p. 422--430, South Bend, Indiana, 2018. [pdf] [bib]
[code]
- ``An open source C++ implementation of multi-threaded Gaussian
Mixture Models, k-means and expectation maximisation.''. C. Sanderson, R.R.
Curtin. Proceedings of the 11th International Conference on Signal Processing
and Communication Systems (ICSPCS 2017), p. 1-8, Surfers Paradise, Gold
Coast, Australia, 2017. [pdf]
- ``pfsuper: simulation-based prognostics to monitor and predict
sparse time series''. J. Echauz, A.B. Gardner, R.R. Curtin, N. Vasiloglou,
G.J. Vachtsevanos. In Annual Conference of the Prognostics and
Health Management Society 2017 (PHM '17), p. 1-9, St. Petersburg, Florida,
2017. [pdf]
- ``A dual-tree algorithm for fast k-means clustering with large
k'',
R.R. Curtin. In Proceedings of the 2017 SIAM International Conference on
Data Mining, p. 300-308, Houston, Texas, 2017. [pdf]
- ``Fast approximate furthest neighbors with data-dependent
candidate selection''. R.R. Curtin, A.B. Gardner. In Similarity Search
and Applications 2016 (SISAP 2016), p. 221-235, Tokyo, Japan, 2016.
[pdf]
- ``Faster dual-tree traversal for nearest neighbor search''.
R.R. Curtin. In Similarity Search and Applications, p. 77-89, Glasgow,
Scotland, 2015. [pdf]
- ``Collaborative filtering via matrix decomposition in
mlpack''. S. Agrawal, R.R. Curtin, S. Ghaisas, M.R. Gupta. In
ICML 2015 Workshop on Machine Learning Open Source Software, Lille,
France, 2015. [pdf]
- ``An automatic benchmarking system''. M. Edel, A. Soni,
R.R. Curtin. In NIPS 2014 Workshop on Software Engineering for
Machine Learning, Montreal, Canada, 2014. [pdf]
- ``Classifying broiler chicken condition using audio data''.
R.R. Curtin, W. Daley, D.V. Anderson. GlobalSIP 2014 Symposium
on Signal Processing Applications Related to Animal Environments, Atlanta,
Georgia, 2014. [pdf]
- ``Tree-independent dual-tree algorithms''. R.R. Curtin,
W.B. March, P. Ram, D.V. Anderson, A.G. Gray, C.L. Isbell, Jr. In
Proceedings of The 30th International Conference on Machine Learning (ICML
'13), p. 1435-1443, Atlanta, Georgia, 2013. [pdf]
- ``Fast exact max-kernel search''. R.R. Curtin, P. Ram, A.G.
Gray. In SIAM International Conference on Data Mining (SDM '13), p. 1-9,
Austin, Texas, 2013. Nominated for Best Paper Award. [pdf]
- ``mlpack: a scalable C++ machine learning library''. R.R.
Curtin, J.R. Cline, N.P. Slagle, M.L. Amidon, A.G. Gray. In NIPS 2011
Workshop on Big Learning, Granada, Spain, 2011. [pdf]
- ``Learning distances to improve phoneme classification''.
R.R. Curtin, N. Vasiloglou, D.V. Anderson. In Proceedings of the 2011 IEEE
International Workshop on Machine Learning in Signal Processing (MLSP 2011),
p. 1-6, Beijing, China, 2011. [pdf]
(technical reports/other)
- ``Flexible numerical optimization with ensmallen''. R.R.
Curtin, M. Edel, R.G. Prabhu, S. Basak, Z. Lou, C. Sanderson. arXiv preprint
arXiv:2003.04103, 2020. [pdf] [code]
- ``A generic and fast C++ optimization framework''. R.R.
Curtin, S. Bhardwaj, M. Edel, Y. Mentekidis. arXiv preprint
arXiv:1711.06581, 2017. [pdf]
- ``Designing and building the mlpack open-source machine
learning library.''. R.R. Curtin, M. Edel. Submitted to The Fourth
International Conference of PUST (ICOPUST 2017)---conference cancelled.
2017. [pdf]
- ``Detecting adversarial samples from artifacts''. R.
Feinman, R.R. Curtin, S. Shintre, A.B. Gardner. arXiv preprint
arXiv:1703.00410, 2017. [pdf]
- ``Improving dual-tree algorithms''. Ph.D. thesis, Georgia
Institute of Technology, 2015. [pdf]
- ``Single-tree GMM training''. R.R. Curtin. Technical
report GT-CSE-2015-01, Georgia Institute of Technology, School of Computational
Science and Engineering, 2015. [pdf]
references available upon request.