Machine Learning Track

Work experience

2015 SDE, @Amazon

2014 SDE Intern, @Amazon

Used various AWS services and visualization techniques to automatically generate daily report of the usage of Amazon Marketplace Feed and Report Platform services, and provide a web portal for users to view and customize their daily report.

2012 Research Intern, @THU \& Tencent

Used Hadoop to process collected data of more than 320M users and 3.7B microblogs on a cluster of 36 servers. Studied Tencent Weibo from both macro and micro levels, discovered interesting difference between Tencent Weibo and Twitter.

Academic experience

2014 Research project, @CS, Columbia U

Developed two probabilistic topic models, Correlated LDA and Correlated HDP, for analyzing topic correlations between large, asymmetric, and potentially weakly-related collections. Used C-LDA to compare over 300k documents in collections of sciences and humanities research from JSTOR.

Developed a Bayesian probabilistic tensor factorization model for generating word vector representations and per-perspective linear transformations from any number of word similarity perspectives. Evaluated the word embeddings with GRE antonym questions, achieved the state-of-the-art performance. Paper presented in EMNLP’14, project will be integrated in IBM Waston.

2014- Research assistant, @CS, Columbia U

Developed an application of semi-supervised learning that utilize Author Topic Model and graph Laplacian to automate role discovery with a small amount of training data.

We introduced a novel mention network from emails, then demonstrated such network can help predicting the organizational dominance in the Enron corporation. Implemented graph walk and re-ranking algorithm for name disambiguation.

Used natural language processing technique to perform social event extraction over a large tranche of newspapers and magazines produced by Taliban, aim to construct a mapping of the social network of the Taliban leadership.

Representative projects

Contest Awards