My projects range from building a Meetup visual analytics dashboard to fighting corruption in New York City. I utilize skills of text mining, predictive modeling, and data visualization. Feel free to check out the code, the visuals and the reports.

Knowledge Share: AI 101

The future is already here — it's just not very evenly distributed. ---- William Gibson

Political Sentiment Visualization: Data Analysis and Visualization Using Voxgov US Federal Government Media Releases
Rongyao Huang, Mengying Li, Yuqi Bi, Darrick Leow

What does the US Federal Government say in their “boring” media releases? How do the Republicans and the Democrats differ in the things they are saying? What kind of emotions are attached to each topic? This collaborative project explores the Federal Government Media Releases with structural topic model and sentiment analysis. Results are presented as interesting interactive visualizations that are showcased on our project website.

Powering Social Network Analysis with Graph Data Model: An Example of the Meetup Visual Analytics
Yiwei Zhang, Rongyao Huang, Mengge Li, Rahul Gaur

The increasing popularity of social network services provides a great opportunity to study what we care about and how we interact with others. However, because of the complexity of network relationship data, research and applications are limited by the query efficiency of relational data model and SQL-like language. This collaborative project employs the newly emerging Graph Database, Neo4j, to store, extract and analyze group network data on Meetup.com. Finally, an interactive dashboard is built that allows users to query their interested Meetup topics and view results in the form of interactive visuals.

Language Use in Teenage Crisis Intervention and the Immediate Outcome: A Machine Automated Analysis of Large Scale Text Data
Rongyao Huang (Master's Thesis)

You are what you say. People’s words reveal important information about their identity, emotions and relationships with others. This provides new insight into the evaluation of teenage crisis intervention.With techniques of text mining, LIWC-based psycholinguistic analysis, and Analysis of Variance, my research reveals significant correlations between language use during a counseling session and the effectiveness of the treatment. For example, teens who felt better after their treatments in general used more prepositions, more conjunctions, and more words representing cognitive processes. Based on the detected language use patterns, predictive models achieve above 75% accuracy in detecting the “better” interventions, and above 90% accuracy in finding out the “worse”.

Predicting Salary using Job Description: Topic Modeling and Supervised Learning on NYC Job Posting Data
Rongyao Huang, Greg Werbin, Yiran Dong, Chang Liu

Among the many things that haunt a graduate student’s mind, finding a job is probably the most important and monstrous. While an ideal job is the intersection of three sets – what one loves, what one is good at, and what the society values – an answer to the third questions is enough to guarantee a good pay. This collaborative project intends to predict job salary based on texts that describe the job. We experiment on using topic modeling as a dimension reduction method to transfer unstructured text into quantitative dimensions that represent latent topics. The results are then fed into statistical models to predict salary levels with up to 84% accuracy.

CTL Google Adwords Campaign (A marketing analytics project at Crisis Text Line)
Rongyao Huang

Crisis Text Line (CTL) is a data-driven NGO start-up providing free crisis intervention to teens 24/7, covering the whole United States. During my internship there, I was in charge of a Google Adwords Campaign to raise fund and recruit volunteer counselors. With the support from my supervisor and the team, I learnt important concepts and strategies of PPC marketing, and even redesigned the organization landing page to target at potential donors. Below are some designs and adwords tutorials I developed for the project.

Saving Gotham: Fighting Corruption in New York City’s Property Tax System
Paul Lagunes, Rongyao Huang

The 2002 New York City property assessment scandal has been called the greatest case of municipal fraud in U.S. history. Mayor Michael Bloomberg referred to it as the “largest and most damaging corruption scheme ever conducted within city” of New York. A tax assessor turned consultant masterminded a ploy wherein property owners paid him substantial fees. In exchange, the high-powered consultant secured questionable reductions to their property taxes by bribing government officials. The three-decades-old corruption scheme cost New York City an estimated $40 million in vital revenue a year. This research project examines the scandal closely by studying government reports, news articles, legal documents, conducting interviews with reporters and prosecutors, and performing quantitative analysis of public indictment data. A volume paper is in its final stage of preparation to publish.


All content copyright 2018 Rongyao Huang. Modified from Dan Foreman-Mackey.

    .