top of page

I’m thrilled you’re here!

I am an enthusiastic data scientist interested in turning data into actionable insights and products. Please take a moment to explore my site, where you’ll find details on my background, experience, skills and more.

To learn more or connect on an opportunity, don’t hesitate to reach out.

​

Tianshi Wang

Insight Data Science Fellow

tianshi_wang@outlook.com

https://www.linkedin.com/in/tianshi-wang/

Data Science and Machine Learning
Insight project: Buy2Sell, converting collectors to sellers in an online marketplace


Covetly suffers from the high variance of inventory levels across categories. To add inventory for categories in shortage, the client expects this project to suggest a list of collectors who can be prompted to sell.

​

This project:

•  Consolidated 7 Million user behavior records from client’s MongoDB and MSSQL databases 
•  Used supervised learning classification to recommend potential sellers with 80% confidence level 
•  Built a dashboard with real-time updates with Dash and Flask deployed on AWS EC2 and cache data stored on AWS RDS

​

The website: http://tianshi-wang.com/​

Github: https://github.com/tianshi-wang/Buy2Sell_Insight

Project: Metal/nonmetal Classification and Bandgap Prediction

​

The workflow below shows how the code works for classifying metal/nonmetal and predicting band gaps. The data ingestion module collects 44 features for each of the ~4100 materials from databases such as MaterialsProject . Then, the collected data is used to train support vector machine (SVM) classifier and SVM regression. Finally, we test the models on test set. The result shows the trained model can separate metal/nonmetal with an accuracy of ~90% and the bandgap prediction accuracy (RMSE = 0.7 eV) outperforms  DFT-LDA calculations.

​

Workflow of the developed code for metal/nonmetal classification and bandgap projection

Project: Stock Clustering and Selection from S&P500

​

This algorithm assumes stocks which performed similarly in the past will likely continue doing so in the near future. Therefore, it is regarded as a buy single if a stock performed worse than others in a cluster.  

      S&P500 stocks are clustered by their daily performance from 201401 to 201801 using KMeans method. Based on their performance during 201802-201805,  ~40 underperformed (to-buy) and outperformed (to-sell) stocks are selected. The selected to-buy stocks return 2.2% in 201806 (or 26% annually) compared to 0.5% for market and -0.6% for to-sell stocks.

Diagram of stock-selection procedure using this algorithm

To find more information of the projects and to download the source files, please visit Data Science

First-principles Simulations

I and my collaborators employ cutting-edge first-principles method based on density functional theory to solve challenging problems in materials science, electrical engineering, and mechanical engineering. 

​

My research includes:

  • Electronic structures (defect level, band edge position, and band diagram in heterostructures) of advanced semiconductors

  • Transport properties i.e. thermal conductivity and electron mobility in electronic materials for electron-phonon scattering

  • Thermodynamics analyses of phase stability and structure-property relationships in materials.

Slab model used to calculate band edge positions in ZnSnN2 and GaN

Phonon-electron scattering rates compared to phonon-phonon scattering in SiC

To find out more about my experience on first-principles simulations , click First-principles

Scientific Programming

Code development to solve real-world problems is important for me. In addition, I have gained experiences of large-scale scientific computing using clusters e.g. SDSC Comet and UTexas Stampede. 

​

In the below example, I demonstrate a code which I and Wei Li developed for simulation of epitaxial growth process in molecular beam epitaxy (MBE).  Using this code, you can observe different growth morphologies: statistically roughing, step-flow, and islanding by tuning the input parameters. The code was first written in Python and Cython by us based on the KMCInterative code. However, to increase running speed, we finally wrote it in C++ using QT and OpenGL libraries. 

The features of this code are summaries as follows:

  1. Two threads design: one for 3D rending and the other for Monte-Carlo calculations.

  2. Helpful thread control which allows users to pause, resume, and restart the calculations.

  3. More than 40 input/output functions enabling easy control of the code

  4. Portable i.e. no installation needed and available for both Windows and MacOS

To find out more about the code , please visit Scientific Programming

Publications, Presentations and Proposals

My publications include seven first-authored and five co-authored peer-reviewed articles (recently submitted manuscripts included). For the complete list, please visit Publications

​

I presented my work in 10 meetings including APS March meeting and MRS Fall meeting.

​

In addition, I was involved in preparing six proposals for funding and computational resources. 

©2018 by Tianshi Wang at the University of Delaware.

bottom of page