Data for Impact Summer Institute

You are here

The Data for Impact Summer Institute is offered in partnership with the Office of Creative Inquiry, the Martindale Center for the Study of Private Enterprise, and the Institute for Data, Intelligent Systems, and Computation (I-DISC). The D4I Summer Institute specifically welcomes applications from rising sophomores and juniors but is open to all Lehigh students. 

Summer 2021

10 weeks program: June 1st - August 6th
Full details of Summer 2021 program and application can be found on the Creative Inquiry Website
At this time, both programs are scheduled to operate fully virtually, though some in-person work may be able to happen in Building C or elsewhere on campus, safety considerations and protocols depending.

Data for Impact 2021 Projects
Projects led by I-DISC Faculty Members in bold

1. Designing Fast Machine Learning
Project Mentor: Joshua Agar, Materials Science & Engineering
Project Description: Machine learning has provided an opportunity to achieve beyond human performance in a range of tasks. A less-considered advantage is the ability of machine learning models to make complex decisions with latencies much faster than human reaction times (>300ms). Achieving FastML requires careful co-design of models, optimization methods, networking, and hardware to create end-to-end cyberphysical systems. This project will co-design machine learning models for deployment on field-programmable gate arrays for applications with hard latency constraints from milliseconds to nanoseconds. Students will have the opportunity to deploy these models for a variety of applications including additive manufacturing of materials, scanning and electron microscopy, biomedical devices, and much more.

Fellow and Associate positions available. Students from all majors are welcome to apply. 

2. Machine Learning for High-Speed Biomedical Imaging
Project Mentor: Yaling Liu, Bioengineering/Mechanical Engineering
Project Description: Combining the fields of data science, bioengineering, high-speed imaging, and machine learning, this project's goal is to design and build an intelligent and automated platform for high-throughput imaging cell flow cytometry. Students will use data-driven models for the classification of cell types based on optical/fluorescence cell imaging and the physical properties of cells, ultimately investigating the application of machine learning in dynamic intelligent cell sorting. The possible impact of this cell sorting system could be used for early diagnosis of tumor cells, tracking stem cell differentiation, and more. 

Fellow and Associate positions available. Interested students will ideally have a background in programming and/or hardware integration for image processing and machine learning implementation.

3. Does Inclusive Finance for the Poor Help Mitigate COVID-19's Impact on Women and Families?
Project Mentor: Todd Watkins, Economics / Martindale Center for the Study of Private Enterprise
Project Description: Students on this project will build a database that can help uncover how effective inclusive financial services and FinTech are for the communities they purport to serve-namely, women and lower socioeconomic strata. This short-term information and data gathering will pay off in a longer-term design of solutions to make these services better. The team will begin by examining the microeconomic consequences of the COVID-19 pandemic, particularly towards women in the workforce, and whether financial efforts made to help those most affected have worked at all. Future applications of this work could include broader interrogations of payday lending, youth unemployment, and global economic mobility.

Fellow and Associate positions available. Ideal students for this project have interests in one or more of the following: developing economics, microfinance, impact investing, FinTech and related financial inclusion tools. Students ideal for the team will also have some experience with statistical analysis, data visualization, and scraping data for building databases. Open to interested students from any majors or minors.

4. Practical Training and Research in Computer-Aided Drug Discovery
Project Mentor: Wonpil Im, Bioengineering/Biological Sciences
Project Description: One of the most remarkable features of proteins is their ability of specific, reversible binding to other molecules. Such molecular recognitions are typically associated with almost all biological functions in living systems. Drug compounds bind to proteins, regulating their functions to acquire beneficial effects to treat diseases. Therefore, a better understanding of protein-ligand interactions at the molecular level and accurate quantification or prediction of their binding affinity are at the core of computer-aided drug discovery. Students on this project will aim to study protein-ligand interactions computationally using three families of impactful therapeutic targets for cancers and AIDS. In particular, students will gain practical hands-on research experiences in computer-aided drug discovery. The lectures and tools in CHARMM-GUI will be used for student learning and research. 

Fellow and Associate positions available. Students should have some knowledge of, or interest in, coding/programming or data analysis.

5. Covered Interest Rate Parity in the Crypto-Currency Market
Faculty Mentor: Patrick Zoro, Finance
Project Description: The goal of this project is to disprove the covered interest rate parity argument in the general crypto-currency decentralized finance market. Briefly, covered interest rate parity states that if you borrow in currency A and invest in currency B, and simultaneously sell currency B forward, any profit opportunity should be eliminated.

Fellow and Associate positions available.

5. Interpretability of a Supervised Learning-based Trading Strategy / Bitcoin Trading Strategy
Faculty Mentor: Patrick Zoro, Finance
Project Description: The primary objective of this project is to develop an interpretability of the supervised learning-based trading strategy using different approaches, and get insight into what signals are driving a trading strategy. The strategy is based on the Random forest classifier model, and it already has the feature importance plot for interpretability. The team will generate other forms of interpretability for this approach. We will use deep ANN and not Random forest for the trading strategy and finding of interpretability techniques.

Fellow and Associate positions available.

6. Pennsylvania Index
Faculty Mentor: Patrick Zoro, Finance
Project Description: This is a project to create a Pennsylvania Stock Market index and its derivatives. It will consist of first, identifying all publicly-listed companies that are headquartered in PA. Then, create a value-weighted index of these publicly-listed firms, irrespective of which industry they are in. We then monitor this index and publish the current value on the MFE website.

Fellow and Associate positions available.

7. DeFi (Decentralized Finance) Market Analysis
Faculty Mentor: Patrick Zoro, Finance
Project Description: This project will analyze historic trades (confirmed transactions) monitoring large / active accounts. What tokens are they buying/selling? Where are they providing liquidity? What other insights can be gained?

Fellow and Associate positions available. 

8. Data-to-Control: Toward Data-Driven Model Predictive Control for Chemical Process Automation
Faculty Mentor: Srinivas Rangarajan, Chemical & Biomolecular Engineering
Project Description: Most chemical and biological processes are dynamical systems. This means that their state variables (i.e. variables that characterize what state the system is in) are continuously changing. Modern plants in the energy and chemical industry have advanced data acquisition technologies, enabled in many cases by solutions offered by OSISoft LLC, the industrial partners on this project. These technologies allow for collecting, storing, and analyzing data from thousands of sensors every second (or faster). Our ultimate goal is to leverage this data to design, optimize, and control new energy and chemical systems. Students will develop algorithms that will allow us to extract the underlying ordinary differential equations from time-varying data. This algorithm will then allow us to take time-varying plant data and build data-driven dynamic equations that accurately captures the overall process. We specifically intend to build on the state-of-the-art algorithms from the applied mathematics community on inferring equations from data that have been successfully applied in the fluid mechanics domain by incorporating a number of new features including the concept of infusing chemical engineering domain knowledge as constraints while training the data-driven model.


Summer 2020

Impact for Summer Institute Expo 2020

Data for Impact Summer Institute 2020

The Data for Impact Summer Institute, is one of Lehigh's Creative Inquiry initiatives and planned in collaboration with the Martindale Center and I-DISC. This is an eight week program that will run from June 15th - August 7th. 

Data for Impact Summer Expo > 
Learn more about the program & projects: Data for Impact Summer Institute Webpage >