Big Data & Quantum Mechanics#

Overview of Medford Group research#

MedfordImg

About Prof. AJ Medford#

  • Started as a professor (and this VIP course) in Spring 2017.

  • Experience in developing and contributing to several open-source software packages (CatMAP, ElectroLens, TAPSolver, SPARC, AMPTorch)

  • Instructor for “Data Analytics for Chemical Engineers” and Numerical Methods.

  • Interest in applying data science techniques to problems in quantum chemistry and physics.

Introductions#

We will go around the class and introduce ourselves to the everyone. When it is your turn to speak, tell everyone your preferred name, major, and something random about yourself.

How does VIP work?#

The premise of VIP is teams working on projects. Much like a real-world engineering team, individual members work on different aspects of the project. Team members range from sophomores through graduate students, from first-time participants to students who have been involved for four or more semesters. Some students take the course for one credit, and others take it for two credits; naturally, the bar will be higher for those taking it for two credits.

How is VIP graded?#

You will receive a grade for the course based on three criteria:

  • Documentation (33.3%): Based on biweekly updates of progress on tasks.

  • Personal Accomplishments (33.3%): Based on how well you achieve your research goals.

  • Teamwork and Participation (33.3%): Peer evaluations will be used to establish how well you work on a team.

Grading process:

  • Bi-weekly on Thursday: Submit “bi-weekly update” and literature review to Canvas. Complete peer grading (instructions in syllabus).

  • Midterm: Submit personal accomplishment documentation to Canvas. Complete peer evaluations. Complete peer grading. Note for returning students This grade will be counted towards your final grade in the course.

  • Final: Identical to midterm. Final grade will be a weighted average of the midterm and final submissions.

The following deliverables are expected at the midterm and final evaluations. Note that the “personal accomplishment” documentation will be graded using a combination of peer grading and instructor grading, so you will also need to complete the peer grading at each point.

  • Deliverables:

    • Compiled bi-weekly update

    • Personal accomplishment documentation

    • Peer grading

    • Peer evaluation

    • Demonstrated completion of all assigned training materials

See syllabus for more details.

VIP is not like a regular course#

Regular courses have a clear direction:

MarioImg

VIP lets you choose your own adventure:

ZeldaImg

Group Communication:#

  • Slack group used for all communication, join using this link

    • training: discussion related to training project.

    • general: channel for general discussions with the whole group.

Overarching Project: ChatDFT development#

This semester, we are starting to work on a new large-language model (LLM) based tool that we hope will make it easier for students to learn how to do DFT calculations. This is also partially educational research, and we will send out various surveys throughout the semester. The bot, named “ChatDFT-F24” will be accessible via Slack, and is based off of ChatGPT. This semester, all projects will involve performing DFT simulations with the aid of ChatDFT to study reproducibility in DFT and how LLMs can assist in performing scientific computing tasks. Note that all converations with ChatDFT will be recorded and may be reviewed and/or used for fine-tuning or development of future versions of ChatDFT. We also appreciate any feedback you have about ChatDFT, including best practices for prompt engineering, ideas about how to improve the architecture, or various failure modes that you experience. The tool is far from perfect, and like any LLM it should not be blindly trusted!

All subteams will also work with the SPARC DFT code, a recent DFT package developed by Prof. Suryanarayana at Georgia Tech, designed for massively parallel calculations. This code is very powerful, but relatively new, and thus not as familiar for LLM tools. We hope that through this project we can help tailor an LLM interface that allows new users to quickly learn to use SPARC to perform reliable calculations.

Team Structure : New Students#

All new undergraduate students will automatically join the training subteam:

  • Training (led by Logan Brabson and Sayan Bhowmik) - All new undergraduate students must complete a training program that involves the basics of high-performance computing, common quantum-mechanical techniques such as Density Function Theory (DFT), and the development and use of machine learning techniques such as neural networks. DFT software packages are well-established, but require the use of supercomputing resources and must be converged with respect to several numerical parameters. The training will cover approximately 10 weeks and is highly represnetative of training programs for new graduate students in this field.

Team Structure: Returning Students, OMSCS Students, post Training New Students#

Returning students will join one of the “sub-teams” described below, and OMSCS students can decide whether to complete the training exercises or join a sub-team (or both)

Sub-teams#

  • There are several sub-teams, each of which will function as a small research group advised by a graduate mentor. The main tasks in each sub-team will be calculation of adsorption energies, but each team will focus on different systems and mentors may focus on different aspects (e.g. reproducibility, accuracy at different levels of theory, etc.).

    • Simulation of transition metal surfaces (led by Neung-Kyung Yu) - Transition metals are some of the most common catalysts, and adsorption on transition metals has been widely studied. However, there are still open questions about the accuracy of these models since adsorption energies are difficult to measure, and the vast literature provides lots of opportunities to study reproducibility. This team will focus on calculating some well-known adsorption energies and evaluating accuracy and reproducibility.

    • Adsorption on electrochemical oxide surfaces (led by Kaylee Tian) - Electrochemistry and electrocatalysis are promising sustainable alternatives for a wide range of processes including fuel production and waste treatment. However, calculating adsorption energies on oxides can be challenging due to complex electronic structure (magnetism, localized electrons) and the fact that many metal oxides are highly reactive and can change under reaction conditions. This team will focus on the influence of different atomic configurations of oxide surfaces and different DFT settings and initializations on the adsorption energies of intermediates related to electrocatalytric nutrient recovery from waste sludge.

    • Simulations of Nano-porous Materials (led by Lucas Timmerman) - This team will focus on electronic structure theory (EST) calculations for zeolites and metal organic frameworks. These will primarily consist of adsorption energy calculations for materials and adsorbates of interest for sustainability applications, including direct air capture and biomass conversion. Outcomes include basic proficiency in EST calculations as well as a fundamental understanding of the chemistry and physics concepts underpinning at least one of the materials listed above.

  • Students should work on self-defined individual tasks within the scope of a sub-team

    • Each student should have a clearly-defined task that they are working independently on. This task should be self-determined and within the scope of the broader goal of the sub-team based on consultations with other sub-team members and the graduate student advisor.

    • Students may work together on a given task or direction, but should also have individual goals.

    • Students should regularly communicate with their sub-team to (1) coordinate progress on individual tasks to work toward the larger goal of the team, (2) ask for and provide assistance to other sub-team members, and (3) seek advice from and provide updates to the sub-team graduate mentor.

    • It is fully expected that the goals of research tasks change throughout the semester. The goals document can be updated at any time up to 2 weeks prior to the end of the semester. Students should revise goals as needed to ensure they are achievable.

    • Note that the achievements grade is determined by your sub-team advisor. Different advisors may have different expectations and organizational standards. You should be sure to clearly communicate with your subteam advisor to be sure that you are meeting their expectations.

Standard meeting format#

Subsequent meetings will follow one of three formats. We will start each meeting virtually in this Zoom room regardless of the format.

  • Lecture meetings: The main lecture will be used to briefly discuss logistics before breaking into sub-groups. All training students should plan on attending the training breakout session for discussion regarding that week’s lecture(s), while other subgroups will attend their own breakout rooms. Sub-groups where all (or some) of the students are on campus may elect to meet in person or in a hybrid mode based on the preference of group members.

  • Update meetings: For midterm and final updates, each team will post a 10-15 minute update presentation to Canvas, and each student will be assigned 3 update presentations to watch and provide peer reviews before class. During the class time, all students are expected to be present, and we will go through each group to field questions and discuss their work. Any remaining time will be used for sub-group meetings.

  • Workshops: No official lecture topic. The entire lecture will be used as unstructured time to work on projects and interact with mentors and instructors. Training students should use this as an opportunity to join the breakout room of the subteams they are most interested in joining.

Note: If no member of your sub-team is present for the synchronous lecture, then everyone from the group will lose 1/2 point (out of 5) from the teamwork grade. If you cannot attend please coordinate with your group and confirm with an instructor at least 24 hours ahead of time if nobody from your group will be there.

Lecture schedule and syllabus#

The course syllabus is available within this book, and includes a list of all the lecture topics and dates.

Week 1 Assignment#

  • Join the Slack channel

  • Start discussion for selecting your group or sub-team project

  • Install necessary software following instructions below.

Software Installation:#

Install Anaconda3#

We’ll be using Python3 and Jupyter notebooks extenstively in this class. To access this easily, we’ll need to install anaconda3. To do that, go to the anaconda website below and simply follow the buttons to download and install it (ensure that you’re downloading the correct version for your operating system.)

https://www.anaconda.com/distribution/

Ensure you can access a linux/unix prompt#

Windows users:#

Please install the windows ubuntu subsystem using these instructions:

https://docs.microsoft.com/en-us/windows/wsl/install-win10

Mac users:#

Be sure you can open a terminal