The Caltech Library offers an Advanced Python Programming workshop to help researchers get their work done in less time and with less pain by teaching them basic research computing skills. This hands-on workshop consists of three one-hour sessions on consecutive days and assumes skills learned in the Introduction to Python workshop. The workshop covers pandas ("Python for data analysis"), command line Python, and importing functions.
Data Visualization 101
This workshop will cover how to choose the right chart type for your data and how good design choices will make your chart easier to understand. The focus will be on visualization best practices, independent of any specific visualization software, and making interactive data visualizations in Python. Learners will wrangle data into the proper format using pandas, create visualizations using Plotly, and display these visualizations and create widgets using Streamlit.
Databases & SQL
The Caltech Library offers an introductory Databases and SQL workshop designed for researchers working with data. This hands-on workshop, spread over four days, covers selecting, sorting and deduplicating data, creating subsets, calculating derivative data and data aggregations, and discussing data formats. The final session addresses accessing databases from Python programs.
Dissertation Writers Bootcamp
This two-day dissertation bootcamp provides a calm, distraction-free environment for writing, along with refreshments and support. Optional consultations with Caltech librarians and Hixon Writing Center staff address formatting, copyright, data, and writing questions.
Finding U.S. Maps Online
Are you looking for geological or topographic maps online? This class will show various websites where you can find free maps online used in the geological sciences. We’ll show colorful detailed geologic maps, as well as historical and current topographic maps for the United States. You will learn how to navigate these sites to find exactly what you are looking for. Highly recommended if you are new to the geological sciences, but all are welcome.
Getting Started at the Caltech Library
From our library hours and how to borrow materials to printing large posters at our TechHub makerspace, this online workshop will introduce all the resources available to you at the Caltech Library. Getting Started at the Caltech Library is intended for new Caltech students, but open to anyone who would like to see the breadth of services available.
Getting Started with LaTeX using Overleaf
This class is designed for beginning LaTeX users. Using the online collaborative LaTeX editor Overleaf, It will go over the basics of creating a document, formatting, inserting equations, images, and citations, and exporting as a PDF. This class is only for the Caltech community.
Introduction to ArcGIS Online
ArcGIS Online is a geographic information system (GIS) that is used to create, edit, visualize, and analyze spatial data—all in an online cloud-based system, using only your browser. In this introduction, you’ll get an overview of its features, how to access your organization and how to connect to data sources, along with examples on how you can create and share a map. Some basic knowledge of GIS concepts, such as basemaps, layers and shapefiles is recommended, but not required. This class is only open to members of the Caltech community that have a valid and current access credential log-in. A log-in link to the Caltech GPS GIS organization will be provided.
Introduction to Machine Learning
A series of three one-hour workshops that introduces the main concepts and techniques of machine learning, including defining a machine learning problem; choosing data features and an appropriate algorithm; and the challenges of overfitting and underfitting. Algorithms will include linear regression, logistic regression, k-nearest neighbors, and neural networks and deep learning. The class uses Scikit-Learn and Keras.
Introduction to Python Programming
The Caltech Library offers a Python programming workshop to help researchers get their work done in less time and with less pain by teaching them basic research computing skills. This hands-on workshop covers basic concepts and tools in the context of scientific data analysis, including working in Jupyter notebooks, basic plotting, programming structures including loops and lists, working with files and conditionals, functions, and defensive programming.
Introduction to QGIS
QGIS is an open-source geographic information system (GIS) software program that has been popular for many years now. This introduction offers an overview of QGIS’s interface, see where to download spatial data, along with three examples of what you can do with QGIS. No previous knowledge of a GIS system is required. Highly recommended for first-year geology students, but all are welcome.
Introduction to Tableau Public
Tableau Public is a free software program that can allow data to be understood in an interactive way through visualization. It is used by data analysts, data scientists, students, teachers, businesses, and many more types of disciplines and organizations. This session will demonstrate how to take a sample data set and turn it into a visualization using charts and graphs, as well as creating a dashboard and displaying it to the web using Tableau Public's online interface.
Introduction to Zotero
Are you writing a research paper or ready to start your thesis? Looking for an easy way to collect, organize, share, and cite bibliographic references? "Introduction to Zotero" may be just the quick-start session you need. Learn how to import citations into Zotero from academic sources, prepare bibliographies, use word processor integration, work with cloud sharing functionality, and other topics as audience interest dictates (and as time allows).
Inventions, Patents, and Licensing: The Process at Caltech
The Office of Technology Transfer and Corporate Partnerships presents an overview of different kinds of intellectual property (IP) and how they are managed at Caltech, with a particular emphasis on patents as a means of protecting inventions developed in the course of Caltech research. Ownership and licensing of Caltech IP is also addressed. This will be followed by a Caltech Librarian presenting a brief overview of patent searching, locating English language patent equivalents, legal status issues, and current awareness techniques.
Managing Data with Pandas
Pandas (Python Data Analysis Library) is a fast, powerful, flexible and easy to use data analysis and manipulation tool, built on top of the Python programming language. This one-hour, hands-on workshop will cover a variety of techniques for selecting, sorting, creating subsets, calculating derivative data, etc. The workshop is designed to complement the Databases and SQL workshop.
Text as Data: An Introduction to Natural Language Processing
This introduction to Natural Language Processing (NLP) covers the management and analysis of text using core Python programming language and the open source libraries NLTK (natural language toolkit) and spaCy. Some prior experience with Python programming will be useful, but is not assumed.
The Unix Shell
The Caltech Library offers a hands-on workshop on using the shell to manage and automate analysis on your computer. It covers navigating on the command line, managing files and directories, combining multiple commands with pipes, repeating commands with loops, and basic shell scripting. This workshop does not require any prior experience working on the command line.
Version Control with Git
The Caltech Library offers a hands-on workshop on automated version control using Git. It covers managing a Git repository on a local machine and using Git to work with collaborators. You don't need to have experience with Git, but should have experience working on the Unix command line such as taking our Unix Shell workshop.
You & Your Thesis
This class offers a brief overview of techniques useful in the production and publication of Caltech electronic theses, including tips on formatting and submitting. It also touches on intellectual property considerations and access, as well as thesis dissemination policies. Additional topics may include author identification (ORCIDs) and preservation of thesis-related research data.