Computational and Inferential Thinking: The Foundations of Data Science

Books
First-steps
Regression
Clustering
Code-along
Author

Peter Cruz Parrilla

Published

August 18, 2025

About this resource

The Computational and Inferential Thinking (CIT) book is provided here as a resource for those who need a more gentle introduction to AI. This book begins by covering basic statistics and principles of data science to provide a solid foundation into numerical thinking, the structure of common real-world datasets, and extraction of relationships from the data. Additionally, Python is introduced as a tool for data science through easy-to-follow hands-on exercises. No experience with statistics or Python programming is assumed.

From the author(s)

For whatever aspect of the world we wish to study—whether it’s the Earth’s weather, the world’s markets, political polls, or the human mind—data we collect typically offer an incomplete description of the subject at hand. A central challenge of data science is to make reliable conclusions using this partial information.

In this endeavor, we will combine two essential tools: computation and randomization. For example, we may want to understand climate change trends using temperature observations. Computers will allow us to use all available information to draw conclusions. Rather than focusing only on the average temperature of a region, we will consider the whole range of temperatures together to construct a more nuanced analysis. Randomness will allow us to consider the many different ways in which incomplete information might be completed. Rather than assuming that temperatures vary in a particular way, we will learn to use randomness as a way to imagine many possible scenarios that are all consistent with the data we observe.

Applying this approach requires learning to program a computer, and so this text interleaves a complete introduction to programming that assumes no prior knowledge. Readers with programming experience will find that we cover several topics in computation that do not appear in a typical introductory computer science curriculum. Data science also requires careful reasoning about numerical quantities, but this text does not assume any background in mathematics or statistics beyond basic algebra. You will find very few equations in this text. Instead, techniques are described to readers in the same language in which they are described to the computers that execute them—a programming language.

Prerequisites

  • No prerequisites.

Estimated time to complete

TBD: Use the Improve this page functionality to add your own estimate!

Questions?

If you any lingering questions about this resource, please feel free to post to the Nexus Q&A on GitHub. We will improve materials on this website as additional questions come in.

See also