Statistics is about using data to understand realistic situations. Part I introduces you to data. You’ll notice that each chapter begins with a story and data, and most have additional stories and more data inside. Among other topics, you can read about the Titanic and Enron. It is important that you start thinking about the context of data so that you can answer questions using statistics while describing them in language that is consistent with the story. Communication is critical to statistical analysis.
Every discipline has its own language, and Statistics is no exception. During the semester, you’ll be exposed to many terms that are familiar – variable, variation, distribution, normal, bias, independence, confidence, and random are some of them. As you’re introduced to each of these terms, you will most likely discover that your definition and that of a statistician are not the same. In addition to other terms in Part I, be prepared to learn the statistical definition of variable, variation, distribution, normal, and model.
Part I also introduces you to statistics and ways to display data graphically and numerically. You’ll learn to do some of these by hand, but most will be calculated with technology. Technology plays an important role in this course (and in the work of statisticians), so start learning to use JMP as soon as your instructor introduces it.
Throughout this course you’ll be exposed to the fact that Statistics deals with the real world, and that contains a lot of gray areas. Even though you’ll calculate numbers, the answer is rarely just a number. Instead, it must come with interpretation in terms of the context of the data and the questions that were asked of the data. If you’re thinking, “This doesn’t sound like any other math class I’ve had,” you’re correct. This course will examine issues that require deep thought, careful analysis, and clear writing.
You will draw scatterplots and look for patterns. You will describe the association in terms of direction, form, and strength. When the form is approximately linear, we find the correlation. You will use these concepts and some new vocabulary, but be careful not to jump to conclusions about cause and effect.
You will be introduced to assumptions and conditions for the first of many times. Most statistical procedures are valid only under certain assumptions and you must think about those assumptions by checking the corresponding conditions. This process continues throughout the course.
Another model is introduced here – the linear model. What constitutes a good linear model? What is the equation of a good linear model? After we have an equation, how do we use the model? Does the slope or y-intercept have meaning? And, can we go beyond the usefulness of a model?
Be prepared to revisit the statistical definition of model, and encounter the terms correlation, linear, and regression.
Adapted from DeVeaux,Velleman, & Bock Intro Stats 2nd edition with permission from Addison Wesley.