Uncategorized

Read PDF Teaching Statistics: A Bag of Tricks

Free download. Book file PDF easily for everyone and every device. You can download and read online Teaching Statistics: A Bag of Tricks file PDF Book only if you are registered here. And also you can download or read online all Book PDF file that related with Teaching Statistics: A Bag of Tricks book. Happy reading Teaching Statistics: A Bag of Tricks Bookeveryone. Download file Free Book PDF Teaching Statistics: A Bag of Tricks at Complete PDF Library. This Book have some digital formats such us :paperbook, ebook, kindle, epub, fb2 and another formats. Here is The CompletePDF Book Library. It's free to register here to get Book file PDF Teaching Statistics: A Bag of Tricks Pocket Guide.

GUI can be bad and promote bad habits or it can help develop good habits. With JMP I believe it is the latter. Talk about lazy thinking — but that is the natural result of having such a poor user interface. And perhaps the answers I received from the JMP representatives were not accurate. But, I left there being told that I would not be able to automate the data cleaning, recoding, calculations, etc. Modifying code is a great way to begin the learning process in an analytical program. It is not stopping point, but an entry into a new method.

This may have changed, but it was still the case a couple of years ago anyway. Despite most of my work being with data, I have found few reasons to use the scripting language — that does not mean that others feel the same way. The GUI has many capabilities that are quite sophisticated, and in some ways, reinforce thinking carefully and meaningfully about the data.

In ways that code does not promote — for me. Perhaps, as Martha says, it depends on the world you come from and how your experience has been constructed. I would suggest that programming is a red herring. As I said at the outset, it is neither necessary nor sufficient for good work with data.

ISBN 13: 9780198572244

As with NHST, that is a loaded statement. I could just as easily say that a good GUI is neither necessary nor sufficient for good data analysis. One other point about JMP is worth noting. Despite its superiority in my view over other GUI-based statistics programs, it is far less well known and less used. I believe this is due to it being under the SAS umbrella — I believe SAS is worried about cannibalizing their more expense products and it is a legitimate worry in my mind.

That point is somewhat off topic, but I believe it may be relevant to some of your experiences with exposure to the product. I agree that programming is not required for good and careful data analysis, but it does in fact depend on what you are trying to accomplish. There is a reason Andrew Gelman and the Stan group felt compelled to develop the Stan software. It was because the existing programs did not do something they wanted to do. They do not provide the same functionality as the coding language that underlies them.

This is inevitably true in that they would have to code every bit of existing functionality into the GUI, which is typically not very efficient. Also, I certainly accept your argument that JMP has been adequate to your needs. If a GUI does the trick, then have at it. If it is the JMP scripting language, then awesome. If it is R, Python, Stata, Stan, whatever.

Teaching Statistics : A Bag of Tricks 2e by Andrew Gelman and Deborah Nolan (2017, Paperback)

Then use it. The great thing about R and Stan is that they are relatively cheap. I am with Dale in this debate. For beginners, trying to teach programming takes away class time from teaching data literacy and methods. I have learned first hand that the young generations know how to navigate software and what is particularly gratifying is that they hand in assignments which demonstrate that they have explored the software and used functions that are not covered in my lectures.

When I taught programming languages, this does not happen… probably because going beyond the lecture notes means having to read the coding manuals, and apparently no one likes to read manuals. Of course, for more advanced students and for research, one has to learn coding. While I agree that it is important to teach good statistical analysis, I think it is a mistake to ignore real world data issues. It has been my observation that many who have completed graduate level statistics courses, even with high grades, often have little understanding of the complexities of data and modeling in the real world and even in the real world of research.

I would not argue that we trade good analytical training for data cleaning skills as both are essential. My primary argument is the importance of having a data analysis process in place that can be easily reviewed and audited from start to finish and the difference in ease with which this is done with code relative to a GUI.

Perhaps JMP allows for this currently, but when I was considering using it a couple of years ago, it did not. I spent a bit of time looking at their website last night and it does appear that there have been some substantial improvements made to the scripting functionality. However, it is not clear to me how easy it would be.

Here is where it becomes an issue. If it is possible to produce a scripting document that I can review one step at a time, I will be able to make my way through the project easily and provide feedback where I think it might be helpful. If on the other hand the GUI you are using does not allow for this script to be produced, then I will have to go into each of the 30 nodes or notes or whatever form of documentation you might add to the project and figure out how you produced what is coming out the other end.

I find it is far easier to review code for complex projects than it is to review GUI processes. Now, perhaps my experience is outdated and this is no longer an issue for most GUI based interfaces, but that would really surprise me. Kaiser I actually think data cleaning is very important and I introduce it in my earliest courses. It is one of the things I like best about JMP — it facilitates cleaning large data sets very well.

I am referring to the analysis nodes in Enterprise Guide and I take from your response that JMP is not structured similarly. Daniel I simply do not agree with you. The fact that there is a script does mean there is programming, but it does not follow that therefore people must know how to program.

Now, regarding what should be taught in a program — I completely agree that a statistics or data science program must have programming as part of it — even a large part. I do not think the first course is the appropriate place, except for some fields where students are well-equipped for it. But many people will work with data that are not majoring in these subjects. I think it is naive to believe that you can keep data analysis to the small group of people that have majored in these fields.

Many people will — and should — use data analysis, and one or two courses should be enough to teach them something. Must that something include programming? My answer is no — there are more important things for them to learn first. They could send you the script — or better yet, they could send you the data set with the script saved inside.

Andrew Gelman

Then you just have to run the script to reproduce the analysis. And this is where I think I just completely disagree with you. Jennifer and I are breaking up our book into two. The first volume, Regression and Other Stories, is nearly done, and it should be available in The second volume on multilevel models, coauthored with Ben Goodrich and Jonah Gabry, probably Oh that is wonderful.

Do you think it will be out this summer or more late in the year? I do fake-data simulation to understand my model. Interestingly, Amazon also sells the first edition of the book, at significantly more than twice the price of the second edition both paperback and hardcopy. For this second edition we have added chapter 4 on graphics, chapter 14 on teaching statistics to social scientists, chapter 15 on statistics diaries, chapter 16 on a course in statistical communication and graphics, and chapter 21 on teaching data science.

All these new chapters reflect our view of the unity of statistics education: we see no sharp distinction between introductory classes and more advanced teaching. The educational principle of active learning, and the statistical principles of variation and uncertainty, apply at all levels. We have also added new activities in the data collection chapter: a sampling project that involves digital photos, and an alternative taste-testing experiment. There is a new section in the how-to-do-it chapter on the large lecture class that includes experiences with document cameras, clickers, online forums, near-peers, and reproducible documents.

We have added a new first-week-of-class activity In the chapter on the survey sampling class and a new project description for a analysis of data from a complex survey. You still sell hard copies while, at the same time, making the entire book available for free on-line. It is tough, after all, for owners of the first edition like me! Andrew, I thought you might have included a section on slicing and dicing data, along the lines of the good work you have done with the Deaton dataset.

That seems to be lost in most intro stats classes. I look forward to this. One result is that engineering schools sometimes create their own stats courses. That may be good, or bad. I hate to disappoint you, but the chapter on lying with statistics is all old material from the edition. In my opinion it looks spindly and light on the page, and is unpleasant to read.

Mail will not be published.

Teaching Statistics A Bag of Tricks

Robert Grant says:. April 20, at am. Reply to this comment. Andrew says:. Anoneuoid says:. April 20, at pm.

Subscriber Login

Keith O'Rourke says:. Samuel says:. April 21, at pm. I love love love this book. Looking forward to reading it. I have had some success over several years with health professionals, generally quite scared of their statistics module. I show them bootstrapping before standard errors and get them to do an approximate randomisation test by flipping coins and counting how many people in the room had results as or more extreme than the data 8 out of 10 cats prefer Whiskas before they encounter p-values.

I love fake-data simulation. Regarding your other points: I recommend not teaching p-values at all and not teaching goodness-of-fit tests either. I think it also does a disservice by suggesting that avoiding really more so hiding assumptions is somehow inherently good. Doing fake-data simulations with clear and purposeful assumptions that are varied and not calling it a parametric bootstrap arguably is better. McElreath is apparently trying to workout out how to do it with a physical Galton two stage quincunx e. Do you happen to remember the title of this paper or something close to it?

Did not cover the bootstrap when I taught the course the next term. Strangely the bootstrap material seemed to go over very well with the students. Now we had just covered survey sampling with and without replacement and the text was Freedman, Pisani and Purvis which used box models so it was easy to start with a sample of three, enumerate all possible sample paths of size 3 i. Now increase the sample size to show why enumeration becomes impossible but sampling with replacement using the bootstrap remains easy.

So now they really know what it actual is and does — now what to make of it when used on real data? The Efron paper was convincing that what to make of it when used on real data was really hard as most statisticians according to Efron were not doing it right mostly because one should not use the vanilla but some corrected version of it. The take home message was that it vanilla bootstrap was mostly a distraction, except perhaps for getting an assessment of whether the sampling distribution of the statistic was approximately normal. So I removed from the course.

It has not worked well at all with people that have no programming experience. Tried it once with a group of psychology undergrads, and that did more harm than good. They just thought it sounded like a hack which it is, but a useful hack.


  • Text, Speech and Dialogue: 4th International Conference, TSD 2001 železná Ruda, Czech Republic, September 11–13, 2001, Proceedings?
  • The iPhone Annual 2017?
  • File Teaching Statistics A Bag Of Tricks By Andrew Gelman.
  • Electrical Util Engineers FGD Manual Vol II [major mech equip].

My one experience was that the students sensed it was a pedagogical trick to illustrate a principle, rather than a useful tool, and powered down their brains, checked facebook etc etc. But I am optimistic…. Programming experience is probably one of the best ways to get that. A beautiful theory that keeps getting killed by nasty ugly facts ;-. This I do not agree with.

I am not saying it is without value — I also am not saying that students today should not learn programming is that a double or triple negative? Programming does help develop many critical thinking skills. I think understanding data and how to think about data is often more easily done before learning programming. This probably reflects my own abilities — programming was always a void in my tool kit, and it is by using smart software that I have been able to analyze data with minimal programming skills.

I am not advocating that as a model for others to follow. But it has taught me that pedagogically it may be better to start with learning about data — how is measured, what does it mean, what does it look like, etc. I think the appeal to programming as a prerequisite to learning statistics is just a form of setting a hurdle for entry into the field. I agree with much of what you say, but cannot entirely agree with the last two sentences. I do believe that requiring programming before statistics can unnecessarily limit the field.

But I think that people can genuinely believe that programming should be a prerequisite to statistics simply because that worked well for them, not realizing that different paths may work well for different people. My point was simply it might be helpful to learn some programming before attempting to learn statistics implicitly from using two-stage simulation aka ABC or Galton machine — so statistics based on data generating models and priors.

If people are going to learn this way two-stage simulation they will need to get get some experience making representations and working with them. Otherwise they are just inducing stories to explain to themselves what is going on. If one has the math to comfortably do that — as almost play rather than hard work — then they can do it that way that will like take one than one or two calculus courses.

I agree with Martha that different paths may work well for different people — so its not going be best for everyone. I am perplexed though I why one would want to do statistics these days without some decent programming skills. A number of times I have taken over work done by statisticians who left and with an afternoon of programming was able to do work they spent days doing in about 15 to 30 minutes. To me and their former supervisors? I am not pretending to be a statistician, but I do not feel any need for programming I use JMP if that helps, and it has its own scripting language, but I also rarely need to use that.

I am not advocating that students learn to do statistical work without programming — but I am advocating that they first learn to appreciate data, and that for many, that is better accomplished without programming. I will venture as far as to say that most of the serious problems I have seen with statistical work result from failure to understand your data, conceptual problems with what is being measured and how, or lack of recognition of the limits of an anslysis. Failure to use the correct technique rarely are as important from my experience.

I tried taking a Stanford machine learning course on Coursera — it was good and I did the conceptual parts of the course fine, but I gave up on the programming assignments after the first one. Far too much wasted time making errors of syntax for which I saw too little benefit. I love statistics and I hate programming. Should you 1 Teach a moderate amount of programming first or 2 Avoid teaching programming until maybe some later class.

You give it to incoming students, and then you teach two sections of this course, one you do some programming stuff early, one you avoid programming entirely. You look at the distribution of changes in the final exam type scores… you try to estimate how much having the programming improved outcomes. It seems to me the ideal case, if you can manage it, is to teach two classes, and let people self-assort into the one they thing is best.

They might frequently make the wrong decision based on fear or what they heard from their friends or whatever. This is a good point. It brings to mind someone who was very adept at programming who volunteered to teach introductory statistics. He had the students learn R — but he did not understand statistics well enough to use R well himself.

So I think his students got a lot of mis-learning. In other words, good programming skills without good understanding of statistics is a recipe for poor use of statistics. Jump is a great program if the data has already been cleaned and prepared for analysis. But it is not a great program for automating the production of large numbers of graphs create reproducible analyses. I am not claiming that a well designed GUI could not be developed that overcame the limitations.

But, I am saying that I have not come across that GUI in my exploration of statistical analytic tools. Making an analysis reproducible and auditable is important.

Home page for the book, "Teaching Statistics: A Bag of Tricks"

Curious I could not disagree with you more regarding JMP. Reproducing analysis is simple by saving the scripts — easy to save with the data sets which makes it easy to reproduce the analysis and see the code at the same time. GUI can be bad and promote bad habits or it can help develop good habits. With JMP I believe it is the latter.

Talk about lazy thinking — but that is the natural result of having such a poor user interface. And perhaps the answers I received from the JMP representatives were not accurate. But, I left there being told that I would not be able to automate the data cleaning, recoding, calculations, etc. Modifying code is a great way to begin the learning process in an analytical program. It is not stopping point, but an entry into a new method.

This may have changed, but it was still the case a couple of years ago anyway. Despite most of my work being with data, I have found few reasons to use the scripting language — that does not mean that others feel the same way. The GUI has many capabilities that are quite sophisticated, and in some ways, reinforce thinking carefully and meaningfully about the data.

In ways that code does not promote — for me. Perhaps, as Martha says, it depends on the world you come from and how your experience has been constructed. I would suggest that programming is a red herring. As I said at the outset, it is neither necessary nor sufficient for good work with data. As with NHST, that is a loaded statement. I could just as easily say that a good GUI is neither necessary nor sufficient for good data analysis.

One other point about JMP is worth noting. Despite its superiority in my view over other GUI-based statistics programs, it is far less well known and less used. I believe this is due to it being under the SAS umbrella — I believe SAS is worried about cannibalizing their more expense products and it is a legitimate worry in my mind. That point is somewhat off topic, but I believe it may be relevant to some of your experiences with exposure to the product. I agree that programming is not required for good and careful data analysis, but it does in fact depend on what you are trying to accomplish.

There is a reason Andrew Gelman and the Stan group felt compelled to develop the Stan software. It was because the existing programs did not do something they wanted to do. They do not provide the same functionality as the coding language that underlies them. This is inevitably true in that they would have to code every bit of existing functionality into the GUI, which is typically not very efficient. Also, I certainly accept your argument that JMP has been adequate to your needs. If a GUI does the trick, then have at it.