Summaries and Customer Reviews are supplied by Amazon.com
Summary:
During the past decade there has been an explosion in computation and information technology. With it have come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It is a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting---the first comprehensive treatment of this topic in any book.
This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression & path algorithms for the lasso, non-negative matrix factorization, and spectral clustering. There is also a chapter on methods for ``wide'' data (p bigger than n), including multiple testing and false discovery rates.
Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie co-developed much of the statistical modeling software and environment in R/S-PLUS and invented principal curves and surfaces. Tibshirani proposed the lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, projection pursuit and gradient boosting.
Customer Reviews:
Average Customer Rating:
Full of Pleasant Surprises
Customer Rating:
I originally bought this book for coverage of some topics about which I know nothing. But having had it for just a week, I have been more impressed with what I have learned about stuff that I thought I knew. The material in Chapter 7 is one of several cases in point. The discussion of cross validation is great, and the way it is related to other aspects of the model selection problem is masterly. More generally, the book manages a nice balance between theory and application, and its scope is truly impressive. Thanks to authors and publisher.
authoritative textbook for data mining
Customer Rating:
This edition adds some essential features for supervised statistical learning, such as supervised principle components, which I find fairly useful.
Not the best textbook for a class
Customer Rating:
I used this book for my stats course. While I do enjoy reading some of the parts of the book, I have to say that I am rather dissappointed with the presentation in the book.
1. This book assumes that you already have some background and quite a bit of familiarity with the subject
2. While it contains many topics, most materials are only "presented" rather than "clearly explained". And so, while it may be good as a reference book, at least for me, this definitely shouldn't be your main resource when first studying the subject.
3. Definitely the authors are expert on the field and I just hope they would come up with a much better revision of the book
4. One nice feature of the book ... it contains pretty picture! Unfortunately, just like the old saying, "a picture contains a thousand words". Thats exactly what happens here. Some of the pictures are hard to understand.
It may or may not be fair to give this book 1 star (I might update my rating in the future). But the simple truth is that I am not impressed when I first read the book. It surely falls below my expectation from such a highly acclaimed book.
Has the most post-its of any book on my shelf
Customer Rating:
This is one of the best books in a difficult field to survey and summarize. Like 'Pattern Recognition', 'Statistical Learning' is an umbrella term for a broad range of techniques of varying complexity, rigor and acceptance by practitioners in the field. The audience for such a text ranges from the user requiring a code library to the mathematician seeking proof of every statement. I sit somewhere in the middle, but more towards the mathematical end. I subscribe to the traditional statistician's view of Machine Learning. It is a term invented in order to avoid having to prove theorems and dodge the rigors of 'real' statistics. However, I strongly support such a course of action. There is an immense need for Machine Learning algorithms, whether they have actual properties or not, and an equal need for books to introduce these topics to people like myself who have a strong mathematical background, but have not been exposed to these techniques.
Hastie & Tibshirani has the most post-it's of any book on my shelf. When my company built an custom multivariate statistical library for our targeted product, we largely followed Hastie & Tibshirani's taxonomy. Their overview of support vector machines is excellent, and I found little of value to me in dedicated volumes like Cristianini & Shawe-Taylor that wasn't covered in Hastie & Tibshirani. Hastie & Tibshirani is another book with excellent visual aides. In addition to some great 2-D representations of complex multidimensional spaces, I thought the 'car going up hill' icon was a very useful cue that the level was going up a notch.
Having praised this book, I can't argue with any of the negative reviews. There is no right answer of where to start or what to cover. This book will be too mathematical for some, insufficiently rigorous for others, but was just right for me. It will offer too much of a hodge-podge of techniques, miss someone's favorite, or offer just the right balance. In the end, it was the best one for me, so if you're like me (someone with a very solid math base, not a mathematician, who appreciates rigor, but isn't married to it, and who is looking to self-start on this topic.) you'll like it.
not a machine learning book
Customer Rating:
I owned this book. This book introduces machine learning from the traditional stat point of view. I would recommend other books because (1) You don't want to read on if you find so many typos in the first several pages; (2) The data used in the book is still the iris data, which has only 150 points. How many data Google has to process?