home libri books Fumetti ebook dvd top ten sconti 0 Carrello


Torna Indietro

nokeri tshepo chris - data science solutions with python

Data Science Solutions with Python Fast and Scalable Models Using Keras, PySpark MLlib, H2O, XGBoost, and Scikit-Learn




Disponibilità: Normalmente disponibile in 15 giorni
A causa di problematiche nell'approvvigionamento legate alla Brexit sono possibili ritardi nelle consegne.


PREZZO
37,98 €
NICEPRICE
36,08 €
SCONTO
5%



Questo prodotto usufruisce delle SPEDIZIONI GRATIS
selezionando l'opzione Corriere Veloce in fase di ordine.


Pagabile anche con Carta della cultura giovani e del merito, 18App Bonus Cultura e Carta del Docente


Facebook Twitter Aggiungi commento


Spese Gratis

Dettagli

Genere:Libro
Lingua: Inglese
Editore:

Apress

Pubblicazione: 10/2021
Edizione: 1st ed.





Trama

Apply supervised and unsupervised learning to solve practical and real-world big data problems. This book teaches you how to engineer features, optimize hyperparameters, train and test models, develop pipelines, and automate the machine learning (ML) process. 

The book covers an in-memory, distributed cluster computing framework known as PySpark, machine learning framework platforms known as scikit-learn, PySpark MLlib, H2O, and XGBoost, and a deep learning (DL) framework known as Keras.

The book starts off presenting supervised and unsupervised ML and DL models, and then it examines big data frameworks along with ML and DL frameworks. Author Tshepo Chris Nokeri considers a parametric model known as the Generalized Linear Model and a survival regression model known as the Cox Proportional Hazards model along with Accelerated Failure Time (AFT). Also presented is a binary classification model (logistic regression) and an ensemble model (Gradient Boosted Trees). The book introduces DL and an artificial neural network known as the Multilayer Perceptron (MLP) classifier. A way of performing cluster analysis using the K-Means model is covered. Dimension reduction techniques such as Principal Components Analysis and Linear Discriminant Analysis are explored. And automated machine learning is unpacked.

This book is for intermediate-level data scientists and machine learning engineers who want to learn how to apply key big data frameworks and ML and DL frameworks. You will need prior knowledge of the basics of statistics, Python programming, probability theories, and predictive analytics. 



What You Will Learn
  • Understand widespread supervised and unsupervised learning, including key dimension reduction techniques
  • Know the big data analytics layers such as data visualization, advanced statistics, predictive analytics, machine learning, and deep learning
  • Integrate big data frameworks with a hybrid of machine learning frameworks and deep learning frameworks
  • Design, build, test, and validate skilled machine models and deep learning models
  • Optimize model performance using data transformation, regularization, outlier remedying, hyperparameter optimization, and data split ratio alteration

 

Who This Book Is For

Data scientists and machine learning engineers with basic knowledge and understanding of Python programming, probability theories, and predictive analytics





Sommario

Chapter 1: Understanding Machine Learning and Deep Learning.

Chapter goal: It carefully presents supervised and unsupervised ML and DL models and their application in the real world. 

  • Understanding Machine Learning.

  • Supervised Learning.

    • The Parametric Method.

    • The Non-parametric method.

    • Ensemble Methods.

  • Unsupervised Learning.

    • Cluster Analysis.

    • Dimension Reduction.

  • Exploring Deep Learning.

  • Conclusion.

Chapter 2: Big Data Frameworks and ML and DL Frameworks.

Chapter goal: It explains a big data framework recognized as PySpark, machine learning frameworks like SciKit-Learn, XGBoost, and H2O, and a deep learning framework called Keras. 

  • Big Data Frameworks and ML and DL Frameworks.

  • Big Data.

    • Characteristics of Big Data.

  • Impact of Big Data on Business and People.

    • Better Customer Relationships.

    • Refined Product Development.

    • Improved Decision-Making.

  • Big Data Warehousing.

    • Big Data ETL.

  • Big Data Frameworks.

    • Apache Spark.

      • Resilient Distributed Datasets.

      • Spark Configuration.

      • Spark Frameworks.

  • ML Frameworks.

  • SciKit-Learn.

  • H2O.

  • XGBoost.

  • DL Frameworks.

    • Keras.

  • Conclusion.

  • Chapter 3: The Parametric Method – Linear Regression.

    Chapter goal: It considers the most popular parametric model – the Generalized Linear Model.

    • Regression Analysis.

    • Regression in practice.

      • SciKit-Learn in action.

      • Spark MLlib in action.

      • H2O in action.

    • Conclusion.

    Chapter 4: Survival Regression Analysis.

    Chapter goal: It covers two main survival regression analysis models, the Cox Proportional Hazards and Accelerated Failure Time model.

    • Cox Proportional Hazards.

    • Lifeline in action.

  • Accelerated Failure Time (AFT) model.

    • Spark MLlib in Action.

  • Conclusion.

  • Chapter 5: The Non-Parametric Method - Classification.

    Chapter goal: It covers a binary classification model, recognized as Logistic Regression, using SciKit-Learn, Keras, PySpark MLlib, and H2O.

  • Logistic Regression.

  • Logistic Regression in Practice.

    • SciKit-Learn in action.

    • Spark MLlib in Action.

    • H2O in action.

  • Conclusion.

  • Chapter 6: Tree-based Modelling and Gradient Boosting.

    Chapter goal: It covers two main ensemble methods, the decision tree model and the gradient boost model.

  • Decision Tree.

    • SciKit-Learn in action.

  • Gradient Boosting.

    • XGBoost in action.

    • Spark MLlib in Action.

    • H2O in action.

  • Conclusion.

  • Chapter 7: Artificial Neural Networks.

    Chapter goal: It covers deep learning and its application in the real world. It shows ways of designing, building, and testing an MLP classifier using the SciKit-Learn framework and an artificial neural network using the Keras framework. 

  • Deep Learning.

    • Restricted Boltzmann Machine.

  • Multi-Layer Perception Neural Network.

    • SciKit-Learn in action.

    • Deep Belief Networks.

    • Keras in action.

    • H2O in action.

  • Conclusion.

  • Chapter 8: Cluster Analysis using K-Means.

    Chapter goal: It covers a technique of finding k, modelling and evaluating a cluster model known as K-Means using frameworks like SciKit-Learn, PySpark MLlib and H2O.

    • K-Means.

    • K-Mean in practice.

    • SciKit-Learn in action.

    • Spark MLlib in Action.

    • H2O in action.

  • Conclusion.

  • Chapter 9: Dimension Reduction – Principal Components Analysis.

    Chapter goal: It covers a technique for reducing data into few components using the Principal Components Analysis. 

  • Principal Components Analysis.

    • SciKit-Learn in action.

    • Spark MLlib in Action.

    • H2O in Action.

  • Conclusion.

  • Chapter 10: Automated Machine Learning.

    Chapter goal: Acquaint the reader with the H2O AutoML model.

    • Automated Machine Learning.

      • H2O in Action.

      Conclusions.





    Autore

    Tshepo Chris Nokeri harnesses advanced analytics and artificial intelligence to foster innovation and optimize business performance. In his functional work, he has delivered complex solutions to companies in the mining, petroleum, and manufacturing industries. He initially completed a bachelor’s degree in information management. Afterward, he graduated with an Honours degree in business science at the University of the Witwatersrand on a TATA Prestigious Scholarship and a Wits Postgraduate Merit Award. They unanimously awarded him the Oxford University Press Prize.










    Altre Informazioni

    ISBN:

    9781484277614

    Condizione: Nuovo
    Dimensioni: 254 x 178 mm Ø 274 gr
    Formato: Brossura
    Illustration Notes:XVI, 119 p. 35 illus.
    Pagine Arabe: 119
    Pagine Romane: xvi


    Dicono di noi