Last updated on 15 may 2024

A continuación, te explicamos cómo puedes dominar los algoritmos de aprendizaje automático como ingeniero de datos.

Con tecnología de la IA y la comunidad de LinkedIn

Como ingeniero de datos, ya es experto en la gestión y organización de grandes conjuntos de datos. Pero para llevar tu carrera al siguiente nivel, domina el aprendizaje automático (ML) Los algoritmos pueden cambiar las reglas del juego. Comprender estos algoritmos le permite extraer información valiosa y hacer predicciones basadas en datos, lo cual es una habilidad muy buscada. Este artículo lo guiará a través de los pasos para obtener un control firme de los algoritmos de ML, mejorando su kit de herramientas de ingeniería de datos.

Expertos destacados en este artículo

Elección de la comunidad a partir de 30 contribuciones. Más información

1 Aprenda los conceptos básicos

Antes de sumergirse en algoritmos complejos, asegúrese de tener una sólida comprensión de los conceptos básicos del aprendizaje automático. Esto incluye conocer la diferencia entre el aprendizaje supervisado, no supervisado y de refuerzo. El aprendizaje supervisado implica datos etiquetados para enseñar a los modelos a predecir resultados, mientras que el aprendizaje no supervisado encuentra patrones ocultos en los datos sin etiquetas preexistentes. El aprendizaje por refuerzo consiste en tomar secuencias de decisiones, aprender a lograr un objetivo en entornos inciertos y potencialmente complejos.

Añade tu opinión

Prayson Wilfred Daniel

🐉 Principal Data Scientist | Director of Transformation Lab
(editado)
Denunciar la contribución
Adding to Supervised, Unsupervised and Reinforcement Learning, and perhaps less known family of algorithms, is Imitation Learning, which is used heavily in autonomous vehicles and robotics. IL is a ML where a machine/agent mimics human behavior by learning from expert demonstrations rather than trial and error. This can be done either by Behavioural Cloning, mapping states to actions directly, or Inverse Reinforcement Learning, inferring the expert's reward function.

Traducido

Recomendar

Poco útil
Pavel Popov

Senior Data Engineer at Playrix | Ex-Lead Data Engineer at Glowbyte Consulting | Master’s degree in Information Technologies - National Research University "MPEI" ‘22 | 2x AWS Certified
Denunciar la contribución
Linear Regression: Predictive modeling technique for establishing relationships between variables. Logistic Regression: Used for binary classification problems. Decision Trees: Hierarchical tree structures for classification/regression tasks. k-Nearest Neighbors (k-NN): Instance-based learning for classification/regression. Naive Bayes: Probabilistic classifier often used for text classification. Random Forest: Ensemble learning method of decision trees, providing high accuracy and robustness. Gradient Boosting Machines (GBM): Boosting ensemble technique for improving predictive performance. k-Means Clustering: Unsupervised learning algorithm for partitioning data into clusters based on similarity.

Traducido

Recomendar

Poco útil
Praveen C T
Denunciar la contribución
Build a strong understanding of core machine learning concepts like supervised vs unsupervised learning, classification vs regression, cost functions, and optimization algorithms. This foundation will help you grasp the nuances of specific algorithms. Focus on mastering some of the most popular and versatile algorithms like linear regression, decision trees, random forests, and support vector machines (SVMs).Brush up on your statistics and probability knowledge. Familiarize yourself with popular machine learning libraries like TensorFlow, PyTorch, or scikit-learn in Python. These libraries offer pre-built implementations of various algorithms, allowing you to focus on understanding the concepts and applying them to your data.

Traducido

Recomendar

Poco útil
Swapnil Surushe

Data Engineer | ETL Specialist | AWS Certified Solution Architect | 2 x GCP Certified Professional | Building a community with 4k followers on LinkedIn | SQL 5 ⭐ on HackerRank | Python 4 ⭐ on HackerRank.
Denunciar la contribución
1. **Master the Basics**: Start with statistics, linear algebra, and calculus. 2. **Learn Programming**: Focus on Python and R. 3. **Explore Libraries**: Get familiar with Scikit-learn, TensorFlow, and PyTorch. 4. **Understand Algorithm Types**: Study supervised, unsupervised, and reinforcement learning. 5. **Data Preprocessing**: Learn about normalization, one-hot encoding, and feature scaling. 6. **Feature Selection and Engineering**: Understand how to improve model performance. 7. **Model Evaluation**: Master techniques like cross-validation and precision-recall curves. 8. **Real-World Projects**: Gain practical experience and collaborate with others. 9. **Stay Updated**: Follow industry trends and participate in communities.

Traducido

Recomendar

Poco útil
ASTIKAR VIVEK KUMAR

Linkedin Top Data Engineering Voice | @Google @Microsoft Certified | Magma M Scholar | @Data Maverick | Building the Future with AI
Denunciar la contribución
While data engineers focus on building and maintaining data pipelines, mastering machine learning algorithms gives them a toolbox to extract insights from that data. Example :- - Imagine you have a system tracking website clicks to recommend products. - By understanding machine learning algorithms, you can analyze click data to suggest items users might like, boosting sales! - This way, you go beyond data pipelines and unlock the hidden value within the data. #Happy_Learning

Traducido

Recomendar

Poco útil
Mehmet GÜNER 🔅

Generative AI & Large Language Models & AI Policy and Ethics
Denunciar la contribución
Before delving into intricate algorithms in machine learning, it's essential to establish a firm grasp of the fundamentals. This entails understanding the distinctions between supervised, unsupervised, and reinforcement learning. Supervised learning relies on labeled data to train models in predicting outcomes accurately. In contrast, unsupervised learning identifies underlying patterns within data without predefined labels. Reinforcement learning, on the other hand, revolves around making sequential decisions to accomplish goals in uncertain and possibly intricate environments. Mastery of these foundational concepts lays a solid groundwork for navigating more advanced machine learning techniques effectively.

Traducido

Recomendar

Poco útil
Sachin D N 🇮🇳

Data Consultant @ Lumen Technologies | Data Engineer | Big Data Engineer | Azure | Apache Spark | Databricks | Delta Lake | Agile | PySpark | Hadoop | Python | SQL | Hive | Data Lake | Data Warehousing
Denunciar la contribución
Mastering machine learning algorithms as a data engineer involves a combination of theoretical understanding and practical application. Start by learning the basics of machine learning, including different types of algorithms such as supervised, unsupervised, and reinforcement learning. Understand the math behind these algorithms to grasp how they work. Use online resources, books, and courses for learning. Then, implement these algorithms on real-world datasets. Platforms like Kaggle provide datasets and competitions that can help you practice. Remember, mastering machine learning is a journey, so be patient and consistent in your learning efforts.

Traducido

Recomendar

Poco útil

2 Elegir herramientas

Seleccione las herramientas y los lenguajes de programación adecuados que prevalecen en el campo del aprendizaje automático. Python es una opción popular debido a su legibilidad y a las extensas bibliotecas como Scikit-learn, TensorFlow y PyTorch que admiten el desarrollo de ML. Familiarícese con estas bibliotecas, ya que proporcionan funciones y métodos prediseñados que simplifican la implementación de algoritmos de aprendizaje automático. Además, comprender la consulta de bases de datos con SQL y la manipulación de datos con Pandas será beneficioso.

Añade tu opinión

Pavel Popov

Senior Data Engineer at Playrix | Ex-Lead Data Engineer at Glowbyte Consulting | Master’s degree in Information Technologies - National Research University "MPEI" ‘22 | 2x AWS Certified
Denunciar la contribución
Python: Versatile language with rich ML libraries like TensorFlow, PyTorch, and scikit-learn. TensorFlow: Open-source ML framework developed by Google, offering flexibility and scalability. Scikit-learn: Python library providing simple and efficient ML tools for data preprocessing, modeling, and evaluation. R: Statistical computing language with comprehensive ML packages for data analysis and modeling. Apache Spark: Unified analytics engine supporting MLlib for scalable machine learning on distributed systems. SQL: Essential for data manipulation and querying, with ML capabilities in databases like PostgreSQL and Oracle. Java: Widely used for building scalable ML applications with frameworks like Weka and Deeplearning4j.

Traducido

Recomendar

Poco útil
Mehmet GÜNER 🔅

Generative AI & Large Language Models & AI Policy and Ethics
Denunciar la contribución
In the machine learning field, selecting the appropriate tools and programming languages is crucial. Python stands out as a preferred language due to its readability and the robust libraries it offers, such as Scikit-learn, TensorFlow, and PyTorch, which streamline ML development. Familiarizing oneself with these libraries is essential as they provide pre-built functions and methods facilitating the implementation of ML algorithms. Additionally, proficiency in SQL for database querying and Pandas for data manipulation enhances one's skill set, enabling comprehensive data handling and analysis in the ML pipeline.

Traducido

Recomendar

Poco útil
Sasha Korovkina

Financial Data Developer | Towards Data Science writer | Microsoft Founders Member
Denunciar la contribución
Familiarise yourself not only with the tool - such as a Python library, but with the development environment as a whole. Learn about modular setups, virtual environments and administrator permissions, as well as how your files are structured and synced to version control systems. This would allow you to feel more confident in the development environment as a whole and allow you to experiment more without the fear of breaking anything.

Traducido

Recomendar

Poco útil
Agathamudi Leela Vara Prasad

Microsoft Certified Azure Data Engineer(DP-203) | Python | SQL | Big Data |Azure Data Factory | Azure Databricks | Spark-SQL | ADLS | Pyspark | ETL | Hadoop | Hive | PowerBI
Denunciar la contribución
First, understand supervised learning and unsupervised learning to get a solid grounding. Next, concentrate on Python programming as well as scikit-learn which is gaining popularity among developers. Doing regression and classification are other algorithm types that can be used.

Traducido

Recomendar

Poco útil

3 Practica la codificación

La experiencia práctica es crucial. Comience implementando algoritmos básicos desde cero en Python para comprender su funcionamiento interno. Por ejemplo, escriba un modelo de regresión lineal simple usando Entumecido o un clasificador de árbol de decisión usando Scikit-learn . Al codificar estos algoritmos a mano, obtendrá una comprensión más profunda de la teoría detrás de ellos y cómo se pueden ajustar para mejorar el rendimiento de sus conjuntos de datos.

Añade tu opinión

Pavel Popov

Senior Data Engineer at Playrix | Ex-Lead Data Engineer at Glowbyte Consulting | Master’s degree in Information Technologies - National Research University "MPEI" ‘22 | 2x AWS Certified
Denunciar la contribución
Implement Basic Algorithms: Code simple models like linear regression with numpy or decision trees with Scikit-learn from scratch in Python. Understand Inner Workings: Gain insights into algorithm theory by coding them manually. Experiment with Datasets: Apply implemented models to different datasets to observe performance variations. Debug and Optimize: Identify and debug errors in code, then optimize algorithms for better performance. Learn from Results: Analyze model outputs. Document and Review: Document coding processes regularly to reinforce learning. Explore Advanced Techniques: Gradually tackle more complex algorithms as proficiency grows. Continuous Practice: Dedicate regular time to coding practice to hone skills.

Traducido

Recomendar

Poco útil
Sasha Korovkina

Financial Data Developer | Towards Data Science writer | Microsoft Founders Member
Denunciar la contribución
Best practice is industry practice. When working on real world projects always analyse where your models and pipelines can be optimised. Also note down the variable parameters and thresholds - these are your assumptions which can be improved through optimising and hill climbing approaches. When you run out or get bored of the industry projects, you can have a go at building on scientific datasets. There are plenty available on Kaggle of varying complexity to experiment with.

Traducido

Recomendar

Poco útil
Agathamudi Leela Vara Prasad

Microsoft Certified Azure Data Engineer(DP-203) | Python | SQL | Big Data |Azure Data Factory | Azure Databricks | Spark-SQL | ADLS | Pyspark | ETL | Hadoop | Hive | PowerBI
Denunciar la contribución
For real experience, you should do hands-on projects through platforms such as Kaggle. Use different models and methods to see how they work, learn how to measure them well too.

Traducido

Recomendar

Poco útil

4 Algoritmos de estudio

A continuación, estudia en profundidad los algoritmos de aprendizaje automático. Sumérgete en la lógica detrás de algoritmos como árboles de decisión, redes neuronales, clustering y modelos de regresión. Comprenda los casos de uso de cada algoritmo y cómo hacen predicciones o categorizan los datos. Saber cuándo y por qué usar un algoritmo en particular es tan importante como saber cómo implementarlo. Recursos como cursos en línea, libros de texto y tutoriales pueden ser muy útiles para este paso.

Añade tu opinión

Sasha Korovkina

Financial Data Developer | Towards Data Science writer | Microsoft Founders Member
Denunciar la contribución
Understand the concepts (whether logical or mathematical) behind the algorithms which you are using. This does not seem immediately significant, but when you would inevitably want to increase the accuracy metrics, knowing the backbone of your algorithms is the key. A good way to understand it is to approach algorithms like maths problems - you start off with the simplest case first to understand the mechanics and increase the complexity to your desired level.

Traducido

Recomendar

Poco útil
Sasha Korovkina

Financial Data Developer | Towards Data Science writer | Microsoft Founders Member
Denunciar la contribución
Understand the concepts (whether logical or mathematical) behind the algorithms which you are using. This does not seem immediately significant, but when you would inevitably want to increase the accuracy metrics, knowing the backbone of your algorithms is the key. A good way to understand it is to approach algorithms like maths problems - you start off with the simplest case first to understand the mechanics and increase the complexity to your desired level.

Traducido

Recomendar

Poco útil

5 Proyectos de compilación

No hay nada mejor que la experiencia práctica. Empieza poco a poco trabajando en proyectos que te interesen y aumenta gradualmente la complejidad. Por ejemplo, puede comenzar prediciendo los precios de la vivienda mediante la regresión o identificando segmentos de clientes con agrupación. Estos proyectos te ayudarán a aplicar los algoritmos que has aprendido en escenarios del mundo real, a perfeccionar tus habilidades y a crear un portafolio que muestre tu experiencia a posibles empleadores o colaboradores.

Añade tu opinión

Pavel Popov

Senior Data Engineer at Playrix | Ex-Lead Data Engineer at Glowbyte Consulting | Master’s degree in Information Technologies - National Research University "MPEI" ‘22 | 2x AWS Certified
Denunciar la contribución
There are some ideas of projects to learn machine learning for any data engineer: Predictive Modeling: Build models for sales forecasting, customer churn prediction, or stock price prediction. Recommendation Systems: Design personalized recommendation engines for products, movies, or music. Time Series Analysis: Analyze temporal data for trend forecasting, anomaly detection, or demand forecasting. E-commerce Optimization: Optimize product recommendations, pricing strategies, or marketing campaigns to improve sales and customer satisfaction. Sentiment Analysis: Analyze social media data to understand public opinion or sentiment trends.

Traducido

Recomendar

Poco útil
Penninah Gathu

Data Engineer | BI Developer | Data Analyst | SQL | Python | Cloud Technologies
Denunciar la contribución
As with any new skill, they best way to get good at ML algorithms is by applying the knowledge you have learnt to real wold problems. Building projects will help you gain practical experience as well as help you bridge that gap between being a beginner at ML algorithms and pro at algorithms.

Traducido

Recomendar

Poco útil

6 Seguir aprendiendo

El aprendizaje automático es un campo en constante evolución, por lo que el aprendizaje continuo es clave. Manténgase actualizado con las últimas tendencias y avances leyendo trabajos de investigación, asistiendo a talleres y participando en foros en línea. Interactúe con la comunidad para aprender tanto de sus compañeros como de los expertos. Cuanto más te sumerjas en el mundo del aprendizaje automático, más competente serás en la aplicación de estos algoritmos como ingeniero de datos.

Añade tu opinión

Ivan de Castro

Founder @ DataFlex: Data Integration, Analytics & AI | Ex-Adidas Global Analytics Leader | Full Stack Engineer
Denunciar la contribución
Great way to stay ahead in Machine Learning: - DeepLearning released an amazing online course (Machine Learning Specialization by Andrew Ng); providing a lot of practical tips as well - DeepLearning is releasing a weekly newsletter (“The Batch”) - Substack, Medium and following some influencers in the space might be another great opportunity to keep up-to-date

Traducido

Recomendar

Poco útil
Pavel Popov

Senior Data Engineer at Playrix | Ex-Lead Data Engineer at Glowbyte Consulting | Master’s degree in Information Technologies - National Research University "MPEI" ‘22 | 2x AWS Certified
Denunciar la contribución
Online Courses: Enroll in ML courses for structured learning. Research Papers: Stay updated by reading the latest research. Hands-on Projects: Apply concepts in real-world projects. Coding Practice: Regularly code ML algorithms. Peer Collaboration: Learn from peers and share insights. Workshops/Webinars: Attend to explore new topics. ML Communities: Join for networking and knowledge sharing. Follow Experts: Stay updated with thought leaders. Teaching: Share knowledge to reinforce learning. Stay Curious: Explore new topics and experiment.

Traducido

Recomendar

Poco útil

7 Esto es lo que hay que tener en cuenta

Este es un espacio para compartir ejemplos, historias o ideas que no encajan en ninguna de las secciones anteriores. ¿Qué más te gustaría añadir?

Añade tu opinión

Ryan Garaygay

Vice President of Engineering | Cloud Data Products and Analytics
Denunciar la contribución
As a Data Engineer, you do not even need to know Machine Learning algorithms, much less master them. It helps to know the ML foundations, and useful to know others will use the data that just went into the pipeline you engineered. Some do both, few do both well, but they are two different specialized roles that are hard enough by themselves, and unreasonable expectation to master the things expected from another. The debate between generalists and specialists is complex, and this question or advice could lead to the misconception on what's to be expected from a data engineer. Before we know it, LinkedIn advice will start with questions like "How can one become more effective in craniotomy as a data engineer?". Can we downvote questions?

Traducido

Recomendar

Poco útil

Ingeniería de datos

Seguir

Valorar este artículo

Hemos creado este artículo con la ayuda de la inteligencia artificial. ¿Qué te ha parecido?

Está genial Está regular

Denunciar este artículo

Ver todo

A continuación, te explicamos cómo puedes dominar los algoritmos de aprendizaje automático como ingeniero de datos.

1

2

3

4

5

6

7

1 Aprenda los conceptos básicos

2 Elegir herramientas

3 Practica la codificación

4 Algoritmos de estudio

5 Proyectos de compilación

6 Seguir aprendiendo

7 Esto es lo que hay que tener en cuenta

Ingeniería de datos

Valorar este artículo

Gracias por tus comentarios

Más artículos sobre Ingeniería de datos

Lecturas más relevantes

A continuación, te explicamos cómo puedes dominar los algoritmos de aprendizaje automático como ingeniero de datos.

1

2

3

4

5

6

7

1 Aprenda los conceptos básicos

2 Elegir herramientas

3 Practica la codificación

4 Algoritmos de estudio

5 Proyectos de compilación

6 Seguir aprendiendo

7 Esto es lo que hay que tener en cuenta

Ingeniería de datos

Valorar este artículo

Gracias por tus comentarios

Explorar otras aptitudes