L1 Regularization
A technique that adds the absolute value of the coefficients as a penalty term in machine learning models to induce sparsity.
Types of L1 Regularization
- Lasso Regression - Uses L1 regularization for feature selection.
- Sparse Coding - Uses L1 norm to enforce sparsity in neural networks.
Example
Used in linear regression models to reduce overfitting by selecting relevant features.
L2 Regularization
A regularization technique that penalizes the squared magnitude of model coefficients to prevent overfitting.
Types of L2 Regularization
- Ridge Regression - Applies L2 penalty to linear regression models.
- Weight Decay - Used in deep learning to control model complexity.
Example
Used in logistic regression to prevent large coefficient values that could lead to overfitting.
Lagrange Multiplier
A mathematical optimization technique used to handle constraints in machine learning models.
Types of Lagrange Multipliers
- Equality Constraints - Used in constrained optimization where functions are equal.
- Inequality Constraints - Used in problems with upper or lower bounds.
Example
Used in support vector machines (SVMs) to maximize the margin while satisfying constraints.
Latent Dirichlet Allocation (LDA)
A generative probabilistic model used for topic modeling in text data.
Types of LDA
- Collapsed Gibbs Sampling - A method for estimating topics in LDA.
- Variational Bayes - An alternative approach to inference in LDA.
Example
Used in news categorization and document classification.
Latent Semantic Analysis (LSA)
A technique that uses singular value decomposition (SVD) to find hidden structures in text data.
Types of LSA
- Probabilistic LSA - Uses probability distributions instead of matrix decomposition.
- Standard LSA - Uses SVD to reduce dimensionality.
Example
Used in information retrieval systems to improve search accuracy.
Lazy Learning
A type of machine learning where the model defers processing until a query is made.
Types of Lazy Learning
- K-Nearest Neighbors (KNN) - Classifies new data points based on stored examples.
- Case-Based Reasoning - Solves problems by reusing previous cases.
Example
Used in recommendation systems where predictions are made in real-time.
Layer Normalization
A normalization technique that scales activations across a layer in neural networks.
Types of Normalization
- Batch Normalization - Normalizes across mini-batches.
- Instance Normalization - Normalizes across individual samples.
Example
Used in transformers like BERT for stabilizing training.
Learned Embeddings
A representation of categorical variables as continuous-valued vectors in machine learning.
Types of Learned Embeddings
- Word Embeddings - Represent words as dense vectors (e.g., Word2Vec).
- Graph Embeddings - Represent nodes in a graph as vectors.
Example
Used in natural language processing (NLP) for semantic understanding.
Leaky ReLU
A variation of the ReLU activation function that allows small negative values.
Types of Leaky ReLU
- Parametric Leaky ReLU - Allows learning of the negative slope.
- Standard Leaky ReLU - Uses a fixed small negative slope.
Example
Used in deep learning to mitigate the dying ReLU problem.
Log Loss
A loss function that measures the performance of a classification model where output is a probability.
Types of Log Loss
- Binary Log Loss - Used for binary classification.
- Multi-class Log Loss - Used when multiple classes exist.
Example
Used in logistic regression for evaluating predictive accuracy.
Logistic Regression
A statistical model that predicts binary outcomes using the logistic function.
Types of Logistic Regression
- Binary Logistic Regression - Predicts two possible outcomes.
- Multinomial Logistic Regression - Predicts multiple categorical outcomes.
Example
Used in spam detection to classify emails as spam or not spam.
Long Short-Term Memory (LSTM)
A type of recurrent neural network (RNN) that addresses the vanishing gradient problem in sequential data.
Types of LSTM
- Standard LSTM - Uses input, forget, and output gates.
- Peephole LSTM - Connects gates to memory cells for additional control.
Example
Used in speech recognition and machine translation.
Loss Function
A mathematical function that quantifies the difference between predicted and actual values in a model.
Types of Loss Functions
- Mean Squared Error (MSE) - Used for regression tasks.
- Cross-Entropy Loss - Used for classification tasks.
Example
Used in deep learning models to guide optimization.
Low-Rank Approximation
A technique that reduces the dimensionality of large datasets while preserving key structures.
Types of Low-Rank Approximation
- Singular Value Decomposition (SVD) - Factorizes matrices into singular vectors.
- Non-Negative Matrix Factorization (NMF) - Ensures all elements remain non-negative.
Example
Used in recommendation systems to reduce computational complexity.
Learning Rate
A hyperparameter that controls how much model weights update during training.
Types of Learning Rate Strategies
- Fixed Learning Rate - Uses a constant value.
- Adaptive Learning Rate - Changes dynamically during training.
Example
Used in gradient descent optimization to balance convergence speed and stability.
Learning Rate Decay
A technique that reduces the learning rate over time to improve convergence.
Types of Learning Rate Decay
- Step Decay - Reduces the rate at fixed intervals.
- Exponential Decay - Reduces the rate continuously over time.
Example
Used in neural networks to stabilize training over long epochs.
Label Propagation
A semi-supervised learning algorithm that spreads label information through a graph.
Types of Label Propagation
- Hard Label Propagation - Assigns discrete labels to unlabeled data.
- Soft Label Propagation - Assigns probabilistic labels.
Example
Used in social network analysis for community detection.
Label Smoothing
A regularization technique that prevents overconfidence by adjusting target labels.
Types of Label Smoothing
- Uniform Label Smoothing - Distributes probability evenly among classes.
- Adaptive Label Smoothing - Adjusts based on model predictions.
Example
Used in neural networks to improve generalization in classification tasks.
Latency in Machine Learning
The time taken for a model to process an input and return an output.
Types of Latency
- Inference Latency - Time taken for a trained model to make a prediction.
- Training Latency - Time taken to train a model on a dataset.
Example
Reduced latency is critical for real-time applications like autonomous driving.
Linear Discriminant Analysis (LDA)
A classification technique that projects data into a lower-dimensional space to maximize class separability.
Types of LDA
- Fisher’s LDA - Uses a single projection to separate two classes.
- Multiple Discriminant Analysis - Extends LDA to multiple classes.
Example
Used in facial recognition for feature extraction.
Linear Kernel
A kernel function used in support vector machines (SVMs) that computes the dot product between feature vectors.
Types of Kernels
- Linear Kernel - Suitable for linearly separable data.
- Polynomial Kernel - Captures more complex relationships.
Example
Used in SVMs for text classification tasks.
Linear Regression
A supervised learning algorithm that models the relationship between a dependent variable and one or more independent variables.
Types of Linear Regression
- Simple Linear Regression - Models a relationship between two variables.
- Multiple Linear Regression - Uses multiple independent variables.
Example
Used in predicting house prices based on features like area and number of rooms.
Lipschitz Continuity
A mathematical property that limits how much a function's output can change concerning small input changes.
Types of Lipschitz Conditions
- Lipschitz Continuous - The function changes at a controlled rate.
- Locally Lipschitz - The function is Lipschitz within a limited domain.
Example
Used in neural network stability analysis.
Local Minima
Points in an optimization function where a model converges to a non-optimal solution.
Types of Minima
- Local Minima - The function is lower than neighboring points.
- Global Minima - The function is at the absolute lowest value.
Example
Gradient descent may get stuck in local minima during deep learning training.
Local Outlier Factor (LOF)
An unsupervised learning algorithm that identifies outliers by measuring local density deviations.
Types of Outlier Detection
- Global Outliers - Deviate from the entire dataset.
- Local Outliers - Deviate within a specific region.
Example
Used in fraud detection to spot anomalous transactions.
Logarithmic Loss (Log Loss)
A performance metric for classification models that penalizes incorrect predictions more severely.
Types of Log Loss Applications
- Binary Log Loss - Used for two-class problems.
- Multiclass Log Loss - Used for multiple classification labels.
Example
Used in evaluating probabilistic classifiers like logistic regression.
Low-Bias Models
Machine learning models that closely fit the training data and have minimal bias errors.
Types of Bias
- High-Bias Models - Underfit the data.
- Low-Bias Models - Have better training accuracy but may overfit.
Example
Deep neural networks often exhibit low bias but high variance.
Low-Variance Models
Models that generalize well to new data and do not overfit.
Types of Variance
- High-Variance Models - Overfit training data.
- Low-Variance Models - Generalize well but may underfit.
Example
Linear regression models are often low-variance models.
Latent Semantic Analysis (LSA)
A technique used to analyze relationships between words in large text datasets.
Types of LSA Techniques
- Truncated SVD - Reduces dimensionality.
- Probabilistic LSA - Uses probabilistic models for topic discovery.
Example
Used in search engines to improve query relevance.
Latent Variable Models
Statistical models that include hidden variables to explain observed data.
Types of Latent Variable Models
- Factor Analysis - Explains variance among variables.
- Hidden Markov Models - Used for sequential data.
Example
Used in natural language processing to model hidden topics in documents.
Latent Dirichlet Allocation (LDA)
A generative probabilistic model used for discovering abstract topics in text data.
Types of Topic Models
- LDA - Uses probabilistic distributions.
- Non-negative Matrix Factorization (NMF) - Uses matrix decomposition.
Example
Used for topic modeling in news categorization.
Layer Normalization
A normalization technique that stabilizes activations across features within a single training example.
Types of Normalization
- Batch Normalization - Normalizes across batch samples.
- Layer Normalization - Normalizes across features per sample.
Example
Used in transformer models like BERT.
Lazy Learning
A type of learning where generalization is deferred until a query is made.
Types of Lazy Learning
- K-Nearest Neighbors (KNN) - Stores all training data for inference.
- Case-Based Reasoning - Uses past cases for decision-making.
Example
KNN is an example of lazy learning.
Learning Curve
A graph that shows how a model's performance improves with more training.
Types of Learning Curves
- Training Curve - Measures model accuracy over time.
- Validation Curve - Measures generalization ability.
Example
Used in deep learning to track model convergence.
Learning Rate
A hyperparameter that controls the step size in model optimization.
Types of Learning Rate Strategies
- Constant Learning Rate - Fixed step size.
- Adaptive Learning Rate - Adjusts dynamically (e.g., Adam, RMSprop).
Example
Used in training neural networks with gradient descent.
Least Squares Method
A mathematical technique to minimize the sum of squared differences between predicted and actual values.
Types of Least Squares
- Ordinary Least Squares (OLS) - Used in linear regression.
- Weighted Least Squares (WLS) - Gives different weights to errors.
Example
Used in linear regression for fitting best-fit lines.
Lempel-Ziv-Welch (LZW) Algorithm
A lossless data compression algorithm.
Types of Compression Algorithms
- Lossless Compression - Retains original data (e.g., LZW, Huffman coding).
- Lossy Compression - Discards some data (e.g., JPEG, MP3).
Example
Used in GIF image compression.
Lexical Analysis
The process of converting text into tokens for processing.
Types of Lexical Analysis
- Tokenization - Splits text into words or subwords.
- Lexeme Analysis - Recognizes patterns in text.
Example
Used in NLP for text preprocessing.
Logit Function
A function that maps probabilities to log-odds, used in logistic regression.
Types of Logit Functions
- Standard Logit - Used in binary classification.
- Multinomial Logit - Used for multi-class classification.
Example
Used in logistic regression for binary classification.
Long Short-Term Memory (LSTM)
A type of recurrent neural network (RNN) designed to handle long-term dependencies.
Types of LSTMs
- Vanilla LSTM - Standard LSTM with forget gates.
- Bidirectional LSTM - Uses both past and future context.
Example
Used in speech recognition and time-series forecasting.
Machine Learning (ML)
ML is a subset of AI that enables machines to learn patterns from data and make predictions or decisions without explicit programming.
Types of ML
- Supervised Learning
- Unsupervised Learning
- Reinforcement Learning
Example
Spam detection in emails using classification models.
Deep Learning (DL)
DL is a subset of ML that uses artificial neural networks to process complex data and perform high-level computations.
Example
Image recognition in self-driving cars.
Generative AI (Gen AI)
Gen AI refers to AI models that generate new content, including text, images, and code, using trained knowledge bases.
Example
AI models like ChatGPT and Stable Diffusion that generate text and images.