DR

1) In machine learning, DR stands for dimensionality reduction. In machine learning, dimensionality reduction is a feature engineering technique, in which a large number of features in a dataset is reduced to a smaller number of features. It is important to ensure that the remaining features are meaningful and representative for the dataset and that ... Read more

ETL

ETL stands for extract, transform, and load. It refers to a data science and machine learning procedure, in which ML model data is being collected (extracted) from data sources, then data is transformed and finally loaded into an ML model.

fitting

Fitting or training in machine learning is the process by which a model learns from input data. Fitting is another word for training an ML model. Besides the ideal best fit or good fit, a model can get overfitted when overfitting occurs or it can get underfitted when underfitting occurs.

holdout

In machine learning, holdout validation is a data sampling method in which the dataset is split into two: the training dataset and the test. The split is equal, i.e. training is performed on the 50% of the dataset and testing is performed on the remaining 50% of the dataset. Holdout validation is not recommended in ... Read more

NoSQL

NoSQL databases are also known as as non-SQL databases. They are different than relational (SQL) databases and do not feature the SQL model with rows and tables. The following types of NoSQL databases are available. Examples of NoSQL databases are the following:  

percentile

In statistics, a percentile is a term that describes how a score compares to other scores from the same set. While there is no universal definition of percentile, it is commonly expressed as the percentage of values in a set of data scores that fall below a given value, for example the 25th percentile, the ... Read more

PII

PII stands for personally identifiable information. This refers to information (data) which must be protected to ensure the privacy of the people described by that information.

SVM

SVM stands for Support Vector Machine. SVM is a well-known family of supervised learning non-parametric algorithms which are used in regression and classification machine learning problems, by separating data values using a hyperplane. SVM algorithms are ideal when there is presence of outliers in the model training data.

variance

In machine learning (ML), variance is a concept which is related to errors in the model's predictions, as a results of over-sensitivity and high correlation of the machine learning algorithm to the training data. Due to this over-sensitivity, the ML model becomes complex to explain (explainability) and it captures the complexity inside the training data ... Read more