Algorithm

An algorithm is a series of instructions, either in the form of pseudo-code or in a computer programming language, which aim at solving a problem or at performing some sort of computation. Algorithms are therefore strongly connected to various fields of mathematics and physics, such as probability and statics, calculus and linear algebra. Algorithms apply ... Read more

Apache Kafka

Apache Kafka is an open-source distributed event streaming platform for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications. Microsoft Azure has a Kafka service implementation in its HDInsight cloud service. In Azure you can also use Azure Event Hubs to stream data from Apache Kafka applications without setting up a Kafka cluster on ... Read more

augmentation

In data engineering, augmentation is a process by which we create various transformations of the available data in the ML training dataset. Augmentation task examples are the perturbation of an image in different ways. Augmentation aims at increasing the amount of data input provided to a model to better fit the data into the model ... Read more

BaaS

BaaS means Backup As A Service and refers to any managed service which allows for remote backup and recovery of any type of on-premise or cloud workload, including but not limited to:

DR

1) In machine learning, DR stands for dimensionality reduction. In machine learning, dimensionality reduction is a feature engineering technique, in which a large number of features in a dataset is reduced to a smaller number of features. It is important to ensure that the remaining features are meaningful and representative for the dataset and that ... Read more

DRaaS

DRaaS (Disaster Recovery As A Service) is a managed service in which disaster recovery is offered as a cloud service. Disaster recovery includes business continuity (BCDR) and backup as a service (BaaS). DRaaS management server can be either an on-premise or a cloud server. There are various DRaaS providers, most of which are already providing ... Read more

GDPR

GDPR stands for General Data Protection Regulation. It is a privacy-related set of regulations in the European Union (EU) which controls how personally identifiable data (PID) is stored, processed and deleted from computing systems. In GDPR the two basic roles in a system which are related to the processing of data are the GDPR controller ... Read more

incident database

In machine learning systems an incident database which collects and stores information about incidents where AI systems have caused or contributed to negative outcomes or harms, such as accidents, errors, biases, discriminations, or violations.

NoSQL

NoSQL databases are also known as as non-SQL databases. They are different than relational (SQL) databases and do not feature the SQL model with rows and tables. The following types of NoSQL databases are available. Examples of NoSQL databases are the following:  

perturbation

Perturbation refers to a set of methods for distorting the pixels in an image without compromising the overall information contained within it. These methods can be changing the image's resolution, converting it from color to greyscale, reverting the image dimensions, etc.

PII

PII stands for personally identifiable information. This refers to information (data) which must be protected to ensure the privacy of the people described by that information.

sigmoid kernel

The sigmoid kernel is a kernel trick method which uses a hyperbolic tangent function (tanh) to create an equivalent of a perceptron neural network.

silhouette analysis

Silhouette analysis is a method of calculating how well a particular data example fits within a cluster as compared to its neighboring clusters.

skillful

The term skillful is used to describe an AI model which is useful for its intended task. There are degrees of skill; some models are more useful than others.

spectrogram

In audio data processing and analysis, spectrogram is a type of plot in which time, frequency, and amplitude of an audio signal are depicted.

stratified k-fold cross-validation

The stratified k-fold cross-validation is a k-fold cross-validation method in which each fold has a representative sample of data in datasets which exhibit class imbalance.

structured data

Data found in data sources (virtual machines, virtual containers, storage accounts, databases, data wareshouses, data lakes, data marts and data hubs) can be classified into three (3) major categories with regard to the level of structure they present. Unstructured data, i.e data which is in a format that makes it difficult to search, filter, or ... Read more

z-score

Z-score in data science is also known as standardization score. Z-score is the number of standard deviations that a sample is above or below the mean of all values in the sample.

zerto

Zerto is a software development and cloud service provider company offering Disaster Recovery As A Service (DRaaS) solutions. They also offer a Disaster Recovery (DR) technical term dictionary at https://www.zerto.com/resources/a-to-zerto/.