Neural Networks are widely used in the scope of Artificial Intelligence. It can be a solution for most of the predictive problems thanks to its scalability and flexibility. You can solve complex problems on regression, classification, forecasting, object recognition, speech recognition, NLP and so on. So, what is Neural Networks, what makes them capable of all these problems and how does it learn to make predictions? In order to understand these, we need to know more about neurons and the math behind them. In this article, I am going to explain how neurons learn from given data and used in…

On the social media and e-commerce platforms, where a lot of content is published, it is critical to show appropriate ones to users. There is a limited amount of content that users can view after entering the platform. Therefore, these contents have to be chosen by a system or an engine that can predict user inclinations adaptively. Most of the services on the internet use such a system to increase the user experience. That is why Netflix shows you different series than it shows to your friends.

A system that individually predicts relevant content for users according to their activities…

The Poisson distribution depends on the number of independent random events which eventuate in a specific region or an interval. We can use it to find the probability of a particular event occurring a given number of times an interval. The term interval is usually time. For example, the probability of the number of x vehicles crossing a highway between 13:00 and 14:00. The probability of the number of x vehicles that park in a parking place in the specified time interval. These are the examples where we use the Poisson distribution to get the probability values.

Every probability question depends on a probability distribution. In which circumstances that the probability will be found, determine the probability distribution should be used. For example, there are two results as heads and tails in the experiment of tossing the coin. Suppose that this experiment depends on no other circumstance than throwing the coins. It is the Bernoulli Distribution since there are only two outcomes. However, we need to reconsider the distribution if the question is “What is the probability of getting ten heads after tossing the coin 30 times?”. …

When dealing with problems on statistics and machine learning, one of the most frequently encountered terms is covariance. While most of us know that variance represents the variation of values in a single variable, we may not be sure what covariance stands for. Besides, knowing covariance can provide way more information on solving multivariate problems. Most of the methods for preprocessing or predictive analysis depend on the covariance. Multivariate outlier detection, dimensionality reduction, and regression can be given as examples.

In this article, I am going to explain five things that you should know about covariance. Instead of explaining it…

Detecting outliers in multivariate data can often be one of the challenges of the data preprocessing phase. There are various distance metrics, scores, and techniques to detect outliers. Euclidean distance is one of the most known distance metrics to identify outliers based on their distance to the center point. There is also a Z-Score to define outliers for a single numeric variable. In some cases, clustering algorithms can be also preferred. All these methods consider outliers from different perspectives. The outliers are found based on one method may not be found by the others as outliers. Therefore, these methods and…

Model Evaluation is the subsidiary part of the model development process. It is the phase that is decided whether the model performs better. Therefore, it is critical to consider the model outcomes according to every possible evaluation method. Applying different methods can provide different perspectives.

There are different metrics (or methods) such as accuracy, recall, precision, F1 Score. These are the most used and known metrics for the model evaluation in classification. Every one of them evaluates the model in different ways. For example, while the accuracy provides insight into how your model can predict correctly, recall provides insight into…

In the Context API, reducer and initial state object are created to access or update the state values from the other components. Accordingly, the `useReducer`

hook is used for both getting state values and dispatching actions to update the values of the state provider. With the help of the `useReducer `

hook, actions for saving values into the state can be triggered easily from the components that are wrapped by the state provider. Before start reading this article it is important to know how the `useReducer`

hook works. Therefore, please visit here to get more information about useReducer hook.

In a…

According to the definition of risk in Wikipedia; the risk is the possibility of something bad happening. Therefore, when the term of risk is used, the number of negative cases should be considered. In other words, the risk depends on how much negative cases are more than positive ones. Then, what is the relative risk and odds ratio? In this article you will have the answers to the following questions;

- What is risk and relative risk?
- How to find the relative risk?
- What are the differences between risk, relative risk, and odds ratio?
- How to decide which risk measurement we…

When we work on nominal types of data we mostly focus on frequency tables. There aren’t too many statistical methods to deduce the conclusions of what nominal data relay, unlike the numeric data. Methods such as correlations, confidence intervals, mean, median, etc work for numeric data types. Therefore, frequency tables are used to interpret the nominal data. With the help of the frequency table, nominal data can be interpreted by considering the frequency values in that table.

After creating a frequency table we can have numeric data which we can apply statistical methods on. Chi-Square Goodness of Fit test and…

Data Scientist, Statistician, Python and R Developer