Data analysis | Pawlin

“PAWLIN Technologies” company implements a number of projects for local and foreign customers in the field of intellectual analysis of accumulated historical data on the work of enterprises in order to build forecasting and scoring models.

Goal

Use available historical observation data to automatically generate a computer model using machine learning methods, that can help to forecast certain production quantities or the events occurrence, for example, the probability of client’s default characterized by a set of parameters. As a rule, the forecast allow the customer to get a certain economic effect: to improve product quality, to reduce costs, to attract more buyers, etc.

Input data

The customer provides historical observation data. This data is often not very structured and represent a set of separate tables that have some logical connections. The data is usually represented in different formats: spreadsheets, text files, database records. The customer describes the structure and meaning of all values. Optionally, intuitive guesses of which data can influence the forecasted value are provided. This hypotheses and assumptions, depending on the certainly degree, are either being initially used in the forecasting system as attributes calculated from derived data (derived attributes), if the client is confident in them, or the hypotheses about data connections are confirmed or refuted later based on the model built on actual data.

Output

As a result, one or several computer programs/components, capable to forecast determined values based on a number of previous values or other parameters, are formed. The model is tested for adequacy on a control sample that is not included in learning process. At this stage one or more metrics, that is adequate to a given task, is applied (Gini, KS, Precision/Recall and F-measure. RMS, etc.)

Project examples

Forecasting the probability of risk event occurrence depending on the season and event parameters;
Forecasting supply and demand;
Forecasting client’s default in banking or leasing, clarifying the probability of default based on customer behavior analysis;
Forecasting the future value of credit/leasing object;
Forecasting the most probable time of customer’s default, if it happens;
Forecasting the time of storage of the object in warehouse until the moment of sale;
Forecasting the number of visitors/calls to the call center;
Forecasting NDVI vegetation growth index based on meteorological and multispectral satellite images;
Forecasting average IT-service customer satisfaction, depending on the parameters that characterize the project;
Forecasting client satisfaction, depending on the parameters that characterize the client;
Forecasting flash floods based on water levels and water discharge observations in rivers.

For specialistsartificial neural networks. feedforward, recurrent, LSTM, selection of informative attributes, adaboost, GPU, binning, WoE, IV, PSI, CSI, transition matrix, vintage analysis, correlation